Staley 2017 Models

Introduction

The staley2017 (or s17) module provides functions to implement the logistic regression models of Staley et al., 2017. In brief, these models estimate debris-flow likelihoods given rainfall accumulation, and vice versa.

from pfdf.models import s17

In the paper, the likelihood model describes debris-flow likelihoods as a function of terrain, fire burn severity, soil, and rainfall accumulation, such that:

Equation 1

p=eX1+eX
X=B+Ct T R+Cf F R+Cs S R

where:

Variable

Description

p

Debris-flow likelihood (0 to 1)

T

A terrain variable

F

A fire severity variable

S

A soil variable

R

Rainfall accumulation in mm

B

Model intercept

Ct, Cf, Cs

Variable coefficients

The likelihood model can also be inverted to compute the rainfall accumulation needed to achieve a given probability level:

Equation 2

R=ln(p1p)BCt T R+Cf F R+Cs S R

The paper then describes 4 specific models, each which uses a different combination of earth-system variables as T, F, and S. Each model is calibrated 3 times: using 15 minute, 30 minute, and 60 minute rainfall durations. The following table summarizes each model:

Show Table

Model

T

F

S

B (15, 30, 60)

Ct

Cf

Cs

M1

Proportion of catchment area with both
(1) moderate or high burn severity, and
(2) slope ≥ 23 degrees
Mean catchment
dNBR / 1000
Mean catchment
KF-factor

-3.63, -3.61, -3.21

0.41, 0.26, 0.17

0.67, 0.39, 0.20

0.70, 0.50, 0.22

M2

Mean sin(θ) of catchment area
burned at moderate or high
severity, where θ is slope
angle in degrees.
Mean catchment
dNBR / 1000
Mean catchment
KF-factor

-3.62, -3.61, -3.22

0.61, 0.42, 0.27

0.65, 0.38, 0.19

0.68, 0.49, 0.22

M3

Topographic ruggedness.
(Vertical relief divided by the
square root of catchment area).
Proportion of
catchment area
burned at moderate
or high severity.
Mean catchment
soil thickness / 100

-3.71, -3.79, -3.46

0.32, 0.21, 0.14

0.33, 0.19, 0.10

0.47, 0.36, 0.18

M4

Proportion of catchment area
that both (1) was burned, and
(2) has a slope ≥ 30 degrees
Mean catchment
dNBR / 1000
Mean catchment
soil thickness / 100

-3.60, -3.64, -3.30

0.51, 0.33, 0.20

0.82, 0.46, 0.24

0.27, 0.26, 0.13

Workflow

The s17 module includes two functions that provide general solutions to equation 1 and equation 2. The module also provides 4 utility classes, which can be used to compute variables and look up parameters for the 4 models in the paper. A common workflow is to:

  1. Use a model class to compute T, F, and S variables

  2. Use the class to retrieve the model’s parameters, and then

  3. Solve equation 1 and/or 2 using the variables and parameters

Model Classes

The module provides the M1, M2, M3, and M4 classes, and each facilitates one of the models presented in the paper. Each of these models uses different earth-system variables for T, F, and S, but all 4 models follow the form of equations 1 and 2.

Each class provides a variables method, which computes the T, F, and S variables given a stream segment network and various inputs. For example, using the M1 model:

# Compute Terrain, Fire and Soil variables for M1
from pfdf.models import s17
T, F, S = s17.M1.variables(segments, moderate_high, slopes, dnbr, kf_factor)

You can alternatively return specific variables, using each model’s terrain, fire, and/or soil method:

# Compute variables individually
T = s17.M1.terrain(segments, moderate_high, slopes)
F = s17.M1.fire(dnbr)
S = s17.M1.soil(kf_factor)

Note

The specific inputs to these methods will vary by model.

Each class also provides a parameters method, which returns the B, Ct, Cf, and Cs coefficients for the model. By default, each parameter is a vector with the values for the 15, 30, and 60 minute calibration, in that order. For example, using the M1 model:

>>> # Acquire model parameters
>>> B, Ct, Cf, Cs = s17.M1.parameters()

>>> # Examine parameters
>>> B
array([-3.63, -3.61, -3.21])
>>> Ct
array([0.41, 0.26, 0.17])
>>> Cf
array([0.67, 0.39, 0.2 ])
>>> Cs
array([0.7 , 0.5 , 0.22])

The durations property reports the default duration-order of parameters:

>>> s17.M1.durations()
[15, 30, 60]

You can also use the durations option to query specific durations:

>>> # Return parameters for specific rainfall durations
>>> B, Ct, Cf, Cs = s17.M1.parameters(durations=[60, 15])

>>> # Examine parameters
>>> B
array([-3.21, -3.63])
>>> Ct
array([0.17, 0.41])
>>> Cf
array([0.2 , 0.67])
>>> Cs
array([0.22, 0.7 ])

Refer below for implementation details of individual models.

Likelihood

You can solve equation 1 using the likelihood function. The equation has a general form, so the function is suitable for any of the 4 models in the paper, as well as any custom models following equation 1. This function solves for debris-flow likelihoods given a set of rainfall accumulations (R). Note that the function requires model variables (T, F, S) and parameters (B, Ct, Cf, Cs) in addition to rainfall accumulations.

In the simplest case - a single rainfall accumulation and duration - the function returns a vector of data values, with one element per stream segment. For example:

>>> # Compute variables and parameters for the M1 model
>>> T, F, S = s17.M1.variables(segments, mod_high, slopes, dnbr, kf)
>>> B, Ct, Cf, Cs = s17.M1.parameters(durations=15)

>>> # Examine sizes
>>> len(segments)
2561
>>> T.size, F.size, S.size
(2561, 2561, 2561)
>>> B.size, Ct.size, Cf.size, Cs.size
(1, 1, 1, 1)

>>> # Compute likelihoods given rainfall accumulation
>>> R = 6 # mm/duration
>>> p = s17.likelihood(R, B, Ct, T, Cf, F, Cs, S)
>>> p.shape
(2561, )

You can also solve for multiple rainfall accumulations simultaneously. In this case, the output array will be a matrix, where each row holds results for a stream segment, and each column is a rainfall duration:

>>> # Compute likelihoods for multiple rainfall accumulations
>>> R = [4, 6, 7, 10]
>>> p = s17.likelihood(R, B, Ct, T, Cf, F, Cs, S)
>>> p.shape
(2561, 4)

The command also allows you to solve for multiple parameter values simultaneously (as is common when solving for multiple rainfall durations). In this case, the output will be 3D, with each element along the third dimension representing one set of parameters:

>>> B, Ct, Cf, Cs = s17.M1.durations([15, 30, 60])
>>> R = [4, 6, 7, 10]
>>> p = s17.likelihood(R, B, Ct, T, Cf, F, Cs, S)
>>> p.shape
(2561, 4, 3)

Durations and Intensities

The likelihood function requires rainfall accumulations rather than intensities. Note that the input rainfall accumulations should be in millimeters, and are interpreted as the total accumulation over the duration associated with each set of parameters.

For example, say we used a single rainfall accumulation of 6 mm for 15, 30, and 60-minute rainfall durations. Then the single rainfall accumulation is equivalent to the following intensities:

  • 6 mm in 15 minutes = 24 mm / hour

  • 6 mm in 30 minutes = 12 mm / hour

  • 6 mm in 60 minutes = 6 mm / hour

which correspond to the final dimension of the output array, respectively.

Tip

If you typically work with rainfall intensities, you can use the intensity module to convert them to rainfall accumulations for this function.

>>> from pfdf.utils import intensity
>>> I15 = [20, 24, 40] # mm/hour
>>> R15 = intensity.to_accumulation(I, duration=15)
array([ 5.,  6., 10.])

Rainfall Accumulation

You can solve equation 2 using the accumulation function. The equation has a general form, so the function is suitable for any of the 4 models in the paper, as well as any custom models following equation 2. This function computes the rainfall accumulation needed to achieve a given debris-flow probability level (p). In some cases, it is not possible to achieve the desired probability level with any rainfall level. This is most common when a catchment basin is minimally burned, so the terrain (T) and fire (F) variables are very small. When this occurs, the rainfall accumulation for the stream segment will be nan.

This function requires model variables (T, F, S) and parameters (B, Ct, Cf, Cs) in addition to probability levels. In the simplest case - a single probability level and single rainfall duration - the function returns a vector of data values, with one element per stream segment. For example:

>>> # Compute variables and parameters for the M1 model
>>> T, F, S = s17.M1.variables(segments, mod_high, slopes, dnbr, kf)
>>> B, Ct, Cf, Cs = s17.M1.parameters(durations=15)

>>> # Examine sizes
>>> len(segments)
2561
>>> T.size, F.size, S.size
(2561, 2561, 2561)
>>> B.size, Ct.size, Cf.size, Cs.size
(1, 1, 1, 1)

>>> # Compute accumulations given probability
>>> p = 0.5
>>> R = s17.accumulation(p, B, Ct, T, Cf, F, Cs, S)
>>> R.shape
(2561,)

You can also solve for multiple probability levels simultaneously. In this case, the output array will be a matrix. Each row holds results for a string segment, and each column is a probability level. For example:

>>> # Compute accumulations for multiple probabilities
>>> p = [.2, .4, .6, .8]
>>> R = s17.accumulation(p, B, Ct, T, Cf, F, Cs, S)
>>> R.shape
(2561, 4)

The command also allows you to solve for multiple parameter values simultaneously (as is common when solving for multiple rainfall durations). In this case, the output will be 3D, with each element along the third dimension representing one set of parameters:

>>> B, Ct, Cf, Cs = s17.M1.durations([15, 30, 60])
>>> p = [.2, .4, .6, .8]
>>> R = s17.accumulations(p, B, Ct, T, Cf, F, Cs, S)
>>> R.shape
(2561, 4, 3)

Durations and Intensities

Note that the rainfall accumulations are in millimeters, and are relative to the duration associated with each set of parameters. Continuing the example, we provided three sets of parameters, for 15, 30, and 60 minute intervals, respectively. So the output accumulations are:

  • The accumulation for 50% probability in mm/15-minutes

  • The accumulation for 50% probability in mm/30-minutes

  • The accumulation for 50% probability in mm/60-minutes

which correspond to the columns of the output array, respectively.

Tip

You can use the intensity module to convert the output accumulations to intensities (in mm/hour):

from pfdf.utils import intensity
R = s17.accumulation(...)
I = intensity.from_accumulation(R, durations)

Advanced Usage

Tip

This section is mostly intended for researchers. You probably don’t need to read it for standard hazard assessments.

Custom Models

As stated, the likelihood and accumulation functions solve the general form of equations 1 and 2. As such, you are not required to use the M1-4 models, and you can design custom models instead. In this case, you can compute the variables (T, F, S) and parameters (B, Ct, Cf, Cs) for your custom model, and then use likelihood and accumulation as usual.

Note

When designing custom models, the rainfall accumulations are always parsed/returned relative to the duration interval used to calibrate the model parameters. Refer above for an explanation of how this affects the likelihood and accumulation commands for the M1-4 models.

Parameter Sweeps

Some researchers may be interested in conducting parameter sweeps or using Monte Carlo methods to test model parameters. When doing so, note that each parameter (B, Ct, Cf, Cs) provided as input to likelihood and/or accumulation may either be scalar or a vector. If a parameter is scalar, then the same value is used for each computation. If a parameter is a vector, then each value will correspond to one column of the output array. Note that if multiple parameters are vectors, then the vectors must be the same length.

For example, to test multiple values of a single parameter, you could do something like:

>>> # Get model parameters. Use 30 different values of Ct
>>> import numpy as np
>>> B = -3.63
>>> Ct = np.arange(0.2, 0.5, 0.01)
>>> Cf = 0.67
>>> Cs = 0.70
>>> R = 0.24  # mm/duration

>>> # Examine sizes.
>>> len(segments)
2561
>>> Ct.size
30

>>> # Estimate likelihoods for the 30 Ct values
>>> p = s17.likelihood(R, B, Ct, T, Cf, F, Cs, S)
>>> p.shape
(2561, 30)

In this example, we have tested an M1-like model using 30 different values of the Ct parameter. The output array is a matrix with one column per tested value.


Alternatively, you could instead test multiple values of multiple parameters simultaneously. For example, suppose you use a latin hyper-cube to sample 5000 different sets of Ct, Cf, and Cs parameters. Your code might resemble the following:

>>> # Sample 5000 sets of Ct, Cf, and Cs values
>>> Ct, Cf, Cs = my_lhc_sampler(N=5000)
>>> Ct.size, Cf.size, Cs.size
(5000, 5000, 5000)

>>> # Estimate likelihoods for all 5000 sets
>>> p = s17.likelihood(R, B, Ct, T, Cf, F, Cs, S)
>>> p.shape
(2561, 5000)

Here, the output array is a matrix with one column per sampled set of parameters.


If you need to vary a variable (T, F, S) in conjunction with parameters, then the variable should be a matrix with one column per parameter set. For example, the following code could be used to vary the terrain variable as a function of the Ct parameter:

>>> # Use a different terrain variable for 30 Ct values
>>> Ct = np.arange(0.2, 0.5, 0.01)
>>> T = my_terrain_sampler(segments, Ct)

>>> # Examine sizes
>>> Ct.size
30
>>> T.shape
(2561, 30)

>>> # Estimate likelihoods for the 30 sets of Ct and T
>>> p = s17.likelihood(R, B, Ct, T, Cf, F, Cs, S)
>>> p.shape
(2561, 30)

Specific Models

M1

Variable

Description

T

Proportion of catchment area with both
(1) moderate or high burn severity, and
(2) slope angle ≥ 23 degrees.

F

Mean catchment dNBR / 1000

S

Mean catchment KF-factor

Example workflow:

# Get input rasters
from pfdf import severity, watershed
moderate_high = severity.mask(barc4, ["moderate","high"])
slopes = watershed.slopes(dem, flow)
dnbr = Raster('dnbr.tif')
kf = Raster('kf-factor.tif')

# Compute variables and parameters
T, F, S = s17.M1.variables(segments, moderate_high, slopes, dnbr, kf)
B, Ct, Cf, Cs = s17.M1.parameters()

M2

Variable

Description

T

Mean sin(theta) of catchment area
burned at moderate or high severity.

F

Mean catchment dNBR / 1000

S

Mean catchment soil thickness / 100

Example workflow:

# Get input rasters
from pfdf import severity, watershed
moderate_high = severity.mask(barc4, ["moderate","high"])
slopes = watershed.slopes(dem, flow)
dnbr = Raster('dnbr.tif')
kf = Raster('kf-factor.tif')

# Compute variables and parameters
s17.M2.variables(segments, moderate_high, slopes, dnbr, kf_factor)
B, Ct, Cf, Cs = s17.M2.parameters()

M3

Variable

Description

T

Topographic ruggedness.
(Vertical relief divided by the
square root of catchment area).

F

Proportion of catchment area burned
at moderate or high severity.

S

Mean catchment soil thickness / 100

Example workflow:

# Get input rasters
from pfdf import severity, watershed
moderate_high = severity.mask(barc4, ["moderate","high"])
relief = watershed.relief(dem, flow)
thickness = Raster('soil-thickness.tif')

# Compute variables and parameters
T, F, S = s17.M3.variables(segments, moderate_high, relief, thickness)
B, Ct, Cf, Cs = s17.M3.parameters()

Note

By default, the relief dataset is interpreted as units of meters. If this is not the case, use the “relief_per_m” option to specify a conversion factor between relief units and meters. For example, if your relief dataset is in feet, use:

T, F, S = s17.M3.variables(..., relief_per_m=3.28084)

M4

Variable

Description

T

Proportion of catchment area that both
(1) was burned, and
(2) has a slope angle ≥ 30 degrees.

F

Mean catchment dNBR / 1000

S

Mean catchment soil thickness / 100

Example Workflow:

# Get input rasters
from pfdf import severity, watershed
isburned = severity.mask(barc4, "burned")
slopes = watershed.slopes(dem, flow)
dnbr = Raster('dnbr.tif')
thickness = Raster('soil-thickness.tif')

# Compute variables and parameters
T, F, S = s17.M4.variables(segments, isburned, slopes, dnbr, thickness)
B, Ct, Cf, Cs = s17.M4.parameters()