datasets.py API¶

The datasets.py file lives in the assessment folder. This file records (1) how ocelote should obtain input datasets, and (2) data provenance metadata. Configuration settings are organized as Python dictionaries, with one block per input dataset.

Input Datasets¶

perimeter¶

Type:: dict

Provenance metadata for the fire perimeter. The buffered perimeter is used to define the domain of the analysis, and the raw perimeter is used to locate areas within or downstream of the burn area.

dem¶

Type:: dict

Provenance metadata for the digital elevation model (DEM). The DEM is used to determine flow pathways, slopes, and vertical relief.

dnbr¶

Type:: dict

Provenance metadata for the difference normalized burn ratio dataset (dNBR). The dNBR informs the M1 likelihood model.

severity¶

Type:: dict

Provenance metadata for the burn severity dataset. The severity informs network delineation, the M1 likelihood model, and the volume model.

kf¶

Type:: dict

Provenance metadata for the soil KF-factors (a measure of soil erodibility). Soil KF-factors inform the M1 likelihood model.

evt¶

Type:: dict

Provenance metadata for the existing vegetation type (EVT) dataset. The EVT informs network delineation.

retainments¶

Type:: dict

Provenance metadata for debris retainment features. The retainments inform network delineation.

excluded¶

Type:: dict

Provenance metadata for an exclusion mask. The exclusion mask informs network delineation.

Dataset Fields¶

Each provenance dictionary includes a dataset field, which informs ocelote how to obtain the associated dataset. The options available for this field vary by the input dataset.

perimeter.dataset¶

Type:: Path

Must be the path to a fire perimeter file.

dem.dataset¶

Type:: "download" | Path

Set to "download" to download DEM data from the USGS National Map 1/3 arc-second DEM. Otherwise, must be the path to a DEM dataset on the local filesystem.

dnbr.dataset¶

Type:: Path | number

Usually the path to a dNBR dataset on the local filesystem. If a dNBR dataset is not available, you may set the dataset to a number, which will use the number as a constant dNBR value throughout the analysis.

severity.dataset¶

Type:: Path | "estimate" | number

Usually the path to a burn severity file on the local filesystem. If a burn severity dataset is not available, you may set this field to "estimate", which will estimate severity from the dNBR. When this is the case, you must update the severity_thresholds field in configuration.py. Alternatively, you may set the dataset to a number, which will use the number as a constant burn severity value throughout the analysis.

kf.dataset¶

Type:: "download" | Path | number

Set to "download" to download soil KF-factor data from STATSGO. Alternatively, use a path to load the dataset from a dataset on the local filesystem. If a KF-factor dataset is not available, you can set the dataset to a number, which will use the number as a constant KF-factor throughout the analysis.

evt.dataset¶

Type:: "download" | "download:layer" | Path | None

Set to "download" to download the most up-to-date LANDFIRE EVT layer. Alternatively, use "download:layer" to download a specific LANDFIRE EVT layer, which may be useful for replicating historical assessments. In this case, you must update the “layer” text to the name of the desired LANDFIRE data layer. You can find a list of LANDFIRE data layers here: LANDFIRE layers.

You may also set the dataset to a path to load an EVT dataset from the local filesystem, or set the dataset to None. The latter case will run an analysis without using EVT information to inform network delineation.

retainments.dataset¶

Type:: None | "download" | Path

Set by default to None, which does not use debris retainment features to inform network delineation. Set to "download" to download retainment features from the Los Angeles County archive, or use a file path to load a retainment feature dataset from the local filesystem.

excluded.dataset¶

Type:: None | Path

Set by default to None, which does not use an exclusion mask to inform network delineation. Change to a file path to load an exclusion mask from the local filesystem. If using an exclusion mask, you must provide note metadata for the excluded dataset.

Standard Fields¶

The following fields are included in all provenance dictionaries.

<dataset>.source¶

Type:: str

A string indicating the dataset source. All datasets that are not None, must have source metadata before you can run an assessment. The source will auto-populate for the fire perimeter, dNBR, and severity if you initialize the assessment with a --from <source> option. Source strings are also auto-populated for any downloaded datasets. The exclusion mask source will auto-populate if an exclusion mask is provided to the preprocesser.

<dataset>.note¶

Type:: str

An optional note providing additional details about the dataset. If an exclusion mask is provided, then you must provide a note indicated what was excluded.

<dataset>.doi¶

Type:: str

A DOI for the dataset, if available. DOIs will auto-populate for downloaded datasets that have DOIs available.

<dataset>.url¶

Type:: str

A URL used to access the dataset if appropriate. If a dataset is downloaded and does not have a DOI, then the URL metadata will auto-populate.

<dataset>.access_date¶

Type:: str

An optional ISO 8601 date string indicating the date the dataset was accessed. Should follow the format YYYY-MM-DD. The access date will auto-populate for any downloaded datasets.

<dataset>.archive¶

Type:: bool

Set to True to archive a dataset with the assessment results, or False to prevent the dataset from being archived. By default, the datasets.py will update the archive fields to archive any datasets not associated with a permanent DOI. Users can change the archive options after preprocessing to modify this behavior.

Note

The archive option is not available for the perimeter dataset, as the perimeter is always included in the assessment results.

Misc Fields¶

Fields specific to a particular dataset.

dem.tiles¶

Type:: list[str]

An optional list of DEM tiles used to construct the input DEM dataset. Many large DEM datasets are split into tiles to facilitate distribution. This field allows you to provide additional information on the leveraged tiles. This field auto-populates if the DEM dataset is downloaded.