digger.compare.confusion

Module Contents

Functions

calculate_metrics

Calculate binary classification metrics.

extract_shapes

Calculate polygon shapes of each element of the confusion matrix.

API

digger.compare.confusion.calculate_metrics(true_positive: float, true_negative: float, false_positive: float, false_negative: float, yamlout: str = 'confusion_metrics.yaml') dict

Calculate binary classification metrics.

See the confusion matrix Wikipedia page for additional detail regarding many of the provided classification metrics. Note that many of these metrics have multiple names. For simplicity, only one name is written here.

\(\alpha_T\), \(\beta_T\), \(\gamma_T\), and \(\Omega_T\) are from Schraml et al. (2016).

Inputs:
true_positivefloat

Number of true positives (TP)

true_negativefloat

Number of true negatives (TN)

false_positivefloat

Number of false positives (FP)

false_negativefloat

Number of false negatives (FN)

yamloutstr

Name of a yaml file to dump the calculated metrics.

Outputs:
outdict

Dictionary with the following columns

Table 7 Returned dictionary key value definition.

Key

Description of value

Equation

ppv

Positive predictive value

\(\frac{TP}{TP + FP}\)

tpr

True positive rate

\(\frac{TP}{TP + FN}\)

tnr

True negative rate

\(\frac{TN}{TN + FP}\)

npv

Negative predictive value

\(\frac{TN}{TN + FN}\)

fnr

False negative rate

\(\frac{FN}{FN + TP}\)

fpr

False positive rate

\(\frac{FP}{FP + TN}\)

fdr

False discovery rate

\(\frac{FP}{FP + TP}\)

for

False omission rate

\(\frac{FN}{FN + TN}\)

f1

\(F_1\) score

\(\frac{2TP}{2TP + FP + FN}\)

ts

threat score

\(\frac{TP}{TP + FN + FP}\)

accuracy

Accuracy

\(\frac{TP + TN}{TP + FP + TN + FN}\)

alphaT

\(\alpha_T\)

\(\frac{TP}{TP + FN + FP}\)

betaT

\(\beta_T\)

\(\frac{FN}{TP + FN + FP}\)

gammaT

\(\gamma_T\)

\(\frac{FP}{TP + FN + FP}\)

OmegaT

\(\Omega_T\)

\(\frac{TP - FP - FN}{TP + FN + FP}\)

References

Schraml, K., Thomschitz, B., McArdell, B. W., Graf, C., & Kaitna, R. (2015). Modeling debris-flow runout patterns on two alpine fans with different dynamic simulation models. Natural Hazards and Earth System Sciences, 15(7), 1483–1492. https://doi.org/10.5194/nhess-15-1483-2015

digger.compare.confusion.extract_shapes(truth_fn: str, simulation_fn: str, region_fn: str | None = None, make_plot: bool = True, topo_file: str = None, initial_source: str = None) dict

Calculate polygon shapes of each element of the confusion matrix.

Inputs:
truth_fnstr

Path to fiona-readable vector file containing the extent of observed runout.

simulation_fnstr

Path to fiona-readable vector file containing the extent of simulated runout.

region_fnstr

Path to fiona-readable vector file containing the extent of the considered region. This is necessary to calculate the true negative area correctly. If it is not provided the enveloping rectangle of the true and simulation polygons is used.

make_plotbool

Whether to make a diagnostic plot. It will be called ‘confustion.png’ and be placed in the current directory.

topo_filestr

Path to rasterio-readable raster file used to depict topography under the confusion matrix if make_plot=True

initial_sourcestr

Path to a fiona-readable vector file containing the extent of the initial landslide source region. Used for plotting only.

Outputs:
outdict

Dictionary with the keys ‘true_positive’, ‘true_negative’, ‘false_positive’, and ‘false_negative’. The value associated with each key is a shapely.Polygon or MultiPolygon associated with the classified area.