pfdf 3.0.0 Release Notes

This release adds the data package, which provides routines to download commonly used datasets from the internet. The release also provides pfdf tutorials as Jupyter Notebooks, improves array broadcasting in the gartner2014 and staley2017 modules, and introduces the RasterMetadata class (which manages raster metaata without loading data arrays to memory).

Backwards Compatibility

The updates to array broadcasting in the gartner2014 and staley2017 modules may break backwards compatibility in some cases. Please read the breaking changes section for details.


Updates

Major Additions

  • Added the data package, which provides routines to download commonly-used datasets from the internet

  • Added the RasterMetadata class, which manages raster metadata without loading raster data arrays into memory

Tutorials

  • Converted the tutorials to Jupyter Notebooks

  • Added a tutorial on the data package

  • Split the raster tutorial into three tutorials: an introduction, raster properties, and raster factories

  • Added a tutorial on the RasterMetadata class

Installation

  • pfdf can now be installed from the official USGS package registry. This is the recommended way to install pfdf.

Projection Package

Raster

  • Added the from_url factory, which builds Raster objects for datasets accessed via web URLs

  • save now returns the path to the saved file

  • Added the dtype, field_casting, and operation options to from_points and from_polygons

  • Pinned rasterized vector feature dtypes to int32 for integer fields, and float64 for floating-point fields

  • Added __getitem__, which allows indexing into Raster objects

  • Added copy option to Raster.fill and Raster.set_range

  • repr now only includes non-default raster names

Segments

  • Added the located_basins property, which indicates if the object has pre-located outlet basins,

  • save now returns the path to the saved file

  • repr now includes more info on the network

Models

Exceptions

  • Added 9 new exceptions pertaining to download errors from third-party data providers

  • Added 2 new exceptions for when a dataset does not overlap a required bounding box

Bug Fixes

  • Fixed a bug where Raster.from_points and Raster.from_polygons would ignore an entire MultiPoint or MultiPolygon feature if a single point or polygon was outside a queried bounding box

  • Fixed a bug that prevented Raster.from_points and Raster.from_polygons from rasterizing features in geodatabases and other structured directories

  • Fixed a bug where some BoundingBox and Transform properties were returned as numpy arrays instead of floats

  • Fixed a bug wherein Raster.from_file would ignore the driver option

  • Fixed a bug wherein Raster.set_range and Raster.fill would fail when filling a float32 raster with NaN NoData

  • Fixed a bug that prevented loading float32 rasters from file with default NoData

  • Removed the unused y option from projection.Transform.reproject

For Developers

  • Added slow and web markers

  • Added quicktest and webtest developer scripts

  • Added developer scripts to build and run tutorials

  • Pipeline now runs tutorials and web tests on merge requests

  • Added a scheduled pipeline for daily builds. The daily builds do not use poetry.lock, instead building from the most up-to-date dependencies.

  • pyproject.toml now conforms to PEP 508 and poetry 2+

  • Improved pipeline implementation of multiple Python builds

API Breaking Changes

Changes to model broadcasting in the gartner2014 and staley2017 wil break backwards compatibility in some cases. This section details these changes and suggest possible fixes.

gartner2014

Formerly, output arrays from the gartner2014 module had dimensions of (Segments x Parameter Runs). In this formulation, rainfall intensities and confidence interval parameters were grouped with the model coefficients to characterize parameter runs. In v3.0.0, rainfall intensities and confidence intervals have been separated into their own dimensions. As such, output arrays can now be 4D with dimensions of (Segments x Rainfall Intensities x Parameter Runs x Confidence Intervals).

This change is unlikely to affect most users, as running the model with multiple coefficients and/or confidence intervals is rare. The most common case is to run the multiple over a stream segment network for multiple rainfall intensities, and this will still return a 2D array of model results.

staley2017

Formerly, output arrays from the likelihood and accumulation functions had dimensions of (Segments x Durations x Accumulations/Probabilities). The order of these dimensions has been altered to (Segments x Accumulations/Probabilities x Durations) in order to simplify broadcasting for the most common use cases.

This change is most likely to affect users of the accumulation function (often referred to as rainfall thresholds), as this function is often run with multiple sets of model coefficients. This change will also affect users of the likelihood function who set keepdims = True. The following is a suggested fix for common use cases:

Accumulation

Change this:

accumulations = s17.accumulation(...)
for s in segments:
    for d in durations:
        for p in probabilities:
            accumulations[s, d, p]

to this:

accumulations = s17.accumulation(...)
for s in segments:
    for p in probabilities:
        for d in durations:
            accumulations[s, p, d]

Likelihood

Change this:

likelihoods = s17.likelihood(..., keepdims=True)
for s in segments:
    for r in R15:
        likelihoods[s, :, r]

to this:

likelihoods = s17.likelihood(..., keepdims=True)
for s in segments:
    for r in R15:
        likelihoods[s, r, :]

intensity.from_accumulation

The utils.intensity.from_accumulation function has been updated to accommodate the new dimension order in the staley2017 model. Formerly, this function always broadcast durations along the second dimension of an array. Now, this function broadcasts durations along the final array dimension, regardless of dimensional index.

This change is unlikely to affect most users, as the dimensions of the accumulation arrays from the staley2017 module have shifted in a like manner.