data.usgs.statsgo module¶
Functions to load data from the STATSGO archive. Specifically, these functions load data from the STATSGO COG collection, which reformatted select data fields from the source STATSGO archive as cloud-optimized GeoTiff (COG) rasters. Currently, the supported data fields include: KFFACT, and THICK.
Function |
Description |
---|---|
Load Data |
|
Loads data from a STATSGO field as a Raster object |
|
Downloads the COG data file for a STATSGO field |
|
Item Info |
|
Returns a pandas.DataFrame with information on the supported fields |
|
Returns the ScienceBase URLs for items in the STATSGO COG collection |
|
Returns ScienceBase metadata on a STATSGO item as a JSON dict |
Load Data¶
- pfdf.data.usgs.statsgo.read(field, bounds, *, timeout=60)¶
Reads data from a STATSGO field into memory as a Raster object
Read Data
read(field, bounds)
Reads data from the indicated STATSGO field within the provided bounding box. Supported fields include: KFFACT and THICK. Note that the
bounds
input should be a BoundingBox-like object with a CRS. Returns the loaded dataset as a Raster object.Connection Timeout
read(..., *, timeout)
Specifies a maximum time in seconds for connecting to the ScienceBase data server. This option is typically a scalar, but may also use a vector with two elements. In this case, the first value is the timeout to connect with the server, and the second value is the time for the server to return the first byte. You can also set timeout to None, in which case API queries will never time out. This may be useful for some slow connections, but is generally not recommended as your code may hang indefinitely if the server fails to respond.
- Inputs:
field (str) – The name of the STATSGO data field from which to load data
timeout (scalar | vector) – The maximum number of seconds to connect with the ScienceBase server
- Outputs:
Raster – The data loaded from the STATSGO archive
- pfdf.data.usgs.statsgo.download(field, *, parent=None, name=None, overwrite=False, timeout=60)¶
Downloads the cloud-optimized GeoTiff for a STATSGO field
Download Data
download(field)
Downloads the cloud-optimized GeoTiff (COG) for the indicated STATSGO field. Supported fields include: KFFACT, and THICK.
The dataset in the downloaded file spans the Continental US at a nominal 30 meter resolution. A downloaded file will require 336MB of disk space. Note that the COG format uses compression internally to reduce file size, so reading the full dataset into memory will require ~60GB of RAM - significantly more memory than the size of the downloaded file.
Returns the path to the downloaded file as output. By default, downloads a file named
STATSGO-<field>.tif
to the current folder. Raises an error if the file exists. (And refer to the following syntax for additional file path options).File Path
download(..., *, parent) download(..., *, name) download(..., *, overwrite=True)
Options for downloading the file. Use the
parent
input to specify the the path to the parent folder where the file should be saved. If a relative path, then parent is interpreted relative to the current folder. Usename
to set the name of the downloaded file. By default, raises an error if the path for the downloaded file already exists. Set overwrite=True to allow the download to overwrite an existing file.Connection Timeout
download(..., *, timeout)
Specifies a maximum time in seconds for connecting to the ScienceBase data server. This option is typically a scalar, but may also use a vector with two elements. In this case, the first value is the timeout to connect with the server, and the second value is the time for the server to return the first byte. You can also set timeout to None, in which case API queries will never time out. This may be useful for some slow connections, but is generally not recommended as your code may hang indefinitely if the server fails to respond.
- Inputs:
field (str) – The name of the STATSGO data field to download
parent (Path-like) – The path to the parent folder where the file should be saved. Defaults to the current folder.
name (str) – The name for the downloaded file. Defaults to STATSGO-<field>.tif
overwrite (bool) – True to allow the downloaded file to replace an existing file. False (default) to not allow overwriting
timeout (scalar | vector) – The maximum number of seconds to connect with the ScienceBase server
- Outputs:
Path – The Path to the downloaded COG file
Item Info¶
- pfdf.data.usgs.statsgo.fields() DataFrame: ¶
Returns a pandas.DataFrame describing the supported STATSGO fields
fields()
Returns a pandas.DataFrame describing the STATSGO fields supported by this module. The index entries are the names of supported fields. Each row provides the description, units, and URL to the ScienceBase catalog item for the field.
- Outputs:
pandas.DataFrame – Documents the supported STATSGO fields
index (str) – The name of each field
Description (str) – A description of each field
Units (str) – Reports the units of each field
URL (str) – The URL to the ScienceBase item for each field
- pfdf.data.usgs.statsgo.url(field=None)¶
Returns the URLs to ScienceBase items for the STATSGO dataset
Collection URL
url()
Returns the URL to the ScienceBase STATSGO collection item. This item is the parent of the individual STATSGO data field rasters, and it links to the ScienceBase items for the supported STATSGO data fields.
Field URL
url(field)
Returns the URL to the ScienceBase item for the queried STATSGO field. Supported field include: KFFACT, and THICK.
- Inputs:
field (str) – A STATSGO field whose ScienceBase item URL should be returned
- Outputs:
str – The URL to a ScienceBase item in the STATSGO archive
- pfdf.data.usgs.statsgo.query(field=None, *, timeout=60)¶
Queries the ScienceBase API for a STATSGO item and returns the response as a JSON dict
Query Collection
query()
Uses the ScienceBase API to query the parent item for the STATSGO collection. This item links to the items for the supported STATSGO data fields. Returns the query response as a JSON dict.
Query Field
query(field)
Uses the ScienceBase API to query the catalog item for the indicated STATSGO data field. Supported fields include: KFFACT and THICK.
Connection Timeout
query(..., *, timeout)
Specifies a maximum time in seconds for connecting to the ScienceBase data server. This option is typically a scalar, but may also use a vector with two elements. In this case, the first value is the timeout to connect with the server, and the second value is the time for the server to return the first byte. You can also set timeout to None, in which case API queries will never time out. This may be useful for some slow connections, but is generally not recommended as your code may hang indefinitely if the server fails to respond.
- Inputs:
field (str) – The name of a STATSGO data field to query
timeout (scalar | vector) – The maximum number of seconds to connect with the ScienceBase server
- Outputs:
dict – ScienceBase item info as a JSON dict