Raster Properties Tutorial¶

This tutorial introduces the Raster class and examines routines to manage data values and spatial metadata.

Introduction¶

Raster datasets are fundamental to pfdf - many routines require rasters as input, and many produce new rasters as output. In brief, a raster dataset is a rectangular grid of data values. The individual values (often called pixels) are regularly spaced along the X and Y axes, and each axis may use its own spacing interval. A raster is usually associated with some spatial metadata, which locates the raster’s pixels in space. Some rasters will also have a NoData value - when this is the case, pixels equal to the NoData value represent missing data.

A raster’s spatial metadata consists of a coordinate reference system (CRS) and an affine transformation matrix (also known as the transform). The transform converts the data grid’s column indices to spatial coordinates, and the CRS specifies the location of these coordinates on the Earth’s surface. A transform defines a raster’s resolution and alignment (the location of pixel edges) and takes the form:

\[\begin{split} \begin{vmatrix} dx & 0 & \mathrm{left}\\ 0 & dy & \mathrm{top} \end{vmatrix} \end{split}\]

Here dx and dy are the change in spatial coordinate when incrementing one column or row, and their absolute values define the raster’s resolution. Meanwhile, left and top indicate the spatial coordinates of the data grid’s left and top edges, which defines the raster’s alignment. The two remaining coefficients can be used to implement shear transforms, but pfdf only supports rectangular pixels, so these will always be 0 for our purposes.

In this tutorial, we’ll learn how to use Raster objects to manage data values and spatial metadata. Other routines are explored later in the Raster Factories and Preprocessing tutorials.

Prerequisites¶

Install pfdf¶

To run this tutorial, you must have installed pfdf 3+ with tutorial resources in your Jupyter kernel. The following line checks this is the case:

import check_installation

pfdf and tutorial resources are installed

Imports¶

We’ll next import the Raster class from pfdf. We’ll also use numpy to work with raster data grids.

from pfdf.raster import Raster
import numpy as np

Raster Object¶

We’ll start by using the from_file command to create a Raster object for an example dataset. (You can learn more about this command in the Raster Factories Tutorial. Here, we’ll specifically load the dNBR raster used in the main tutorial series:

raster = Raster('data/dnbr.tif')

Printing the object to the console, we find a summary of the data grid and spatial metadata:

print(raster)

Raster:
    Shape: (1280, 1587)
    Dtype: int16
    NoData: -32768
    CRS("NAD83 / UTM zone 11N")
    Transform(dx=10.0, dy=-10.0, left=408022.12009999994, top=3789055.5413000006, crs="NAD83 / UTM zone 11N")
    BoundingBox(left=408022.12009999994, bottom=3776255.5413000006, right=423892.12009999994, top=3789055.5413000006, crs="NAD83 / UTM zone 11N")

Data Grid¶

You can use the values property to return a Raster object’s data grid:

raster.values

array([[-32768, -32768, -32768, ..., -32768, -32768, -32768],
       [-32768, -32768, -32768, ..., -32768, -32768, -32768],
       [-32768, -32768, -32768, ..., -32768, -32768, -32768],
       ...,
       [-32768, -32768, -32768, ..., -32768, -32768, -32768],
       [-32768, -32768, -32768, ..., -32768, -32768, -32768],
       [-32768, -32768, -32768, ..., -32768, -32768, -32768]],
      shape=(1280, 1587), dtype=int16)

Raster objects represent their data grids as numpy arrays, so provide several properties determined by the array. For example, you can use the shape property to return the array shape (nrows x ncols), size to return the number of pixels. dtype to return the data type, and nbytes to return the memory consumed by the array. Users who prefer rasterio’s syntax can also use height and width to return the number of rows and columns, respectively:

print(f'shape = {raster.shape}')
print(f'height = {raster.height}')
print(f'width = {raster.width}')
print(f'size = {raster.size}')
print(f'dtype = {raster.dtype}')
print(f'nbytes = {raster.nbytes}')

shape = (1280, 1587)
height = 1280
width = 1587
size = 2031360
dtype = int16
nbytes = 4062720

The values property returns a read-only view of the Raster object’s data grid. Most routines will work as normal, but you’ll need to make a copy if you want to alter array elements directly:

# Most routines work as normal
median = np.median(raster.values)

# But this will fail because it attempts to alter array elements
try:
    rasters.values[0,:] = 0
except Exception:
    print('Failed because we attempted to change the array')

Failed because we attempted to change the array

# This is fine because we copied the array first
values = raster.values.copy()
values[0,:] = 0
print(values)

[[     0      0      0 ...      0      0      0]
 [-32768 -32768 -32768 ... -32768 -32768 -32768]
 [-32768 -32768 -32768 ... -32768 -32768 -32768]
 ...
 [-32768 -32768 -32768 ... -32768 -32768 -32768]
 [-32768 -32768 -32768 ... -32768 -32768 -32768]
 [-32768 -32768 -32768 ... -32768 -32768 -32768]]

NoData Values¶

You can use the nodata property to return a raster’s NoData value:

print(raster.nodata)

-32768

The nodata_mask property will return a boolean array indicating the locations of NoData values in the data grid. Here, True values indicate NoData pixels, and False values indicate data pixels. Inspecting the NoData mask for the example dataset, we can observe locations of NoData pixels along the data grid’s edges:

raster.nodata_mask

array([[ True,  True,  True, ...,  True,  True,  True],
       [ True,  True,  True, ...,  True,  True,  True],
       [ True,  True,  True, ...,  True,  True,  True],
       ...,
       [ True,  True,  True, ...,  True,  True,  True],
       [ True,  True,  True, ...,  True,  True,  True],
       [ True,  True,  True, ...,  True,  True,  True]],
      shape=(1280, 1587))

Alternatively, you can use the data_mask property to return the inverse mask, wherein True indicates data pixels and False is NoData:

raster.data_mask

array([[False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]],
      shape=(1280, 1587))

These masks can be useful for manipulating and/or visualizing raster data values after processing.

CRS¶

Several other properties return a raster’s spatial metadata. The crs returns the raster’s coordinate reference system as a pyproj.CRS object, crs_units reports the CRS’s coordinate units along the X and Y axes, and utm_zone returns the CRS of the best UTM zone for the raster’s center point:

raster.crs

<Projected CRS: EPSG:26911>
Name: NAD83 / UTM zone 11N
Axis Info [cartesian]:
- [east]: Easting (metre)
- [north]: Northing (metre)
Area of Use:
- undefined
Coordinate Operation:
- name: UTM zone 11N
- method: Transverse Mercator
Datum: North American Datum 1983
- Ellipsoid: GRS 1980
- Prime Meridian: Greenwich

raster.crs_units

('metre', 'metre')

raster.utm_zone

<Projected CRS: EPSG:32611>
Name: WGS 84 / UTM zone 11N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Between 120°W and 114°W, northern hemisphere between equator and 84°N, onshore and offshore. Canada - Alberta; British Columbia (BC); Northwest Territories (NWT); Nunavut. Mexico. United States (USA).
- bounds: (-120.0, 0.0, -114.0, 84.0)
Coordinate Operation:
- name: UTM zone 11N
- method: Transverse Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

Transform¶

You can use the transform property to return a raster’s Transform object. This object manages the affine transform, and you can learn more in the Spatial Metadata Tutorial:

raster.transform

Transform(dx=10.0, dy=-10.0, left=408022.12009999994, top=3789055.5413000006, crs="NAD83 / UTM zone 11N")

You can also use the resolution method to return the resolution along the X and Y axes, and pixel_area to return the area of a single pixel:

print(raster.resolution())
print(raster.pixel_area())

(10.0, 10.0)
100.0

By default, these commands return values in meters, but you can use the units option to select other units:

resolution = raster.resolution(units='feet')
area = raster.pixel_area(units='feet')
print(resolution)
print(area)

(32.808398950131235, 32.808398950131235)
1076.3910416709723

You can find a list of supported units here: Supported Units

Bounding Box¶

You can use the bounds property to return a raster’s BoundingBox object. This object manages the raster’s bounding box, and you can learn more in the Spatial Metadata Tutorial:

raster.bounds

BoundingBox(left=408022.12009999994, bottom=3776255.5413000006, right=423892.12009999994, top=3789055.5413000006, crs="NAD83 / UTM zone 11N")

You can also use the left, right, top, and bottom properties to return the coordinates of specific edges, and the center property to return the (X, Y) coordinate of the raster’s center point:

print(f'left = {raster.left}')
print(f'right = {raster.right}')
print(f'bottom = {raster.bottom}')
print(f'top = {raster.top}')
print(f'center = {raster.center}')

left = 408022.12009999994
right = 423892.12009999994
bottom = 3776255.5413000006
top = 3789055.5413000006
center = (415957.12009999994, 3782655.5413000006)

Conclusion¶

In this tutorial, we’ve introduced raster datasets, and learned how to use the Raster class to manage their data grids and spatial metadata. In the next tutorial, we’ll learn how to load and build Raster objects from a variety of different data sources.