The gmrecords Program

See also

Be sure to review the discussion of how the gmrecords command line interface makes use of “projects” in the Initial Setup section.

You can use the gmrecords program to download, process, and generate products for ground-motion records from a given set of earthquakes. Each processing step is a subcommand and you can run only one subcommand at a time. You can use Python scripting to chain together multiple subcommands.

Use the -h command line argument to output the list of subcommands and their descriptions:

$ gmrecords -h
usage: gmrecords [-h] [-d | -q] [-v] [--log LOG] [-e EVENT_ID]
                 [--resume event_id] [-t TEXTFILE] [-l LABEL]
                 [-n NUM_PROCESSES] [-o] [--datadir DATADIR]
                 [--confdir CONFDIR]
                 <command> (<aliases>) ...

gmrecords is a program for retrieving and processing ground motion records, as
well as exporting commonly used station and waveform parameters for earthquake
hazard analysis.

options:
  -h, --help            show this help message and exit
  -d, --debug           Print all informational messages.
  -q, --quiet           Print only errors.
  -v, --version         Print program version.
  --log LOG             Path to log file; if provided, logging is directed to
                        this file.
  -e EVENT_ID, --eventid EVENT_ID
                        ComCat event ID. If None (default) all events in
                        project data directory will be used. To specify
                        multiple eventids, use a comma separated list like
                        `gmrecords -e "nc73799091, nc73774300" autoprocess`
  --resume event_id     Allows processing to start from a given event_id. This
                        can be useful when a list of event ids is provided
                        using the text flag. Then this option can be used to
                        resume processing at a specific event in the list.
  -t TEXTFILE, --textfile TEXTFILE
                        A CSV file without column headers. The columns can be
                        either: (1) a single column with ComCat event IDs, or
                        (2) six columns in which those columns are: event_id
                        (string, no spaces), time (any ISO standard for
                        date/time), latitutde (float, decimal degrees),
                        longitude (float, decimal degrees), depth (float, km),
                        magnitude (float).
  -l LABEL, --label LABEL
                        Processing label (single word, no spaces) to attach to
                        processed files. Default label is 'default'.
  -n NUM_PROCESSES, --num-processes NUM_PROCESSES
                        Number of parallel processes to run over events.
  -o, --overwrite       Overwrite results if they exist.
  --datadir DATADIR     Path to data directory. Setting this disables the use
                        of projects. Setting this also requires --confdir to
                        be set.
  --confdir CONFDIR     Path to directly containing config files. Setting this
                        disables the use of projects. Setting this also
                        requires --datadir to be set.

Subcommands:
  <command> (<aliases>)
    assemble            Assemble raw data and organize it into an ASDF file.
    autoprocess         Chain together the most common processing subcommands.
    autoshakemap        Chain together subcommands to get shakemap ground
                        motion file.
    clean               Clean (i.e., remove) project data.
    compute_station_metrics (sm)
                        Compute station metrics.
    compute_waveform_metrics (wm)
                        Compute waveform metrics.
    download            Download data and organize it in the project data
                        directory.
    export_cosmos (cosmos)
                        Export COSMOS format files.
    export_failure_tables (ftables)
                        Export failure tables.
    export_gmpacket (gmpacket)
                        Export JSON ground motion packet files.
    export_metric_tables (mtables)
                        Export metric tables.
    export_provenance_tables (ptables)
                        Export provenance tables.
    generate_regression_plot (regression)
                        Generate multi-event "regression" plot.
    generate_report (report)
                        Generate summary report (latex required).
    generate_station_maps (maps)
                        Generate interactive station maps.
    import              Import data for an event into the project data
                        directory.
    init                Initialize the current directory as a gmprocess
                        project directory.
    process_waveforms (process)
                        Process waveform data.
    processing_steps    Print a summary of the currently available processing
                        steps.
    projects (proj)     Manage gmrecords projects.

Note that some of the subcommands with longer names have short aliases to make the command line calls more concise. Use the syntax gmrecords SUBCOMMAND -h to show the help information for a given subcommand.

General subcommands

config

$ gmrecords config -h
usage: gmrecords [-h] [-d | -q] [-v] [--log LOG] [-e EVENT_ID]
                 [--resume event_id] [-t TEXTFILE] [-l LABEL]
                 [-n NUM_PROCESSES] [-o] [--datadir DATADIR]
                 [--confdir CONFDIR]
                 <command> (<aliases>) ...
gmrecords: error: argument <command> (<aliases>): invalid choice: 'config' (choose from 'assemble', 'autoprocess', 'autoshakemap', 'clean', 'compute_station_metrics', 'sm', 'compute_waveform_metrics', 'wm', 'download', 'export_cosmos', 'cosmos', 'export_failure_tables', 'ftables', 'export_gmpacket', 'gmpacket', 'export_metric_tables', 'mtables', 'export_provenance_tables', 'ptables', 'generate_regression_plot', 'regression', 'generate_report', 'report', 'generate_station_maps', 'maps', 'import', 'init', 'process_waveforms', 'process', 'processing_steps', 'projects', 'proj')

clean

$ gmrecords clean -h
usage: gmrecords clean [-h] [-a] [--raw] [--workspace] [--report] [--export]
                       [--plot] [--html]

options:
  -h, --help   show this help message and exit
  -a, --all    Remove all project files except raw data.
  --raw        Remove all raw directories.
  --workspace  Remove all workspace files.
  --report     Remove all PDF reports.
  --export     Remove all exported tables (.csv and .xlsx).
  --plot       Remove plots (*.png, plots/*).
  --html       Remove html maps.

init

Create a configuration file for projects in the current directory.

$ gmrecords init -h
usage: gmrecords init [-h]

options:
  -h, --help  show this help message and exit

projects

Manage local directory or system-level projects. Use this subcommand to switch among projects and add, delete, list, and rename projects.

$ gmrecords projects -h
usage: gmrecords projects [-h] [-l] [-s <name>] [-c] [-d <name>]
                          [-r <old> <new>] [--set-conf <name> <path>]
                          [--set-data <name> <path>]

options:
  -h, --help            show this help message and exit
  -l, --list            List all configured gmrecords projects.
  -s <name>, --switch <name>
                        Switch from current project to <name>.
  -c, --create          Create a project and switch to it.
  -d <name>, --delete <name>
                        Delete existing project <name>.
  -r <old> <new>, --rename <old> <new>
                        Rename project <old> to <new>.
  --set-conf <name> <path>
                        Set the conf path to <path> for project <name>.
  --set-data <name> <path>
                        Set the data path to <path> for project <name>.

    In order to simplify the command line interface, the gmrecords command makes use of
    "projects". You can have many projects configured on your system, and a project can
    have data from many events. A project is essentially a way to encapsulate the
    configuration and data directories so that they do not need to be specified as
    command line arguments.

    `gmrecords` first checks the current directory for the presence of
    `./.gmprocess/projects.conf` (this is called a "local" or "directory" project); if
    that is not found then it looks for the presence of `~/.gmprocess/projects.conf`
    (this is called a "system" level project).

    Within the `projects.conf` file, the key `project` indicates which the currently
    selected project. Multiple projects can be included in a `projects.conf` file.

    The project name is then stored as a key at the top level, which itself has keys
    `data_path` and `conf_path`. The `data_path` points to the directory where data
    is stored, and organized at the top level by directories named by event id. The
    `conf_path` points to the directory that holds configuration options in YML files.

Data gathering subcommands

download

The download subcommand will fetch data for a given set of earthquakes from a variety of data centers. The data includes the earthquake rupture information (for example, magnitude, location, origin time) and the raw waveforms.

The easiest way to get data for events is by specifying USGS ComCat event IDs. These event IDs can be found by searching for events on the USGS Search Earthquake Catalog page. With gmrecords you can specify a single event ID or a list of event IDs in a text file. Also, you can run customized searches of the earthquake catalog in Python using libcomcat, ObsPy, or webservices directly in your code.

A subdirectory for each event will be created in the data directory of the project with the name of the directory set to the event ID. Within each subdirectory, the event information will be placed in event.json and the raw waveforms in a raw subdirectory. If STREC is enabled to associated events with tectonic regimes, the information will be placed in strec.json.

$ gmrecords download -h
usage: gmrecords download [-h]

options:
  -h, --help  show this help message and exit

assemble

The assemble command reads the files in each event subdirectory and creates a corresponding ASDF file workspace.h5 with the event information and raw waveforms. All subsequent commands only access the ASDF file.

$ gmrecords assemble -h
usage: gmrecords assemble [-h]

options:
  -h, --help  show this help message and exit

import

$ gmrecords import -h
usage: gmrecords import [-h] [-p PATH]

options:
  -h, --help            show this help message and exit
  -p PATH, --path PATH  Path to file or directory containing data to import.

Processing subcommands

processing_steps

Print a summary of the currently available processing steps.

$ gmrecords processing_steps -h
usage: gmrecords processing_steps [-h] [-o {text,myst}] [-p PATH]

options:
  -h, --help            show this help message and exit
  -o {text,myst}, --output-type {text,myst}
                        File path to save output, formated as markdown.
  -p PATH, --path PATH  File path to save output. If unset, then uses standard
                        out.

    These are the processing steps that can be included in the `processing` section of
    the config file.

    The "myst" output type is useful for building docs.

process_waveforms

Perform processing steps on the raw waveforms, such as baseline correction, bandpass filtering, and trimming.

$ gmrecords process_waveforms -h
usage: gmrecords process_waveforms [-h] [-r]

options:
  -h, --help       show this help message and exit
  -r, --reprocess  Reprocess data using manually review information.

compute_station_metrics

Compute station metrics, such as rupture distance, and back azimuth to rupture.

$ gmrecords compute_station_metrics -h
usage: gmrecords compute_station_metrics [-h]

options:
  -h, --help  show this help message and exit

compute_waveform_metrics

Compute waveform metrics, such as PGA, PGV, pseudospectral acceleration, and Fourier amplitude spectra.

$ gmrecords compute_waveform_metrics -h
usage: gmrecords compute_waveform_metrics [-h]

options:
  -h, --help  show this help message and exit

autoshakemap

An alias for downloading and assembling data, processing waveforms, computing station metrics, computing waveform metrics, and exporting files for ShakeMap.

$ gmrecords autoshakemap -h
usage: gmrecords autoshakemap [-h] [-p PATH] [--skip-download] [-d]

options:
  -h, --help            show this help message and exit
  -p PATH, --path PATH  Path to external data file or directory. If given,
                        then the download step is also skipped.
  --skip-download       Skip data download step.
  -d, --diagnostics     Include diagnostic outputs that are created after
                        ShakeMap data file is created.

autoprocess

An alias for downloading and assembling data, processing waveforms, computing station metrics, computing waveform metrics, and generating a report and station map.

$ gmrecords autoprocess -h
usage: gmrecords autoprocess [-h] [--no-download] [--no-assemble]
                             [--no-process] [--no-station_metrics]
                             [--no-waveform_metrics] [--no-report] [--no-maps]

options:
  -h, --help            show this help message and exit
  --no-download         Skip download subcommand.
  --no-assemble         Skip assemble subcommand.
  --no-process          Skip process_waveforms subcommand.
  --no-station_metrics  Skip compute_station_metrics subcommand.
  --no-waveform_metrics
                        Skip compute_waveform_metrics subcommand.
  --no-report           Skip generate_report subcommand.
  --no-maps             Skip generate_station_maps subcommand.

    This is a convenience function, but it also provides a mechanism to loop over
    events, calling each of the following subcommands in order:
      - download
      - assemble
      - process_waveforms
      - compute_station_metrics
      - compute_waveform_metrics
      - generate_report
      - generate_station_maps
    Individual subcommands can be turned off with the arguments to this subcommand.

Export subcommands

export_cosmos

$ gmrecords export_cosmos -h
usage: gmrecords export_cosmos [-h] [-f OUTPUT_FOLDER] [-s SEPARATE_CHANNELS]
                               [-p PROCESS_LEVEL] [--label LABEL]

options:
  -h, --help            show this help message and exit
  -f OUTPUT_FOLDER, --output-folder OUTPUT_FOLDER
                        Choose output folder for COSMOS files. Default is
                        existing event folder.
  -s SEPARATE_CHANNELS, --separate-channels SEPARATE_CHANNELS
                        Turn off concatenation of COSMOS text files. Default
                        is to concatenate all channels from a stream into one
                        file.Setting this flag will result in one file per
                        channel.
  -p PROCESS_LEVEL, --process-level PROCESS_LEVEL
                        Select the volume or processing level to output.
                        OPTIONS are [RAW,CONVERTED,PROCESSED] (V0,V1,V2 in
                        Cosmos parlance).
  --label LABEL         Specify the processing desired processing label.
                        Choosing -p RAW will automatically use the
                        'unprocessed' label. If there is only one label for
                        processed data, then that one will be automatically
                        chosen when -p PROCESSED is selected.

export_failure_tables

$ gmrecords export_failure_tables -h
usage: gmrecords export_failure_tables [-h] [--type {short,long,net}]
                                       [-f {excel,csv}] [-l]

options:
  -h, --help            show this help message and exit
  --type {short,long,net}
                        Output failure information, either in short form
                        ("short"),long form ("long"), or network form ("net").
                        short: Two column table, where the columns are
                        "failure reason" and "number of records". net: Three
                        column table where the columns are "network", "number
                        passed", and "number failed". long: Two column table,
                        where columns are "station ID" and "status" where
                        status is "passed" or "failed" (with reason).
  -f {excel,csv}, --output-format {excel,csv}
                        Output file format.
  -l, --log-status      Include failure information in INFO logging.

export_gmpacket

$ gmrecords export_gmpacket -h
usage: gmrecords export_gmpacket [-h]

options:
  -h, --help  show this help message and exit

export_metric_tables

$ gmrecords export_metric_tables -h
usage: gmrecords export_metric_tables [-h] [-f {excel,csv}]

options:
  -h, --help            show this help message and exit
  -f {excel,csv}, --output-format {excel,csv}
                        Output file format.

export_provenance_tables

$ gmrecords export_provenance_tables -h
usage: gmrecords export_provenance_tables [-h] [-f {excel,csv}]

options:
  -h, --help            show this help message and exit
  -f {excel,csv}, --output-format {excel,csv}
                        Output file format.

Diagnostic subcommands

generate_regression_plot

Important

You must run the export_metric_tables subcommand before running the generate_regression_plot subcommand.

$ gmrecords generate_regression_plot -h
usage: gmrecords generate_regression_plot [-h]

options:
  -h, --help  show this help message and exit

generate_report

$ gmrecords generate_report -h
usage: gmrecords generate_report [-h]

options:
  -h, --help  show this help message and exit

generate_station_maps

$ gmrecords generate_station_maps -h
usage: gmrecords generate_station_maps [-h]

options:
  -h, --help  show this help message and exit