Indexer
« Back to Receiving Documentation
Also see this page describing Indexer Processing
The Indexer maintains an index of received products. It uses this index to
associate related products into events, based on eventid or time,
latitude, and longitude. When multiple sources submit information for the
same event, the indexer determines which source is considered preferred
for that type of information.
The Index
The index is typically a database, although it is not required to be. The
default implementation uses JDBC, and should be able to maintain an index
in any JDBC compliant database.
Archive Policies
Archive policies define rules for when the indexer should remove
information from its index.
Search
Enabled by default.
The indexer listens on a socket to allow external users to search and
retrieve information from The Index.
See the command line client
--search
option, or SearchSocket API class.
Searches and results use an XML format. See etc/schema/indexer.xsd for
details.
Indexer Events
When a product arrives and is added to the index, the indexer keeps track
of the changes it makes. Each Indexer Event is a group of one or more
changes that were made in response to one product arriving.
This tracking is performed through an onEventTrigger database trigger. For
a technical description of this trigger and instructions for implementing
it on the MySQL database, see
Configuring the Product Index to Use MySQL
Change Types
- EVENT_ADDED
-
An event was added to the index. This occurs when a product arrives that
cannot associate to an existing event, but has enough information (time,
latitude, longitude) to create a new event.
- EVENT_SPLIT
-
An event that was part of another event in the index, in now
considered a separate event. This usually occurs when a network
updates is location far enough away from the "parent" event.
There may be several EVENT_SPLIT changes, but there will always also
be an EVENT_UPDATED for the event that the split events were split
from.
- EVENT_UPDATED
-
An event that already existed in the index was updated. This occurs when
a product arrives and associates to an existing event. This does not
necessarily mean the preferred event properties (eventid, time,
latitude, longitude, magnitude, depth) have changed, only that
information associated to this event is different than before.
- EVENT_DELETED
-
An event that already existed in the index was deleted. This effectively
means the event did not occur. This occurs when a product arrives,
associates to an existing event, and because of the new information the
event no longer has a time, latitude, or longitude.
- EVENT_MERGED
-
An event that already existed in the index merged with another
event. This means this event still occured, but is now part of
another event (and is not preferred).
There may be several EVENT_MERGED changes, but there will always
also be an EVENT_UPDATED for the event that the merged events were
merged into.
- EVENT_ARCHIVED
-
An event was removed from the index due to a configured archive policy.
The event still occured, but is no longer being tracked by this indexer.
- PRODUCT_ADDED
-
A product arrived, was unable to associate to an event, and did not have
enough information (time, latitude, longitude) to create a new event.
- PRODUCT_UPDATED
-
An unassociated product was updated. If an update causes the product to
associate, there will be an EVENT_UPDATED change instead of
PRODUCT_UPDATED.
- PRODUCT_DELETED
- An unassociated product was deleted.
- PRODUCT_ARCHIVED
-
An unassociated product was removed from the index due to a configured
archive policy.
Example Indexer Configuration File
In this example, an indexer is configured to:
- Download "origin" and "shakemap-input" type products
-
Call a listener named "shakemap_listener" whenever an event's preferred
mag,lat,lon,depth, or time change, which triggers an executable script
"/home/shake/bin/ProductClient/trigger_pdl".
-
Automatically clean up old versions of products, and events after 60
days
; note this configuration does not include senders,
; which would be required for sending products.
receivers = receiver_pdl
listeners = indexer
; receive from production hubs
[receiver_pdl]
type = gov.usgs.earthquake.distribution.EIDSNotificationReceiver
storageDirectory = data/receiver_storage
indexFile = data/receiver_index.db
serverHost = prod01-pdl01.cr.usgs.gov
serverPort = 39977
alternateServers = prod02-pdl01.cr.usgs.gov:39977
cleanupInterval = 900000
storageage = 900000
; indexer is only listener
; currently it only receives origin messages
[indexer]
type = gov.usgs.earthquake.indexer.Indexer
listenerIndexFile = data/indexer_listener_index.db
storageDirectory = data/indexer_product_storage
indexfile = data/indexer_product_index.db
includeTypes = origin, associate, disassociate, trump, trump-origin
listeners = indexerlistener_example
archivePolicy = policyOldEvents, policyOldProducts, policyOldProductVersions
[policyOldEvents]
; remove events after one month
type = gov.usgs.earthquake.indexer.ArchivePolicy
maxAge = 2592000000
[policyOldProducts]
; remove unassociated products after one week
type = gov.usgs.earthquake.indexer.ProductArchivePolicy
maxAge = 604800000
onlyUnassociated = true
[policyOldProductVersions]
; remove old versions of products after one hour
type = gov.usgs.earthquake.indexer.ProductArchivePolicy
maxAge = 3600000
onlySuperseded = true
; whenever the indexer makes a change, it calls this listener
; currently it only receives changes triggered by origin products
[indexerlistener_example]
type = gov.usgs.earthquake.indexer.ExternalIndexerListener
storageDirectory = data/indexerlistener_storage
command = echo
processPreferredOnly = true
includeTypes = origin
Indexer Summarization
As an aid to indexing, the Indexer maintains a
product summary
of products, associating them to seismic events using time, latitude and
longitude. Using these three attributes, the Indexer assigns an
eventID
to the summaries, so that multiple products can be efficiently
cross-referenced to a single event.
As part of the summarization process, the Indexer extracts a specific
subset of properties from various products, so that important key aspects
of an event are visible without having to interrogate the details of
multiple products.
Summarized Properties
The following properties are extracted from products and are associated
with summarizations of events:
- region
-
The name of a particular geographic region. Initially the Indexer makes
an attempt at obtaining the region directly from the
origin
or
geoserve
products. Failing that, it derives the region using the event's latitude
and longitude. This derivation is performed by the
feplus
feature of the Indexer, where individual regions are defined by
latitude/longitude within the
etc/config/regions.xml
file.
- maxmmi
-
The maximum shaking intensity found in the
shakemap
product, although maxmmi is directly obtained from the
losspager
product. If not available from losspager, then maxmmi is obtained from
the
dyfi
product.
- alertlevel
-
A categorized fatality or economic loss level, obtained from the
losspager
product.:
- Green
-
0 fatalities OR less than 1 million U.S. dollars economic loss.
- Yellow
-
1-99 fatalities OR less than 100 million U.S. dollars economic loss.
- Orange
-
100-999 fatalities OR less than 1 billion U.S. dollars economic
loss.
- Red
-
1000+ fatalities OR greater than 1 billion U.S. dollars economic
loss.
- review_status
-
Whether this event has been reviewed by a human, obtained from the
origin
product.
- event_type
-
The type of event, such as
earthquake
or
landslide
, obtained from the
origin
product.
- azimuthal_gap
-
Azimuthal Gap is obtained from the
origin
product.
- magnitude
-
Magnitude is obtained from the
origin
product.
- num_Resp
-
The number of individuals completing the DYFI web dialogue for this
event, obtained from the nresponses attribute of the event_data.xml file
included in the
dyfi
product.
- tsunamiFlag
-
A [“true”|“false”] Boolean string indicating if
the tsunami flag should be triggered automatically, obtained from the
geoserve
product.
- utcOffset
-
Number of minutes between the epicenter timezone and UTC, obtained from
the
geoserve
product.
- significance
-
An integer value indicating the significance of an event, calculated
from properties of the
origin
,
losspager
and
dyfi
products.
Significance is calculated from the following multi-step formula:
- magnitude_significance
-
= (100/6.5) * magnitude
2
- pager_significance
-
= 2000 if red 1000 if orange 500 if
yellow
- dyfi_significance
- = MIN(num_Resp, 1000) * maxmmi * 0.10
- significance
-
= MAX(magnitude_significance, pager_significance) +
dyfi_significance
Product Summarized Preferred Weight
Within each type of product, the summary with the largest preferred weight
is considered preferred. This calculated weight is the sum of four
components:
- DEFAULT_PREFERRED_WEIGHT = 1
- All product summaries have a preferred weight of at least 1.
- SAME_SOURCE_WEIGHT = 5
- Weight added when product source is same as event source.
- AUTHORITATIVE_WEIGHT = 100
-
Weight added when product author is in the product's authoritative
region.
- AUTHORITATIVE_EVENT_WEIGHT = 50
- Weight added when product refers to an authoritative event.
Indexer Components
Indexer SQL Dependencies
The Indexer is dependent on two SQL components: the
feplus
system and
OnEventUpdate
stored procedures:
- mysql_feplus
-
Found in the
schema/mysql_feplus
directory, feplus implements region-identifying functionality based on
latitude and longitude. It uses the definitions in the
etc/config/regions.xml
file to associate a region-name with a particular latitude/longitude
location of an event or product. OnEventUpdate stored procedures uses
this functionality for
origin
and
geoserve
products, which ultimately determine properties such as event
significance
.
- onEventTrigger Stored Procedures
-
Found in the
schema/productIndexOnEventUpdateMysql.sql
file, these procedures summarize products and events for efficient
retrieval. The trigger is evoked when the Indexer's Java classes use
time/latitude/longitude information in products to create or modify
events.
Some Major Java Components
- JDBCProductIndex
-
This class implements the ProductIndex interface to maintain events,
product summaries, event summaries and properties. It contains and
executes the SQL manipulations of the database.
- Indexer
-
This key class uses JDBCProductIndex to maintain the database, as well
as adds and removes listeners, receives products and sends
notifications. It extends the DefaultNotificationListener class.
Indexer Modules
Specific products sometimes have special needs for indexing; the three
existing product type of this nature are the
shakemap
,
dyfi
, and
moment-tensor
products. This special indexing is configured in
config.ini
, as is documented in the
Indexer Components
section of the configuration documentation and illustrated below.
The following code snippet from
config.ini
shows the minimum entries necessary for requesting special indexing for
the shakemap and dyfi products:
[indexer]
modules = indexer_module_shakemap, indexer_module_dyfi
[indexer_module_shakemap]
type = gov.usgs.earthquake.shakemap.ShakeMapIndexerModule
[indexer_module_dyfi]
type = gov.usgs.earthquake.dyfi.DYFIIndexerModule
[indexer_module_momenttensor]
type = gov.usgs.earthquake.momenttensor.MTIndexerModule
-
The
modules =
line creates labels for further shakemap and dyfi definition.
-
The
[indexer_module_shakemap]
,
[indexer_module_dyfi]
, and
[indexer_module_momenttensor]
lines mark the start of those definitions.
-
The three
type =
lines specify the Java code classes that will handle the special
indexing for those three product types.
As has been noted elsewhere in this documentation, the custom programming
of these special indexing classes requires coordination between the
product producer and the PDL web team at
gs-haz_dev_team_group@usgs.gov
.
- gov.usgs.earthquake.shakemap.ShakeMapIndexerModule
-
This class implements the ProductIndex interface to maintain events,
product summaries, event summaries and properties. It contains and
executes the SQL manipulations of the database.
- gov.usgs.earthquake.dyfi.DYFIIndexerModule
-
This key class uses JDBCProductIndex to maintain the database, as well
as adds and removes listeners, receives products and sends
notifications. It extends the DefaultNotificationListener class.
- gov.usgs.earthquake.momenttensor.MTIndexerModule
- This class adjusts the weight of moment tensor products.