1 Boaz Consultancy, Valwigerberg, Germany
We present a new tool that will allow users to evaluate seismic station performance and characteristics by providing quick and easy transitions between visualizations of the frequency and time domains. The software is based on the probability density functions (PDF) of power spectral densities (PSD) (McNamara and Buland, 2004) and builds on the original development of the seismological data viewer application PQL (IRIS-PASSCAL Quick Look written by RIBoaz). With PQLX, computed PSDs are stored in a MySQL database, allowing a user to access specific time periods of PSDs (PDF subsets) and time series segments through a GUI-driven interface. The power of the method and software lies in the fact that there is no need to screen the data for system transients, earthquakes or general data artifacts since they map into a background probability level. In fact, examination of artifacts related to station operation and episodic cultural noise allow us to estimate both the overall station quality and a baseline level of earth noise at each site.
The output of this analysis tool is useful for both operational and scientific applications. Operationally, it is useful for characterizing the current and past performance of existing broadband stations, for conducting tests on potential new seismic station locations, for evaluating station baselines noise levels (McNamara et al., 2007), for detecting problems with the recording system or sensors, and for evaluating the overall quality of data and meta-data. Scientifically, the tool allows for mining of PSDs for investigations on the evolution of seismic noise (Aster et al., 2007). Currently, PQLX is operational at several international organizations including the USGS National Earthquake Information Center (NEIC), the Albuquerque Seismological Laboratory (ASL), and the IRIS Data management Center for station data monitoring and instrument response quality control. The PQLX system has recently been made available to the community at large. This article aims to describe and illustrate some of its features and capabilities.
Development of PQLX has been generously supported through major funding from the IRIS PASSCAL program, the United States Geological Survey, the IRIS Data Management System program, and the National Science Foundation, with additional minor funding provided by the Institut de Ciencies de la Terra 'Jaume Almera'.
The PQLX server analysis program analyzes all data files and stores these results in a user-specified database. The server can be executed either explicitly from a command line or automated via cron. The system auto-detects and handles the following seismic data formats: mini-SEED, AH, SEGY, SAC, DR100, and NANO. The system is scalable from individual small installations (such as a temporary array) to very large and permanent datasets (multiple networks containing > 8000 real-time channels).
All data channels "discovered" by the server process are processed with all header information (start time, length, sample rate, etc.), and location of gaps and overlaps (mini-SEED format only), stored in the database. Additionally, if a channel is configured for PSD analysis, these results, too, are saved. The method used to calculate the PSD's as well as subsequent creation of the PDF plot follows the algorithm and method laid out by McNamara and Buland, 2004.
Deconvolution of the trace data for PSD analysis is done using the output of the SEED format processing program rdseed, used as input to evalresp(); deconvolution also includes any digital filters. As well, various PSD parameters are configurable by the user: window length, maximum period bound, and minimum and maximum power bounds. By a simple addition to a database table, the user has the ability to define, in generic terms, which channels (following a specific naming convention, e.g., BH*) should be analyzed using a specific configuration. For example, a single database may define two different PSD configurations for BH channels and LH channels (among, possibly, many others). The system is currently delivered to analyze the following channel groups: LH, BH, BL, HG, HN, HL, BH, BN, HH, SH, EH, and EP.
The server also allows for importation (via XML) of other types of data for use by the client GUI. This includes Seismic Event Information (such as an event catalog), as well as Meta-Data Information for each channel (such as name, location, latitude, longitude, instrument type, sensitivity, etc.).
Once all data has been analyzed and results placed into the database, the client GUI program, pqlx, can connect to the database for visualization and client query of the analysis results. It is within the GUI application that most uses and features of the PQLX system are found.
Client GUI Application - pqlx
Building upon the framework of PQL, the client-side GUI can connect to either a local database (held on the same machine) or a remote server, either LAN-based or WAN-based (including internet-wide). This data visualization application is responsible for displaying all graphics of the PQLX system, and is separated into three separate viewing systems: the Trace Viewer (the original PQL), the PDF Viewer, and the STN (station) Viewer. Each provides for different viewing capabilities of seismological data and contributes individually and collectively to the task of data quality control.
The Trace Viewer allows for display of waveform data read in by physical file. Additionally, it provides for the magnification of traces, including multiple zooming options, spectral analysis of selected data, viewing of each of these data views in split screen mode (i.e., simultaneously), as well as display of all header values. (Use of the Trace Viewer does not require a connection to a PQLX database. Indeed, a stand-alone version of PQL is delivered with the PQLX system.) Since all viewers are contained in the same application, pqlx, the Trace Viewer can be invoked from both PDF and STN viewers.
Figure 1 shows a sample Trace Viewer Split Screen Tab displaying three complete traces, a magnified portion and the spectral transformation of this magnified portion. (Click for larger image.)
After connecting to a database, the PDF Viewer allows for display of various types of PDF's based on PSD data previously computed and stored by the server. Either System PDF's held on the database, or PDF's based on user-provided date and time parameters (e.g., a PDF of all PSD's for the month of June over all years of data held) can be requested for display. The main display tab has nine panels and renders System PDF's in various combinations: by station (three different PDF's for a chosen channel group of a single station), by PDF (three different stations for a chosen channel group of a chosen PDF), by both (three different PDF's for three different channels), or as a list (a list of chosen channels of a chosen PDF).
Figure 2 shows the nine panel PDF Viewer Main Tab with the following selections defined by the sidebar controls: all PSD's held on the database (System PDF All); channels BHZ and LHZ; for all stations in the database. (Click for larger image.)
Clicking on any PDF on display in the Main Tab will take the PDF to the Detail Tab of the PDF Viewer for further analysis. This data view allows a user to select a specified portion of the PDF, this "sub-select" defined as either a single point or bounding box. Once specified, three additional views are displayed: a PDF of all PSD's intersecting through the point or bounding box, a histogram displaying start and stop times of all PSD's (X-Axis = day of year, Y-Axis = hour of day), and a view of the first 15 PSD source traces. In addition, mouse-clicking on the lower left panel PDF makes this the new main Detail PDF for further sub-selection in the frequency domain. Mouse-clicking on the trace panel takes the user to the trace viewer for in-depth inspection of the PSD source traces in the time domain.
Figure 3 shows the PDF Detail Tab. The upper left panel displays System PDF All with a sub-select bounding box. The lower left panel displays the resultant PDF of PSDs intersecting the bounding box defined between periods of 9.3 to 25.8 Seconds, and -85 to -103 dB (large earthquakes). The lower right panel is a histogram displaying the start times of the intersecting PSD's. The upper right panel displays the source traces for the intersecting PSDs, here containing the large earthquakes. (Click for larger image)
The third display is the STN Viewer, organizing the display of trace data by station and channel. Fully configurable, it allows the user to specify which stations and channels should be viewed (traversed as a list), how many days of data should be rendered (between 1 and 60), how many lines to display per page, as well as whether the data should be drawn as actual data, or simply rendered as a horizontal line indicating that data exists. This last option allows the user to check for the existence of data (as well as gaps and overlaps) via a connection to a database while not having access to the waveforms themselves.
Figure 4 shows the Main Tab of the STN Viewer displaying two stations of data, displayed channels filtered using "[BL]H*" (i.e., all BH and LH channels), for one week. Hovering over the channel label displays a pop-up providing channel Meta-Data information previously imported by the server (figure 4a). Hovering over the ' Stats ' text for each channel displays a pop-up providing statistical info such as start and stop dates, total number of traces, total gaps and overlaps, and maximum and minimum values (figure 4b). Hovering over the ' PDF ' text displays a pop-up displaying the PDF of the data on display and with an additional click takes the user to the PDF Viewer Detail Tab for investigation (figure 4c).
Quality Control Possibilities
Using PQLX, the sources of many problems encountered with seismological data are easily identified, for example: data gaps and overlaps, instrument-based anomalies, meta-data quality and local noise sources. Historical datasets can be assessed for overall quality, thus increasing the confidence of scientific results. Operators of real-time seismic networks can also benefit from the PQLX system as it allows near real-time analysis and response to data and telemetry problems.
Application at the USGS NEIC
The USGS NEIC receives, in real-time, over 4500 channels of seismic data from over 500 global seismic stations. For many contributed stations, calibration information is not well known. In addition, instrumentation upgrades or changes occur, making it difficult to maintain accurate and timely meta-data. The use of real-time seismic data requires automated QC tools to ensure the accuracy of NEIC real-time earthquake products. Currently, on a daily basis, the NEIC computes PSDs against all channels using the PQLX software system. In order to identify out-of-nominal noise conditions, such as instrument response changes or systems transients, we visually compare the short-term PDFs against the long-term station noise baselines. This method is very sensitive to instrument response inaccuracies.
Figure 5 is an example using the USGS Global Seismographic Network (GSN) station in Lhasa, Tibet, operated by Chinese Digital Seismograph Network (CDSN) (IC.LSA.00.BHZ). An incorrect instrument response has been applied to 2803 hours of data, during 2007:060 to 2007:122, in order to demonstrate the sensitivity of this method to a possible error in instrument response units. Units of acceleration instead of velocity, expressed as an extra zero in the response file, results in counter-clockwise rotation of the PDF. This is clearly observed as low power at short-periods and high power at long-periods, relative to the long-term station baseline model.
Steps Toward Automation
In a system under development at the NEIC, hourly PSDs are compared to the long-term station baselines (McNamara et al., 2007). If the hourly PSD does not fall within the bounds of the station baseline, it is flagged as "out of nominal" and then compared to a set of station-specific noise source models. An effort has begun to characterize station specific noise source models in an attempt to more precisely monitor a station's state of health. "Out of nominal" noise conditions currently monitored include: 1) calibration pulses, 2) the occurrence of missing data , 3) spikes, and 4) mass re-centers. Once these noise models are defined, the server will be able to compare analysis results as they are created with the noise models and, if within (or without) the noise model bounds, will flag this portion of the trace as having breached the noise model in question. A message screen within the GUI will alert the user to the existence of this breach/flag.
Figure 6 is an example of a common excursion from the station baseline due to the re-centering of the Streckeisen STS2 seismometer at the ANSS backbone station US.LRAL. Using the PQLX software, an analyst can select groups of similar PSDs (Figure 6a), define a noise-source model by storing the characteristic maximum and minimum (Figure 6b), and also view the time series segments through the client interface in order to determine the source and characteristics of the noise excursion (Figure 6c). These steps allow the user to visually define the spectral characteristics of known system transients for comparison against the long-term station baselines. In the prototype-automated system, when an hourly PSD falls outside of the long-term station baseline, it is then compared to the set of known noise source models for that channel. If a match of at least 75% occurs, a noise-source detection is declared, stored and ultimately compiled in a report for further investigation by the analyst, system developers and operations managers.
PQLX continues to be a work in progress. Additional forthcoming development includes:
Where To Get It
PQLX has been designed to be relevant for as wide an audience as possible, and to be implemented in the easiest possible manner. Fully open-source, easy to compile and implement, self-configuring based on the existence of data, it allows the network operator and seismologist to concentrate their time and energies on what's important (what the data is trying to tell them), instead of fighting with the computer (and wasting precious time).
Source code for PQLX is freely available for download via WWW from:
Also provided is a fully searchable website for bug reporting and enhancement requests, this can be found at http://wush.net/bugzilla/PQLX. For additional information including the PQLX installation and operations manual and PDF interpretation and method details see: http://geohazards.cr.usgs.gov/staffweb/mcnamara/Software/PQLX.html.
If you would like to be added to the PQLX mailing list, and be automatically informed of availability of future updates, please email Richard Boaz at firstname.lastname@example.org.
Compilation and installation of the entire system happens with a single command and has been demonstrated to work under the following operating systems: LINUX (multiple flavors, including 64-bit architecture), MAC OS X, and Solaris. Portability has been a primary concern, as such, porting to additional systems is likely possible with little effort.
Development Philosophy and History
The PQLX software system has been (and continues to be) developed based on open-source software and is itself open-source (licensed under the GNU GPL, version 2). The overall design provides an architectural framework that is intended to be expandable in the future for inclusion of additional functionality as needed, for ease of use (both for end-user and technical maintainer), and for ease of software maintenance. The system is comprised of both contributed software and original development.
New development in the current release includes: database design, server-side analysis program, client-side user interface, PDF image rendering program (to .png format), data extraction API shell scripts, among others. All original development is provided by Richard Boaz with additional software contributions provided by:
Aster, R., D. McNamara, P. Bromirski, Multi-Decadal Climate-Induced Oceanic Microseism Variability, Seismo. Res. Lett., 79, 2, 2008.
McNamara D.E., C. R. Hutt, R. P. Buland, L. S. Gee, H. Bolton, and H. M. Benz, A method to establish station specific seismic noise baselines for automated QC at the USGS National Earthquake Information Center , Seismo. Res. Lett., in review, 2008.
McNamara, D.E. and R.P. Buland, 2004, Ambient Noise Levels in the Continental United States, Bull. Seism. Soc. Am., 94, 4, 1517-1527.