Sleeman, R. 1, J. Vila 2

1Koninklijk Nederlands Meteorologisch Instituut (KNMI)
2Lab. Estudis Geofisics "Eduard Fontsere" (IEC)/ University of Barcelona (UB)

Introduction

The Virtual European Broadband Seismograph Network (VEBSN) is the European virtual network of broadband seismograph stations linked together thanks to the EC-projects MEREDIAN (EVR1-CT-2000-40007) and NERIES (FP6-contract no. 026130), for which (near) real-time waveform data is collected at the ORFEUS Data Center (ODC). The ODC receives data over the Internet continuously and processes the data automatically by Antelope®. At present (December 2006) more than 140 stations contribute to the VEBSN. Event data from the VEBSN is available to the user by ftp (orfeus.knmi.nl) and by http (http://www.orfeus-eu.org/).

The VEBSN automates central data collection and archiving, and rapidly provides high quality waveform data to the research community. The collection of high quality waveform data provides also new opportunities for rapid secondary products, like the hypocenter and magnitude calculations for large to medium sized earthquakes and also focal mechanism determination.

The quality of the waveform data depends primarily on the quality of the seismic station, like the site conditions, and the quality of sensor and digitizer. The quality can be measured in many ways, for example in terms of noise level (background, instrumental), non-seismic disturbances or timing quality. Well-known installation techniques like thermal insulation of the sensor are used in many places to decrease the instrumental noise at lower frequencies and hence to improve the quality of the data in this sense. Data availability, data gaps or data latency usually depend on the reliability and quality of the communication channel. Application of robust communication protocols over Internet (e.g. SeisComp/SeedLink) has increased the quality of real-time data in terms of gaps or latency substantially.

To sustain a system with high quality data (and its secondary products) it is required to monitor the quality of the system in time. A robust monitoring system enables the (automatic) detection and inspection of problems or changes in the system, with the aim to minimize failures and their effects on the quality of the system. The central data collection system in the VEBSN also provides opportunities for monitoring the quality of the seismic data and the seismic stations in a homogeneous, uniform way. Quality monitoring of the VEBSN is a prerequisite for detecting changes in site conditions, instrumental problems, data communication failures as well as missing or erroneous metadata (i.e. technical characteristics of the instrumentation) information.

The ODC is currently setting up and testing quality control (QC) procedures to monitor the data quality of the VEBSN stations. Figure 1 displays the present dataflow at the ODC, from the primary real-time data collection to the ODC archive of (quality controlled) event data in the archive. For many years the quality control was focused at the end of the pipeline, where event data was bundled together. This step involved mainly visual inspection of the waveform data, verifying metadata like response information and checking the internal consistency of the SEED volumes. The present development and implementation of new QC procedures is mainly focused at the beginning of the pipeline, where the data is collected in real-time. As is shown in the figure these procedures work on the buffer of continuous data with the aim to detect problems in an early stage and hence open the road to rapidly detect and report problems to the network or station operator as to minimize the impacts of technical problems.


Fig. 1.
Data processing pipeline at the ODC (status: December 2006)

Quality Control

A near real-time quality control procedure is currently applied to the continuous incoming waveform data of the VEBSN stations, with the aim to monitor and detect errors or changes in behavior of the whole system. Specifically, the initial aim is to detect malfunctioning of the seismic instrumentation, changes of system response information in time, changes in local site conditions and incorrect system response information (metadata).

The present setup is build on 3 sub-processes to monitor (1) the overall quality of the data in time, (2) the overall quality of the data in a statistical way, and (3) the quality of real-time communication and availability of data. Procedures (1) and (2) are based on an estimation of the Power Spectral Density (PSD) of the ground acceleration taken from 30 minutes of data. The third procedure uses a database, managed by Antelope®, to monitor gaps in the data as a function of time.

Monitoring of PSD variations in time

The main goal of this procedure is to detect (abnormal) changes of the PSD in time, to monitor gaps in the data and to visualize the overall behavior of the station in time. Changes of the PSD may reflect for example a disconnected or broken sensor, but also changes in local site conditions or instrumentation. This procedure uses a Power Spectral Density (PSD) estimation of the ground acceleration, taken from data recordings of 30 minutes. Basically the following steps are involved:

  1. Extraction of raw data from the pool of continuous data of the VEBSN stations into segments of 30 minutes. Segments containing less than 28 minutes of data are not processed and are visualized as gaps in the plots below.
  2. For each segment the PSD is estimated using periodogram averaging (Welch, 1967). Only positive frequencies are taken into account (so-called one-sided PSD) to compare the PSD with the USGS's High Noise Model (NLNM) and Low Noise Model (NLNM) (Peterson, 1993). PSD values are smoothed slightly by taking the average of PSD values in a constant relative bandwidth of 1/10 of a decade.
  3. The PSD is deconvolved with the instrument response to convert the PSD from digital counts [counts^2/Hz] into ground acceleration [m^2/s^4/Hz]. The instrument response is extracted from dataless SEED volumes. All poles and zeros in SEED blockettes 53 and the (overall) sensitivity in SEED blockette 58 are used in the deconvolution. Digital filters are not used.
  4. For selected frequencies [0.01 Hz, 0.05 Hz, 0.5 Hz and 2.0 Hz] the PSD is stored as function of time. (See Fig. 2, top frame).
  5. The PSD is integrated in frequency bands [20 - 10 sec] and [10 - 0 sec] to get the energy in these frequency bands, also as function of time. (See Fig.2, middle frame).
  6. The bottom envelope of PSD is being tracked and updated after the processing of a segment of 30 minutes of data. This envelope shows the lowest, measured noise at each frequency. This envelope could be misinterpreted when there is no sensor connected to the digitizer, because then only the digitizer noise is measured. To eliminate this type of data we have put the following criteria on the PSD data: PSD values from a segment may only be used to update the minimum PSD level if the PSD at 0.14 Hz is above -155 dB.
  7. All above results may be biased due to incorrect system response information. In order to detect inconsistencies in this information we display the lowest, measured noise by variations of step 3:
    • after applying full response correction (including all metadata for the sensor and the digitizer) – in green
    • after applying full response correction (including all metadata for the sensor and recalculated normalization factor, and all metadata for the digitizer) – in blue
    • after full gain correction – in green

The recalculated normalization factor in step 2 is only applied as additional check to verify this value. All 3 steps should give the same result in the frequency band for which the system has a flat velocity response.

Fig. 2. Typical example of PSD variations (in dB rel. to 1 m/s2) in one station as a function of time. The top frame shows the PSD as function of time for selected frequencies (0.01 Hz, 0.05 Hz, 0.5 Hz and 2.0 Hz). Typical features are (a) samples with high PSD values representing seismic events, (b) gaps representing missing waveform data and (c) the seasonal fluctuations for a large frequency band. The middle frame represents the energy in different frequency bands (0.1 – 1 Hz and 0.05 – 0.1 Hz). In the bottom frame the lowest measured PSD at each frequency is displayed. It can be interpreted as a measure for the background noise.

Statistical analysis of PSD

Other implementations related to quality control monitoring are being tested. One such implementation presents the overall quality of the data in a statistical way (McNamara and Buland, 2004). The implementation, developed by Richard Boaz for IRIS, shows the most common levels of background noise in a station, as well as the probability values for different levels due to other types of signals (earthquakes, gaps, calibrations, etc). This procedure is part of QUACK (Quality Analysis Control Kit), which is an initiative by IRIS DMC to develop a modular design toolkit for monitoring the quality of seismic data. An example is given in Fig. 3.

This procedure is useful in characterizing the performance of a station in a statistical way. It uses a PSD estimation of the ground acceleration, taken from data recordings of 60 minutes. This procedure takes the data as it is, so it does not check for gaps, earthquakes or clipped data. The data is deconvolved with the complete system response by the use of 'evalresp', so including digital FIR filters if appropriate. For each 1 hour segment the PSD is calculated and smoothed in 1/8 octave intervals. Powers for each 1/8 octave interval are accumulated in 1 dB power bins. A statistical analysis of the power bins gives probability density functions (PDF) as a function of noise power for each frequency band. As no screening of the data takes place different types of signals (earthquake signals, gaps, spikes, mass re-centers or calibrations) will be included in the processing but only visible with low probability values.

Fig. 3. Typical example of distribution of PSD estimates (in dB rel. to 1 m/s2) in one station.

Data availability

Another initiative taken by ORFEUS Data Center (ODC) is the development of real-time monitoring tools on the Antelope® system. Antelope® collects and processes seismic data from the VEBSN stations in real-time. Tools for monitoring communication, data gaps and RMS values may be valuable to improve the quality of the data.

Examples

Figures 4 to 7 give examples of PSD variations in time for 4 different seismic stations. The examples are shown as they reflect the usefulness of monitoring the PSD in detecting different features.

Figure 4 displays the PSD for station LRW (Lerwick, UK). The top frame and middle frame reveal a distinct step in PSD and energy around day 258 for both low and high frequencies, indicating a change that influences the whole seismic system. On 9 Sept 2004 both the seismometer and the digitizer were replaced, resulting in a different system response and sensitivity.

Figure 5 shows the results for station ZST (Bratislava--Zelezna Studnicka, Slovak Republic). Between day 220 and 230 the PSD levels for the selected frequencies drop down with about 20 dB to a more or less constant level. This feature is explained by the network operator due to a failure in the analogue pre-amplifier of the system during this period. Note in the bottom frame the different PSD curves as obtained by deconvolving the data with the full response (in blue) and with the gain only (in red). Data between day 220 and 230 were automatically rejected as the PSD values at 0.14 Hz were below -155 db (see above). As expected the PSD levels differ beyond the corner frequencies of the sensor, and is indicative for the correctness of the response description.

Figure 6 shows the results for station KHC (Kasperské Hory, Czech Republic). For all frequencies the minimum of the seismic background noise is 10 dB (or more) above the NLNM. As KHC is known as a very quiet station in terms of seismic background noise this result indicates that the sensitivity in the metadata is probably incorrect, which was corroborated by the operator when it was detected.

Figure 7 is an example using data from station VLC (Villacollemandina, Italy). The PSD levels for low frequencies (0.01 Hz and 0.05 Hz) show a large step at around day 180, whereas the PSD levels for high frequencies do not show this step. This step is also visible in the energy in the low frequency band, but not in the high frequency band. Temperature isolation of the sensor (STS-2) can explain this observation. The network operator confirmed that the sensor is isolated on 7 July. Another remarkable feature in the upper frame is that the PSD level for 2.0 Hz is split over 2 bands separated by about 20 dB. A closer inspection (see Fig.8) of the waveform data reveals many intervals (of several hours of length) over the year having increased energy around 2 Hz and 8 Hz.

Figure 8 displays a part of the time series from station VLC filtered above 2 Hz which shows distinct transients with high amplitudes at a constant level. At this stage the reason for this phenomenon is unknown.

Improvements and future plans

Real-time analysis of the second level time series (PSD as function of time) is a powerful tool to rapidly detect changes in the behavior of the seismic recording system or to detect erroneous interpretation of data (by using incorrect metadata). The challenge for the future is the implementation of these monitoring tools in the ODC processing pipeline (see Figure 1) as to disseminate alert messages to data providers and network operators in cases of malfunctioning of the seismic network of the communication. In this way the real-time quality monitoring tools will be useful as to ensure the quality of operation of the VEBSN as well as to optimize the availability of continuous data of this virtual network.

Other monitoring tools will be developed to (a) graphically display the full system response of each station in time, and (b) to monitor the output of the Antelope® system in terms of seismic information (e.g. magnitudes, time residuals). For the current status of the QCM have a look on http://www.orfeus-eu.org/data-info/dataquality.htm.

Acknowledgements

We thank all VEBSN network operators for making data available, in particular Salvatore Mazza, Lars Ottemuller, Jan Zednik and Peter Labak. Thanks to Daniel MacNamara, Richard Boaz and IRIS for providing the PDF analysis code.

References

McNamara, D.E and R.P. Buland, Ambient noise levels in the continental Unites States, Bull. Seism. Soc. Am., 94, 4, 1517-1527, 2004.

Peterson, J., Observations and modeling of seismic background noise. U.S. Geol. Survey Open-File Report 93-322, 95 pp., 1993.

Welch, P.D., The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans. Audio and Electroacoust., AU-15, 70-73, 1967.

Top