The Virtual European Broadband Seismograph Network (VEBSN) is the European virtual network of broadband seismograph stations linked together thanks to the EC-projects MEREDIAN (EVR1-CT-2000-40007) and NERIES (FP6-contract no. 026130), for which (near) real-time waveform data is collected at the ORFEUS Data Center (ODC). The ODC receives data over the Internet continuously and processes the data automatically by Antelope®. At present (December 2006) more than 140 stations contribute to the VEBSN. Event data from the VEBSN is available to the user by ftp (orfeus.knmi.nl) and by http (http://www.orfeus-eu.org/).
The VEBSN automates central data collection and archiving, and rapidly provides high quality waveform data to the research community. The collection of high quality waveform data provides also new opportunities for rapid secondary products, like the hypocenter and magnitude calculations for large to medium sized earthquakes and also focal mechanism determination.
The quality of the waveform data depends primarily on the quality of the seismic station, like the site conditions, and the quality of sensor and digitizer. The quality can be measured in many ways, for example in terms of noise level (background, instrumental), non-seismic disturbances or timing quality. Well-known installation techniques like thermal insulation of the sensor are used in many places to decrease the instrumental noise at lower frequencies and hence to improve the quality of the data in this sense. Data availability, data gaps or data latency usually depend on the reliability and quality of the communication channel. Application of robust communication protocols over Internet (e.g. SeisComp/SeedLink) has increased the quality of real-time data in terms of gaps or latency substantially.
To sustain a system with high quality data (and its secondary products) it is required to monitor the quality of the system in time. A robust monitoring system enables the (automatic) detection and inspection of problems or changes in the system, with the aim to minimize failures and their effects on the quality of the system. The central data collection system in the VEBSN also provides opportunities for monitoring the quality of the seismic data and the seismic stations in a homogeneous, uniform way. Quality monitoring of the VEBSN is a prerequisite for detecting changes in site conditions, instrumental problems, data communication failures as well as missing or erroneous metadata (i.e. technical characteristics of the instrumentation) information.
The ODC is currently setting up and testing quality control (QC) procedures to monitor the data quality of the VEBSN stations. Figure 1 displays the present dataflow at the ODC, from the primary real-time data collection to the ODC archive of (quality controlled) event data in the archive. For many years the quality control was focused at the end of the pipeline, where event data was bundled together. This step involved mainly visual inspection of the waveform data, verifying metadata like response information and checking the internal consistency of the SEED volumes. The present development and implementation of new QC procedures is mainly focused at the beginning of the pipeline, where the data is collected in real-time. As is shown in the figure these procedures work on the buffer of continuous data with the aim to detect problems in an early stage and hence open the road to rapidly detect and report problems to the network or station operator as to minimize the impacts of technical problems.
A near real-time quality control procedure is currently applied to the continuous incoming waveform data of the VEBSN stations, with the aim to monitor and detect errors or changes in behavior of the whole system. Specifically, the initial aim is to detect malfunctioning of the seismic instrumentation, changes of system response information in time, changes in local site conditions and incorrect system response information (metadata).
The present setup is build on 3 sub-processes to monitor (1) the overall quality of the data in time, (2) the overall quality of the data in a statistical way, and (3) the quality of real-time communication and availability of data. Procedures (1) and (2) are based on an estimation of the Power Spectral Density (PSD) of the ground acceleration taken from 30 minutes of data. The third procedure uses a database, managed by Antelope®, to monitor gaps in the data as a function of time.
Monitoring of PSD variations in time
The main goal of this procedure is to detect (abnormal) changes of the PSD in time, to monitor gaps in the data and to visualize the overall behavior of the station in time. Changes of the PSD may reflect for example a disconnected or broken sensor, but also changes in local site conditions or instrumentation. This procedure uses a Power Spectral Density (PSD) estimation of the ground acceleration, taken from data recordings of 30 minutes. Basically the following steps are involved:
The recalculated normalization factor in step 2 is only applied as additional check to verify this value. All 3 steps should give the same result in the frequency band for which the system has a flat velocity response.
Fig. 2. Typical example of PSD variations (in dB rel. to 1 m/s2) in one station as a function of time. The top frame shows the PSD as function of time for selected frequencies (0.01 Hz, 0.05 Hz, 0.5 Hz and 2.0 Hz). Typical features are (a) samples with high PSD values representing seismic events, (b) gaps representing missing waveform data and (c) the seasonal fluctuations for a large frequency band. The middle frame represents the energy in different frequency bands (0.1 – 1 Hz and 0.05 – 0.1 Hz). In the bottom frame the lowest measured PSD at each frequency is displayed. It can be interpreted as a measure for the background noise.
Statistical analysis of PSD
Other implementations related to quality control monitoring are being tested. One such implementation presents the overall quality of the data in a statistical way (McNamara and Buland, 2004). The implementation, developed by Richard Boaz for IRIS, shows the most common levels of background noise in a station, as well as the probability values for different levels due to other types of signals (earthquakes, gaps, calibrations, etc). This procedure is part of QUACK (Quality Analysis Control Kit), which is an initiative by IRIS DMC to develop a modular design toolkit for monitoring the quality of seismic data. An example is given in Fig. 3.
This procedure is useful in characterizing the performance of a station in a statistical way. It uses a PSD estimation of the ground acceleration, taken from data recordings of 60 minutes. This procedure takes the data as it is, so it does not check for gaps, earthquakes or clipped data. The data is deconvolved with the complete system response by the use of 'evalresp', so including digital FIR filters if appropriate. For each 1 hour segment the PSD is calculated and smoothed in 1/8 octave intervals. Powers for each 1/8 octave interval are accumulated in 1 dB power bins. A statistical analysis of the power bins gives probability density functions (PDF) as a function of noise power for each frequency band. As no screening of the data takes place different types of signals (earthquake signals, gaps, spikes, mass re-centers or calibrations) will be included in the processing but only visible with low probability values.
Fig. 3. Typical example of distribution of PSD estimates (in dB rel. to 1 m/s2) in one station.
Another initiative taken by ORFEUS Data Center (ODC) is the development of real-time monitoring tools on the Antelope® system. Antelope® collects and processes seismic data from the VEBSN stations in real-time. Tools for monitoring communication, data gaps and RMS values may be valuable to improve the quality of the data.
Figures 4 to 7 give examples of PSD variations in time for 4 different seismic stations. The examples are shown as they reflect the usefulness of monitoring the PSD in detecting different features.
Figure 4 displays the PSD for station LRW (Lerwick, UK). The top frame and middle frame reveal a distinct step in PSD and energy around day 258 for both low and high frequencies, indicating a change that influences the whole seismic system. On 9 Sept 2004 both the seismometer and the digitizer were replaced, resulting in a different system response and sensitivity.
Figure 5 shows the results for station ZST (Bratislava--Zelezna Studnicka, Slovak Republic). Between day 220 and 230 the PSD levels for the selected frequencies drop down with about 20 dB to a more or less constant level. This feature is explained by the network operator due to a failure in the analogue pre-amplifier of the system during this period. Note in the bottom frame the different PSD curves as obtained by deconvolving the data with the full response (in blue) and with the gain only (in red). Data between day 220 and 230 were automatically rejected as the PSD values at 0.14 Hz were below -155 db (see above). As expected the PSD levels differ beyond the corner frequencies of the sensor, and is indicative for the correctness of the response description.
Figure 6 shows the results for station KHC (Kasperské Hory, Czech Republic). For all frequencies the minimum of the seismic background noise is 10 dB (or more) above the NLNM. As KHC is known as a very quiet station in terms of seismic background noise this result indicates that the sensitivity in the metadata is probably incorrect, which was corroborated by the operator when it was detected.
Figure 7 is an example using data from station VLC (Villacollemandina, Italy). The PSD levels for low frequencies (0.01 Hz and 0.05 Hz) show a large step at around day 180, whereas the PSD levels for high frequencies do not show this step. This step is also visible in the energy in the low frequency band, but not in the high frequency band. Temperature isolation of the sensor (STS-2) can explain this observation. The network operator confirmed that the sensor is isolated on 7 July. Another remarkable feature in the upper frame is that the PSD level for 2.0 Hz is split over 2 bands separated by about 20 dB. A closer inspection (see Fig.8) of the waveform data reveals many intervals (of several hours of length) over the year having increased energy around 2 Hz and 8 Hz.
Figure 8 displays a part of the time series from station VLC filtered above 2 Hz which shows distinct transients with high amplitudes at a constant level. At this stage the reason for this phenomenon is unknown.
Improvements and future plans
Real-time analysis of the second level time series (PSD as function of time) is a powerful tool to rapidly detect changes in the behavior of the seismic recording system or to detect erroneous interpretation of data (by using incorrect metadata). The challenge for the future is the implementation of these monitoring tools in the ODC processing pipeline (see Figure 1) as to disseminate alert messages to data providers and network operators in cases of malfunctioning of the seismic network of the communication. In this way the real-time quality monitoring tools will be useful as to ensure the quality of operation of the VEBSN as well as to optimize the availability of continuous data of this virtual network.
Other monitoring tools will be developed to (a) graphically display the full system response of each station in time, and (b) to monitor the output of the Antelope® system in terms of seismic information (e.g. magnitudes, time residuals). For the current status of the QCM have a look on http://www.orfeus-eu.org/data-info/dataquality.htm.
We thank all VEBSN network operators for making data available, in particular Salvatore Mazza, Lars Ottemuller, Jan Zednik and Peter Labak. Thanks to Daniel MacNamara, Richard Boaz and IRIS for providing the PDF analysis code.
McNamara, D.E and R.P. Buland, Ambient noise levels in the continental Unites States, Bull. Seism. Soc. Am., 94, 4, 1517-1527, 2004.
Peterson, J., Observations and modeling of seismic background noise. U.S. Geol. Survey Open-File Report 93-322, 95 pp., 1993.
Welch, P.D., The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans. Audio and Electroacoust., AU-15, 70-73, 1967.