Observatories and Research Facilities for EUropean Seismology
Volume 5, no 2 September 2003 Orfeus Newsletter


Near Real Time Data at NORSAR for CTBT Monitoring

J. Fyen and K. Iranpour

NORSAR, Instituttveien 25, 2007 Kjeller, Norway

Introduction - IMS Stations Operated by NORSAR - Data transmission
Continuous Data Format CD1.0 - Continuous Data Format CD1.1 - Intra Array Communications
NORSAR Implementation - NORSAR Continuous Data Disk Loop - At the NDC - Summary

Introduction

The Norwegian National Data Center (NDC) is responsible for the design, installation, operation and maintenance of NORSAR's field installations, the data recording and transmission, and the processing and analysis of data. NORSAR's field installations include seismic array stations, three-component seismic stations and radionuclide stations. An infrasound station is planned for installation in 2005. The NDC also maintains databases of seismic data containing digital data of earthquakes, nuclear and non-nuclear explosions since around 1970.
NORSAR performs the technical duties of Norway relating to the Comprehensive Test Ban Treaty (CTBT). The NDC section at NORSAR is tailored to construct, maintain and operate the six Norwegian stations of the International Monitoring System (IMS) established for the verification of compliance with the treaty. Figure 1 shows a map over IMS stations in the Nordic countries including the six Norwegian stations. Under treaty's provisions and described in IMS stations operational manual a set of requirements are to be met. These requirements mostly deal with the issues of data quality and communication between IMS stations and the CTBTO in Vienna, in particular what is defined in the IMS operational manual as data timeliness, data availability and data reliability. The purpose of this report is to describe NORSAR's solution to the issues addressed above.


Figure 1. Nordic / Arctic IMS stations. Station codes signify the station type and its number in the IMS network. "PS" and "AS" represent primary and auxiliary seismic stations, "R" represents Radio nuclide stations and "IS" is an InfraSound station.

IMS Stations Operated by NORSAR

Norsar operates three IMS seismic arrays. These are the large teleseismic Norsar array (NOA PS27)* with a diameter of 60km, the regional Arcess array (ARCES PS28) with a diameter of 3km and the small Spitsbergen array (SPITS AS72) with a diameter of 1km. The NOA array consists of 42 different sites with a total of 63 instruments. These are organized in 7 different subarrays. This is the largest array in the IMS network. The ARCES array has 25 sites with 36 instruments and the SPITS array has 9 sites with 12 instruments. SPITS array represents minimum requirements for the size of an IMS array. Figure 2. shows the approximate design of the three arrays and their relative size.


Figure 2. Schematic plot of Norway's IMS seismic arrays. On the left the ARCES array in Karasjok and the SPITS array on the island of Spitsbergen. On the right the NOA array with its group of seven subarrays near the town of Hamar. The ARCES and SPITS arrays are included for comparison.

Data transmission

Near real time data may take several paths from the provider* (station or NDC) to the consumer (IDC). Common to these is the first phase of data transmission and is the transmission of data from individual array element (digitizer) to the Central Recording Facility (CRF). From there on the data is either directly forwarded to the IDC in what is termed basic topology or to the NDC and thereafter to the IDC in what is termed the independent subnetwork. The latter is the choice of implementation for ARCES and SPITS. For NOA, the CRF is at the NDC.

Data from primary stations arriving at IDC must be in the Continuous Data Format, CD1.0 or the more recent version CD1.1. In addition all IMS data destined for the IDC must be authenticated (signed). After January 2000 data must be signed at the digitizer.

Continuous Data Format CD1.0

The Continuous Data Format CD1.0 is a straightforward TCP/IP program-to-program socket communication and is used to send binary formatted data (frames) from the provider to the IDC or conversely from the IDC to the provider. After the initial connection has been established between the sender application and the receiver at IDC, the station sends a Station Identification Frame and receives the designated port for transmission of further data from the other end. Second a Data Format Frame is sent to the IDC identifying the station channels to which the subsequent data belongs. Finally a continuous stream of data frames is sent. The Frames consist of a header containing the nominal time of channel data and a number of subframes each containing data for one channel (normally 10 seconds), time, number of samples in the subframe as well as some state of health data and authentication information (signature). Table 1 shows how a CD1.0 frame is constructed. The status field in each subframe is used for additional state of health data like power on/off, tampering switch, vault open/close etc. Canadian compression is applied on the samples. The "Alpha library" developed by the "Science Applications International Corporations" (SAIC) for the prototype IDC (pIDC) in early nineties is an example of an application used for data exchange between the provider and the IDC. Application of Alpha library is briefly discussed later in this report.

20 bytes Data Frame Header
4 length 40 signature 8 time stamp 4 # samples Compressed data samples - 10 seconds
4 length 40 signature 8 time stamp 4 # samples Compressed data samples - 10 seconds
4 length 40 signature 8 time stamp 4 # samples Compressed data samples - 10 seconds
4 length 40 signature 8 time stamp 4 # samples Compressed data samples - 10 seconds
.
.
N
.
.
4 length 40 signature 8 time stamp 4 # samples Compressed data samples - 10 seconds

Table 1. CD1.0 Data Frame. The number in each cell represents the length of the field in bytes.

Continuous Data Format CD1.`

Continuous Data Format CD1.1 offers some improvements to its predecessor CD1.0. While channel identification is done immediately after the connection is established in CD1.0 through a frame specially designed for that purpose, this happens at the subframe level in CD1.1. Each subframe carries channel information, making it easier to discover errors in the data. Another aspect unique to CD1.1 is the "application acknowledgment" inherent in the design. If the communication protocol does not guarantee error free data transmission, the application acknowledgement level of CD1.1 would compensate for that deficiency. A third addition to the CD1.1 is its ability to issue commands e.g. generating public key, etc.

As to this date the CD1.1 implementation is only available at two IMS stations. The remaining stations still use CD1.0.

SAIC 's "Public Software Bundle" library offers a comprehensive solution to the whole problem of intra station communication, storage and data transmission to the consumer based on the CD1.1 formatted data. The solution is built around the concept of Framestores. A Framestore is basically a set of directories and files for buffering CD1.1 formatted data. Through applications which form part of the public bundle, data can be retrieved and transported to a similar Framestore or larger Framestores formed by multiplexing single ones on the receiving end (CRF/NDC/IDC).

Intra Array Communications

Within an array, each individual site (digitizer) using some error free protocol communicates with the CRF. A frame of information (packet) of some vendor specific format is sent to the CRF by asynchronous, synchronous, UDP or TCP/IP protocol. Data is then converted to CD format and authenticated (signed). The individual site then sends the signature and status information separately from sample data to the CRF. At CRF the CD1 subframes are recreated and along with the corresponding signatures form a 10 seconds CD1 frame which is then sent to the NDC or the IDC. Current systems that use CD1.1 send signed data from digitizer/authenticator in CD1.1 subframes to the CRF.

NORSAR Implementation

The arrays operated by NORSAR was built before 1 January 2000 and thus escaped the requirements later imposed concerning the signing of data at the individual sites rather than centrally. The data arrives at the CRF in some vendor specific format. At PS27 this is Science Horizon's AIM24 single second packets of compressed data. The protocol for transmission is synchronous SDLC (ADCCP). Each frame is synchronized to the start of a second. A communication Interface Module (CIM) connects to the digitizers using an RS422 interface, buffers and delivers one second data frames on the SCSI interface connected to a SUN solaris workstation.

At PS28 Nanometrics HRD24 digitizers pack 17 bytes of compressed data into one frame and transmit this using asynchronous communication to RM4 multiplexers at the center. The RM4 multiplexers are connected with the central SUN solaris workstation in the local ethernet based network. The RM4 acts as a server that can deliver 15 times 17 bytes data frames using UDP protocol.

In both cases the NORSAR applications collect the frames from the AIM24 or the RM4 and store the data into a circular disk buffer. Another application then reads the disk buffer, reformat the data into CD1 subframes and record the data into NORSAR style, time indexed disk loop. Then a third application keeping track of the last transmitted CD1 frame, sends the newly formed frames to the NDC. At the NDC a receiving application takes the CD1 frames and write to a corresponding CD1 indexed diskloop. The concept of NORSAR diskloops is discussed later in this document. Figure 3 and Figure 4 are schematic illustrations of the various steps in NORSAR arrays data communication.


Figure 3. Schematic plot of NORSAR data acquisition from individual sites to the CRF. Data packets are written to a frameloop, converted to CD format before being dispatched to the NDC.


Figure 4. From CRF to NDC. Socket communication used to transmit CD1 frames from one diskloop at CRF to corresponding diskloop at NDC over a VSAT link.

Arriving at the NDC, the CD1 frames are forwarded to the IDC using the Alpha library over the GCI link (at NORSAR this is a frame relay. VSAT is more commonly used). The AlphaRead reads samples from disk loop, and calls a subroutine of alphalib to create CD1 formatted data and write it to the heap file. The AlphaSend empties the heap file by sending CD1 frames to the IDC using LIFO- last in, first out sequence.

NORSAR Continuous Data Disk Loop

The NORSAR diskloop is simply a UNIX file system consisting of as many files as the number of hours of data the disk loop spans. Thus a weeklong diskloop would have 168 files. Each file then contains a number of record slots proportional to the length of each frame. Frames of 10 seconds thus will define files containing 360 slots. Then indexing into the diskloop for reading or writing is a matter of simple arithmetic given the time of the first sample for a record. This structure is independent of the format of data.

At the NDC

When arriving at the NDC, the data now in NORSAR style diskloop is converted to continuous CSS3.0 format. The CSS formatted data is written into a file system organized around the date of data and the station of origin. The file system is input to the automatic array processing tasks. It also serves as the input to the archiving process. A tape robot with a capacity of 30 terabyte reads the data from the CSS file system and writes an indexed copy into a tape where the data can be easily retrieved if needed. NORSAR keeps a comprehensive archive of data dating back to 1971. Segmented data from that time is recorded on \275 inch magnetic tapes. Since September 1982, all continuous data is recorded on tape. This archive is added to by 2.5 Gigabyte of data every day.

In addition to Norwegian stations, NORSAR collects data from the Finnish IMS primary station FINES (PS17) over the internet and the Swedish HAGFORS array (AS101) over a TDMA VSAT link. Data from AS101 is then forwarded to Stockholm over the internet.\240 Figure 5 is an illustration of various paths of data from different sources to the NDC and from there to its final destination at IDC.


Figure 5. NORSAR independent subnetwork

Summary

Solutions to the challenges of data communication and data processing of seismic arrays have evolved over more than 30 years at NORSAR. The principle idea throughout these years have been to develop simple and at the same time robust solutions. The acquisition and processing tasks have been reduced to smaller and more manageable modules, each concerning itself with only a section of the entire process. The key link between the various modules is the time of latest processed data. All the applications, both those directly communicating with the hardware and those managing the processing, analysis and storage tasks are time sequential.

To ensure recovery of the system in case of a problem leading to stoppage of some of the sub tasks, UNIX crontabs are extensively used. The time sequential concept inherent in all the subtasks allows easy location and recovery of the error by the crontab processes.





* The code represents the designated station code in IMS station network. Seismic stations are assigned either PS (primary station) or AS (Auxiliary station).
* These terms are used in the IDC manual.

page 10
Copyright © 2003. Orfeus. All rights reserved.