SEISAN: Multiplatform implementation of MINISEED/SEED
SEED (Standard for the Exchange of Earthquake Data) has been around for nearly 15 years and is the FDSN (International Federation of Digital Seismograph Networks) standard for storing and exchanging seismic data. At its introduction, there were plans to quickly make toolboxes for several computer platforms and programming languages so it should be easy to read and write for everybody. Some software were written for C and Unix, however it has taken a long time for SEED to spread to other platforms and languages and SEED has mainly been used for data storage by the big data centers. Essentially, there is currently only one good SEED reading program, rdseed, which works on Unix and Linux. At the time of writing this article, a Java version of rdseed is being developed at IRIS, which will work on different platforms. It is called jrdseed and is available for testing. While some SEED utilities and libraries are available, developed by different agencies, little has been done to facilitate use of SEED under Windows. In this communication, we describe an implementation of MINISEED/SEED routines with the SEISAN processing software.
Considering the huge amount of SEED data and the easy availability of it, it is surprising that there has not been a lot processing software developed to use SEED. Although SEED was constructed for long term archiving and data exchange, there is no good reason for not using it for processing. However lack of generally available processing software has made the current practice to download SEED, get it to some other format like SAC and start processing. This seems very backwards. Here we have a format with all required information, fast to read due to compression and easy to adapt to direct access. So why fool around with many other formats? The answer seems to be that general reading software is scarce, a general reluctance to start programming from scratch, particularly since SEED has got the reputation of being the most complex format for seismic data on earth (probably true, was also the authors opinion). The mammoth SEED manual is enough to scare any potential programmer before even starting, it certainly is not bedside reading.
A reduced version of SEED without the header information is MINISEED. MINISEED is easier to deal with. However, it is not complete data since the instrument response is missing. Fortunately, MINISEED has gained wider acceptance on a lower level and much more software is available for MINISEED than for SEED. An example is the SeismicHandler processing software that reads MINISEED data. For Java programmers, there is MINISEED reading software made by Lomax as implemented with SeisGram2K. The most important factor contributing to the wider acceptance is that nearly all commercial recorders now record directly in MINISEED and a de facto standard has been established, not a small thing in the individualistic seismologist world. This was actually attempted before with SUDS, but never caught on due to problems of agreeing on a unified Unix-Windows standard. Fortunately SEED has no such problems.
SEISAN (Havskov and Ottemöller, 2000) is probably the only major processing system that works identically on several computer platforms (including Microsoft Windows). It has long been a wish to read MINISEED, however for the reasons mentioned above, it never got under way. Fortunately for us, we got some free programming time (co-author C. R.) and gave him the challenge to make a MINISEED reading routine. The initial idea was to fish out the reading routines from rdseed and link it to Fortran, however rdseed is a very integrated program and far from being a toolbox, so this proved not practical. Since the routines were to be written from scratch, it was done in Fortran77 which makes integration easier with SEISAN (also written in Fortran). Nevertheless, rdseed was an invaluable help in writing the programs.
It turned out that reading MINISEED was not such a terrible job, making us wonder why we did not do it a long time before. Since SEISAN is a processing system, it would be even more useful to also read SEED, which is almost like reading MINISEED if the main headers are skipped so that was also implemented. This of course gives throws away the instrument response, which is in the SEED main headers. On the other hand, if we run a network, the instrument response does not change every day so it is enough to have the corresponding response files. So the last thing implemented was that SEISAN can read the SEED response files EXACTLY as extracted with rdseed without change of name. Finally MINISEED writing has also been implemented in SEISAN, also with Fortran routines. All these routines are identical for Solaris, Linux and Windows and written without any reference to SEISAN, so it should be possible to use them for other programs.
The SEED format has very many possibilities of writing seismic and other data and rdseed is probably the only software that can deal with it all. We have no such aspirations, the main goal was to be able to read nearly all data written in SEED. The following will describe what was done and the limitation of the software.
The reading routines we developed have the following capabilities:
The functions in the library were designed in order to be used in two steps. First, the file is summarized (by reading all headers in file), that is, we identify the number of channels, and the following information for each of them: station name, channel (component) name, start time, sample rate, number of samples, flag of timing quality, start and end position in the file. This summary is built by checking the information of the fixed header in the Data Records. It means that the SEED headers are mostly ignored. Only the block size and the encoding format are used, when there are SEED headers.
After summarizing the file, it is possible to read the waveforms for each channel. Since some files may cover periods of time that are too large for the buffer in the calling program, the library offers a way to read just an interval of the channel.
Different SEED and Mini-SEED file writers generate files with different particularities. That often forced us to change the way to read the files, so our program could work with all the different files tested. This probably covers most file types used, however, there will probably be files we cannot read. In order to get possible problem files, a test version of some SEISAN programs with SEED reading capability was made available to SEISAN users and this brought our attention to some problems. All files were tested on all 3 platforms. Here we will report some of the testing. For details, see Canabrava (2004).
SEED files tested from Orfeus had more than one blockette 30. This means that different channels can have different encoding formats. All files tested from Orfeus were ok.
The files created by IRIS may also have different encoding formats for each channel. The files tested presented another difference in implementation. The record length of the Data Records may differ from the length in the headers. The only such combination seen so far was 4096-byte blocks for headers and 512-byte blocks for Data Records. The way we implemented this would fail if different channels have different block sizes. However we found no such files and all files tested were ok. It's worth noting that IRIS files use the 8 th byte of data records as a continuation code, although the SEED manual says it should always have a space character (ASCII 20).
Quanterra data logger
Quanterra files are multiplexed and required a different set of routines. Also the multiplexing is not regular, often you can find sequences like: BHE BHN BHE BHZ. It means that, when reading the waveform, we cannot take full advantage of direct access, unless all the blocks are indexed in memory. Since we only store the start and end point of each channel, by the time we decompress the waveform we have to go again through all blocks in between, check the headers and discard the ones that don't belong to the channel.
We also tried files from Seedlink, Kinemetrics and Guralp SCREAM and found that they are all well behaved MINISEED files.
Functions for writing Mini-SEED were developed in Fortran as well. The Fortran routines are in the same library as the functions for reading SEED files. They have the following characteristics:
The writing routines were also tested on all 3 platforms.
SEISAN in the latest release, fully supports reading SEED, including the response files so it should be very easy to download data from a data center and start processing right away. The ability to read directly a segment of a large SEED volume, or browsing through a big volume should further make it easy to process the data. However, due to the format's complexity it will take some time to further test and debug the software. Hopefully the SEISAN implementation will promote more use of SEED, which it certainly deserves.
During programming, Reinoud Sleeman provided help in figuring out some of the obscure sides of the SEED format and provided valuable suggestions. He also made many valuable suggestions/corrections to this communication. We also appreciated discussions with Chad Trabant and have used his MSI tool to check miniseed files (http://www.iris.edu/chad/).
Canabrava, R. N. B. (2004). A toolbox for reading SEED and MiniSEED and writing MiniSEED. Norwegian National Seismic Network, Technical Report No. 18, department of Earth Science, University of Bergen.
Havskov, J. and L. Ottemöller (2000). SEISAN earthquake analysis software. Seismological Research letters, 70, 532-534.
IRIS Consortium. Standard for the Exchange of Earthquake Data – Reference Manual, 2nd Edition, February 1993.