next up previous contents
Next: Tabular Data Files - Up: File Headers Previous: File Headers   Contents

Subsections

Metadata Syntax

All header entries take the format
parameter = value
where ``parameter'' and ``value'' are strings of printable characters. The equals sign may be bedded in space characters, and white space is stripped before the ``parameter'' token. White space is valid within the ``value'' string which is terminated by the newline character or comment marker, `!'. Leading and trailing white space is removed from the ``value'' token. In the case that the ``value'' token contains only white space a single space character is returned as the token.

Metadata is divided into global information that applies to the whole file and information that describes the variables and their formatting within a record.

Blocks of information describing the variables start with a line
Start_variable = name
and ends with the line
End_variable = name
where the value ``name'' is the name to be used for the variable. The variable blocks are described in the section ``Metadata Describing Variables''. These blocks are required for variable description and header files are not valid without one describing each variable.

The file parameters below are optional for all supported file types. Extra formatting parameters needed for the specific file types are listed in the section dealing with that file type.

File_name
This specifies the name of the file. No default value is assumed. It should not include the path, but does include the file type extension. If this is a detached header it is the name and extension of the header. Good practice would suggest that the header and the data files to which it applies share the same file name stem up to the non-unique part of the name.
File_type
This specifies the format type of the file. A default value is taken according to the file name extension. Valid types are `t' for tabular data files and `d' for delimited, and correspond to the file name extensions `.qft' and `.qfd' respectively.
Attribute_delimiter
This specifies the character to be used in the header itself to separate each entry in a multi-entry variable attribute. If no entry is supplied a comma is assumed since space characters are frequently valid within attribute strings.
Record_numbering
This may take the values ``on'' or ``off''. The default is ``off''. If this parameter is set to ``on'' then the first entry in each record must be the sequence number of that record in the file. This option is primarily intended as an aid to visual inspection of data output.
Start_meta
This starts a block of metadata supplying the entries associated with a global attribute. This block is closed by an End_meta parameter. They are described more fully below. The value associated with these parameters is the name of the global attribute. No defaults are provided, and no global attributes need be supplied.
Comment_marker
The value associated with this parameter is a character that starts a comment line in a data file or header. This is in addition to empty lines and lines starting with an exclamation mark, which are always ignored as comment lines. Any line which starts with the character token specified in ``Comment_marker'' will be ignored whether it be in the data records or the header. Unlike the exclamation mark, this value can only be used as the start of a line.
&
The value associated with this parameter is the number of subsequent lines to ignore. It takes effect immediately and may be repeated within either data or header. It is provided to aid interfacing to packages that add their own header to data files, or to skip data blocks.
Start_after
The value associated with this parameter is a string token contained in the line preceding the start of the data records. For files with attached headers, all lines will be ignored after the ``Start_data'' line up to and including a line containing the ``Start_after'' token. For files with detached headers, all lines in the data file up to and including the line containing with the ``Start_after'' token will be ignored. This is provided to aid interfacing to packages that add their own header to data files

Global attributes

The Global attribute block starts with a line
Start_meta = name
and ends with the line
End_meta = name
where the value ``name'' is the name of the global attribute. The global metadata block contains the following entries:

Number_of_entries
The value associated with this parameter is the number of attribute entries provided for this attribute. It must be followed by the same number of ``Entry'' parameters within in the attribute block.
Entry
The value associated with this entry is the next entry in the global attribute. Space characters are valid within a text entry. This parameter may be repeated, with each successive value providing the next entry in the attribute.
Data_type
The ``Data_type'' of a global attribute is used to convert the ascii text entry in each ``Entry'' into the appropriate data type in the data structure. The default value for each global attribute independently is text. This parameter allows changing of data type for each subsequent ``Entry'' until reset with another ``Data_type'' parameter, or the end of the attribute block is reached.

Metadata Describing Variables

The only essential blocks of metadata are those describing the variables. These give metadata and formatting information specific to the named variable. Variables must appear in the same order within a record that the variable entries occur within the header. Each block of variable metadata takes the form,
Start_variable = name
parameter = value
$\vdots$ $\vdots$
End_variable = name

Essential parameters for each file type are listed in the section dealing with that file type. The following parameters are essential for all file types:

Data_type
This identifies the data type and is necessary for conversion from the ascii entry or for the XDR conversion from binary. Allowed values are
epoch
float
double
char
byte

Sizes
This is essential for any variable that has more than one element, such as arrays and vectors. The value string must comprise as many (`Attribute_delimiter' separated) integer values as there are dimensions in the variable (in the CDF sense) with the number giving the size of the array in that dimension. Thus an 8 by 54 array would have the entry
Sizes = 8,54.
It is not required for scalars.

Time_format
In time series data the time variable must have this parameter to identify the time format used. At present there are two accepted values:

ISO
indicates that the time variable is in the standard ISO string format. Files output by QSAS and Qtran always have this Time_format.
FREE_TIME_FORMAT
indicates that the time variable in the file is held in a string which is not in ISO form but which can be parsed by QIE. The parsing information is passed in the parameter ``TIME_FORMAT_STRING'', augmented optionally by the pair of parameters ``All_records_format'' and ``All_records_time''. These are described briefly below and more completely toward the end of this document.

TIME_FORMAT_STRING
This parameter is mandatory for time variables with
Time_format = FREE_TIME_FORMAT
and describes the time field within the file in terms of key strings such as ``YYYY'' or ``MON''. For example, the ISO format could be written here as
TIME_FORMAT_STRING = YYYY-MO-DD HH:MI:SS.sss
A full list and description of key strings is given toward the end of this document. Note that all specifications of both parameters and values are CASE SENSITIVE and POSITION SENSITIVE, and unexpected results will be obtained if the specification does not match exactly that used within the data file. Note also that leading and trailing white spaces are stripped from the `TIME_FORMAT_STRING' but NOT from the data itself.
All_records_time
In the case of a `FREE_TIME_FORMAT' time format, this is an OPTIONAL parameter specifying a component(s) of time which is the same for all records and not present in the data file itself. For example, for short segments of data the year and date may not be repeated in each record but simply given once in a header (or filename) by some data archival systems. In such cases an entry of, say,
All_records_time = 1984-JAN
supplies the year and month to be used for each record. The string ``1984-JAN'' is then appended by QIE to the entry in the data file to generate a complete time string. If this parameter is present, the following parameter `All_records_format' must also be present to specify its format.
All_records_format
This parameter specifies the format of the `All_records_time' string using the same syntax as the `TIME_FORMAT_STRING' (to which it is appended by QIE). For the example above a suitable entry would be
All_records_format = YYYY-MON
YY_offset
In the case of a `FREE_TIME_FORMAT' time format this parameter specifies a four character year to be added to year strings in two character format. It is ignored with four character year string formats. Thus, if the year 1996 is recorded in the file as `96', the YY_offset must be set to `1900'. No default is provided. Files with `YY' format time strings that cross century boundaries must also set the Year_increasing flag below.

Year_increasing
This parameter specifies that time really is monotonic increasing in the case of a `FREE_TIME_FORMAT' time format with year strings in two character format. It is ignored with four character year string formats. If set to 'y' the `YY_offset' is automatically incremented by a century whenever the YY format year decreases in subsequent records. In monotonic YY format data this only happens when crossing a century boundary.

Data
The CDF concept of a variable that is fixed for all records is supported for flat files. Data for these ``non-record-varying' variables must be supplied within the header variable metadata segment, and no entry is then allowed in the data records. The presence of a parameter ``Data'' will be taken to indicate that this is a non-record-varying variable. The value(s) associated with this parameter are the data for that variable. These are particularly useful for label variables. They are separated by the ``Attribute_delimiter'' as they are specified within the metadata segment.

Desirable parameters are the QSAS subset of the CSDS attributes.

For time series data two further attributes are used by QSAS that do not appear in the CSDS standard:

Other variable attributes may be written within the variable block, and these will always be converted to text strings within the header.


next up previous contents
Next: Tabular Data Files - Up: File Headers Previous: File Headers   Contents
Anthony Allen 2002-04-16