Previous: q2c Input Structure, Up: q2c Input Format



E.3 Grammar Rules

The grammar rules describe the format of the syntax that the parser generated by q2c will understand. The way that the grammar rules are included in q2c input file are described above.

The grammar rules are divided into tokens of the following types:

Identifier (ID)
An identifier token is a sequence of letters, digits, and underscores (_). Identifiers are not case-sensitive.
String (STRING)
String tokens are initiated by a double-quote character (") and consist of all the characters between that double quote and the next double quote, which must be on the same line as the first. Within a string, a backslash can be used as a “literal escape”. The only reasons to use a literal escape are to include a double quote or a backslash within a string.
Special character
Other characters, other than white space, constitute tokens in themselves.

The syntax of the grammar rules is as follows:

     grammar-rules ::= ID : subcommands .
     subcommands ::= subcommand
                 ::= subcommands ; subcommand

The syntax begins with an ID or STRING token that gives the name of the procedure to be parsed. The rest of the syntax consists of subcommands separated by semicolons (;) and terminated with a full stop (.).

     subcommand ::= sbc-options ID sbc-defn
     sbc-options ::=
                 ::= sbc-option
                 ::= sbc-options sbc-options
     sbc-option ::= *
                ::= +
                ::= ^
     sbc-defn ::= opt-prefix = specifiers
              ::= [ ID ] = array-sbc
              ::= opt-prefix = sbc-special-form
     opt-prefix ::=
                ::= ( ID )

Each subcommand can be prefixed with one or more option characters. An asterisk (*) is used to indicate the default subcommand; the keyword used for the default subcommand can be omitted in the PSPP syntax file. A plus sign (+) is used to indicate that a subcommand can appear more than once; if it is not present then that subcommand can appear no more than once. A carat sign (^) is used to indicate that a subcommand must appear at least once.

The subcommand name appears after the option characters.

There are three forms of subcommands. The first and most common form simply gives an equals sign (=) and a list of specifiers, which can each be set to a single setting. The second form declares an array, which is a set of flags that can be individually turned on by the user. There are also several special forms that do not take a list of specifiers.

Arrays require an additional ID argument. This is used as a prefix, prepended to the variable names constructed from the specifiers. The other forms also allow an optional prefix to be specified.

     array-sbc ::= alternatives
               ::= array-sbc , alternatives
     alternatives ::= ID
                  ::= alternatives | ID

An array subcommand is a set of Boolean values that can independently be turned on by the user, listed separated by commas (,). If an value has more than one name then these names are separated by pipes (|).

     specifiers ::= specifier
                ::= specifiers , specifier
     specifier ::= opt-id : settings
     opt-id ::=
            ::= ID

Ordinary subcommands (other than arrays and special forms) require a list of specifiers. Each specifier has an optional name and a list of settings. If the name is given then a correspondingly named variable will be used to store the user's choice of setting. If no name is given then there is no way to tell which setting the user picked; in this case the settings should probably have values attached.

     settings ::= setting
              ::= settings / setting
     setting ::= setting-options ID setting-value
     setting-options ::=
                     ::= *
                     ::= !
                     ::= * !

Individual settings are separated by forward slashes (/). Each setting can be as little as an ID token, but options and values can optionally be included. The * option means that, for this setting, the ID can be omitted. The ! option means that this option is the default for its specifier.

     setting-value ::=
                   ::= ( setting-value-2 )
                   ::= setting-value-2
     setting-value-2 ::= setting-value-options setting-value-type : ID
                         setting-value-restriction
     setting-value-options ::=
                           ::= *
     setting-value-type ::= N
                        ::= D
     setting-value-restriction ::=
                               ::= , STRING

Settings may have values. If the value must be enclosed in parentheses, then enclose the value declaration in parentheses. Declare the setting type as n or d for integer or floating point type, respectively. The given ID is used to construct a variable name. If option * is given, then the value is optional; otherwise it must be specified whenever the corresponding setting is specified. A “restriction” can also be specified which is a string giving a C expression limiting the valid range of the value. The special escape %s should be used within the restriction to refer to the setting's value variable.

     sbc-special-form ::= VAR
                      ::= VARLIST varlist-options
                      ::= INTEGER opt-list
                      ::= DOUBLE opt-list
                      ::= PINT
                      ::= STRING (the literal word STRING) string-options
                      ::= CUSTOM
     varlist-options ::=
                     ::= ( STRING )
     opt-list ::=
              ::= LIST
     string-options ::=
                    ::= ( STRING STRING )

The special forms are of the following types:

VAR
A single variable name.
VARLIST
A list of variables. If given, the string can be used to provide PV_* options to the call to parse_variables.
INTEGER
A single integer value.
INTEGER LIST
A list of integers separated by spaces or commas.
DOUBLE
A single floating-point value.
DOUBLE LIST
A list of floating-point values.
PINT
A single positive integer value.
STRING
A string value. If the options are given then the first string is an expression giving a restriction on the value of the string; the second string is an error message to display when the restriction is violated.
CUSTOM
A custom function is used to parse this subcommand. The function must have prototype int custom_name (void). It should return 0 on failure (when it has already issued an appropriate diagnostic), 1 on success, or 2 if it fails and the calling function should issue a syntax error on behalf of the custom handler.