[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
GNATS uses GNU regular expression syntax with these settings:
RE_SYNTAX_POSIX_EXTENDED | RE_BK_PLUS_QM & RE_DOT_NEWLINE |
This means that parentheses (`(' and `)') and pipe symbols (`|') do not need to be used with the escape symbol `\'. The tokens `+' and `?' do need the escape symbol, however.
Unfortunately, we do not have room in this manual for an adequate tutorial on regular expressions. The following is a basic summary of some regular expressions you might wish to use.
See section `Regular Expression Syntax' in Regex, for details on regular expression syntax. Also see section `Syntax of Regular Expressions' in GNU Emacs Manual, but beware that the syntax for regular expressions in Emacs is slightly different.
All search criteria options to query-pr
rely on regular
expression syntax to construct their search patterns. For example,
query-pr --state=open |
matches all PRs whose `>State:' values match with the regular expression `open'.
We can substitute the expression `o' for `open', according to GNU regular expression syntax. This matches all values of `>State:' which begin with the letter `o'.
query-pr --state=o |
is equivalent to
query-pr --state=open |
in this case, since the only value for `>State:' which matches the expression `o' is `open'. (Double quotes (") are used to protect the asterix (*) from the shell.) `--state=o' also matches `o', `oswald', and even `oooooo', but none of those values are valid states for a Problem Report.
Regular expression syntax considers a regexp token surrounded with parentheses, as in `(regexp)', to be a group. This means that `(ab)*' matches any number of contiguous instances of `ab', including zero. Matches include `', `ab', and `ababab'.
Regular expression syntax considers a regexp token surrounded with square brackets, as in `[regexp]', to be a list. This means that `Char[(ley)(lene)(broiled)' matches any of the words `Charley', `Charlene', or `Charbroiled' (case is significant; `charbroiled' is not matched).
Using groups and lists, we see that
query-pr --category="gcc|gdb|gas" |
is equivalent to
query-pr --category="g(cc|db|as)" |
and is also very similar to
query-pr --category="g[cda]" |
with the exception that this last search matches any values which begin with `gc', `gd', or `ga'.
The `.' character is known as a wildcard. `.' matches on any single character. `*' matches the previous character (except newlines), list, or group any number of times, including zero. Therefore, we can understand `.*' to mean "match zero or more instances of any character." For this reason, we never specify it at the end of a regular expression, as that would be redundant. The expression `o' matches any instance of the letter `o' (followed by anything) at the beginning of a line, while the expression `o.*' matches any instance of the letter `o' at the beginning of a line followed by any number (including zero) of any characters.
We can also use the expression operator `|' to signify a logical
OR
, such that
query-pr --state="o|a" |
matches all `open' or `analyzed' Problem Reports. (Double quotes (") are used to protect the pipe symbol (|) from the shell.)
By the same token,(5) using
query-pr --state=".*a" |
matches all values for `>State:' which contain an `a'. (These include `analyzed' and `feedback'.)
Another way to understand what wildcards do is to follow them on their search for matching text. By our syntax, `.*' matches any character any number of times, including zero. Therefore, `.*a' searches for any group of characters which end with `a', ignoring the rest of the field. `.*a' matches `analyzed' (stopping at the first `a') as well as `feedback'.
Note: When using `--text' or `--multitext', you do not have to specify the token `.*' at the beginning of text to match the entire field. For the technically minded, this is because `--text' and `--multitext' use `re_search' rather than `re_match'. `re_match' anchors the search at the beginning of the field, while `re_search' does not anchor the search.
For example, to search in the >Description:
field for the text
The defrobulator component returns a nil value. |
we can use
query-pr --multitext="defrobulator.*nil" |
To also match newlines, we have to include the expression `(.|^M)' instead of just a dot (`.'). `(.|^M)' matches "any single character except a newline (`.') or (`|') any newline (`^M')." This means that to search for the text
The defrobulator component enters the bifrabulator routine and returns a nil value. |
we must use
query-pr --multitext="defrobulator(.|^M)*nil" |
To generate the newline character `^M', type the following depending on your shell:
csh
tcsh
sh (or bash)
(.| ) |
Again, see section `Regular Expression Syntax' in Regex, for a much more complete discussion on regular expression syntax.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |