In this section the built in awk patterns are dissected one by one so as how they work can be shown.
%%PATTERN "<awk><nul>" "<nul>" "" "<awk><sor>" "<sor>" %%PATTERN "<awk><sor>" "<wht>" "" "<awk><sor>" "<nul>" %%PATTERN "<awk><sor>" "<nul>" "" "<awk><sof>" "<sof>" %%PATTERN "<awk><sof>" "<del>" "" "<awk><sor>" "<eof>" %%PATTERN "<awk><sof>" "<any>" "" "<awk><sof>" "<fld>" %%PATTERN "<awk><sof>" "<nul>" "" "<awk><sor>" "<eof>" %%PATTERN "<awk><sor>" "<new>" "" "<awk><nul>" "<eor>"
The first which matches on the default initial mode matches any character on
the input stream (but leaves the input stream as it is) jumps to the start
of record mode, returning the start of record token in the process. In the
start of record mode there are three possible matching patterns. They match
on whitespace, the nul character and on the newline character. Whitespace is
ignored (so that multiple spaces and tabs in the input are not interpreted
as multiple fields), whereas the <nul>
flags the start of a field and sets
the mode to start of field mode. Since the <nul>
is defined after the pattern
for whitespace it will only match when the input is not whitespace. The
newline character (which should actually also be defined above the <nul>
pattern) sets the mode back to the initial mode and flags the end of the
record. The remaining patterns only match in start of field mode. The three
possible matching characters are the delimiter (normally whitespace), any
other character except newline, and the nul character again. The order these
are declared in is important. A character which is a delimiter character will
always match first and sets the mode back to start of record mode (in
prepartion for another field or end of record) and flags the end of field).
Otherwise the character will match the next pattern (except if newline) and
this leaves the mode the same but flags that the character is to be appended
to the field definition. Finally on a newline the <nul>
character matches and
flags the end of the field and puts the mode back to start of record mode.
Note that just matching a newline here (rather than null) wouldn't work as the
newline signals the end of the record but would be taken off to flag the end
of the field and end of record would not then be flagged. So using the <nul>
matches the newline to return end of field but leaves the newline on the input
stream so that in the start of record mode it can be matched by the last
pattern definition to indicate the end of the record. This is neccessary since
usually the last field does not have a delimiter after it.
Go to the first, previous, next, last section, table of contents.