The following functions relate to input/output (I/O). Optional parameters are enclosed in square brackets ([ ]):
close(
filename [,
how])
When closing a coprocess, it is occasionally useful to first close
one end of the two-way pipe and then to close the other. This is done
by providing a second argument to close
. This second argument
should be one of the two string values "to"
or "from"
,
indicating which end of the pipe to close. Case in the string does
not matter.
See Two-way I/O,
which discusses this feature in more detail and gives an example.
fflush(
[filename])
Many utility programs buffer their output; i.e., they save information
to write to a disk file or terminal in memory until there is enough
for it to be worthwhile to send the data to the output device.
This is often more efficient than writing
every little bit of information as soon as it is ready. However, sometimes
it is necessary to force a program to flush its buffers; that is,
write the information to its destination, even if a buffer is not full.
This is the purpose of the fflush
function—gawk also
buffers its output and the fflush
function forces
gawk to flush its buffers.
fflush
was added to the Bell Laboratories research
version of awk in 1994; it is not part of the POSIX standard and is
not available if --posix has been specified on the
command line (see Options).
gawk extends the fflush
function in two ways. The first
is to allow no argument at all. In this case, the buffer for the
standard output is flushed. The second is to allow the null string
(""
) as the argument. In this case, the buffers for
all open output files and pipes are flushed.
fflush
returns zero if the buffer is successfully flushed;
otherwise, it returns −1.
In the case where all buffers are flushed, the return value is zero
only if all buffers were flushed successfully. Otherwise, it is
−1, and gawk warns about the problem filename.
gawk also issues a warning message if you attempt to flush
a file or pipe that was opened for reading (such as with getline
),
or if filename is not an open file, pipe, or coprocess.
In such a case, fflush
returns −1, as well.
system(
command)
system
function executes the command given by the string command.
It returns the status returned by the command that was executed as
its value.
For example, if the following fragment of code is put in your awk program:
END { system("date | mail -s 'awk run done' root") }
the system administrator is sent mail when the awk program finishes processing input and begins its end-of-input processing.
Note that redirecting print
or printf
into a pipe is often
enough to accomplish your task. If you need to run many commands, it
is more efficient to simply print them down a pipeline to the shell:
while (more stuff to do) print command | "/bin/sh" close("/bin/sh")
However, if your awk
program is interactive, system
is useful for cranking up large
self-contained programs, such as a shell or an editor.
Some operating systems cannot implement the system
function.
system
causes a fatal error if it is not supported.
As a side point, buffering issues can be even more confusing, depending upon whether your program is interactive, i.e., communicating with a user sitting at a keyboard.1
Interactive programs generally line buffer their output; i.e., they write out every line. Noninteractive programs wait until they have a full buffer, which may be many lines of output. Here is an example of the difference:
$ awk '{ print $1 + $2 }' 1 1 -| 2 2 3 -| 5 Ctrl-d
Each line of output is printed immediately. Compare that behavior with this example:
$ awk '{ print $1 + $2 }' | cat 1 1 2 3 Ctrl-d -| 2 -| 5
Here, no output is printed until after the Ctrl-d is typed, because it is all buffered and sent down the pipe to cat in one shot.
system
The fflush
function provides explicit control over output buffering for
individual files and pipes. However, its use is not portable to many other
awk implementations. An alternative method to flush output
buffers is to call system
with a null string as its argument:
system("") # flush output
gawk treats this use of the system
function as a special
case and is smart enough not to run a shell (or other command
interpreter) with the empty command. Therefore, with gawk, this
idiom is not only useful, it is also efficient. While this method should work
with other awk implementations, it does not necessarily avoid
starting an unnecessary shell. (Other implementations may only
flush the buffer associated with the standard output and not necessarily
all buffered output.)
If you think about what a programmer expects, it makes sense that
system
should flush any pending output. The following program:
BEGIN { print "first print" system("echo system echo") print "second print" }
must print:
first print system echo second print
and not:
system echo first print second print
If awk did not flush its buffers before calling system
,
you would see the latter (undesirable) output.