Now that we've mastered some simple tasks, let's look at what typical awk programs do. This example shows how awk can be used to summarize, select, and rearrange the output of another utility. It uses features that haven't been covered yet, so don't worry if you don't understand all the details:
ls -l | awk '$6 == "Nov" { sum += $5 } END { print sum }'
This command prints the total number of bytes in all the files in the current directory that were last modified in November (of any year). 1 The `ls -l' part of this example is a system command that gives you a listing of the files in a directory, including each file's size and the date the file was last modified. Its output looks like this:
-rw-r--r-- 1 arnold user 1933 Nov 7 13:05 Makefile -rw-r--r-- 1 arnold user 10809 Nov 7 13:03 awk.h -rw-r--r-- 1 arnold user 983 Apr 13 12:14 awk.tab.h -rw-r--r-- 1 arnold user 31869 Jun 15 12:20 awk.y -rw-r--r-- 1 arnold user 22414 Nov 7 13:03 awk1.c -rw-r--r-- 1 arnold user 37455 Nov 7 13:03 awk2.c -rw-r--r-- 1 arnold user 27511 Dec 9 13:07 awk3.c -rw-r--r-- 1 arnold user 7989 Nov 7 13:03 awk4.c
The first field contains read-write permissions, the second field contains the number of links to the file, and the third field identifies the owner of the file. The fourth field identifies the group of the file. The fifth field contains the size of the file in bytes. The sixth, seventh, and eighth fields contain the month, day, and time, respectively, that the file was last modified. Finally, the ninth field contains the name of the file.2
The `$6 == "Nov"' in our awk program is an expression that
tests whether the sixth field of the output from `ls -l'
matches the string `Nov'. Each time a line has the string
`Nov' for its sixth field, the action `sum += $5' is
performed. This adds the fifth field (the file's size) to the variable
sum
. As a result, when awk has finished reading all the
input lines, sum
is the total of the sizes of the files whose
lines matched the pattern. (This works because awk variables
are automatically initialized to zero.)
After the last line of output from ls has been processed, the
END
rule executes and prints the value of sum
.
In this example, the value of sum
is 80600.
These more advanced awk techniques are covered in later sections
(see Action Overview). Before you can move on to more
advanced awk programming, you have to know how awk interprets
your input and displays your output. By manipulating fields and using
print
statements, you can produce some very useful and
impressive-looking reports.
[1] In the C shell (csh), you need to type a semicolon and then a backslash at the end of the first line; see Statements/Lines, for an explanation. In a POSIX-compliant shell, such as the Bourne shell or bash, you can type the example as shown. If the command `echo $path' produces an empty output line, you are most likely using a POSIX-compliant shell. Otherwise, you are probably using the C shell or a shell derived from it.
[2] On some very old systems, you may need to use `ls -lg' to get this output.