Beginning with version 3.1 of gawk, you may produce execution traces of your awk programs. This is done with a specially compiled version of gawk, called pgawk (“profiling gawk”).
pgawk is identical in every way to gawk, except that when it has finished running, it creates a profile of your program in a file named awkprof.out. Because it is profiling, it also executes up to 45% slower than gawk normally does.
As shown in the following example, the --profile option can be used to change the name of the file where pgawk will write the profile:
$ pgawk --profile=myprog.prof -f myprog.awk data1 data2
In the above example, pgawk places the profile in myprog.prof instead of in awkprof.out.
Regular gawk also accepts this option. When called with just --profile, gawk “pretty prints” the program into awkprof.out, without any execution counts. You may supply an option to --profile to change the file name. Here is a sample session showing a simple awk program, its input data, and the results from running pgawk. First, the awk program:
BEGIN { print "First BEGIN rule" } END { print "First END rule" } /foo/ { print "matched /foo/, gosh" for (i = 1; i <= 3; i++) sing() } { if (/foo/) print "if is true" else print "else is true" } BEGIN { print "Second BEGIN rule" } END { print "Second END rule" } function sing( dummy) { print "I gotta be me!" }
Following is the input data:
foo bar baz foo junk
Here is the awkprof.out that results from running pgawk on this program and data (this example also illustrates that awk programmers sometimes have to work late):
# gawk profile, created Sun Aug 13 00:00:15 2000 # BEGIN block(s) BEGIN { 1 print "First BEGIN rule" 1 print "Second BEGIN rule" } # Rule(s) 5 /foo/ { # 2 2 print "matched /foo/, gosh" 6 for (i = 1; i <= 3; i++) { 6 sing() } } 5 { 5 if (/foo/) { # 2 2 print "if is true" 3 } else { 3 print "else is true" } } # END block(s) END { 1 print "First END rule" 1 print "Second END rule" } # Functions, listed alphabetically 6 function sing(dummy) { 6 print "I gotta be me!" }
This example illustrates many of the basic rules for profiling output. The rules are as follows:
BEGIN
rule,
pattern/action rules, END
rule and functions, listed
alphabetically.
Multiple BEGIN
and END
rules are merged together.
if
-else
statement shows how many times
the condition was tested.
To the right of the opening left brace for the if
's body
is a count showing how many times the condition was true.
The count for the else
indicates how many times the test failed.
for
or while
) shows how many times the loop test was executed.
(Because of this, you can't just look at the count on the first
statement in a rule to determine how many times the rule was executed.
If the first statement is a loop, the count is misleading.)
function
keyword indicates how many times the function was called.
The counts next to the statements in the body show how many times
those statements were executed.
if
, else
, or loop is only a single statement.
print
and printf
only when
the print
or printf
statement is followed by a redirection.
Similarly, if
the target of a redirection isn't a scalar, it gets parenthesized.
BEGIN
and END
rules,
the pattern/action rules, and the functions.
The profiled version of your program may not look exactly like what you
typed when you wrote it. This is because pgawk creates the
profiled version by “pretty printing” its internal representation of
the program. The advantage to this is that pgawk can produce
a standard representation. The disadvantage is that all source-code
comments are lost, as are the distinctions among multiple BEGIN
and END
rules. Also, things such as:
/foo/
come out as:
/foo/ { print $0 }
which is correct, but possibly surprising.
Besides creating profiles when a program has completed, pgawk can produce a profile while it is running. This is useful if your awk program goes into an infinite loop and you want to see what has been executed. To use this feature, run pgawk in the background:
$ pgawk -f myprog & [1] 13992
The shell prints a job number and process ID number; in this case, 13992.
Use the kill command to send the USR1
signal
to pgawk:
$ kill -USR1 13992
As usual, the profiled version of the program is written to awkprof.out, or to a different file if you use the --profile option.
Along with the regular profile, as shown earlier, the profile includes a trace of any active functions:
# Function Call Stack: # 3. baz # 2. bar # 1. foo # -- main --
You may send pgawk the USR1
signal as many times as you like.
Each time, the profile and function call trace are appended to the output
profile file.
If you use the HUP
signal instead of the USR1
signal,
pgawk produces the profile and the function call trace and then exits.
When pgawk runs on MS-DOS or MS-Windows, it uses the
INT
and QUIT
signals for producing the profile and, in
the case of the INT
signal, pgawk exits. This is
because these systems don't support the kill command, so the
only signals you can deliver to a program are those generated by the
keyboard. The INT
signal is generated by the
Ctrl-<C> or Ctrl-<BREAK> key, while the
QUIT
signal is generated by the Ctrl-<\> key.