[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10. Portable Shell Programming

When writing your own checks, there are some shell-script programming techniques you should avoid in order to make your code portable. The Bourne shell and upward-compatible shells like the Korn shell and Bash have evolved over the years, but to prevent trouble, do not take advantage of features that were added after UNIX version 7, circa 1977 (see section Systemology).

You should not use shell functions, aliases, negated character classes, or other features that are not found in all Bourne-compatible shells; restrict yourself to the lowest common denominator. Even unset is not supported by all shells! Also, include a space after the exclamation point in interpreter specifications, like this:

 
#! /usr/bin/perl

If you omit the space before the path, then 4.2BSD based systems (such as DYNIX) will ignore the line, because they interpret `#! /' as a 4-byte magic number. Some old systems have quite small limits on the length of the `#!' line too, for instance 32 bytes (not including the newline) on SunOS 4.

The set of external programs you should run in a configure script is fairly small. See (standards)Utilities in Makefiles section `Utilities in Makefiles' in GNU Coding Standards, for the list. This restriction allows users to start out with a fairly small set of programs and build the rest, avoiding too many interdependencies between packages.

Some of these external utilities have a portable subset of features; see Limitations of Usual Tools.

There are other sources of documentation about shells. See for instance the Shell FAQs.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10.1 Shellology

There are several families of shells, most prominently the Bourne family and the C shell family which are deeply incompatible. If you want to write portable shell scripts, avoid members of the C shell family. The the Shell difference FAQ includes a small history of Unix shells, and a comparison between several of them.

Below we describe some of the members of the Bourne shell family.

Ash

ash is often used on GNU/Linux and BSD systems as a light-weight Bourne-compatible shell. Ash 0.2 has some bugs that are fixed in the 0.3.x series, but portable shell scripts should work around them, since version 0.2 is still shipped with many GNU/Linux distributions.

To be compatible with Ash 0.2:

Bash

To detect whether you are running bash, test if BASH_VERSION is set. To disable its extensions and require POSIX compatibility, run `set -o posix'. See (bash)Bash POSIX Mode section `Bash POSIX Mode' in The GNU Bash Reference Manual, for details.

Bash 2.05 and later

Versions 2.05 and later of bash use a different format for the output of the set builtin, designed to make evaluating its output easier. However, this output is not compatible with earlier versions of bash (or with many other shells, probably). So if you use bash 2.05 or higher to execute configure, you'll need to use bash 2.05 for all other build tasks as well.

/usr/xpg4/bin/sh on Solaris

The POSIX-compliant Bourne shell on a Solaris system is /usr/xpg4/bin/sh and is part of an extra optional package. There is no extra charge for this package, but it is also not part of a minimal OS install and therefore some folks may not have it.

Zsh

To detect whether you are running zsh, test if ZSH_VERSION is set. By default zsh is not compatible with the Bourne shell: you have to run `emulate sh' and set NULLCMD to `:'. See (zsh)Compatibility section `Compatibility' in The Z Shell Manual, for details.

Zsh 3.0.8 is the native /bin/sh on Mac OS X 10.0.3.

The following discussion between Russ Allbery and Robert Lipe is worth reading:

Russ Allbery:

The GNU assumption that /bin/sh is the one and only shell leads to a permanent deadlock. Vendors don't want to break users' existing shell scripts, and there are some corner cases in the Bourne shell that are not completely compatible with a POSIX shell. Thus, vendors who have taken this route will never (OK…"never say never") replace the Bourne shell (as /bin/sh) with a POSIX shell.

Robert Lipe:

This is exactly the problem. While most (at least most System V's) do have a Bourne shell that accepts shell functions most vendor /bin/sh programs are not the POSIX shell.

So while most modern systems do have a shell somewhere that meets the POSIX standard, the challenge is to find it.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10.2 Here-Documents

Don't rely on `\' being preserved just because it has no special meaning together with the next symbol. In the native /bin/sh on OpenBSD 2.7 `\"' expands to `"' in here-documents with unquoted delimiter. As a general rule, if `\\' expands to `\' use `\\' to get `\'.

With OpenBSD 2.7's /bin/sh

 
$ cat <<EOF
> \" \\
> EOF
" \

and with Bash:

 
bash-2.04$ cat <<EOF
> \" \\
> EOF
\" \

Many older shells (including the Bourne shell) implement here-documents inefficiently. And some shells mishandle large here-documents: for example, Solaris 8 dtksh, which is derived from ksh M-12/28/93d, mishandles variable expansion that occurs on 1024-byte buffer boundaries within a here-document. Users can generally fix these problems by using a faster or more reliable shell, e.g., by using the command `bash ./configure' rather than plain `./configure'.

Some shells can be extremely inefficient when there are a lot of here-documents inside a single statement. For instance if your `configure.ac' includes something like:

 
if <cross_compiling>; then
  assume this and that
else
  check this
  check that
  check something else
  …
  on and on forever
  …
fi

A shell parses the whole if/fi construct, creating temporary files for each here document in it. Some shells create links for such here-documents on every fork, so that the clean-up code they had installed correctly removes them. It is creating the links that can take the shell forever.

Moving the tests out of the if/fi, or creating multiple if/fi constructs, would improve the performance significantly. Anyway, this kind of construct is not exactly the typical use of Autoconf. In fact, it's even not recommended, because M4 macros can't look into shell conditionals, so we may fail to expand a macro when it was expanded before in a conditional path, and the condition turned out to be false at run-time, and we end up not executing the macro at all.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10.3 File Descriptors

Some file descriptors shall not be used, since some systems, admittedly arcane, use them for special purpose:

 
3 --- some systems may open it to `/dev/tty'.
4 --- used on the Kubota Titan.

Don't redirect the same file descriptor several times, as you are doomed to failure under Ultrix.

 
ULTRIX V4.4 (Rev. 69) System #31: Thu Aug 10 19:42:23 GMT 1995
UWS V4.4 (Rev. 11)
$ eval 'echo matter >fullness' >void
illegal io
$ eval '(echo matter >fullness)' >void
illegal io
$ (eval '(echo matter >fullness)') >void
Ambiguous output redirect.

In each case the expected result is of course `fullness' containing `matter' and `void' being empty.

Don't try to redirect the standard error of a command substitution: it must be done inside the command substitution: when running `: `cd /zorglub` 2>/dev/null' expect the error message to escape, while `: `cd /zorglub 2>/dev/null`' works properly.

It is worth noting that Zsh (but not Ash nor Bash) makes it possible in assignments though: `foo=`cd /zorglub` 2>/dev/null'.

Most shells, if not all (including Bash, Zsh, Ash), output traces on stderr, even for sub-shells. This might result in undesirable content if you meant to capture the standard-error output of the inner command:

 
$ ash -x -c '(eval "echo foo >&2") 2>stderr'
$ cat stderr
+ eval echo foo >&2
+ echo foo
foo
$ bash -x -c '(eval "echo foo >&2") 2>stderr'
$ cat stderr
+ eval 'echo foo >&2'
++ echo foo
foo
$ zsh -x -c '(eval "echo foo >&2") 2>stderr'
# Traces on startup files deleted here.
$ cat stderr
+zsh:1> eval echo foo >&2
+zsh:1> echo foo
foo

You'll appreciate the various levels of detail....

One workaround is to grep out uninteresting lines, hoping not to remove good ones....

Don't try to move/delete open files, such as in `exec >foo; mv foo bar'; see Limitations of Shell Builtins, mv for more details.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10.4 File System Conventions

While autoconf and friends will usually be run on some Unix variety, it can and will be used on other systems, most notably DOS variants. This impacts several assumptions regarding file and path names.

For example, the following code:

 
case $foo_dir in
  /*) # Absolute
     ;;
  *)
     foo_dir=$dots$foo_dir ;;
esac

will fail to properly detect absolute paths on those systems, because they can use a drivespec, and will usually use a backslash as directory separator. The canonical way to check for absolute paths is:

 
case $foo_dir in
  [\\/]* | ?:[\\/]* ) # Absolute
     ;;
  *)
     foo_dir=$dots$foo_dir ;;
esac

Make sure you quote the brackets if appropriate and keep the backslash as first character (see section Limitations of Shell Builtins).

Also, because the colon is used as part of a drivespec, these systems don't use it as path separator. When creating or accessing paths, use the PATH_SEPARATOR output variable instead. configure sets this to the appropriate value (`:' or `;') when it starts up.

File names need extra care as well. While DOS-based environments that are Unixy enough to run autoconf (such as DJGPP) will usually be able to handle long file names properly, there are still limitations that can seriously break packages. Several of these issues can be easily detected by the doschk package.

A short overview follows; problems are marked with SFN/LFN to indicate where they apply: SFN means the issues are only relevant to plain DOS, not to DOS boxes under Windows, while LFN identifies problems that exist even under Windows.

No multiple dots (SFN)

DOS cannot handle multiple dots in filenames. This is an especially important thing to remember when building a portable configure script, as autoconf uses a .in suffix for template files.

This is perfectly OK on Unices:

 
AC_CONFIG_HEADERS([config.h])
AC_CONFIG_FILES([source.c foo.bar])
AC_OUTPUT

but it causes problems on DOS, as it requires `config.h.in', `source.c.in' and `foo.bar.in'. To make your package more portable to DOS-based environments, you should use this instead:

 
AC_CONFIG_HEADERS([config.h:config.hin])
AC_CONFIG_FILES([source.c:source.cin foo.bar:foobar.in])
AC_OUTPUT
No leading dot (SFN)

DOS cannot handle filenames that start with a dot. This is usually not a very important issue for autoconf.

Case insensitivity (LFN)

DOS is case insensitive, so you cannot, for example, have both a file called `INSTALL' and a directory called `install'. This also affects make; if there's a file called `INSTALL' in the directory, `make install' will do nothing (unless the `install' target is marked as PHONY).

The 8+3 limit (SFN)

Because the DOS file system only stores the first 8 characters of the filename and the first 3 of the extension, those must be unique. That means that `foobar-part1.c', `foobar-part2.c' and `foobar-prettybird.c' all resolve to the same filename (`FOOBAR-P.C'). The same goes for `foo.bar' and `foo.bartender'.

Note: This is not usually a problem under Windows, as it uses numeric tails in the short version of filenames to make them unique. However, a registry setting can turn this behavior off. While this makes it possible to share file trees containing long file names between SFN and LFN environments, it also means the above problem applies there as well.

Invalid characters

Some characters are invalid in DOS filenames, and should therefore be avoided. In a LFN environment, these are `/', `\', `?', `*', `:', `<', `>', `|' and `"'. In a SFN environment, other characters are also invalid. These include `+', `,', `[' and `]'.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10.5 Shell Substitutions

Contrary to a persistent urban legend, the Bourne shell does not systematically split variables and back-quoted expressions, in particular on the right-hand side of assignments and in the argument of case. For instance, the following code:

 
case "$given_srcdir" in
.)  top_srcdir="`echo "$dots" | sed 's,/$,,'`"
*)  top_srcdir="$dots$given_srcdir" ;;
esac

is more readable when written as:

 
case $given_srcdir in
.)  top_srcdir=`echo "$dots" | sed 's,/$,,'`
*)  top_srcdir=$dots$given_srcdir ;;
esac

and in fact it is even more portable: in the first case of the first attempt, the computation of top_srcdir is not portable, since not all shells properly understand "`…"…"…`". Worse yet, not all shells understand "`…\"…\"…`" the same way. There is just no portable way to use double-quoted strings inside double-quoted back-quoted expressions (pfew!).

$@

One of the most famous shell-portability issues is related to `"$@"'. When there are no positional arguments, POSIX says that `"$@"' is supposed to be equivalent to nothing, but the original Unix Version 7 Bourne shell treated it as equivalent to `""' instead, and this behavior survives in later implementations like Digital Unix 5.0.

The traditional way to work around this portability problem is to use `${1+"$@"}'. Unfortunately this method does not work with Zsh (3.x and 4.x), which is used on Mac OS X. When emulating the Bourne shell, Zsh performs word splitting on `${1+"$@"}':

 
zsh $ emulate sh
zsh $ for i in "$@"; do echo $i; done
Hello World
!
zsh $ for i in ${1+"$@"}; do echo $i; done
Hello
World
!

Zsh handles plain `"$@"' properly, but we can't use plain `"$@"' because of the portability problems mentioned above. One workaround relies on Zsh's "global aliases" to convert `${1+"$@"}' into `"$@"' by itself:

 
test "${ZSH_VERSION+set}" = set && alias -g '${1+"$@"}'='"$@"'

A more conservative workaround is to avoid `"$@"' if it is possible that there may be no positional arguments. For example, instead of:

 
cat conftest.c "$@"

you can use this instead:

 
case $# in
0) cat conftest.c;;
*) cat conftest.c "$@";;
esac
${var:-value}

Old BSD shells, including the Ultrix sh, don't accept the colon for any shell substitution, and complain and die.

${var=literal}

Be sure to quote:

 
: ${var='Some words'}

otherwise some shells, such as on Digital Unix V 5.0, will die because of a "bad substitution".


Solaris' /bin/sh has a frightening bug in its interpretation of this. Imagine you need set a variable to a string containing `}'. This `}' character confuses Solaris' /bin/sh when the affected variable was already set. This bug can be exercised by running:

 
$ unset foo
$ foo=${foo='}'}
$ echo $foo
}
$ foo=${foo='}'   # no error; this hints to what the bug is
$ echo $foo
}
$ foo=${foo='}'}
$ echo $foo
}}
 ^ ugh!

It seems that `}' is interpreted as matching `${', even though it is enclosed in single quotes. The problem doesn't happen using double quotes.

${var=expanded-value}

On Ultrix, running

 
default="yu,yaa"
: ${var="$default"}

will set var to `M-yM-uM-,M-yM-aM-a', i.e., the 8th bit of each char will be set. You won't observe the phenomenon using a simple `echo $var' since apparently the shell resets the 8th bit when it expands $var. Here are two means to make this shell confess its sins:

 
$ cat -v <<EOF
$var
EOF

and

 
$ set | grep '^var=' | cat -v

One classic incarnation of this bug is:

 
default="a b c"
: ${list="$default"}
for c in $list; do
  echo $c
done

You'll get `a b c' on a single line. Why? Because there are no spaces in `$list': there are `M- ', i.e., spaces with the 8th bit set, hence no IFS splitting is performed!!!

One piece of good news is that Ultrix works fine with `: ${list=$default}'; i.e., if you don't quote. The bad news is then that QNX 4.25 then sets list to the last item of default!

The portable way out consists in using a double assignment, to switch the 8th bit twice on Ultrix:

 
list=${list="$default"}

…but beware of the `}' bug from Solaris (see above). For safety, use:

 
test "${var+set}" = set || var={value}
`commands`

While in general it makes no sense, do not substitute a single builtin with side effects, becauase Ash 0.2, trying to optimize, does not fork a subshell to perform the command.

For instance, if you wanted to check that cd is silent, do not use `test -z "`cd /`"' because the following can happen:

 
$ pwd
/tmp
$ test -n "`cd /`" && pwd
/

The result of `foo=`exit 1`' is left as an exercise to the reader.

$(commands)

This construct is meant to replace ``commands`'; they can be nested while this is impossible to do portably with back quotes. Unfortunately it is not yet widely supported. Most notably, even recent releases of Solaris don't support it:

 
$ showrev -c /bin/sh | grep version
Command version: SunOS 5.8 Generic 109324-02 February 2001
$ echo $(echo blah)
syntax error: `(' unexpected

nor does IRIX 6.5's Bourne shell:

 
$ uname -a
IRIX firebird-image 6.5 07151432 IP22
$ echo $(echo blah)
$(echo blah)

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10.6 Assignments

When setting several variables in a row, be aware that the order of the evaluation is undefined. For instance `foo=1 foo=2; echo $foo' gives `1' with sh on Solaris, but `2' with Bash. You must use `;' to enforce the order: `foo=1; foo=2; echo $foo'.

Don't rely on the following to find `subdir/program':

 
PATH=subdir$PATH_SEPARATOR$PATH program

as this does not work with Zsh 3.0.6. Use something like this instead:

 
(PATH=subdir$PATH_SEPARATOR$PATH; export PATH; exec program)

Don't rely on the exit status of an assignment: Ash 0.2 does not change the status and propagates that of the last statement:

 
$ false || foo=bar; echo $?
1
$ false || foo=`:`; echo $?
0

and to make things even worse, QNX 4.25 just sets the exit status to 0 in any case:

 
$ foo=`exit 1`; echo $?
0

To assign default values, follow this algorithm:

  1. If the default value is a literal and does not contain any closing brace, use:

     
    : ${var='my literal'}
    
  2. If the default value contains no closing brace, has to be expanded, and the variable being initialized will never be IFS-split (i.e., it's not a list), then use:

     
    : ${var="$default"}
    
  3. If the default value contains no closing brace, has to be expanded, and the variable being initialized will be IFS-split (i.e., it's a list), then use:

     
    var=${var="$default"}
    
  4. If the default value contains a closing brace, then use:

     
    test "${var+set}" = set || var='${indirection}'
    

In most cases `var=${var="$default"}' is fine, but in case of doubt, just use the latter. See section Shell Substitutions, items `${var:-value}' and `${var=value}' for the rationale.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10.7 Special Shell Variables

Some shell variables should not be used, since they can have a deep influence on the behavior of the shell. In order to recover a sane behavior from the shell, some variables should be unset, but unset is not portable (see section Limitations of Shell Builtins) and a fallback value is needed. We list these values below.

CDPATH

When this variable is set it specifies a list of directories to search when invoking cd with a relative filename. POSIX 1003.1-2001 says that if a nonempty directory name from CDPATH is used successfully, cd prints the resulting absolute filename. Unfortunately this output can break idioms like `abs=`cd src && pwd`' because abs receives the path twice. Also, many shells do not conform to this part of POSIX; for example, zsh prints the result only if a directory name other than `.' was chosen from CDPATH.

In practice the shells that have this problem also support unset, so you can work around the problem as follows:

 
(unset CDPATH) >/dev/null 2>&1 && unset CDPATH

Autoconf-generated scripts automatically unset CDPATH if possible, so you need not worry about this problem in those scripts.

IFS

Don't set the first character of IFS to backslash. Indeed, Bourne shells use the first character (backslash) when joining the components in `"$@"' and some shells then re-interpret (!) the backslash escapes, so you can end up with backspace and other strange characters.

The proper value for IFS (in regular code, not when performing splits) is `SPCTABRET'. The first character is especially important, as it is used to join the arguments in `@*'.

LANG
LC_ALL
LC_COLLATE
LC_CTYPE
LC_MESSAGES
LC_MONETARY
LC_NUMERIC
LC_TIME

Autoconf-generated scripts normally set all these variables to `C' because so much configuration code assumes the C locale and POSIX requires that locale environment variables be set to `C' if the C locale is desired. However, some older, nonstandard systems (notably SCO) break if locale environment variables are set to `C', so when running on these systems Autoconf-generated scripts unset the variables instead.

LANGUAGE

LANGUAGE is not specified by POSIX, but it is a GNU extension that overrides LC_ALL in some cases, so Autoconf-generated scripts set it too.

LC_ADDRESS
LC_IDENTIFICATION
LC_MEASUREMENT
LC_NAME
LC_PAPER
LC_TELEPHONE

These locale environment variables are GNU extensions. They are treated like their POSIX brethren (LC_COLLATE, etc.) as described above.

LINENO

Most modern shells provide the current line number in LINENO. Its value is the line number of the beginning of the current command. Autoconf attempts to execute configure with a modern shell. If no such shell is available, it attempts to implement LINENO with a Sed prepass that replaces each instance of the string $LINENO (not followed by an alphanumeric character) with the line's number.

You should not rely on LINENO within eval, as the behavior differs in practice. Also, the possibility of the Sed prepass means that you should not rely on $LINENO when quoted, when in here-documents, or when in long commands that cross line boundaries. Subshells should be OK, though. In the following example, lines 1, 6, and 9 are portable, but the other instances of LINENO are not:

 
$ cat lineno
echo 1. $LINENO
cat <<EOF
3. $LINENO
4. $LINENO
EOF
( echo 6. $LINENO )
eval 'echo 7. $LINENO'
echo 8. '$LINENO'
echo 9. $LINENO '
10.' $LINENO
$ bash-2.05 lineno
1. 1
3. 2
4. 2
6. 6
7. 1
8. $LINENO
9. 9
10. 9
$ zsh-3.0.6 lineno
1. 1
3. 2
4. 2
6. 6
7. 7
8. $LINENO
9. 9
10. 9
$ pdksh-5.2.14 lineno
1. 1
3. 2
4. 2
6. 6
7. 0
8. $LINENO
9. 9
10. 9
$ sed '=' <lineno |
>   sed '
>     N
>     s,$,-,
>     : loop
>     s,^\([0-9]*\)\(.*\)[$]LINENO\([^a-zA-Z0-9_]\),\1\2\1\3,
>     t loop
>     s,-$,,
>     s,^[0-9]*\n,,
>   ' |
>   sh
1. 1
3. 3
4. 4
6. 6
7. 7
8. 8
9. 9
10. 10
NULLCMD

When executing the command `>foo', zsh executes `$NULLCMD >foo'. The Bourne shell considers NULLCMD to be `:', while zsh, even in Bourne shell compatibility mode, sets NULLCMD to `cat'. If you forgot to set NULLCMD, your script might be suspended waiting for data on its standard input.

ENV
MAIL
MAILPATH
PS1
PS2
PS4

These variables should not matter for shell scripts, since they are supposed to affect only interactive shells. However, at least one shell (the pre-3.0 UWIN ksh) gets confused about whether it is interactive, which means that (for example) a PS1 with a side effect can unexpectedly modify `$?'. To work around this bug, Autoconf-generated scripts do something like this:

 
(unset ENV) >/dev/null 2>&1 && unset ENV MAIL MAILPATH
PS1='$ '
PS2='> '
PS4='+ '
PWD

POSIX 1003.1-2001 requires that cd and pwd must update the PWD environment variable to point to the logical path to the current directory, but traditional shells do not support this. This can cause confusion if one shell instance maintains PWD but a subsidiary and different shell does not know about PWD and executes cd; in this case PWD will point to the wrong directory. Use ``pwd`' rather than `$PWD'.

status

This variable is an alias to `$?' for zsh (at least 3.1.6), hence read-only. Do not use it.

PATH_SEPARATOR

If it is not set, configure will detect the appropriate path separator for the build system and set the PATH_SEPARATOR output variable accordingly.

On DJGPP systems, the PATH_SEPARATOR environment variable can be set to either `:' or `;' to control the path separator bash uses to set up certain environment variables (such as PATH). Since this only works inside bash, you want configure to detect the regular DOS path separator (`;'), so it can be safely substituted in files that may not support `;' as path separator. So it is recommended to either unset this variable or set it to `;'.

RANDOM

Many shells provide RANDOM, a variable that returns a different integer each time it is used. Most of the time, its value does not change when it is not used, but on IRIX 6.5 the value changes all the time. This can be observed by using set.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10.8 Limitations of Shell Builtins

No, no, we are serious: some shells do have limitations! :)

You should always keep in mind that any builtin or command may support options, and therefore have a very different behavior with arguments starting with a dash. For instance, the innocent `echo "$word"' can give unexpected results when word starts with a dash. It is often possible to avoid this problem using `echo "x$word"', taking the `x' into account later in the pipe.

.

Use . only with regular files (use `test -f'). Bash 2.03, for instance, chokes on `. /dev/null'. Also, remember that . uses PATH if its argument contains no slashes, so if you want to use . on a file `foo' in the current directory, you must use `. ./foo'.

!

You can't use !; you'll have to rewrite your code.

break

The use of `break 2' etc. is safe.

cd

POSIX 1003.1-2001 requires that cd must support the `-L' ("logical") and `-P' ("physical") options, with `-L' being the default. However, traditional shells do not support these options, and their cd command has the `-P' behavior.

Portable scripts should assume neither option is supported, and should assume neither behavior is the default. This can be a bit tricky, since the POSIX default behavior means that, for example, `ls ..' and `cd ..' may refer to different directories if the current logical directory is a symbolic link. It is safe to use cd dir if dir contains no `..' components. Also, Autoconf-generated scripts check for this problem when computing variables like ac_top_srcdir (see section Performing Configuration Actions), so it is safe to cd to these variables.

Also please see the discussion of the pwd command.

case

You don't need to quote the argument; no splitting is performed.

You don't need the final `;;', but you should use it.

Because of a bug in its fnmatch, bash fails to properly handle backslashes in character classes:

 
bash-2.02$ case /tmp in [/\\]*) echo OK;; esac
bash-2.02$

This is extremely unfortunate, since you are likely to use this code to handle UNIX or MS-DOS absolute paths. To work around this bug, always put the backslash first:

 
bash-2.02$ case '\TMP' in [\\/]*) echo OK;; esac
OK
bash-2.02$ case /tmp in [\\/]*) echo OK;; esac
OK

Some shells, such as Ash 0.3.8, are confused by an empty case/esac:

 
ash-0.3.8 $ case foo in esac;
error-->Syntax error: ";" unexpected (expecting ")")

Many shells still do not support parenthesized cases, which is a pity for those of us using tools that rely on balanced parentheses. For instance, Solaris 2.8's Bourne shell:

 
$ case foo in (foo) echo foo;; esac
error-->syntax error: `(' unexpected
echo

The simple echo is probably the most surprising source of portability troubles. It is not possible to use `echo' portably unless both options and escape sequences are omitted. New applications which are not aiming at portability should use `printf' instead of `echo'.

Don't expect any option. See section Preset Output Variables, ECHO_N etc. for a means to simulate `-n'.

Do not use backslashes in the arguments, as there is no consensus on their handling. On `echo '\n' | wc -l', the sh of Digital Unix 4.0 and MIPS RISC/OS 4.52, answer 2, but the Solaris' sh, Bash, and Zsh (in sh emulation mode) report 1. Please note that the problem is truly echo: all the shells understand `'\n'' as the string composed of a backslash and an `n'.

Because of these problems, do not pass a string containing arbitrary characters to echo. For example, `echo "$foo"' is safe if you know that foo's value cannot contain backslashes and cannot start with `-', but otherwise you should use a here-document like this:

 
cat <<EOF
$foo
EOF
exit

The default value of exit is supposed to be $?; unfortunately, some shells, such as the DJGPP port of Bash 2.04, just perform `exit 0'.

 
bash-2.04$ foo=`exit 1` || echo fail
fail
bash-2.04$ foo=`(exit 1)` || echo fail
fail
bash-2.04$ foo=`(exit 1); exit` || echo fail
bash-2.04$

Using `exit $?' restores the expected behavior.

Some shell scripts, such as those generated by autoconf, use a trap to clean up before exiting. If the last shell command exited with nonzero status, the trap also exits with nonzero status so that the invoker can tell that an error occurred.

Unfortunately, in some shells, such as Solaris 8 sh, an exit trap ignores the exit command's argument. In these shells, a trap cannot determine whether it was invoked by plain exit or by exit 1. Instead of calling exit directly, use the AC_MSG_ERROR macro that has a workaround for this problem.

export

The builtin export dubs a shell variable environment variable. Each update of exported variables corresponds to an update of the environment variables. Conversely, each environment variable received by the shell when it is launched should be imported as a shell variable marked as exported.

Alas, many shells, such as Solaris 2.5, IRIX 6.3, IRIX 5.2, AIX 4.1.5, and Digital UNIX 4.0, forget to export the environment variables they receive. As a result, two variables coexist: the environment variable and the shell variable. The following code demonstrates this failure:

 
#! /bin/sh
echo $FOO
FOO=bar
echo $FOO
exec /bin/sh $0

when run with `FOO=foo' in the environment, these shells will print alternately `foo' and `bar', although it should only print `foo' and then a sequence of `bar's.

Therefore you should export again each environment variable that you update.

false

Don't expect false to exit with status 1: in the native Bourne shell of Solaris 8 it exits with status 255.

for

To loop over positional arguments, use:

 
for arg
do
  echo "$arg"
done

You may not leave the do on the same line as for, since some shells improperly grok:

 
for arg; do
  echo "$arg"
done

If you want to explicitly refer to the positional arguments, given the `$@' bug (see section Shell Substitutions), use:

 
for arg in ${1+"$@"}; do
  echo "$arg"
done

But keep in mind that Zsh, even in Bourne shell emulation mode, performs word splitting on `${1+"$@"}'; see Shell Substitutions, item `$@', for more.

if

Using `!' is not portable. Instead of:

 
if ! cmp -s file file.new; then
  mv file.new file
fi

use:

 
if cmp -s file file.new; then :; else
  mv file.new file
fi

There are shells that do not reset the exit status from an if:

 
$ if (exit 42); then true; fi; echo $?
42

whereas a proper shell should have printed `0'. This is especially bad in Makefiles since it produces false failures. This is why properly written Makefiles, such as Automake's, have such hairy constructs:

 
if test -f "$file"; then
  install "$file" "$dest"
else
  :
fi
pwd

With modern shells, plain pwd outputs a "logical" directory name, some of whose components may be symbolic links. These directory names are in contrast to "physical" directory names, whose components are all directories.

POSIX 1003.1-2001 requires that pwd must support the `-L' ("logical") and `-P' ("physical") options, with `-L' being the default. However, traditional shells do not support these options, and their pwd command has the `-P' behavior.

Portable scripts should assume neither option is supported, and should assume neither behavior is the default. Also, on many hosts `/bin/pwd' is equivalent to `pwd -P', but POSIX does not require this behavior and portable scripts should not rely on it.

Typically it's best to use plain pwd. On modern hosts this outputs logical directory names, which have the following advantages:

Also please see the discussion of the cd command.

set

This builtin faces the usual problem with arguments starting with a dash. Modern shells such as Bash or Zsh understand `--' to specify the end of the options (any argument after `--' is a parameter, even `-x' for instance), but most shells simply stop the option processing as soon as a non-option argument is found. Therefore, use `dummy' or simply `x' to end the option processing, and use shift to pop it out:

 
set x $my_list; shift

Some shells have the "opposite" problem of not recognizing all options (e.g., `set -e -x' assigns `-x' to the command line). It is better to elide these:

 
set -ex
shift

Not only is shifting a bad idea when there is nothing left to shift, but in addition it is not portable: the shell of MIPS RISC/OS 4.52 refuses to do it.

source

This command is not portable, as POSIX does not require it; use . instead.

test

The test program is the way to perform many file and string tests. It is often invoked by the alternate name `[', but using that name in Autoconf code is asking for trouble since it is an M4 quote character.

If you need to make multiple checks using test, combine them with the shell operators `&&' and `||' instead of using the test operators `-a' and `-o'. On System V, the precedence of `-a' and `-o' is wrong relative to the unary operators; consequently, POSIX does not specify them, so using them is nonportable. If you combine `&&' and `||' in the same statement, keep in mind that they have equal precedence.

You may use `!' with test, but not with if: `test ! -r foo || exit 1'.

test (files)

To enable configure scripts to support cross-compilation, they shouldn't do anything that tests features of the build system instead of the host system. But occasionally you may find it necessary to check whether some arbitrary file exists. To do so, use `test -f' or `test -r'. Do not use `test -x', because 4.3BSD does not have it. Do not use `test -e' either, because Solaris 2.5 does not have it.

test (strings)

Avoid `test "string"', in particular if string might start with a dash, since test might interpret its argument as an option (e.g., `string = "-n"').

Contrary to a common belief, `test -n string' and `test -z string' are portable. Nevertheless many shells (such as Solaris 2.5, AIX 3.2, UNICOS 10.0.0.6, Digital Unix 4 etc.) have bizarre precedence and may be confused if string looks like an operator:

 
$ test -n =
test: argument expected

If there are risks, use `test "xstring" = x' or `test "xstring" != x' instead.

It is common to find variations of the following idiom:

 
test -n "`echo $ac_feature | sed 's/[-a-zA-Z0-9_]//g'`" &&
  action

to take an action when a token matches a given pattern. Such constructs should always be avoided by using:

 
echo "$ac_feature" | grep '[^-a-zA-Z0-9_]' >/dev/null 2>&1 &&
  action

Use case where possible since it is faster, being a shell builtin:

 
case $ac_feature in
  *[!-a-zA-Z0-9_]*) action;;
esac

Alas, negated character classes are probably not portable, although no shell is known to not support the POSIX syntax `[!…]' (when in interactive mode, zsh is confused by the `[!…]' syntax and looks for an event in its history because of `!'). Many shells do not support the alternative syntax `[^…]' (Solaris, Digital Unix, etc.).

One solution can be:

 
expr "$ac_feature" : '.*[^-a-zA-Z0-9_]' >/dev/null &&
  action

or better yet

 
expr "x$ac_feature" : '.*[^-a-zA-Z0-9_]' >/dev/null &&
  action

`expr "Xfoo" : "Xbar"' is more robust than `echo "Xfoo" | grep "^Xbar"', because it avoids problems when `foo' contains backslashes.

trap

It is safe to trap at least the signals 1, 2, 13, and 15. You can also trap 0, i.e., have the trap run when the script ends (either via an explicit exit, or the end of the script).

Although POSIX is not absolutely clear on this point, it is widely admitted that when entering the trap `$?' should be set to the exit status of the last command run before the trap. The ambiguity can be summarized as: "when the trap is launched by an exit, what is the last command run: that before exit, or exit itself?"

Bash considers exit to be the last command, while Zsh and Solaris 8 sh consider that when the trap is run it is still in the exit, hence it is the previous exit status that the trap receives:

 
$ cat trap.sh
trap 'echo $?' 0
(exit 42); exit 0
$ zsh trap.sh
42
$ bash trap.sh
0

The portable solution is then simple: when you want to `exit 42', run `(exit 42); exit 42', the first exit being used to set the exit status to 42 for Zsh, and the second to trigger the trap and pass 42 as exit status for Bash.

The shell in FreeBSD 4.0 has the following bug: `$?' is reset to 0 by empty lines if the code is inside trap.

 
$ trap 'false

echo $?' 0
$ exit
0

Fortunately, this bug only affects trap.

true

Don't worry: as far as we know true is portable. Nevertheless, it's not always a builtin (e.g., Bash 1.x), and the portable shell community tends to prefer using :. This has a funny side effect: when asked whether false is more portable than true Alexandre Oliva answered:

In a sense, yes, because if it doesn't exist, the shell will produce an exit status of failure, which is correct for false, but not for true.

unset

You cannot assume the support of unset. Nevertheless, because it is extremely useful to disable embarrassing variables such as PS1, you can test for its existence and use it provided you give a neutralizing value when unset is not supported:

 
if (unset FOO) >/dev/null 2>&1; then
  unset=unset
else
  unset=false
fi
$unset PS1 || PS1='$ '

See section Special Shell Variables, for some neutralizing values. Also, see Limitations of Shell Builtins, documentation of export, for the case of environment variables.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10.9 Limitations of Usual Tools

The small set of tools you can expect to find on any machine can still include some limitations you should be aware of.

awk

Don't leave white spaces before the parentheses in user functions calls; GNU awk will reject it:

 
$ gawk 'function die () { print "Aaaaarg!"  }
        BEGIN { die () }'
gawk: cmd. line:2:         BEGIN { die () }
gawk: cmd. line:2:                      ^ parse error
$ gawk 'function die () { print "Aaaaarg!"  }
        BEGIN { die() }'
Aaaaarg!

If you want your program to be deterministic, don't depend on for on arrays:

 
$ cat for.awk
END {
  arr["foo"] = 1
  arr["bar"] = 1
  for (i in arr)
    print i
}
$ gawk -f for.awk </dev/null
foo
bar
$ nawk -f for.awk </dev/null
bar
foo

Some AWK, such as HPUX 11.0's native one, have regex engines fragile to inner anchors:

 
$ echo xfoo | $AWK '/foo|^bar/ { print }'
$ echo bar | $AWK '/foo|^bar/ { print }'
bar
$ echo xfoo | $AWK '/^bar|foo/ { print }'
xfoo
$ echo bar | $AWK '/^bar|foo/ { print }'
bar

Either do not depend on such patterns (i.e., use `/^(.*foo|bar)/', or use a simple test to reject such AWK.

cat

Don't rely on any option. The option `-v', which displays non-printing characters, seems portable, though.

cc

When a compilation such as `cc foo.c -o foo' fails, some compilers (such as CDS on Reliant UNIX) leave a `foo.o'.

HP-UX cc doesn't accept `.S' files to preprocess and assemble. `cc -c foo.S' will appear to succeed, but in fact does nothing.

The default executable, produced by `cc foo.c', can be

cmp

cmp performs a raw data comparison of two files, while diff compares two text files. Therefore, if you might compare DOS files, even if only checking whether two files are different, use diff to avoid spurious differences due to differences of newline encoding.

cp

SunOS cp does not support `-f', although its mv does. It's possible to deduce why mv and cp are different with respect to `-f'. mv prompts by default before overwriting a read-only file. cp does not. Therefore, mv requires a `-f' option, but cp does not. mv and cp behave differently with respect to read-only files because the simplest form of cp cannot overwrite a read-only file, but the simplest form of mv can. This is because cp opens the target for write access, whereas mv simply calls link (or, in newer systems, rename).

Bob Proulx notes that `cp -p' always tries to copy ownerships. But whether it actually does copy ownerships or not is a system dependent policy decision implemented by the kernel. If the kernel allows it then it happens. If the kernel does not allow it then it does not happen. It is not something cp itself has control over.

In SysV any user can chown files to any other user, and SysV also had a non-sticky `/tmp'. That undoubtedly derives from the heritage of SysV in a business environment without hostile users. BSD changed this to be a more secure model where only root can chown files and a sticky `/tmp' is used. That undoubtedly derives from the heritage of BSD in a campus environment.

Linux by default follows BSD, but it can be configured to allow chown. HP-UX as an alternate example follows SysV, but it can be configured to use the modern security model and disallow chown. Since it is an administrator configurable parameter you can't use the name of the kernel as an indicator of the behavior.

date

Some versions of date do not recognize special % directives, and unfortunately, instead of complaining, they just pass them through, and exit with success:

 
$ uname -a
OSF1 medusa.sis.pasteur.fr V5.1 732 alpha
$ date "+%s"
%s
diff

Option `-u' is nonportable.

Some implementations, such as Tru64's, fail when comparing to `/dev/null'. Use an empty file instead.

dirname

Not all hosts have a working dirname, and you should instead use AS_DIRNAME (see section Programming in M4sh). For example:

 
dir=`dirname "$file"`       # This is not portable.
dir=`AS_DIRNAME(["$file"])` # This is more portable.

This handles a few subtleties in the standard way required by POSIX. For example, under UN*X, should `dirname //1' give `/'? Paul Eggert answers:

No, under some older flavors of Unix, leading `//' is a special path name: it refers to a "super-root" and is used to access other machines' files. Leading `///', `////', etc. are equivalent to `/'; but leading `//' is special. I think this tradition started with Apollo Domain/OS, an OS that is still in use on some older hosts.

POSIX allows but does not require the special treatment for `//'. It says that the behavior of dirname on path names of the form `//([^/]+/*)?' is implementation defined. In these cases, GNU dirname returns `/', but it's more portable to return `//' as this works even on those older flavors of Unix.

egrep

POSIX 1003.1-2001 no longer requires egrep, but many older hosts do not yet support the POSIX replacement grep -E. To work around this problem, invoke AC_PROG_EGREP and then use $EGREP.

The empty alternative is not portable, use `?' instead. For instance with Digital Unix v5.0:

 
> printf "foo\n|foo\n" | $EGREP '^(|foo|bar)$'
|foo
> printf "bar\nbar|\n" | $EGREP '^(foo|bar|)$'
bar|
> printf "foo\nfoo|\n|bar\nbar\n" | $EGREP '^(foo||bar)$'
foo
|bar

$EGREP also suffers the limitations of grep.

expr

No expr keyword starts with `x', so use `expr x"word" : 'xregex'' to keep expr from misinterpreting word.

Don't use length, substr, match and index.

expr (`|')

You can use `|'. Although POSIX does require that `expr ''' return the empty string, it does not specify the result when you `|' together the empty string (or zero) with the empty string. For example:

 
expr '' \| ''

GNU/Linux and POSIX.2-1992 return the empty string for this case, but traditional UNIX returns `0' (Solaris is one such example). In POSIX.1-2001, the specification has been changed to match traditional UNIX's behavior (which is bizarre, but it's too late to fix this). Please note that the same problem does arise when the empty string results from a computation, as in:

 
expr bar : foo \| foo : bar

Avoid this portability problem by avoiding the empty string.

expr (`:')

Don't use `\?', `\+' and `\|' in patterns, as they are not supported on Solaris.

The POSIX standard is ambiguous as to whether `expr 'a' : '\(b\)'' outputs `0' or the empty string. In practice, it outputs the empty string on most platforms, but portable scripts should not assume this. For instance, the QNX 4.25 native expr returns `0'.

One might think that a way to get a uniform behavior would be to use the empty string as a default value:

 
expr a : '\(b\)' \| ''

Unfortunately this behaves exactly as the original expression; see the `expr (`:')' entry for more information.

Older expr implementations (e.g., SunOS 4 expr and Solaris 8 /usr/ucb/expr) have a silly length limit that causes expr to fail if the matched substring is longer than 120 bytes. In this case, you might want to fall back on `echo|sed' if expr fails.

Don't leave, there is some more!

The QNX 4.25 expr, in addition of preferring `0' to the empty string, has a funny behavior in its exit status: it's always 1 when parentheses are used!

 
$ val=`expr 'a' : 'a'`; echo "$?: $val"
0: 1
$ val=`expr 'a' : 'b'`; echo "$?: $val"
1: 0

$ val=`expr 'a' : '\(a\)'`; echo "?: $val"
1: a
$ val=`expr 'a' : '\(b\)'`; echo "?: $val"
1: 0

In practice this can be a big problem if you are ready to catch failures of expr programs with some other method (such as using sed), since you may get twice the result. For instance

 
$ expr 'a' : '\(a\)' || echo 'a' | sed 's/^\(a\)$/\1/'

will output `a' on most hosts, but `aa' on QNX 4.25. A simple workaround consists in testing expr and use a variable set to expr or to false according to the result.

fgrep

POSIX 1003.1-2001 no longer requires fgrep, but many older hosts do not yet support the POSIX replacement grep -F. To work around this problem, invoke AC_PROG_FGREP and then use $FGREP.

find

The option `-maxdepth' seems to be GNU specific. Tru64 v5.1, NetBSD 1.5 and Solaris 2.5 find commands do not understand it.

The replacement of `{}' is guaranteed only if the argument is exactly {}, not if it's only a part of an argument. For instance on DU, and HP-UX 10.20 and HP-UX 11:

 
$ touch foo
$ find . -name foo -exec echo "{}-{}" \;
{}-{}

while GNU find reports `./foo-./foo'.

grep

Don't use `grep -s' to suppress output, because `grep -s' on System V does not suppress output, only error messages. Instead, redirect the standard output and standard error (in case the file doesn't exist) of grep to `/dev/null'. Check the exit status of grep to determine whether it found a match.

Don't use multiple regexps with `-e', as some grep will only honor the last pattern (e.g., IRIX 6.5 and Solaris 2.5.1). Anyway, Stardent Vistra SVR4 grep lacks `-e'… Instead, use extended regular expressions and alternation.

Don't rely on `-w', as Irix 6.5.16m's grep does not support it.

ln

Don't rely on ln having a `-f' option. Symbolic links are not available on old systems; use `$(LN_S)' as a portable substitute.

For versions of the DJGPP before 2.04, ln emulates soft links to executables by generating a stub that in turn calls the real program. This feature also works with nonexistent files like in the Unix spec. So `ln -s file link' will generate `link.exe', which will attempt to call `file.exe' if run. But this feature only works for executables, so `cp -p' is used instead for these systems. DJGPP versions 2.04 and later have full symlink support.

ls

The portable options are `-acdilrtu'. Modern practice is for `-l' to output both owner and group, but traditional ls omits the group.

Modern practice is for all diagnostics to go to standard error, but traditional `ls foo' prints the message `foo not found' to standard output if `foo' does not exist. Be careful when writing shell commands like `sources=`ls *.c 2>/dev/null`', since with traditional ls this is equivalent to `sources="*.c not found"' if there are no `.c' files.

mkdir

None of mkdir's options are portable. Instead of `mkdir -p filename', you should use use AS_MKDIR_P(filename) (see section Programming in M4sh).

mv

The only portable options are `-f' and `-i'.

Moving individual files between file systems is portable (it was in V6), but it is not always atomic: when doing `mv new existing', there's a critical section where neither the old nor the new version of `existing' actually exists.

Be aware that moving files from `/tmp' can sometimes cause undesirable (but perfectly valid) warnings, even if you created these files. On some systems, creating the file in `/tmp' is setting a guid wheel which you may not be part of. So the file is copied, and then the chgrp fails:

 
$ touch /tmp/foo
$ mv /tmp/foo .
error-->mv: ./foo: set owner/group (was: 3830/0): Operation not permitted
$ echo $?
0
$ ls foo
foo

This behavior conforms to POSIX:

If the duplication of the file characteristics fails for any reason, mv shall write a diagnostic message to standard error, but this failure shall not cause mv to modify its exit status."

Moving directories across mount points is not portable, use cp and rm.

Moving/Deleting open files isn't portable. The following can't be done on DOS/WIN32:

 
exec > foo
mv foo bar

nor can

 
exec > foo
rm -f foo
sed

Patterns should not include the separator (unless escaped), even as part of a character class. In conformance with POSIX, the Cray sed will reject `s/[^/]*$//': use `s,[^/]*$,,'.

Sed scripts should not use branch labels longer than 8 characters and should not contain comments.

Don't include extra `;', as some sed, such as NetBSD 1.4.2's, try to interpret the second as a command:

 
$ echo a | sed 's/x/x/;;s/x/x/'
sed: 1: "s/x/x/;;s/x/x/": invalid command code ;

Input should have reasonably long lines, since some sed have an input buffer limited to 4000 bytes.

Alternation, `\|', is common but POSIX does not require its support, so it should be avoided in portable scripts. Solaris 8 sed does not support alternation; e.g., `sed '/a\|b/d'' deletes only lines that contain the literal string `a|b'.

Anchors (`^' and `$') inside groups are not portable.

Nested parenthesization in patterns (e.g., `\(\(a*\)b*)\)') is quite portable to modern hosts, but is not supported by some older sed implementations like SVR3.

Of course the option `-e' is portable, but it is not needed. No valid Sed program can start with a dash, so it does not help disambiguating. Its sole usefulness is to help enforcing indentation as in:

 
sed -e instruction-1 \
    -e instruction-2

as opposed to

 
sed instruction-1;instruction-2

Contrary to yet another urban legend, you may portably use `&' in the replacement part of the s command to mean "what was matched". All descendants of Bell Lab's V7 sed (at least; we don't have first hand experience with older seds) have supported it.

POSIX requires that you must not have any white space between `!' and the following command. It is OK to have blanks between the address and the `!'. For instance, on Solaris 8:

 
$ echo "foo" | sed -n '/bar/ ! p'
error-->Unrecognized command: /bar/ ! p
$ echo "foo" | sed -n '/bar/! p'
error-->Unrecognized command: /bar/! p
$ echo "foo" | sed -n '/bar/ !p'
foo
sed (`t')

Some old systems have sed that "forget" to reset their `t' flag when starting a new cycle. For instance on MIPS RISC/OS, and on IRIX 5.3, if you run the following sed script (the line numbers are not actual part of the texts):

 
s/keep me/kept/g  # a
t end             # b
s/.*/deleted/g    # c
: end             # d

on

 
delete me         # 1
delete me         # 2
keep me           # 3
delete me         # 4

you get

 
deleted
delete me
kept
deleted

instead of

 
deleted
deleted
kept
deleted

Why? When processing 1, a matches, therefore sets the t flag, b jumps to d, and the output is produced. When processing line 2, the t flag is still set (this is the bug). Line a fails to match, but sed is not supposed to clear the t flag when a substitution fails. Line b sees that the flag is set, therefore it clears it, and jumps to d, hence you get `delete me' instead of `deleted'. When processing 3, t is clear, a matches, so the flag is set, hence b clears the flags and jumps. Finally, since the flag is clear, 4 is processed properly.

There are two things one should remember about `t' in sed. Firstly, always remember that `t' jumps if some substitution succeeded, not only the immediately preceding substitution. Therefore, always use a fake `t clear; : clear' to reset the t flag where indeed.

Secondly, you cannot rely on sed to clear the flag at each new cycle.

One portable implementation of the script above is:

 
t clear
: clear
s/keep me/kept/g
t end
s/.*/deleted/g
: end
touch

On some old BSD systems, touch or any command that results in an empty file does not update the timestamps, so use a command like echo as a workaround.

GNU touch 3.16r (and presumably all before that) fails to work on SunOS 4.1.3 when the empty file is on an NFS-mounted 4.2 volume.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

10.10 Limitations of Make

make itself suffers a great number of limitations, only a few of which are listed here. First of all, remember that since commands are executed by the shell, all its weaknesses are inherited....

$<

POSIX says that the `$<' construct in makefiles can be used only in inference rules and in the `.DEFAULT' rule; its meaning in ordinary rules is unspecified. Solaris 8's make for instance will replace it with the argument.

Leading underscore in macro names

Some makes don't support leading underscores in macro names, such as on NEWS-OS 4.2R.

 
$ cat Makefile
_am_include = #
_am_quote =
all:; @echo this is test
$ make
Make: Must be a separator on rules line 2.  Stop.
$ cat Makefile2
am_include = #
am_quote =
all:; @echo this is test
$ make -f Makefile2
this is test
Trailing backslash in macro

On some versions of HP-UX, make will read multiple newlines following a backslash, continuing to the next non-empty line. For example,

 
FOO = one \

BAR = two

test:
        : FOO is "$(FOO)"
        : BAR is "$(BAR)"

shows FOO equal to one BAR = two. Other makes sensibly let a backslash continue only to the immediately following line.

Escaped newline in comments

According to POSIX, `Makefile' comments start with # and continue until an unescaped newline is reached.

 
% cat Makefile
# A = foo \
      bar \
      baz

all:
        @echo ok
% make   # GNU make
ok

However in Real World this is not always the case. Some implementations discards anything from # up to the end of line, ignoring any trailing backslash.

 
% pmake  # BSD make
"Makefile", line 3: Need an operator
Fatal errors encountered -- cannot continue

Therefore, if you want to comment out a multi-line definition, prefix each line with #, not only the first.

 
# A = foo \
#     bar \
#     baz
make macro=value and sub-makes.

A command-line variable definition such as foo=bar overrides any definition of foo in the `Makefile'. Some make implementations (such as GNU make) will propagate this override to sub-invocations of make. This is allowed but not required by POSIX.

 
% cat Makefile
foo = foo
one:
        @echo $(foo)
        $(MAKE) two
two:
        @echo $(foo)
% make foo=bar            # GNU make 3.79.1
bar
make two
make[1]: Entering directory `/home/adl'
bar
make[1]: Leaving directory `/home/adl'
% pmake foo=bar           # BSD make
bar
pmake two
foo

You have a few possibilities if you do want the foo=bar override to propagate to sub-makes. One is to use the -e option, which causes all environment variables to have precedence over the `Makefile' macro definitions, and declare foo as an environment variable:

 
% env foo=bar make -e

The -e option is propagated to sub-makes automatically, and since the environment is inherited between make invocations, the foo macro will be overridden in sub-makes as expected.

Using -e could have unexpected side-effects if your environment contains some other macros usually defined by the Makefile. (See also the note about make -e and SHELL below.)

Another way to propagate overrides to sub-makes is to do it manually, from your `Makefile':

 
foo = foo
one:
        @echo $(foo)
        $(MAKE) foo=$(foo) two
two:
        @echo $(foo)

You need to foresee all macros that a user might want to override if you do that.

The SHELL macro

POSIX-compliant makes internally use the $(SHELL) macro to spawn shell processes and execute `Makefile' rules. This is a builtin macro supplied by make, but it can be modified from the `Makefile' or a command-line argument.

Not all makes will define this SHELL macro. OSF/Tru64 make is an example; this implementation will always use /bin/sh. So it's a good idea to always define SHELL in your `Makefile's. If you use Autoconf, do

 
SHELL = @SHELL@

POSIX-compliant makes should never acquire the value of $(SHELL) from the environment, even when make -e is used (otherwise, think about what would happen to your rules if SHELL=/bin/tcsh).

However not all make implementations will make this exception. For instance it's not surprising that OSF/Tru64 make doesn't protect SHELL, since it doesn't use it.

 
% cat Makefile
SHELL = /bin/sh
FOO = foo
all:
        @echo $(SHELL)
        @echo $(FOO)
% env SHELL=/bin/tcsh FOO=bar make -e   # OSF1 V4.0 Make
/bin/tcsh
bar
% env SHELL=/bin/tcsh FOO=bar gmake -e  # GNU make
/bin/sh
bar
Comments in rules

Never put comments in a rule.

Some make treat anything starting with a tab as a command for the current rule, even if the tab is immediately followed by a #. The make from Tru64 Unix V5.1 is one of them. The following `Makefile' will run # foo through the shell.

 
all:
        # foo
The `obj/' subdirectory.

Never name one of your subdirectories `obj/' if you don't like surprises.

If an `obj/' directory exists, BSD make will enter it before reading `Makefile'. Hence the `Makefile' in the current directory will not be read.

 
% cat Makefile
all:
        echo Hello
% cat obj/Makefile
all:
        echo World
% make      # GNU make
echo Hello
Hello
% pmake     # BSD make
echo World
World
make -k

Do not rely on the exit status of make -k. Some implementations reflect whether they encountered an error in their exit status; other implementations always succeed.

 
% cat Makefile
all:
        false
% make -k; echo exit status: $?    # GNU make
false
make: *** [all] Error 1
exit status: 2
% pmake -k; echo exit status: $?   # BSD make
false
*** Error code 1 (continuing)
exit status: 0
VPATH

There is no VPATH support specified in POSIX. Many makes have a form of VPATH support, but its implementation is not consistent amongst makes.

Maybe the best suggestion to give to people who need the VPATH feature is to choose a make implementation and stick to it. Since the resulting `Makefile's are not portable anyway, better choose a portable make (hint, hint).

Here are a couple of known issues with some VPATH implementations.

VPATH and double-colon rules

Any assignment to VPATH causes Sun make to only execute the first set of double-colon rules. (This comment has been here since 1994 and the context has been lost. It's probably about SunOS 4. If you can reproduce this, please send us a test case for illustration.)

$< in inference rules:

One implementation of make would not prefix $< if this prerequisite has been found in a VPATH dir. This means that

 
VPATH = ../src
.c.o:
        cc -c $< -o $@

would run cc -c foo.c -o foo.o, even if `foo.c' was actually found in `../src/'.

This can be fixed as follows.

 
VPATH = ../src
.c.o:
        cc -c `test -f $< || echo ../src/`$< -o $@

This kludge was introduced in Automake in 2000, but the exact context have been lost. If you know which make implementation is involved here, please drop us a note.

$< not supported in explicit rules

As said elsewhere, using $< in explicit rules is not portable. The prerequisite file must be named explicitly in the rule. If you want to find the prerequisite via a VPATH search, you have to code the whole thing manually. For instance, using the same pattern as above:

 
VPATH = ../src
foo.o: foo.c
        cc -c `test -f foo.c || echo ../src/`foo.c -o foo.o
Automatic rule rewriting

Some make implementations, such as SunOS make, will search prerequisites in VPATH and rewrite all their occurrences in the rule appropriately.

For instance

 
VPATH = ../src
foo.o: foo.c
        cc -c foo.c -o foo.o

would execute cc -c ../src/foo.c -o foo.o if `foo.c' was found in `../src'. That sounds great.

However, for the sake of other make implementations, we can't rely on this, and we have to search VPATH manually:

 
VPATH = ../src
foo.o: foo.c
        cc -c `test -f foo.c || echo ../src/`foo.c -o foo.o

However the "prerequisite rewriting" still applies here. So if `foo.c' is in `../src', SunOS make will execute

 
cc -c `test -f ../src/foo.c || echo ../src/`foo.c -o foo.o

which reduces to

 
cc -c foo.c -o foo.o

and thus fails. Oops.

One workaround is to make sure that foo.c never appears as a plain word in the rule. For instance these three rules would be safe.

 
VPATH = ../src
foo.o: foo.c
        cc -c `test -f ./foo.c || echo ../src/`foo.c -o foo.o
foo2.o: foo2.c
        cc -c `test -f 'foo2.c' || echo ../src/`foo2.c -o foo2.o
foo3.o: foo3.c
        cc -c `test -f "foo3.c" || echo ../src/`foo3.c -o foo3.o

Things get worse when your prerequisites are in a macro.

 
VPATH = ../src
HEADERS = foo.h foo2.h foo3.h
install-HEADERS: $(HEADERS)
        for i in $(HEADERS); do \
          $(INSTALL) -m 644 `test -f $$i || echo ../src/`$$i \
            $(DESTDIR)$(includedir)/$$i; \
        done

The above install-HEADERS rule is not SunOS-proof because for i in $(HEADERS); will be expanded as for i in foo.h foo2.h foo3.h; where foo.h and foo2.h are plain words and are hence subject to VPATH adjustments.

If the three files are in `../src', the rule is run as:

 
for i in ../src/foo.h ../src/foo2.h foo3.h; do \
  install -m 644 `test -f $i || echo ../src/`$i \
     /usr/local/include/$i; \
done

where the two first install calls will fail. For instance, consider the foo.h installation:

 
install -m 644 `test -f ../src/foo.h || echo ../src/`../src/foo.h \
  /usr/local/include/../src/foo.h;

It reduces to:

 
install -m 644 ../src/foo.h /usr/local/include/../src/foo.h;

Note that the manual VPATH search did not cause any problems here; however this command installs `foo.h' in an incorrect directory.

Trying to quote $(HEADERS) in some way, as we did for foo.c a few `Makefile's ago, does not help:

 
install-HEADERS: $(HEADERS)
        headers='$(HEADERS)'; for i in $$headers; do \
          $(INSTALL) -m 644 `test -f $$i || echo ../src/`$$i \
            $(DESTDIR)$(includedir)/$$i; \
        done

Indeed, headers='$(HEADERS)' expands to headers='foo.h foo2.h foo3.h' where foo2.h is still a plain word. (Aside: the headers='$(HEADERS)'; for i in $$headers; idiom is a good idea if $(HEADERS) can be empty, because some shell produce a syntax error on for i in;.)

One workaround is to strip this unwanted `../src/' prefix manually:

 
VPATH = ../src
HEADERS = foo.h foo2.h foo3.h
install-HEADERS: $(HEADERS)
        headers='$(HEADERS)'; for i in $$headers; do \
          i=`expr "$$i" : '../src/\(.*\)'`;
          $(INSTALL) -m 644 `test -f $$i || echo ../src/`$$i \
            $(DESTDIR)$(includedir)/$$i; \
        done
OSF/Tru64 make creates prerequisite directories magically

When a prerequisite is a sub-directory of VPATH, Tru64 make will create it in the current directory.

 
% mkdir -p foo/bar build
% cd build
% cat >Makefile <<END
VPATH = ..
all: foo/bar
END
% make
mkdir foo
mkdir foo/bar

This can yield unexpected results if a rule uses a manual VPATH search as presented before.

 
VPATH = ..
all : foo/bar
        command `test -d foo/bar || echo ../`foo/bar

The above command will be run on the empty `foo/bar' directory that was created in the current directory.

target lookup

GNU make uses a rather complex algorithm to decide when it should use files found via a VPATH search. See (make)Search Algorithm section `How Directory Searches are Performed' in The GNU Make Manual.

If a target needs to be rebuilt, GNU make discards the filename found during the VPATH search for this target, and builds the file locally using the filename given in the `Makefile'. If a target does not need to be rebuilt, GNU make uses the filename found during the VPATH search.

Other make implementations, like BSD make, are easier to describe: the filename found during the VPATH search will be used whether the target needs to be rebuilt or not. Therefore new files are created locally, but existing files are updated at their VPATH location.

When attempting a VPATH build for an autoconfiscated package (e.g, mkdir build; ../configure), this means the GNU make will build everything locally in the `build' directory, while BSD make will build new files locally and update existing files in the source directory.

 
% cat Makefile
VPATH = ..
all: foo.x bar.x
foo.x bar.x: newer.x
        @echo Building $@
% touch ../bar.x
% touch ../newer.x
% make        # GNU make
Building foo.x
Building bar.x
% pmake       # BSD make
Building foo.x
Building ../bar.x

Another point worth mentioning is that once GNU make has decided to ignore a VPATH filename (e.g., it ignored `../bar.x' in the above example) it will continue to ignore it when the target occurs as a prerequisite of another rule.

The following example shows that GNU make does not look up `bar.x' in VPATH before performing the .x.y rule, because it ignored the VPATH result of `bar.x' while running the bar.x: newer.x rule.

 
% cat Makefile
VPATH = ..
all: bar.y
bar.x: newer.x
        @echo Building $@
.SUFFIXES: .x .y
.x.y:
        cp $< $@
% touch ../bar.x
% touch ../newer.x
% make        # GNU make
Building bar.x
cp bar.x bar.y
cp: cannot stat `bar.x': No such file or directory
make: *** [bar.y] Error 1
% pmake       # BSD make
Building ../bar.x
cp ../bar.x bar.y

Note that if you drop away the command from the bar.x: newer.x rule, things will magically start to work: GNU make knows that bar.x hasn't been updated, therefore it doesn't discard the result from VPATH (`../bar.x') in succeeding uses.

 
% cat Makefile
VPATH = ..
all: bar.y
bar.x: newer.x
.SUFFIXES: .x .y
.x.y:
        cp $< $@
% touch ../bar.x
% touch ../newer.x
% make        # GNU make
cp ../bar.x bar.y
% rm bar.y
% pmake       # BSD make
cp ../bar.x bar.y
Single Suffix Rules and Separated Dependencies

A Single Suffix Rule is basically a usual suffix (inference) rule (`.from.to:'), but which destination suffix is empty (`.from:').

Separated dependencies simply refers to listing the prerequisite of a target, without defining a rule. Usually one can list on the one hand side, the rules, and on the other hand side, the dependencies.

Solaris make does not support separated dependencies for targets defined by single suffix rules:

 
$ cat Makefile
.SUFFIXES: .in
foo: foo.in
.in:
        cp $< $ $ touch foo.in
$ make
$ ls
Makefile  foo.in

while GNU Make does:

 
$ gmake
cp foo.in foo
$ ls
Makefile  foo       foo.in

Note it works without the `foo: foo.in' dependency.

 
$ cat Makefile
.SUFFIXES: .in
.in:
        cp $< $ $ make foo
cp foo.in foo

and it works with double suffix inference rules:

 
$ cat Makefile
foo.out: foo.in
.SUFFIXES: .in .out
.in.out:
        cp $< $ $ make
cp foo.in foo.out

As a result, in such a case, you have to write target rules.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by John Paul Wallington on October, 29 2003 using texi2html 1.67.