8 Controlling the Archive Format
Due to historical reasons, there are several formats of tar archives.
All of them are based on the same principles, but have some subtle
differences that often make them incompatible with each other.
GNU tar is able to create and handle archives in a variety of formats.
The most frequently used formats are (in alphabetical order):
- gnu
- Format used by GNU tar versions up to 1.13.25. This format derived
from an early POSIX standard, adding some improvements such as
sparse file handling and incremental archives. Unfortunately these
features were implemented in a way incompatible with other archive
formats.
Archives in ‘gnu’ format are able to hold pathnames of unlimited
length.
- oldgnu
- Format used by GNU tar of versions prior to 1.12.
- v7
- Archive format, compatible with the V7 implementation of tar. This
format imposes a number of limitations. The most important of them
are:
- The maximum length of a file name is limited to 99 characters.
- The maximum length of a symbolic link is limited to 99 characters.
- It is impossible to store special files (block and character
devices, fifos etc.)
- Maximum value of user or group ID is limited to 2097151 (7777777
octal)
- V7 archives do not contain symbolic ownership information (user
and group name of the file owner).
This format has traditionally been used by Automake when producing
Makefiles. This practice will change in the future, in the meantime,
however this means that projects containing filenames more than 99
characters long will not be able to use GNU tar 1.15.92 and
Automake prior to 1.9.
- ustar
- Archive format defined by POSIX.1-1988 specification. It stores
symbolic ownership information. It is also able to store
special files. However, it imposes several restrictions as well:
- The maximum length of a file name is limited to 256 characters,
provided that the filename can be split at directory separator in
two parts, first of them being at most 155 bytes long. So, in most
cases the maximum file name length will be shorter than 256
characters.
- The maximum length of a symbolic link name is limited to
100 characters.
- Maximum size of a file the archive is able to accomodate
is 8GB
- Maximum value of UID/GID is 2097151.
- Maximum number of bits in device major and minor numbers is 21.
- star
- Format used by Jörg Schilling star
implementation. GNU tar is able to read ‘star’ archives but
currently does not produce them.
- posix
- Archive format defined by POSIX.1-2001 specification. This is the
most flexible and feature-rich format. It does not impose any
restrictions on file sizes or filename lengths. This format is quite
recent, so not all tar implementations are able to handle it properly.
However, this format is designed in such a way that any tar
implementation able to read ‘ustar’ archives will be able to read
most ‘posix’ archives as well, with the only exception that any
additional information (such as long file names etc.) will in such
case be extracted as plain text files along with the files it refers to.
This archive format will be the default format for future versions
of GNU tar.
The following table summarizes the limitations of each of these
formats:
Format | UID | File Size | Path Name | Devn
|
---|
gnu | 1.8e19 | Unlimited | Unlimited | 63
|
oldgnu | 1.8e19 | Unlimited | Unlimited | 63
|
v7 | 2097151 | 8GB | 99 | n/a
|
ustar | 2097151 | 8GB | 256 | 21
|
posix | Unlimited | Unlimited | Unlimited | Unlimited
|
The default format for GNU tar is defined at compilation
time. You may check it by running tar --help, and examining
the last lines of its output. Usually, GNU tar is configured
to create archives in ‘gnu’ format, however, future version will
switch to ‘posix’.