Previous: Split Recovery, Up: Other Tars
Any tar implementation will be able to extract sparse members from a PAX archive. However, the extracted files will be condensed, i.e. any zero blocks will be removed from them. When we restore such a condensed file to its original form, by adding zero bloks (or holes) back to their original locations, we call this process expanding a compressed sparse file.
To expand a file, you will need a simple auxiliary program called xsparse. It is available in source form from GNU tar home page.
Let's begin with archive members in sparse format version 1.01, which are the easiest to expand. The condensed file will contain both file map and file data, so no additional data will be needed to restore it. If the original file name was dir/name, then the condensed file will be named dir/GNUSparseFile.n/name, where n is a decimal number2.
To expand a version 1.0 file, run xsparse as follows:
$ xsparse cond-file
where cond-file is the name of the condensed file. The utility will deduce the name for the resulting expanded file using the following algorithm:
In the unlikely case when this algorithm does not suite your needs, you can explicitely specify output file name as a second argument to the command:
$ xsparse cond-file
It is often a good idea to run xsparse in dry run mode first. In this mode, the command does not actually expand the file, but verbosely lists all actions it would be taking to do so. The dry run mode is enabled by -n command line argument:
$ xsparse -n /home/gray/GNUSparseFile.6058/sparsefile Reading v.1.0 sparse map Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to `/home/gray/sparsefile' Finished dry run
To actually expand the file, you would run:
$ xsparse /home/gray/GNUSparseFile.6058/sparsefile
The program behaves the same way all UNIX utilities do: it will keep quiet unless it has simething important to tell you (e.g. an error condition or something). If you wish it to produce verbose output, similar to that from the dry run mode, give it -v option:
$ xsparse -v /home/gray/GNUSparseFile.6058/sparsefile Reading v.1.0 sparse map Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to `/home/gray/sparsefile' Done
Additionally, if your tar implementation has extracted the extended headers for this file, you can instruct xstar to use them in order to verify the integrity of the expanded file. The option -x sets the name of the extended header file to use. Continuing our example:
$ xsparse -v -x /home/gray/PaxHeaders.6058/sparsefile \ /home/gray/GNUSparseFile.6058/sparsefile Reading extended header file Found variable GNU.sparse.major = 1 Found variable GNU.sparse.minor = 0 Found variable GNU.sparse.name = sparsefile Found variable GNU.sparse.realsize = 217481216 Reading v.1.0 sparse map Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to `/home/gray/sparsefile' Done
An extended header is a special tar archive header
that precedes an archive member and contains a set of
variables, describing the member properties that cannot be
stored in the standard ustar
header. While optional for
expanding sparse version 1.0 members, use of extended headers is
mandatory when expanding sparse members in older sparse formats: v.0.0
and v.0.1 (The sparse formats are described in detail in see Sparse Formats). So, for this format, the question is: how to obtain
extended headers from the archive?
If you use a tar implementation that does not support PAX format, extended headers for each member will be extracted as a separate file. If we represent the member name as dir/name, then the extended header file will be named dir/PaxHeaders.n/name, where n is an integer number.
Things become more difficult if your tar implementation does support PAX headers, because in this case you will have to manually extract the headers. We recommend the following algorithm:
$ star -t -v -block-number -f arc.tar ... star: Unknown extended header keyword 'GNU.sparse.size' ignored. star: Unknown extended header keyword 'GNU.sparse.numblocks' ignored. star: Unknown extended header keyword 'GNU.sparse.name' ignored. star: Unknown extended header keyword 'GNU.sparse.map' ignored. block 56: 425984 -rw-r--r-- gray/users Jun 25 14:46 2006 GNUSparseFile.28124/sparsefile block 897: 65391 -rw-r--r-- gray/users Jun 24 20:06 2006 README ...
(as usual, ignore the warnings about unknown keywords.)
N = Bs - Bn - size/512 - 2
This number gives the size of the extended header part in tar blocks.
In our example, this formula gives: 897 - 56 - 425984 / 512 - 2
= 7
.
dd if=archive of=hname bs=512 skip=Bs count=N
where archive is the archive name, hname is a name of the file to store the extended header in, Bs and N are computed in previous steps.
In our example, this command will be
$ dd if=arc.tar of=xhdr bs=512 skip=56 count=7
Finally, you can expand the condensed file, using the obtained header:
$ xsparse -v -x xhdr GNUSparseFile.6058/sparsefile Reading extended header file Found variable GNU.sparse.size = 217481216 Found variable GNU.sparse.numblocks = 208 Found variable GNU.sparse.name = sparsefile Found variable GNU.sparse.map = 0,2048,1050624,2048,... Expanding file `GNUSparseFile.28124/sparsefile' to `sparsefile' Done
[2] technically speaking, n is a process ID of the tar process which created the archive (see PAX keywords).