tar (Section 38.2) is a general-purpose archiving utility capable of packing many files into a single archive file, retaining information such as file permissions and ownership. The name tar stands for tape archive, because the tool was originally used to archive files as backups on tape. However, use of tar is not at all restricted to making tape backups, as we'll see.
The format of the tar command is:
tarfunctionoptions
files...
where function
is
a single letter indicating the operation to perform,
options
is a list of (single-letter) options to
that function, and files
is the list of files to pack
or unpack in an archive. (Note that function
is not
separated from options
by any space.)
The most commonly used functions are c reate, extract, and table-of-contents.
There are other options, which we cover in Section 38.5. Section 38.12 has more information about the order of tar options, and Section 39.3 has a lot more about GNU tar.
Although the tar syntax might appear complex at first, in practice it's quite simple. For example, say we have a directory named mt, containing these files:
rutabaga% ls -l mt
total 37
-rw-r--r-- 1 root root 24 Sep 21 1993 Makefile
-rw-r--r-- 1 root root 847 Sep 21 1993 README
-rwxr-xr-x 1 root root 9220 Nov 16 19:03 mt
-rw-r--r-- 1 root root 2775 Aug 7 1993 mt.1
-rw-r--r-- 1 root root 6421 Aug 7 1993 mt.c
-rw-r--r-- 1 root root 3948 Nov 16 19:02 mt.o
-rw-r--r-- 1 root root 11204 Sep 5 1993 st_info.txt
We wish to pack the contents of this directory into a single tar archive. To do this, we use the following command:
tar cf mt.tar mt
The first argument to tar is the function
(here, c
, for create) followed by any
options. Here, we use the one option f
mt.tar, to specify that the resulting tar archive be named
mt.tar. The last argument is the name
of the file or files to archive; in this case, we give the name of a directory,
so tar packs all files in that directory into
the archive.
Note that the first argument to tar must be
a function letter followed by a list of options. Because of this, there's no
reason to use a hyphen (-
) to precede the
options as many Unix commands require. tar
allows you to use a hyphen, as in:
tar -cf mt.tar mt
but it's really not necessary. In some versions of tar, the first letter must be the function, as in
c
, t
, or x
. In other
versions, the order of letters does not matter as long as there is one and only
one function given.
The function letters as described here follow the so-called "old option style." There is also a newer "short option style," in which you precede the function options with a hyphen. On some versions of tar, a "long option style" is available, in which you use long option names with two hyphens. See the manpage or info page (Section 2.9) for tar for more details if you are interested.
It is often a good idea to use the v
option with tar to list each file as it is archived. For
example:
rutabaga% tar cvf mt.tar mt
mt/
mt/st_info.txt
mt/README
mt/mt.1
mt/Makefile
mt/mt.c
mt/mt.o
mt/mt
On some tars, if you use v
multiple times, additional information will be printed, as in:
rutabaga% tar cvvf mt.tar mt
drwxr-xr-x root/root 0 Nov 16 19:03 1994 mt/
-rw-r--r-- root/root 11204 Sep 5 13:10 1993 mt/st_info.txt
-rw-r--r-- root/root 847 Sep 21 16:37 1993 mt/README
-rw-r--r-- root/root 2775 Aug 7 09:50 1993 mt/mt.1
-rw-r--r-- root/root 24 Sep 21 16:03 1993 mt/Makefile
-rw-r--r-- root/root 6421 Aug 7 09:50 1993 mt/mt.c
-rw-r--r-- root/root 3948 Nov 16 19:02 1994 mt/mt.o
-rwxr-xr-x root/root 9220 Nov 16 19:03 1994 mt/mt
This is especially useful as it lets you verify that tar is doing the right thing.
In some versions of tar, f
must be the last letter in the list of options. This is because tar expects the f
option to be
followed by a filename — the name of the tar file to read from or write to. If
you don't specify f
filename
at all, tar uses a default tape device (some versions of tar use /dev/rmt0 for historical reasons regardless of the OS; some have
a slightly more specific default). Section
38.5 talks about using tar in
conjunction with a tape drive to make backups.
Now we can give the file mt.tar to other people, and they can extract it on their own system. To do this, they would use the command:
tar xvf mt.tar
This creates the subdirectory mt and
places all the original files into it, with the same permissions as found on the
original system. The new files will be owned by the user running tar xvf
(you) unless you are running as
root, in which case the original owner is generally
preserved. Some versions require the o
option to set ownership.
The x
option stands for "extract." The v
option is used again here to list each file as it is extracted. This
produces:
courgette% tar xvf mt.tar
mt/
mt/st_info.txt
mt/README
mt/mt.1
mt/Makefile
mt/mt.c
mt/mt.o
mt/mt
We can see that tar saves the pathname of
each file relative to the location where the tar file was originally created.
That is, when we created the archive using tar
cf mt.tar mt
, the only input filename we
specified was mt, the name of the directory
containing the files. Therefore, tar stores
the directory itself and all of the files below that directory in the tar file.
When we extract the tar file, the directory mt is created and the files are placed into it, which is the
exact inverse of what was done to create the archive.
If you were to pack up the contents of your /bin directory with the command:
tar cvf bin.tar /bin
you can cause terrible mistakes when extracting the tar file. Extracting a tar file packed as /bin could trash the contents of your /bin directory when you extract it. If you want to archive /bin, you should create the archive from the root directory, /, using the relative pathname (Section 1.16) bin (with no leading slash) — and if you really want to overwrite /bin, extract the tar file by cding to / first. Section 38.11 explains and lists workarounds.
Another way to create the tar file mt.tar would be to cd into the mt directory itself, and use a command such as:
tar cvf mt.tar *
This way the mt subdirectory would not be
stored in the tar file; when extracted, the files would be placed directly in
your current working directory. One fine point of tar etiquette is always to pack tar files so that they contain a
subdirectory, as we did in the first example with tar
cvf mt.tar mt
. Therefore, when the archive is extracted, the
subdirectory is also created and any files placed there. This way you can ensure
that the files won't be placed directly in your current working directory; they
will be tucked out of the way and prevent confusion. This also saves the person
doing the extraction the trouble of having to create a separate directory
(should they wish to do so) to unpack the tar file. Of course, there are plenty
of situations where you wouldn't want to do this. So much for etiquette.
When creating archives, you can, of course, give tar a list of files or directories to pack into the archive. In
the first example, we have given tar the
single directory mt, but in the previous
paragraph we used the wildcard *
, which the
shell expands into the list of filenames in the current directory.
Before extracting a tar file, it's usually a good idea to take a look at its table of contents to determine how it was packed. This way you can determine whether you do need to create a subdirectory yourself where you can unpack the archive. A command such as:
tar tvf tarfile
lists the table of contents for the named
tarfile
. Note that when using the
t
function, only one v
is required to get
the long file listing, as in this example:
courgette% tar tvf mt.tar
drwxr-xr-x root/root 0 Nov 16 19:03 1994 mt/
-rw-r--r-- root/root 11204 Sep 5 13:10 1993 mt/st_info.txt
-rw-r--r-- root/root 847 Sep 21 16:37 1993 mt/README
-rw-r--r-- root/root 2775 Aug 7 09:50 1993 mt/mt.1
-rw-r--r-- root/root 24 Sep 21 16:03 1993 mt/Makefile
-rw-r--r-- root/root 6421 Aug 7 09:50 1993 mt/mt.c
-rw-r--r-- root/root 3948 Nov 16 19:02 1994 mt/mt.o
-rwxr-xr-x root/root 9220 Nov 16 19:03 1994 mt/mt
No extraction is being done here; we're just displaying the archive's table of contents. We can see from the filenames that this file was packed with all files in the subdirectory mt, so that when we extract the tar file, the directory mt will be created, and the files placed there.
You can also extract individual files from a tar archive. To do this, use the command:
tar xvftarfile
files
where files
is the list of files to extract. As
we've seen, if you don't specify any files, tar extracts the entire archive.
When specifying individual files to extract, you must give the full pathname as it is stored in the tar file. For example, if we wanted to grab just the file mt.c from the previous archive mt.tar, we'd use the command:
tar xvf mt.tar mt/mt.c
This would create the subdirectory mt and place the file mt.c within it.
tar has many more options than those mentioned here. These are the features that you're likely to use most of the time, but GNU tar, in particular, has extensions that make it ideal for creating backups and the like. See the tar manpage or info page (Section 2.9) and the following chapter for more information.