A proficient Linux systems administrator is capable of using at least one text editor in Linux. This skill is needed to manage daily work such as modifying configuration files and creating shell scripts. You have the choice of several editors. Many individuals find a particular editor whose functionality they love and use that one exclusively. This chapter provides a brief sampling of two text editors that are popular with admins.
When managing any computer server, you'll often have files that contain large amounts of text data. It's typically difficult to handle the information and make it useful. It's easy to get data overload when working with system commands. Fortunately, Linux provides several command‐line utilities that help you manage large amounts of data.
When managing a Linux system, it's wise to gain proficiency using at least one text editor. Using features such as searching, cutting, and pasting allows you to modify configuration files more quickly. In Chapter 19, “Writing Scripts,” you'll need the text editor skills that you learn in this chapter to quickly create Bash shell scripts. This section walks you through the basics of using the vim
editor, which is typically available for most Linux distros.
Before you begin your exploration of the vim
editor, it's a good idea to understand what “flavor” of vim
your Linux system has installed. On some distributions, you will have a full functioning vim
editor installed, but not on others, which can cause you some difficulties.
Three commands can help you determine whether your Linux distro has a fully functioning vim
editor.
alias
Command To display whether a command calls a different command and/or uses additional command options, you type alias
command‐name
at the command‐line interface (CLI). For our purposes here, we need to see if the typical command to run the vim
editor, vi
, is aliased to the vim
command or not. Thus, alias vi
will let us know this information. If vi
is not aliased, it's our first clue that a fully functioning vim
editor is probably not installed.which
Command This particular command is useful in many circumstances. When you type which
command‐name
, it shows you exactly where the program associated with the command‐name
resides in the Linux virtual directory system. For our investigation, we'll need to use the which vim
command to get the location, so we can properly use the next command, readlink
.readlink
Command When researching whether you have a fully functioning vim
editor, you need to determine if a symbolic link is involved with the vim
program name. Sometimes when a vim
editor is installed, the vim
program file, which by all initial appearances is fully functioning, is symbolically linked to a lesser vim
flavor. We could adopt the ls ‐l
technique used in Chapter 7, “Exploring Linux File Management,” to investigate the soft links, but vim
file soft links are often chained so that one soft link points to another soft link, which points to an additional soft link, and so on. In this case, it's best to use the readlink ‐f
command, which will quickly find the final file in a chain of links.Using these three commands to check the vim
editor on a CentOS distribution reveals the following:
$ alias vi
alias vi='vim'
$
$ which vim
/usr/bin/vim
$
$ readlink -f /usr/bin/vim
/usr/bin/vim
$
The vi
command is aliased, which is a good sign. And there are no soft links that indicate that the vim
editor program is linked to a less than full‐featured editor program. From the investigation here, you can rest assured that this CentOS distribution has a fully functioning vim
editor.
Running these same commands on this Ubuntu distribution shows different results:
$ alias vi
-bash: alias: vi: not found
$
$ which vim
/usr/bin/vim
$
$ readlink -f /usr/bin/vim
/usr/bin/vim.basic
$
Notice that there is no alias for the vi
command, but the /usr/bin/vim
program file points to /usr/bin/vim.basic
. If you have the vim.basic
program as shown here on your Ubuntu system, you are OK. However, if in your investigation you find the vim.tiny
program, you'll want to install the vim
package to get vim.basic
so that you can follow along with the vim
editor examples in this chapter. Package installation for Ubuntu distributions was covered in Chapter 3, “Installing and Maintaining Software in Ubuntu.”
Setting up an alias for the vim
command on your Ubuntu system is fairly easy: just type alias vi='vim'
on your system. However, this alias will not survive once you log out of the system. You'll need to add it to one of your login startup files, covered in Chapter 13, “Managing Users and Groups.”
To start using the vim
text editor, type vim
or vi
, depending on your distribution, followed by the name of the file you want to edit or create. Figure 8.1 shows a vim
text editor screen in action with a text file that was previously created.
In Figure 8.1, the file being edited is the editorTestFile.txt
file. The vim
editor works the file data in a memory buffer, and this buffer is displayed on the screen. If you open vim
without a filename, or the filename you entered doesn't yet exist, vim
starts a new buffer area for editing.
The vim
editor has a message area near the bottom line. If you have just opened an already created file, it will display the filename along with the number of lines and characters read into the buffer area. If you are creating a new file, you will see [New File]
in the message area.
The vim
editor has three standard modes.
vim
uses when you first enter the buffer area; this is sometimes called normal mode. Here you enter keystrokes to enact commands. For example, pressing the J key will move your cursor down one line. This is the best mode to use for quickly moving around the buffer area.‐‐Insert‐‐
will display in the message area. You leave this mode by pressing the Esc key.:
). For example, to leave the vim
editor and not save any changes, you type :q
and press the Enter key.Since you start in Command mode when entering the vim
editor's buffer area, it's good to understand a few of the commonly used commands to move around in this mode. Table 8.1 contains several commands for moving around in the editor.
TABLE 8.1: Commonly Used vim
Command Mode Moving Commands
KEYSTROKE | DESCRIPTION |
---|---|
h | Move cursor left one character. |
l | Move cursor right one character. |
j | Move cursor down one line (the next line in the text). |
k | Move cursor up one line (the previous line in the text). |
w | Move cursor forward one word to front of next word. |
e | Move cursor to end of current word. |
b | Move cursor backward one word. |
^ | Move cursor to beginning of line. |
$ | Move cursor to end of line. |
gg | Move cursor to the file's first line. |
G | Move cursor to the file's last line. |
n G |
Move cursor to file line number n
. |
Ctrl+B | Scroll up almost one full screen. |
Ctrl+F | Scroll down almost one full screen. |
Ctrl+U | Scroll up half of a screen. |
Ctrl+D | Scroll down half of a screen. |
Ctrl+Y | Scroll up one line. |
Ctrl+E | Scroll down one line. |
Quickly moving around in the vim
editor buffer is useful. However, there are also several editing commands that help to speed up your modification process. For example, by moving your cursor to a word's first letter and pressing CW, the word is deleted, and you are thrown into Insert mode. You can then type in the new word and press Esc to leave Insert mode.
Once you have made any needed text changes in the vim
buffer area, it's time to save your work. You can type ZZ
in Command mode to write the buffer to disk and exit your process from the vim
editor.
The third vim
mode, Ex mode, has additional handy commands. You must be in Command mode to enter into Ex mode. You cannot jump from Insert mode to Ex mode. Therefore, if you're currently in Insert mode, press the Esc key to go back to Command mode first.
Table 8.2 shows several Ex commands that can help you manage your text file. Notice that all the keystrokes include the necessary colon (:
) to use Ex commands.
TABLE 8.2: Commonly Used vim
Ex Mode Commands
KEYSTROKES | DESCRIPTION |
---|---|
:x | Write buffer to file and quit editor. |
:wq | Write buffer to file and quit editor. |
:wq! | Write buffer to file and quit editor (overrides protection). |
:w | Write buffer to file and stay in editor. |
:w! | Write buffer to file and stay in editor (overrides protection). |
:q | Quit editor without writing buffer to file. |
:q! | Quit editor without writing buffer to file (overrides protection). |
:! command
|
Execute shell command and display results, but don't quit editor. |
:r! command
|
Execute shell command and include the results in editor buffer area. |
:r file
|
Read file contents and include them in editor buffer area. |
After reading through the various mode commands, you may see why some people despise the vim
editor. There are a lot of obscure commands to know. However, some people love the vim
editor because it is so powerful.
It's tempting to learn only one text editor and ignore the others. But knowing at least two text editors is useful in your day‐to‐day Linux work. For complex editing and writing programs, the vim
editor is one of the most popular text editors. However, if you just need to make a small change to a file, the nano
text editor shines. We cover this one next.
In contrast to vim
, which is a complicated editor with powerful features, nano
is a simple editor. For individuals who need a simple console mode text editor that is easy to navigate, nano
is the tool to use. It's also a great text editor for those who are just starting on their Linux command‐line adventure.
The nano
text editor is installed on most Linux distributions by default. Everything about the nano
text editor is easy. To open a file at the command line with nano
, enter nano
filename
.
If you start nano
without a filename or if the file doesn't exist, nano
simply opens a new buffer area for editing. If you specify an existing file on the command line, nano
reads the entire contents of the file into a buffer area, where it is ready for editing, as shown in Figure 8.2.
Notice at the bottom of the nano
editor window, various commands with a brief description are shown. These commands are the nano
control commands. The caret (^
) symbol shown represents the Ctrl key. Therefore, ^X
stands for the keyboard sequence Ctrl+X. Though the nano
control commands list capital letters in the keyboard sequences, you can use either lowercase or uppercase characters for control commands.
Having most of the basic commands listed right in front of you is great—no need to memorize what control command does what. Table 8.3 presents the most common nano
control commands.
TABLE 8.3: nano
Common Control Commands
COMMAND | DESCRIPTION |
---|---|
Ctrl+C | Displays the cursor's position within the text editing buffer |
Ctrl+G | Displays nano
's main help window |
Ctrl+J | Justifies the current text paragraph |
Ctrl+K | Cuts the text line and stores it in the cut buffer |
Ctrl+O | Writes out the current text editing buffer to a file |
Ctrl+R | Reads a file into the current text editing buffer |
Ctrl+T | Starts the available spell checker |
Ctrl+U | Pastes text stored in the cut buffer and places in current line |
Ctrl+V | Scrolls text editing buffer to the next page |
Ctrl+W | Searches for word or phrases within the text editing buffer |
Ctrl+X | Closes the current text editing buffer, exits nano , and returns to the shell |
Ctrl+Y | Scrolls the text editing buffer to previous page |
The control commands listed in Table 8.3 are really all you need. However, if you desire more powerful control features than those listed, nano
has them. To see more control commands, press Ctrl+G in the nano
text editor to display its main help window containing additional control commands.
A few of these additional control commands are called Meta‐key sequences. In the nano
documentation, they are denoted by the letter M. For example, you'll find the key sequence to undo the last task denoted as M‐U
in the nano
help system. But don't press the M key to accomplish this. Instead, M represents the Esc, Alt, or Meta key, depending on your keyboard's configuration. Thus, you might press the Alt+U key combination to undo the last task within nano
.
Using text editors at the CLI is one way to work with text files. The rest of this chapter explores additional methods you can use to manipulate text file data.
When you have a large amount of data, it's often difficult to handle the information and make it useful. The Linux system provides several CLI tools to help you manage large amounts of data. This section covers the basic commands that every system administrator—as well as any everyday Linux user—should know how to use to make their lives easier.
Often, to understand the data within text files, you need to reformat file data in some way. The sort
utility sorts a file's data. It makes no changes to the original file. It only reads the file, sorts its data, and displays the sorted data to STDOUT
(covered in Chapter 6, “Working with the Shell”).
If you want to order a file's content alphabetically, simply enter the sort
command followed by the name of the file you want to sort.
$ nano alphabetKey.txt
$ cat alphabetKey.txt
Alpha
Tango
Sierra
Bravo
Foxtrot
Echo
$
$ sort alphabetKey.txt
Alpha
Bravo
Echo
Foxtrot
Sierra
Tango
$
It's pretty simple. However, things aren't always as easy as they appear. Take a look at this example:
$ nano numberKey.txt
$ cat numberKey.txt
1 One
2 Two
4 Four
3 Three
10 Ten
20 Twenty
100 One Hundred
$
$ sort numberKey.txt
1 One
10 Ten
100 One Hundred
2 Two
20 Twenty
3 Three
4 Four
$
If you were expecting the numbers to sort in numerical order, you were disappointed. By default, the sort
command interprets numbers as characters, producing a sorted output that you may not want. Add the ‐n
option to the command to sort the text based on their numerical values, if that's what you are seeking.
$ sort -n numberKey.txt
1 One
2 Two
3 Three
4 Four
10 Ten
20 Twenty
100 One Hundred
$
There are several useful sort
parameters you can use depending on what kind of sort is needed. Table 8.4 shows commonly used options.
TABLE 8.4: Commonly Used sort
Command Options
SINGLE DASH | DOUBLE DASH | DESCRIPTION |
---|---|---|
‐b |
‐‐ignore‐leading‐blanks |
Ignore leading blanks when sorting. |
‐d |
‐‐dictionary‐order |
Consider only blanks and alphanumeric characters; don't consider special characters. |
‐f |
‐‐ignore‐case |
By default, sort orders capitalized letters first. This parameter ignores case. |
‐g |
‐‐general‐numeric‐sort |
Use general numerical value to sort. |
‐i |
‐‐ignore‐nonprinting |
Ignore nonprintable characters in the sort. |
‐k |
‐‐key= POS1 [, POS2 ] |
Sort based on position POS1
, and end at POS2 if specified. |
‐n |
‐‐numeric‐sort |
Sort by string numerical value. |
‐o |
‐‐output= file |
Write results to file specified. |
‐r |
‐‐reverse |
Reverse the sort order (descending instead of ascending). |
‐t |
‐‐field‐separator= SEP |
Specify the character used to distinguish key positions. |
‐z |
‐‐zero‐terminated |
End all lines with a NULL character instead of a new line. |
Viewing sorted data is helpful, but what do you do if you want to keep that sorted data? STDOUT
redirection (covered in Chapter 6) can help here:
$ nano numberKeySciFi.txt
$ cat numberKeySciFi.txt
1984 101
Wars 1138
Pi 3.14
Trek 1701
Back 88
$
$ sort -n -t ' ' -k 2 numberKeySciFi.txt > sortedSciFi.txt
$ cat sortedSciFi.txt
Pi 3.14
Back 88
1984 101
Wars 1138
Trek 1701
$
Keeping sorted data is especially handy when you've used a complex sort like the previous one. Another useful function when dealing with text data is searching for it. We'll cover that topic next.
You may need to locate a text file within the virtual directory structure, or just search through a file for text. Either way, Linux provides you lots of options to accomplish your task.
A simple utility to use in finding files quickly is the locate
program. What makes it fast is that this command searches a database that is pre‐filled with filenames and their locations.
To find a file with the locate
command, just enter locate
followed by the file's name you want to find. If the file is on your system and you have permission to view it, the locate
utility will display the file's directory path and name as demonstrated here on an Ubuntu distribution:
$ locate .bash_history
/home/sysadmin/.bash_history
$
Another nice feature of locate
is that it uses a pattern to find files. This allows you to employ partial filenames and regular expressions (covered later in this chapter) and, with the command options, ignore case. Table 8.5 shows a few of the more commonly used locate
command options.
TABLE 8.5: The locate
Command's Commonly Used Options
SHORT | LONG | DESCRIPTION |
---|---|---|
‐A |
‐‐all |
Display only filenames that match all the patterns, instead of displaying files that match only one pattern in the pattern list. |
‐b |
‐‐basename |
Display only filenames that match the pattern and do not include any directory names that match the pattern. |
‐c |
‐‐count |
Display only the number of files whose name matches the pattern instead of displaying filenames. |
‐i |
‐‐ignore‐case |
Ignore case in the pattern for matching filenames. |
‐q |
‐‐quiet |
Do not display any error messages, such as permission denied , when processing. |
‐r |
‐‐regexp R
|
Use the regular expression, R
, instead of the pattern list to match filenames. |
‐w |
‐‐wholename |
Display filenames that match the pattern and include any directory names that match the pattern. This is default behavior. |
Where you can run into problems with locate
is when a file is newly created. Here's an example of using the touch
command to create a file and then it tries to find it with the locate
utility:
$ touch newFile.txt
$ ls newFile.txt
newFile.txt
$
$ locate newFile.txt
$
When a file is newly created (or downloaded), often it is not yet listed in the locate
database. Typically, this database is updated only periodically. Also, when you have a newly installed Linux system, the database may not even yet exist!
To fix both of these issues, you'll need to obtain super user privileges and run the updatedb
command. This will update the database, named /var/lib/mlocate/mlocate.db
(or some variation), or create and update it.
$ sudo updatedb
[sudo] password for sysadmin:
$
$ locate newFile.txt
/home/sysadmin/newFile.txt
$
That's much better! Now that the database is updated, the newly created file can be found by the locate
utility.
The locate
command is useful when you want to find files by their name, but it's not useful when you're trying to find a file based on its size or who owns it. This is where the find
command can help.
The find
command is flexible. It allows you to locate files based on data, such as who owns the file, when the file was last modified, permissions set on the file, and so on. Its command‐line format is a little different.
find [path] [option] [expression]
The path
argument is a starting point directory, because you designate a starting point in a directory tree, and find
will search through that directory and all its subdirectories (recursively) for the file or files you seek. You can use a single period (
.
) to designate your present working directory as the starting point directory.
The expression
command argument and its preceding option
control what type of filters are applied to the search as well as any settings that may limit the search. Table 8.6 shows the more commonly used option
and expression
combinations.
TABLE 8.6: The find
Command's Commonly Used Options and Expressions
OPTION | EXPRESSION | DESCRIPTION |
---|---|---|
‐cmin |
n |
Display names of files whose status changed n minutes ago. |
‐empty |
N/A | Display names of files that are empty and are a regular text file or a directory. |
‐gid |
n |
Display names of files whose group ID is equal to n
. |
‐group |
name |
Display names of files whose group is name
. |
‐inum |
n |
Display names of files whose inode number is equal to n
. |
‐maxdepth |
n |
When searching for files, traverse down into the starting point directory's tree only n levels. |
‐mmin |
n |
Display names of files whose data changed n minutes ago. |
‐name |
pattern |
Display names of files whose name matches pattern
. Many regular expression arguments may be used in the pattern and need to be enclosed in quotation marks to avoid unpredictable results. Replace ‐name with ‐iname to ignore case. |
‐nogroup |
N/A | Display names of files where no group name exists for the file's group ID. |
‐nouser |
N/A | Display names of files where no username exists for the file's user ID. |
‐perm |
mode |
Display names of files whose permissions match mode
. Either octal or symbolic modes may be used. |
‐size |
n |
Display names of files whose size matches n
. Suffixes can be used to make the size more human readable, such as G for gigabytes. |
‐user |
name |
Display names of files whose owner is name
. |
One nice feature of find
is that it will display all the files in your present working directory (and any subdirectories) that have no data in them:
$ find . -empty
[…]
./newFile.txt
./.local/share/nano
[…]
$
Unlike locate
, you can quickly find files that were newly created without updating a database.
$ touch anotherNewFile.txt
$
$ find /home/sysadmin -name anotherNewFile.txt
/home/sysadmin/anotherNewFile.txt
$
You can search for files whose status was recently changed, such as when data has been added to the file.
$ nano anotherNewFile.txt
$
$ find /home/sysadmin -cmin 1
[…]
/home/sysadmin/anotherNewFile.txt
$
You can modify these find
command searches at any location within the virtual directory system. However, you may want to use super user privileges to get accurate results and avoid an overload of permission denied
messages.
$ sudo find / -name mlocate.db
[sudo] password for sysadmin:
/var/lib/mlocate/mlocate.db
$
Both locate
and find
are useful for discovering a file's location or performing a basic file analysis. However, neither of these commands can search through and display a file's contents. We'll cover a utility that does offer that feature next.
When you need a utility that lets you search for files that contain certain data, grep
is the winner. The command‐line format for the grep
command is as follows:
grep [options] pattern [file]
In the following example, the text files we've used or created for this chapter are listed in a single‐column format via the ls ‐1 *.txt
command. Two of those files contain the word Pi
, but which ones? The grep
command can easily determine the correct answer.
$ ls -1 *.txt
alphabetKey.txt
anotherNewFile.txt
editorTestFile.txt
editorTestFileNano.txt
newFile.txt
numberKey.txt
numberKeySciFi.txt
sortedSciFi.txt
$
$ cat numberKeySciFi.txt
1984 101
Wars 1138
Pi 3.14
Trek 1701
Back 88
$
$ cat sortedSciFi.txt
Pi 3.14
Back 88
1984 101
Wars 1138
Trek 1701
$
$ grep Pi *.txt
numberKeySciFi.txt:Pi 3.14
sortedSciFi.txt:Pi 3.14
$
In the preceding example, the pattern
used with grep
was Pi
, and in the current directory, all the text files, *.txt
, were searched. The grep
command lists the search results by displaying each file's name that contains the pattern
and then shows the entire text line that has the pattern
.
There are some nice options you can use in your grep
searches. Table 8.7 shows some of the more commonly used grep
utility options.
TABLE 8.7: The grep
Command's Commonly Used Options
SHORT | LONG | DESCRIPTION |
---|---|---|
‐c |
‐‐count |
Display a count of text file records that contain a PATTERN match. |
‐d action
|
‐‐directories= action |
When a file is a directory, if action is set to read , read the directory as if it were a regular text file; if action is set to skip , ignore the directory; and if action is set to recurse , act as if the ‐R , ‐r , or ‐‐recursive option was used. |
‐E |
‐‐extended‐regexp |
Designate the PATTERN as an extended regular expression. |
‐i |
‐‐ignore‐case |
Ignore the case in the PATTERN as well as in any text file records. |
‐R, ‐r |
‐‐recursive |
Search a directory's contents, and for any subdirectory within the original directory tree, consecutively search its contents as well (recursively). |
‐v |
‐‐invert‐match |
Display only text file's records that do not contain a PATTERN match. |
When searching through larger sections of the virtual directory structure for files containing certain data, it's a good idea to employ the ‐d skip
option so that grep
doesn't complain at you when it encounters a directory file.
$ grep -d skip sysadmin /etc/*
grep: /etc/at.deny: Permission denied
/etc/group:sysadmin:x:1000:
[…]
/etc/passwd:sysadmin:x:1000:1000:[…]:/home/sysadmin:/bin/bash
[…]
$
Notice that the grep
utility found the word sysadmin
in two files. This is a real time‐saver when you're trying to locate data. However, also notice that a Permission denied
message was produced. You'll need to use super user privileges to search through files that require higher permission levels to look through.
You can also use grep
to conduct searches on one particular file. Often in a large file, you have to look for a specific line of data buried somewhere in the middle of the file. Instead of manually scrolling through the entire file, you can let the grep
command search for you.
$ grep bash /etc/passwd
root:x:0:0:root:/root:/bin/bash
sysadmin:x:1000:1000:[…]:/home/sysadmin:/bin/bash
$
When looking for a particular piece of data whose case you cannot remember, use the ‐i
option to make grep
case‐insensitive.
$ sudo grep -d skip Ubuntu-Server /etc/*
[sudo] password for sysadmin:
$
$ sudo grep -i -d skip Ubuntu-Server /etc/*
/etc/hostname:ubuntu-server
/etc/hosts:127.0.1.1 ubuntu-server
$
The ‐d skip
and ‐i
options, along with super user privileges, make your grep
search results cleaner and provide you with faster results.
You can conduct rather complex searches with grep
by using regular expressions. The grep
can even handle extended regular expressions, if you use the ‐E
option.
$ grep -E "(^root|^sysadmin)" /etc/passwd
root:x:0:0:root:/root:/bin/bash
sysadmin:x:1000:1000:[…]:/home/sysadmin:/bin/bash
$
The grep ‐E
command is the more modern version of the egrep
utility. The two are functionally the same, but egrep
is now deprecated. When a command is deprecated, this means that it may not be available in the future, so you should stop using it as soon as possible and start using its modern equal.
Linux contains several file compression utilities that allow you to easily compress large files into smaller files that take up less space. While this may sound great, it often leads to confusion and chaos when you're trying to determine which utility to use. The following popular utilities are available on Linux:
gzip
bzip2
xz
The advantages and disadvantages of each of these data compression methods are explored in this section.
gzip
The gzip
utility was developed in 1992 as a replacement for the old compress
program. Achieving text‐based file compression rates of 60–70 percent, gzip
has long been a popular data compression utility. To compress a file, simply type in gzip
followed by the file's name. The original file is replaced by a compressed version with a .gz
filename extension. To reverse the operation, type in gunzip
followed by the compressed file's name.bzip2
Developed in 1996, the bzip2
utility offers higher compression rates than gzip
but takes slightly longer to perform the data compression. There was a bzip
program, but it had some patent issues, so bzip2
was created to replace it.
The bzip2
utility employs multiple layers of compression techniques and algorithms. Until 2013, this data compression utility was used to compress the Linux kernel for distribution. To compress a file, simply type in bzip2
followed by the file's name. The original file is replaced by a compressed version with a .bz2
file extension. To reverse the operation, type in bunzip2
followed by the compressed file's name, which decompresses (deflates) the data.
xz
Developed in 2009, the xz
data compression utility quickly became popular among Linux administrators. It boasts a higher default compression rate than bzip2
and gzip
. In 2013, the xz
compression utility replaced bzip2
for compressing the Linux kernel for distribution. To compress a file, simply type in xz
followed by the file's name. The original file is replaced by a compressed version with an .xz
file extension. To reverse the operation, type in unxz
followed by the compressed file's name.It's helpful to see a side‐by‐side comparison of some of the compression utilities using their defaults. Here is a compression comparison example on an Ubuntu distribution:
$ ls -hs /var/log/syslog
344K /var/log/syslog
$
$ cp /var/log/syslog syslog1
$ cp /var/log/syslog syslog2
$ cp /var/log/syslog syslog3
$
$ gzip syslog1
$ bzip2 syslog2
$ xz syslog3
$
$ ls -hs syslog?.*
72K syslog1.gz 40K syslog2.bz2 32K syslog3.xz
$
In the preceding example, first the /var/log/syslog
file size is shown, which is 344 K. (You can use /var/log/lastlog
in place of /var/log/syslog
for this comparison on a CentOS distribution.) Then the file is copied three times to the local directory using a new filename each time. Next, three compression utilities are used. After the files are compressed with the various utilities, another ls ‐hs
command displays the compressed files' names and their sizes. You can see that the xz
program produces the highest compression of this file, because its file, syslog3.xz
, is the smallest in size.
Compression goes hand in hand with backing up files, because the resulting file containing a backup is often rather large. We'll cover backing up files next.
Backing up files is often called archiving, especially in the Linux world. There are several programs you can employ for managing backups. Some of the more popular products are Amanda, Bacula, Bareos, Duplicity, and BackupPC. Yet, often these GUI and/or web‐based programs have command‐line utilities at their core, which include the following:
cpio
dd
rsync
tar
The tar
command was originally used to write files to a tape device for archiving. However, it can also write the output to a file, which has become a popular way to archive data in Linux, and that's the command we'll focus on in this chapter.
The tar
command copies the selected files and stores them in a single file. This file is called a tar archive file. If this archive file is compressed using a data compression utility, the compressed archive file is called a tarball.
The tar
program has several useful options. Table 8.8 describes the more commonly used ones for creating data backups.
TABLE 8.8: The tar
Command's Commonly Used Archive Creation Options
SHORT | LONG | DESCRIPTION |
---|---|---|
‐c |
‐‐create |
Creates a tar archive file. The backup can be a full or incremental backup, depending upon the other selected options. |
‐u |
‐‐update |
Appends files to an existing tar archive file, but only copies those files that were modified since the original archive file was created |
‐g |
‐‐listed‐incremental |
Creates an incremental or full archive based upon metadata stored in the provided file |
‐z |
‐‐gzip |
Compresses a tar archive file into a tarball using gzip |
‐j |
‐‐bzip2 |
Compresses a tar archive file into a tarball using bzip2 |
‐J |
‐‐xz |
Compresses a tar archive file into a tarball using xz |
‐v |
‐‐verbose |
Displays each file's name as each file is processed |
Notice that there are some compression options in Table 8.8. When you use a compression utility along with an archive and restore program for data backups, it is vital that you use a lossless compression method. A lossless compression is just as it sounds: no data is lost. The gzip
, bzip2
, and xz
utilities provide lossless compression. Obviously, it is important not to lose data when doing backups!
To create an archive using the tar
utility, you have to add a few arguments for the options and the command.
$ ls n*.txt
newFile.txt numberKey.txt numberKeySciFi.txt
$
$ tar -cvf archive.tar n*.txt
newFile.txt
numberKey.txt
numberKeySciFi.txt
$
In the preceding example, three options are used.
‐c
option creates the tar
archive.‐v
option displays the filenames as they are placed into the archive file.‐f
option designates the archive filename, which is archive.tar
.Though not required, it is considered good form to use the .tar
extension on tar
archive files. The example command's last argument designates the files to copy into this archive.
If you are backing up lots of files or large amounts of data, it is a good idea to employ a compression utility. This is easily accomplished by adding an additional switch to your tar
command options. Here gzip
compression is used to create a tarball:
$ ls -hs /var/log/syslog.*
132K /var/log/syslog.1 168K /var/log/syslog.2.gz
$
$ tar -zcvf syslog.tar.gz /var/log/syslog.*
tar: Removing leading `/' from member names
/var/log/syslog.1
tar: Removing leading `/' from hard link targets
/var/log/syslog.2.gz
$
$ ls -hs syslog.tar.gz
196K syslog.tar.gz
$
There are a couple of things to note in this example. First, look at the tar: Removing leading
messages. The tar
utility strips off the first forward slash (/) in filenames so that they can be restored anywhere in the future. If that forward slash was left in there, the files would only go back to their original location in the virtual directory structure, which is not very flexible.
The next thing to note in the preceding example is that the tarball filename has the .tar.gz
file extension. It is considered good form to use the .tar
extension and tack on an indicator showing the compression method that was used. However, you can shorten it to .tgz
if desired.
Whenever you create data backups, it is a good practice to verify them. Table 8.9 provides some tar
command options for viewing and verifying data backups.
TABLE 8.9: The tar
Command's Commonly Used Archive Verification Options
SHORT | LONG | DESCRIPTION |
---|---|---|
‐d |
‐‐compare ‐‐diff |
Compares a tar archive file's members with external files and lists the differences |
‐t |
‐‐list |
Displays a tar archive file's contents |
‐W |
‐‐verify |
Verifies each file as the file is processed. This option cannot be used with the compression options. |
Backup verification can take several different forms. You might ensure that the desired files (sometimes called members) are included in your backup by using the ‐v
option on the tar
command in order to watch the files being listed as they are included in the archive file. You can also verify that desired files are included in your backup after the fact. Use the ‐t
option to list a tarball or archive file's contents, as shown here:
$ tar -tf archive.tar
newFile.txt
numberKey.txt
numberKeySciFi.txt
$
$ tar -tf syslog.tar.gz
var/log/syslog.1
var/log/syslog.2.gz
$
Table 8.10 lists some of the options that you can use with the tar
utility to restore data from a tar
archive file or tarball. Several options used to create the backup, such as ‐g
, are also available when restoring data.
TABLE 8.10: The tar
Command's Commonly Used File Restore Options
SHORT | LONG | DESCRIPTION |
---|---|---|
‐x |
‐‐extract ‐‐get |
Extracts files from a tarball or archive file and places them in the current working directory |
‐z |
‐‐gunzip |
Decompresses files in a tarball using gunzip |
‐j |
‐‐bunzip2 |
Decompresses files in a tarball using bunzip2 |
‐J |
‐‐unxz |
Decompresses files in a tarball using unxz |
Extracting files from an archive or tarball is fairly simple using the tar
utility. Here is an example of extracting files from our previously created tarball:
$ mkdir Extract
$ mv syslog.tar.gz Extract/
$ cd Extract
$
$ tar -zxvf syslog.tar.gz
var/log/syslog.1
var/log/syslog.2.gz
$
$ ls -F
syslog.tar.gz var/
$
$ ls var/log/
syslog.1 syslog.2.gz
$
In the previous example, a new subdirectory, Extract
, is created. The tarball is moved to the new subdirectory, and then the files are restored from the tarball. Notice that instead of putting the files in the top level of the new subdirectory, Extract
, they were instead placed in the var/log
subdirectory. That's because tar
removed the leading forward slash of the original files but kept the rest of the directory reference. This is a rather useful feature of the tar
utility.
Using the tar
command is a simple way to create a backup file of various files. You can also create archive files of entire directory structures. This is a common method for distributing source code files for open source applications in the Linux world.
vim
editor's basic features. The vim
editor is one of the most popular text editors in use. Though it can be tricky to use, modifying text files using vim
is worth the time to learn. Grasping the basics of the vim
editor is all that is needed for a system admin.
vim
editor. You only want to quickly add a paragraph of comments to the top of the file. What editor commands can you employ to accomplish this task quickly?nano
editor for everyday text file editing. The nano
text editor is a simple and quick editor to use in your daily work. You can quickly get into a file, make any needed modifications, save your work, and go on with other tasks. It's a favorite editor of system administrators because of its simplicity.
nano
editor with this file in the buffer, what editor commands covered in this chapter can you use to accomplish this task quickly?grep
command is a utility to learn. With its ability to conduct simple or complex searches, locating the information or the files you need is a snap.
/etc
directory (but not its subdirectories) that contain the word host
. The search must be case‐insensitive, and you don't want to see any error messages concerning directory files. Assuming you need to use the sudo
command along with your grep
command, what will your command look like to conduct this search?tar
utility has been around for a long time. It provides useful options to create archive files. While tar
has the ability to compress files on the fly, you can also use the gzip
, bzip2
, and xz
compression utilities to compress tar
archive files as well as other files.
tar
archive file, myArchive.tar
, but did not compress it with a tar
option, because you needed to verify each file as it was processed with the ‐W
option. Now that the archive file was successfully created and verified, what command will you use to compress it to the highest level, and what will the resulting file's name be?