Implementation Issues

There are many differences among the current crop of SSH implementations: features that aren't dictated by the protocols, but are simply inclusions or omissions by the software authors. Here we discuss a few implementation-dependent features of various products:

Host keys
Authorization in hostbased authentication
SSH-1 backward compatibility
Randomness
Privilege separation

3.6.1 Host Keys

SSH host keys are long-term asymmetric keys that distinguish and identify hosts running SSH, or instances of the SSH server, depending on the SSH implementation. This happens in two places in the SSH protocol:

Server authentication verifying the server host's identity to connecting clients. This process occurs for every SSH connection.^[20]
Authentication of a client host to the server; used only during RhostsRSA or hostbased user authentication.

Unfortunately, the term "host key" is confusing. It implies that only one such key may belong to a given host. This is true for client authentication but not for server authentication, because multiple SSH servers may run on a single machine, each with a different identifying key.^[21] This so-called "host key" actually identifies a running instance of the SSH server program, not a machine.

OpenSSH maintains a single database serving both server authentication and client authentication. It is the union of the system's known_hosts file (/etc/ssh/ssh_known_hosts), together with the user's ~/.ssh/known_hosts file on either the source machine (for server authentication) or the target machine (for client authentication). The database maps a hostname or address to a set of keys acceptable for authenticating a host with that name or address. One name may be associated with multiple keys (more on this shortly).

Tectia, on the other hand, maintains two separate maps for these purposes:

The hostkeys map for authentication of the server host by the client
The knownhosts map for authentication of the client host by the server

Hooray, more confusing terminology. Here, the term "known hosts" is reused with slightly different formatting ("knownhosts" versus "known_hosts") for an overlapping but not identical purpose.

While OpenSSH keeps host keys in a file with multiple entries, Tectia stores them in a filesystem directory, one key per file, indexed by filename. For instance, a knownhosts directory looks like this:

    $ ls -l /etc/ssh2/knownhosts/
    total 2
    -r--r—r--   1 root     root       697 Jun  5 22:22 wynken.sleepy.net.ssh-dss.pub
    -r--r—r--   1 root     root       697 Jul 21  1999 blynken.sleepy.net.ssh-dss.pub

Note that the filename is of the form <hostname>.<key type>.pub.

The other map, hostkeys, is keyed not just on name/address, but also on the server's TCP listening port; that is to say, it is keyed on TCP sockets. This allows for multiple keys per host in a more specific manner than before. Here, the filenames are of the form key_lt;port number>_<hostname>.pub. The following example shows the public keys for one SSH server running on blynken, port 22, and two running on wynken, ports 22 and 220. Furthermore, we've created a symbolic link to make "nod" another name for the server at wynken:22. End users may add to these maps by placing keys (either manually or automatically by client) into the directories ~/.ssh2/knownhosts and ~/.ssh2/hostkeys.

    $ ls -l /etc/ssh2/hostkeys/
    total 5
    -rw-r—r--   1 root     root    757 May 31 14:52 key_22_blynken.sleepy.net.pub
    -rw-r—r--   1 root     root    743 May 31 14:52 key_22_wynken.sleepy.net.pub
    -rw-r—r--   1 root     root    755 May 31 14:52 key_220_wynken.sleepy.net.pub
    lrwxrwxrwx   1 root     root     28 May 31 14:57 key_22_nod.pub -> key_22_wynken.sleepy.net.pub

Even though it allows for multiple keys per host, Tectia is missing one useful feature of OpenSSH: multiple keys per name. This sounds like the same thing, but there's a subtle difference: names can refer to more than one host. A common example is a set of load-sharing login servers hidden behind a single hostname. A university might have a set of three machines intended for general login access, each with its own name and address:

login1.foo.edu →10.0.0.1

login2.foo.edu → 10.0.0.2

login3.foo.edu → 10.0.0.3

In addition, there is a single generic name that carries all three addresses:

login.foo.edu →{10.0.0.1, 10.0.0.2, 10.0.0.3}

The university computing center tells people to connect only to login.foo.edu, and the university's naming service hands out the three addresses in round-robin order (e.g., using round-robin DNS) to share the load among the three machines. SSH has problems with this setup by default. Each time you connect to login.foo.edu, you have a two-thirds chance of reaching a different machine than you reached last time, with a different host key. SSH repeatedly complains that the host key of login.foo.com has changed and issues a warning about a possible attack against your client. This soon gets annoying. With OpenSSH, you can edit the known_hosts file to associate the generic name with each of the individual host keys, changing this:

    login1.foo.edu 1024 35 1519086808544755383...
    login2.foo.edu 1024 35 1508058310547044394...
    login3.foo.edu 1024 35 1087309429906462914...

to this:

    login1.foo.edu,login.foo.edu 1024 35 1519086808544755383...
    login2.foo.edu,login.foo.edu 1024 35 1508058310547044394...
    login3.foo.edu,login.foo.edu 1024 35 1087309429906462914...

With Tectia, however, there's no general way to do this; since the database is indexed by entries in a directory, with one key per file, it can't have more than one key per name.

It might seem that you're losing some security by doing this, but we don't think so. All that's really happening is the recognition that a particular name may refer to different hosts at different times, and thus you tell SSH to trust a connection to that name if it's authenticated by any of a given set of keys. Most of the time, that set happens to have size 1, and you're telling SSH, "When I connect to this name, I want to make sure I'm connecting to this particular host." With multiple keys per name, you can also say, "When I connect to this name, I want to make sure that I get one of the following set of hosts." That's a perfectly valid and useful thing to do.

Another way to solve this problem is for the system administrators of login.foo.com to install the same host key on all three machines. But this defeats the ability of SSH to distinguish between these hosts, even if you want it to. We prefer the former approach.

3.6.2 Authorization in Hostbased Authentication

The most complicated aspect of hostbased authentication is not the method itself, but the implementation details of configuring it, particularly authorization. We'll discuss:

Hostbased access files
Control file details
Netgroups as wildcards

3.6.2.1 Hostbased access files

Two pairs of files on the SSH server machine provide access control for hostbased authentication, in both its weak and strong forms:

/etc/hosts.equiv and ~/.rhosts (weak)
/etc/shosts.equiv and ~/.shosts (strong)

The files in /etc have machine-global scope, while those in the target account's home directory are specific to that account. The hosts.equiv and shosts.equiv files have the same syntax, as do the .rhosts and .shosts files, and by default they are all checked.

Warning

If any of the four access files allows access for a particular connection, it's allowed, even if another of the files forbids it.

The /etc/hosts.equiv and ~/.rhosts files originated with the insecure r-commands. For backward compatibility, SSH can also use these files for making its hostbased authentication decisions. If you're using both the r-commands and SSH, however, you might not want the two systems to have the same configuration. Also, because of their poor security, it's common to disable the r-commands, by turning off the servers in your inetd.conf files and/or removing the software. In that case, you may not want to have any traditional control files lying around, as a defensive measure in case an attacker managed to get one of these services turned on again.

To separate itself from the r-commands, SSH reads two additional files, /etc/shosts.equiv and ~/.shosts, which have the same syntax and meaning as /etc/hosts.equiv and ~/.rhosts, but are specific to SSH. If you use only the SSH-specific files, you can have SSH hostbased authentication without leaving any files the r-commands would look at.^[22]

All four files have the same syntax, and SSH interprets them very similarly—but not identically—to the way the r-commands do. Read the following sections carefully to make sure you understand this behavior.

3.6.2.2 Control file details

Here is the common format of all four hostbased control files. Each entry is a single line, containing either one or two tokens separated by tabs and/or spaces. Comments begin with #, continue to the end of the line, and may be placed anywhere; empty and comment-only lines are allowed.

    # example control file entry
    [+-][@]hostspec  [+-][@]userspec  # comment

The two tokens indicate host(s) and user(s), respectively; the userspec may be omitted. If the at sign (@) is present, then the token is interpreted as a netgroup (see the sidebar "Netgroups") and is looked up using the innetgr() library call, and the resulting list of user or hostnames is substituted. Otherwise, the token is interpreted as a single host or username. Hostnames must be canonical as reported by gethostbyaddr() on the server host; other names won't work.

x  = a 
                  or x is null or a is null

and:

y = b 
                  or y is null or b is null

and:

z  = c 
                  or z is null or c is null

This means that a null field in a triple acts as a wildcard. By "null," we mean missing; that is, in the triple (, user, domain), the host part is null. This isn't the same as the empty string: ("", user, domain). In this triple, the host part isn't null. It is the empty string, and the triple can match only another whose host part is also the empty string.

When SSH matches a username U against a netgroup, it matches the triple (, U,); similarly, when matching a hostname H, it matches (H,,). You might expect it to use (, U, D) and (H,, D) where D is the host's domain, but it doesn't.

If either or both tokens are preceded by a minus sign (-), the whole entry is considered negated. It doesn't matter which token has the minus sign; the effect is the same. Let's see some examples before explaining the rules.

The following hostspec allows anyone from fred.flintstone.gov to log in if the remote and local usernames are the same:

    # /etc/shosts.equiv
    fred.flintstone.gov

The following hostspecs allow anyone from any host in the netgroup "hostbasedusers" to log in, if the remote and local usernames are the same, but not from evil.empire.org, even if it is in the hostbasedusers netgroup:

    # /etc/shosts.equiv
    -evil.empire.org
    @hostbasedusers

This next entry (hostspec and userspec) allows mark@way.too.trusted to log into any local account! Even if a user has -way.too.trusted mark in ~/.shosts, it won't prevent access since the global file is consulted first. You probably never want to do this.

    # /etc/shosts.equiv
    way.too.trusted mark              Don't do this!!

On the other hand, the following entries allow anyone from sister.host.org to connect under the same account name, except mark, who can't access any local account:

    # /etc/shosts.equiv
    sister.host.org -mark
    sister.host.org

Remember, however, that a target account can override this restriction by placing sister.host.org mark in ~/.shosts. Note also, as shown earlier, that the negated line must come first; in the other order, it's ineffective.

This next hostspec allows user wilma on fred.flintstone.gov to log into the local wilma account:

    # ~wilma/.shosts
    fred.flintstone.gov

This entry allows user fred on fred.flintstone.gov to log into the local wilma account, but no one else—not even wilma@fred.flintstone.gov :

    # ~wilma/.shosts
    fred.flintstone.gov fred

These entries allow both fred and wilma on fred.flintstone.gov to log into the local wilma account:

    # ~wilma/.shosts
    fred.flintstone.gov fred
    fred.flintstone.gov

Now that we've covered some examples, let's discuss the precise rules. Suppose the client username is C, and the target account of the SSH command is T. Then:

A hostspec entry with no userspec permits access from all hostspec hosts when T = C.
In a per-account file (~/.rhosts or ~/.shosts), a hostspec userspec entry permits access to the containing account from hostspec hosts when C is any one of the userspec usernames.
In a global file (/etc/hosts.equiv or /etc/shosts.equiv), a hostspec userspec entry permits access to any local target account from any hostspec host, when C is any one of the userspec usernames.
For negated entries, replace "permits" with "denies" in the preceding rules.

Note Rule #3 carefully. You never, ever want to open your machine to such a security hole. The only reasonable use for such a rule is if it is negated, thus disallowing access to any local account for a particular remote account. We present some examples shortly.

The files are checked in the following order (a missing file is simply skipped, with no effect on the authorization decision):

/etc/hosts.equiv
/etc/shosts.equiv
~/.shosts
~/.rhosts

SSH makes a special exception when the target user is root: it doesn't check the global files. Access to the root account can be granted only via the root account's /.rhosts and /.shosts files. If you block the use of those files with the IgnoreRootRhosts server directive, this effectively prevents access to the root account via hostbased authentication.

When checking these files, there are two rules to keep in mind. The first rule is: the first accepting line wins. That is, if you have two netgroups:

    set     (one,,) (two,,) (three,,)
    subset  (one,,) (two,,)

the following /etc/shosts.equiv file permits access only from host three:

    -@subset
    @set

But this next one allows access from all three:

    @set
    -@subset

The second line has no effect, because all its hosts have already been accepted by a previous line.

The second rule is: if any file accepts the connection, it's allowed. That is, if /etc/shosts.equiv forbids a connection but the target user's ~/.shosts file accepts it, then it is accepted. Therefore, the sysadmin cannot rely on the global file to block connections. Similarly, if your per-account file forbids a connection, it can be overridden by a global file that accepts it. Keep these facts carefully in mind when using hostbased authentication.^[23]

3.6.2.3 Netgroups as wildcards

You may have noticed the rule syntax has no wildcards; this omission is deliberate. The r-commands recognize bare + and - characters as positive and negative wildcards, respectively, and a number of attacks are based on surreptitiously adding a "+" to someone's .rhosts file, immediately allowing anyone to rlogin as that user. So, SSH deliberately ignores these wildcards. You'll see messages to that effect in the server's debugging output if it encounters such a wildcard:

    Remote: Ignoring wild host/user names in /etc/shosts.equiv

However, there's still a way to get the effect of a wildcard: using the wildcards available in netgroups. An empty netgroup:

    empty  # nothing here

matches nothing at all. However, this netgroup:

    wild  (,,)

matches everything. In fact, a netgroup containing (,,) anywhere matches everything, regardless of what else is in the netgroup. So, this entry:

    # ~/.shosts
    @wild

allows access from any host at all,^[24] as long as the remote and local usernames match. This one:

    # ~/.shosts
    way.too.trusted @wild

allows any user on way.too.trusted to log into this account, while this entry:

    # ~/.shosts
    @wild @wild

allows any user access from anywhere.

Given this wildcard behavior, it's important to pay careful attention to netgroup definitions. It's easier to create a wildcard netgroup than you might think. Including the null triple (,,) is the obvious approach. However, remember that the order of elements in a netgroup triple is (host, user, domain). Suppose you define a group "oops" like this:

    oops        (fred,,) (wilma,,) (barney,,)

You intend for this to be a group of usernames, but you've placed the usernames in the host slots, and the username fields are left null. If you use this group as the userspec of a rule, it acts as a wildcard. Thus, this entry:

    # ~/.shosts
    home.flintstones.gov @oops

allows anyone on home.flintstones.gov,not just your three friends, to log into your account. Beware!

3.6.2.4 Summary

Hostbased authentication is convenient for users and administrators, because it can set up automatic authentication between hosts based on username correspondence and interhost trust relationships. This removes the burden of typing passwords or dealing with key management. However, it is heavily dependent on the correct administration and security of the hosts involved; compromising one trusted host can give an attacker automatic access to all accounts on other hosts. Also, the rules for the access control files are complicated, fragile, and easy to get wrong in ways that compromise security. In an environment more concerned with eavesdropping and disclosure than active attacks, it may be acceptable to deploy hostbased authentication for general user authentication. In a more security-conscious scenario, however, it is probably inappropriate, though it may be acceptable for limited use in special-purpose accounts, such as for unattended batch jobs. [11.1.3]

3.6.3 SSH-1 Backward Compatibility

The Tectia server can provide backward compatibility for the SSH-1 protocol, as long as another package supporting SSH-1 (such as OpenSSH) is also installed on the same machine. When the Tectia server encounters a client requesting an SSH-1 connection, it simply runs the SSH-1 server.^[25] This is rather cumbersome. It's also wasteful and slow, since each new sshd1 needs to generate its own server key, which otherwise the single master server regenerates only once an hour. This wastes random bits, sometimes a precious commodity, and can cause noticeable delays in the startup of SSH-1 connections to a Tectia server. Further, it is an administrative headache and a security problem, since one must maintain two separate SSH server configurations and try to make sure all desired restrictions are adequately covered in both.

OpenSSH, on the other hand, supports both SSH-1 and SSH-2 in a single set of programs, an approach we prefer.

3.6.4 Randomness

Cryptographic algorithms and protocols require a good source of random bits. Randomness is used in various ways:

To generate data-encryption keys
As plaintext padding and initialization vectors in encryption algorithms, to help foil cryptanalysis
For check-bytes or cookies in protocol exchanges, as a measure against packet-spoofing attacks

Randomness is harder to achieve than you might think; in fact, even defining randomness is difficult (or picking the right definition for a given situation). For example, "random" numbers that are perfectly good for statistical modeling might be terrible for cryptography. Each of these applications requires certain properties of its random input, such as an even distribution. Cryptography, in particular, demands unpredictability, so an attacker reading our data can't guess our keys.

True randomness—in the sense of complete unpredictability—can't be produced by a computer program. Any sequence of bits produced as the output of a program eventually repeats itself. For true randomness, you have to turn to physical processes, such as fluid turbulence or the quantum dice of radioactive decay. Even there, you must take great care that measurement artifacts don't introduce unwanted structure.

There are algorithms, however, that produce long sequences of practically unpredictable output, with good statistical randomness properties. These are good enough for many cryptographic applications, and such algorithms are called pseudo-random number generators, or PRNGs. A PRNG requires a small random input, called the seed, so it doesn't always produce the same output. From the seed, the PRNG produces a much larger string of acceptably random output; essentially, it is a randomness "stretcher." So, a program using a PRNG still needs to find some good random bits, just fewer of them, but they had better be quite unpredictable.

Since various programs require random bits, some operating systems have built-in facilities for providing them. Some Unix variants (including Linux and OpenBSD) have a device driver, accessed through /dev/random and /dev/urandom, that provides random bits when opened and read as a file. These bits are derived by all sorts of methods, some quite clever. Correctly filtered timing measurements of disk accesses, for example, can represent the fluctuations due to air turbulence around the drive heads. Another technique is to look at the least significant bits of noise coming from an unused microphone port. And of course, they can track fluctuating events such as network packet arrival times, keyboard events, interrupts, etc.

SSH implementations make use of randomness, but the process is largely invisible to the end user. Here's what happens under the hood. OpenSSH and Tectia, for example, use a kernel-based randomness source if it is available, along with their own sampling of (one hopes) fluctuating system parameters, gleaned by running such programs as ps or netstat. It uses these sources to seed its PRNG, as well as to "stir in" more randomness every once in a while. Since it can be expensive to gather randomness, SSH stores its pool of random bits in a file between invocations of the program, as shown in the following table:

	OpenSSH	Tectia
Server	/etc/ssh/ssh_random_seed	/etc/ssh2/random_seed
Client	~/.ssh/random_seed	~/.ssh2/random_seed

These files should be kept protected, since they contain sensitive information that can weaken SSH's security if disclosed to an attacker, although SSH takes steps to reduce that possibility. The seed information is always mixed with some new random bits before being used, and only half the pool is ever saved to disk, to reduce its predictive value if stolen.

In OpenSSH and Tectia, all this happens automatically and invisibly. OpenSSH links against the OpenSSL library and uses its randomness source, a kernel source if available. When building OpenSSH on a platform without /dev/random, you have a choice. If you have installed an add-on randomness source, such as the Entropy Gathering Daemon (EGD, http://www.lothar.com/tech/crypto/), you can compile OpenSSH to use it with the --with-egd-pool compile-time configuration option. Or you can use the OpenSSH entropy-gathering mechanism. You can tailor which programs are run to gather entropy and "how random" they're considered to be, by editing the file /etc/ssh/ssh_prng_cmds. Also, note that the OpenSSH random seed is kept in the ~/.ssh/prng_seed file, even the daemon's, which is just the root user's seed file. Earlier versions of OpenSSH use this method internally and automatically if there is no /dev/random and no pool specified. OpenSSH 3.8 and later have the random generator factored into a separate program, ssh-rand-helper, selected with the --with-rand-helper compile-time configuration option.

3.6.5 Privilege Separation in OpenSSH

A persistent problem in the world of Unix security is the lack of fine-grained permissions when it comes to process capabilities. Basically, either you're God (that is, "root") or you're not. The "Church" of Unix is missing the hosts of angels, archangels, cherubim, etc., that fill other pantheons and smooth the relationship between mere mortals and the divine, embodied for us in the mystical uid 0. This means that in order to accomplish some common tasks, such as listening on port 22 or creating processes under other uid's, the SSH server must also take on all the other powers of the root account. This flies in the face of a basic rule of security engineering: the Principle of Least Privilege, which says that a process should have only the privileges it needs, only when it needs them, and no more. If a serious vulnerability is found in the code of a server running as root, you can kiss your system goodbye, because when an attacker gets in, he has complete control.

In order to address this general problem, OpenSSH has a feature called privilege separation . The developers have factored out those server functions which require root privilege, and placed them in a separate process. The main server does not run as root; it gives up that privilege as soon as possible after startup, leaving a separate privileged "monitor" process with which it can communicate. The monitor opens the server listening socket which the main server inherits, but then closes its copy so that it does not communicate directly with clients (i.e., potential attackers). It communicates only by a private pipe to the main server and obeys a strict protocol, performing only those privileged operations necessary from time to time for the operation of the main server, and nothing else. This design mitigates the problem by restoring the Principle of Least Privilege, at least as much as is possible given the limitations of Unix.

Privilege separation is a complicated feature to implement, however, due to many small differences among Unix platforms with regard to the exact behavior of relevant system calls such as setuid, seteuid, setgid, etc., as well as difficulties with related software such as PAM. The early implementations of privilege separation in OpenSSH were notorious for causing mysterious errors in the operation of the server. Things have improved a great deal, but if you run into odd problems you can't explain—especially having to do with a privilege or access violation on the part of the server—you could do worse than to disable privilege separation and see what happens.

For more information on privilege separation, see:

http://www.citi.umich.edu/u/provos/ssh/privsep.html
"Preventing Privilege Escalation," Niels Provos, Markus Friedl, and Peter Honeyman, 12th USENIX Security Symposium, Washington, D.C., August 2003, http://www.citi.umich.edu/u/provos/papers/privsep.pdf.

^[20]In SSH-1, the host key also encrypts the session key for transmission to the server. However, this use is actually for server authentication, rather than for data protection per se; the server later proves its identity by showing that it correctly decrypted the session key. Protection of the session key is obtained by encrypting it a second time with the ephemeral server key.

^[21]Or sharing the same key, if you wish, assuming the servers are compatible with one another.

^[22]Unfortunately, you can't configure the server to look at one set but not the other. If it looks at ~/.shosts, then it also considers ~/.rhosts, and both global files are always considered.

^[23]By setting the server's IgnoreRhosts keyword to yes, you can cause the server to ignore the per-account files completely and consult the global files exclusively instead. [5.4.4]

^[24]If strong hostbased authentication is in use, this means any host verified by public key against the server's known hosts database.

^[25]Or it can use an internal SSH-1 compatibility mode.