Inside SSH-2

The SSH protocol has two major, incompatible versions, called Version 1[14] and Version 2. [1.5] We refer to these as SSH-1 and SSH-2. The SSH-1 protocol is now a relic; it is less flexible than SSH-2, has unfixable security weaknesses, and has been deprecated for years. Its implementations see no real development aside from bug fixes, and the default protocol for most SSH software has been SSH-2 for some time now. In this chapter, as we describe "the SSH protocol," we are talking about SSH-2. We limit our treatment of SSH-1 to a summary of its design, its differences with SSH-2, and its weaknesses.

The SSH protocol is actually divided into four major pieces, formally described as four separate protocols in different IETF documents, and in principle independent of one another. In practice, they are layered together to provide the set of services most users associate with SSH as a whole. These are:

There are other documents that describe other aspects of, or extensions to, the protocols, but the preceding ones represent the core of SSH. As of this writing, these documents are still "Internet-Drafts," but after much effort by the IETF SECSH working group, they have been submitted to the IESG for consideration as proposed standards and may soon be published as Internet RFCs.

Figure 3-2 outlines the division of labor between these protocols, and how they relate to each other, application programs, and the network. Elements in italics are protocol extensions defined in separate Internet-Draft documents, which have attained fairly widespread use.

SSH-2 protocol family

Figure 3-2. SSH-2 protocol family

SSH is designed to be modular and extensible. All of the core protocols define abstract services they provide and requirements they must meet, but allow multiple mechanisms for doing so, as well as a way of easily adding new mechanisms. All the critical parameters of an SSH connection are negotiable, including the methods and algorithms used in:

Client and server negotiate the use of a common set of methods, allowing broad interoperability among different implementations. In most categories, the protocol defines at least one required method, to further promote interoperability. Note that this only means a conforming implementation is required to support the method in its code; any particular method may in fact be turned off by the administrator in a particular environment. So, the fact that public-key authentication is required by SSH-AUTH doesn't mean it's always available to clients from any particular running SSH server; it merely means it must be available and could be turned on, if need be.

SSH-TRANS is the fundamental building block, providing the initial connection, record protocol, server authentication, and basic encryption and integrity services. After establishing an SSH-TRANS connection, the client has a single, secure, full-duplex byte stream to an authenticated peer.

Next, the client can use SSH-AUTH over the SSH-TRANS connection to authenticate itself to the server. SSH-AUTH defines a framework within which multiple authentication mechanisms may be used, fixing such things as the format and order of authentication requests, conditions for success or failure, and how a client learns the available methods. There may be any number of actual methods implemented, and the protocol allows arbitrary exchanges as part of any particular mechanism so that protocol extensions are easily defined to incorporate any desired authentication method in the future. SSH-AUTH requires only one method: public key with the DSS algorithm. It further defines two more methods: password and hostbased. A number of other methods have been defined in various Internet-Drafts, and some of them have gained wide acceptance.

After authentication, SSH clients invoke the SSH-CONN protocol, which provides a variety of richer services over the single pipe provided by SSH-TRANS. This includes everything needed to support multiple interactive and noninteractive sessions: multiplexing several streams (or channels ) over the underlying connection; managing X, TCP, and agent forwarding; propagating signals across the connection (such as SIGINT, when a user types ^C to interrupt a process); terminal handling; data compression; and remote program execution.

Finally, an application may use SSH-SFTP over an SSH-CONN channel to provide file-transfer and remote filesystem manipulation functions.

It's important to understand that the arrangement, layering, and sequencing of these protocols is a matter of convention or need, not design; although they are typically used in a particular order, other arrangements are possible. For instance, note that SSH-CONN is not layered on top of SSH-AUTH; they are both at the same level above SSH-TRANS. Typically, an SSH server requires authentication via SSH-AUTH before allowing the client to invoke SSH-CONN—and also typically, clients want to use SSH-CONN in order to obtain the usual SSH services (remote terminal, agent forwarding, etc.). However, this need not be the case. A specialized SSH server for a particular, limited purpose might not require authentication, and hence could allow a client to invoke an application service (SSH-CONN, or perhaps some other locally defined service) immediately after establishing an SSH-TRANS connection. An anonymous SFTP server might be implemented this way, for example. However, such nonstandard protocol arrangements are probably seen only in a closed environment with custom client/server software. Since most SFTP clients in the world expect to do SSH-AUTH, they probably won't interoperate with such a server. An anonymous SFTP server for general use would use SSH-AUTH in the usual fashion and simply report immediate success for any attempted client authentication.

That said, these protocols were conceived as a group and rely on each other in practice. For instance, SSH-SFTP on its own provides no security whatsoever; it is merely a language for conducting remote-filing operations. It's assumed to be run over a secure transport if security is needed, such as an SSH session. However, using the sftp -S option of OpenSSH and Tectia, for example, you could connect the sftp client to an sftp-server running on another host using some other method: over a serial line, or some other secure network protocol...or rsh if you want to be perverse. Similarly, SSH-AUTH mechanisms rely on a secure underlying transport to varying degrees. The most obvious is the "password" mechanism, which simply sends the password in plaintext over the transport as part of an authentication request. Obviously, that mechanism would be disastrous over an insecure transport.

Another important point is that the SSH protocol deals only with communication "on-the-wire"--that is, its formats and conventions apply only to data being exchanged dynamically between the SSH client and server. It says nothing at all, for instance, about:

...and many other things that people typically think of as part of SSH. These facets are implementation-dependent: they are not specified by the standard, and hence may be done differently depending on what software you're using. And in fact they do differ: OpenSSH and Tectia use different file formats for keys. Even if you convert one to the other, you'll find that OpenSSH keys belong in ~/.ssh/authorized_keys, whereas each Tectia key goes in its own file, listed by reference in yet another file, ~/.ssh2/authorization. And although both products sport a private-key agent—with the same name even, ssh-agent--they are incompatible.

Now that we have an overview of the major components of SSH, let's dive in and examine each of these protocols in detail. To give structure and concreteness to an otherwise abstract description of the protocols, we frame our discussion by following a particular SSH connection from beginning to end. We follow the thread of debugging messages produced by ssh -vv, explaining the significance of the various messages and turning aside now and then to describe the protocol phases occurring at that point.

Since this -vv level of verbosity produces quite a few messages not relevant to our protocol discussion, we omit some for the sake of clarity.

As soon as the server accepts the connection, the SSH protocol begins. The server announces its protocol version using a text string:

    debug1: Remote protocol version 2.0, remote software version 4.1.0.34 SSH Secure Shell

You can see this string yourself by simply connecting to the server socket, e.g., with telnet:

    $ telnet host.foo.net 22
    Trying 10.1.1.1...
    Connected to host.foo.net
    Escape character is '^]'.
    SSH-2.0-4.1.0.34 SSH Secure Shell
    ^]
    telnet> quit
    Connection closed.

The format of the announcement is:

SSH-<protocol version>-<comment>

In this case, the server implements the SSH-2 protocol, and the software version is 4.1.0.34 of SSH Secure Shell from SSH Communications Security. Although the comment field can contain anything at all, SSH servers commonly put their product name and version there. This is useful, as clients often recognize specific servers in order to work around known bugs or incompatibilities. Some people don't like this practice on security grounds, and try to remove or change the comment. Be aware that if you do, you may cause more trouble than it's worth, since previously working SSH sessions may suddenly start failing if they had relied on such workarounds.

The protocol version number "1.99" has special significance: it means the server supports both SSH-1 and SSH-2, and the client may choose either one.

Next, OpenSSH parses the comment:

    debug1: no match: 4.1.0.34 SSH Secure Shell
    debug1: Enabling compatibility mode for protocol 2.0
    debug1: Local version string SSH-2.0-OpenSSH_3.6.1p1+CAN-2003-0693

but does not find a match in its list of known problems to work around. It elects to proceed with SSH-2 (the only choice in this case), and sends its own version string to the server, in the same format. If the client and server agree that their versions are compatible, the connection process continues; otherwise, either party may decide to terminate the connection.

At this point, if the connection proceeds, both sides switch to a nontextual, record-oriented protocol for further communication, which is the basis of SSH transport. This is often referred to as the SSH binary packet protocol , and is defined in SSH-TRANS.

Having established a connection and agreed on a protocol version, the first real function of SSH-TRANS is to arrange for the basic security properties of SSH:

But first, the two sides must agree on session parameters, including the methods to achieve these properties. The whole process happens in the protocol phase called the key exchange , even though the first part also negotiates some parameters unrelated to the key exchange per se.

    debug1: SSH2_MSG_KEXINIT sent
    debug1: SSH2_MSG_KEXINIT received

The client sends its KEXINIT ("key exchange initialization") message, and receives one from the server. Here are the choices it gives to the server:

    debug2: kex_parse_kexinit: gss-group1-sha1-toWM5Slw5Ew8Mqkay+al2g==,
                     gss-group1-sha1-A/vxljAEU54gt9a48EiANQ==,
                     diffie-hellman-group-exchange-sha1,
                     diffie-hellman-group1-sha1

These are the key exchange algorithms the client supports, which are:

diffie-hellman-group1-sha1

This algorithm is defined and required by SSH-TRANS; this specifies the well-known Diffie-Hellman procedure for key agreement, together with specific parameters (Oakley Group 2 [RFC-2409] and the SHA-1 hash algorithm).

diffie-hellman-group-exchange-sha1

Similar, but allows the client to choose from a list of group parameters, addressing concerns about possible attacks based on a fixed group; defined in the IETF draft document "secsh-dh-group-exchange."[15]

gss-group1-sha1-toWM5Slw5Ew8Mqkay+al2g==

gss-group1-sha1-A/vxljAEU54gt9a48EiANQ==

These odd-looking names are partially encoded in Base64--they represent two variants of a Kerberos-authenticated Diffie-Hellman exchange as defined in IETF draft "secsh-gsskeyex." These are useful where a Kerberos infrastructure is available, providing automatic and flexible server authentication without maintaining separate SSH host keys and known-hosts files. The Kerberos authentication proceeds by way of GSSAPI, and the name suffixes are the Base64 encoding of the MD5 hash of the ASN.1 DER encoding of the underlying GSSAPI mechanism's OID. Say that five times fast.

In terms of abstract requirements, an SSH key exchange algorithm has two outputs:

  • A shared secret, K

  • An "exchange hash," H

K is the master secret for the session: SSH-TRANS defines a method for deriving from secret K the various keys and other cryptographic parameters needed for specific encryption and integrity algorithms used in the SSH connection. The exchange hash H does not have to be secret, although it should not be divulged unnecessarily. It should be unique to each session, and computed in such a way that neither side can force a particular value of hash H. We'll see the significance of that later.

The key exchange should also perform server authentication, in order to guard against spoofing and man-in-the-middle (MITM) attacks. There is an inherent asymmetry in the SSH client/server relationship: the server accepts connections from as-yet unknown parties, whereas a client always has a particular server as the target of its connection. The server may demand secret information as part of user authentication (e.g., password). The client is the first party to rely on the identity of the other side, and hence server authentication comes first. Without server authentication, an attacker might redirect the client's TCP connection to a host of his choice (perhaps by subverting the DNS or network routing) and trick the user into logging into the wrong host; this is called spoofing. Or, he might interpose himself between the client and the (legitimate) server, executing the SSH protocol as server on one side and client on the other, passing messages back and forth and reading all the traffic! This is a man-in-the-middle attack.

The key exchange phase of SSH-TRANS may be repeated later in a connection, in order to replace an aging master secret or re-authenticate the server. In fact, the draft recommends that a connection be re-keyed after each gigabyte of transmitted data or after each hour of connection time, whichever comes sooner. However, the hash output H of the very first key exchange is used as the "session identifier" for this SSH connection; we'll see its use later.

Next, the client offers a choice of SSH host key types it can accept:

    debug2: kex_parse_kexinit: ssh-rsa,ssh-dss,null

In this case, it offers RSA, DSA, and "null," for no key at all. It includes "null" because of its support of Kerberos for host authentication; if a Kerberos key exchange is used, no SSH-specific host key is needed for server authentication.

After that, the client lists the bulk data encryption ciphers it supports:

    debug2: kex_parse_kexinit: aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour,
                     aes192-cbc,aes256-cbc,rijndael-cbc@lysator.liu.se

The selected cipher is used for privacy of data flowing over the connection. Bulk data is never enciphered directly with public-key methods such as RSA or DSA because they are far too slow. Instead, we use a symmetric cipher such as those listed, protecting the session key for that cipher with public-key methods if appropriate. The names here indicate particular algorithms and associated cryptographic parameters; for instance, aes128-cbc refers to the Advanced Encryption Standard algorithm, with a 128-bit key in cipher-block-chaining mode.

Note the use of a private algorithm name as well: . This email-address-like syntax is defined in the SSH Architecture draft ("secsh-architecture"), and allows any individuals or organizations to define and use their own algorithms or other SSH protocol identifiers without going through the IETF to have them approved. Identifiers that don't contain an @ sign are global and must be centrally registered.

The draft also defines the "none" cipher, meaning no encryption is to be applied. While there are legitimate reasons for wanting such a connection (including debugging!), some SSH implementations do not support it, at least in their default configuration. Often, recompiling the software from source with different flags, or hacking the code itself, is needed to turn on support for "none" encryption .[16] The reason is that it's deemed just too dangerous. If a user can easily turn off encryption, so can an attacker who gains access to a user's account, even briefly. Imagine surreptitiously adding this to an OpenSSH user's client configuration file, ~/.ssh/config:

    # OpenSSH
    Host *
     Ciphers none

or simply replacing the ssh program on a compromised machine with one that uses the "none" cipher, and issues no warnings about it. Bingo! All the user's SSH sessions become transparent, until he notices the change (if ever). If the client doesn't support "none," then this simple config file hack won't work; if the server doesn't, then the client-side Trojan horse won't work, either.

Next, the client presents its list of available integrity algorithms:

    debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,hmac-ripemd160,
                     hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96

The integrity algorithm is applied to each message sent by the SSH record protocol, together with a sequence number and session key, to produce a message authentication code (MAC) appended to each message. The receiver can use the MAC and its own copy of the session key to verify that the message has not been altered in transit, is not a replay, and came from the other holder of the session key; these are the message integrity properties.

SSH-TRANS defines several MAC algorithms, and requires support for one: "hmac-sha1," a 160-bit hash using the standard keyed HMAC construction with SHA-1 (see RFC-2104, "HMAC: Keyed-Hashing for Message Authentication").

Finally, the client indicates which data-compression techniques it supports:

    debug2: kex_parse_kexinit: none,zlib

The draft does not require any compression to be available (i.e., "none" is the required type). It does define "zlib": LZ77 compression as described in RFC-1950 and in RFC-1951. Although it does not appear here, SSH speakers also at this point also can negotiate a language tag for the session (as described in RFC-3066), e.g., to allow a server to provide error messages in a language appropriate to the user.

Having sent its negotiation message, the client also receives one from the server, listing the various parameters it supports in the same categories:

    debug2: kex_parse_kexinit: diffie-hellman-group1-sha1
    debug2: kex_parse_kexinit: ssh-dss,x509v3-sign-rsa
    debug2: kex_parse_kexinit: aes128-cbc,3des-cbc,twofish128-cbc,cast128-cbc,
                     twofish-cbc, blowfish-cbc,aes192-cbc,aes256-cbc,
                     twofish192-cbc,twofish256-cbc,arcfour
    debug2: kex_parse_kexinit: hmac-sha1,hmac-sha1-96,hmac-md5,hmac-md5-96
    debug2: kex_parse_kexinit: none,zlib

Note that this server supports a much smaller set of key exchange algorithms: only the required one, in fact. It has two host key types to offer: plain DSS, and RSA with X.509 public-key certificate attached. It does not support a null host key since its single key exchange algorithm requires one.

Next, each side chooses a cipher/integrity/compression combination from the other side's set of supported algorithms:

    debug1: kex: server->client aes128-cbc hmac-md5 none
    debug1: kex: client->server aes128-cbc hmac-md5 none

In this case, the choices in both directions are the same; however, they need not be. The choice of these mechanisms is entirely independent, and they are independently keyed, as well. Data flowing in one direction might be encrypted with AES and compressed, while the return stream could be encrypted with 3DES without compression.

At this point, we are ready to engage in the actual key exchange:

    debug2: dh_gen_key: priv key bits set: 131/256
    debug2: bits set: 510/1024
    debug1: sending SSH2_MSG_KEXDH_INIT

The client chooses an exchange algorithm from the server's advertised set; in this case, the server offers only one, and we go with it. We generate an ephemeral key as part of the Diffie-Hellman algorithm, and send the initial message of the diffie-hellman-group1-sha1 exchange, simultaneously letting the server know which method we're using, and actually starting it.

Next the client expects, and the server sends, its reply to our KEXDH_INIT message:

    debug1: expecting SSH2_MSG_KEXDH_REPLY
    debug1: Host 'host.foo.net' is known and matches the DSA host key.
    debug1: Found key in /Users/res/.ssh/known_hosts:169
    debug2: bits set: 526/1024
    debug1: ssh_dss_verify: signature correct

Contained in the reply is the server's SSH public host key, of a type we said we'd accept in the earlier parameter negotiation (DSA), along with a signature proving it holds the corresponding private key. The signature is verified, of course, but that by itself is meaningless; for all we know, the server just generated this key. The crucial step here is to check that the public key identifies the server we wanted to contact. In this case, the client finds a record associating the name foo.host.net with the key supplied by the server, at line 169 in the user's OpenSSH known_hosts file.

Note that the approach used to verify the host key is entirely unspecified by the SSH protocol; it's completely implementation-dependent. Most SSH products provide some version of the known-hosts file method used here: simple, but limiting and cumbersome for large numbers of hosts, users, or different SSH implementations. A client could do anything that makes sense to verify the host key, perhaps taking advantage of some existing secure infrastructure, for example; look it up in a trusted LDAP directory.

Of course, the problem of verifying the owner of a public key is hardly a new one; that's what Public Key Infrastructure (PKI) systems are for, such as the X.509 standard for public-key certificates. SSH-2 supports PKI, defining a number of key types which include attached certificates:

ssh-rsa

Plain RSA key

ssh-dss

Plain DSS key

x509v3-sign-rsa

X.509 certificates (RSA key)

x509v3-sign-dss

X.509 certificates (DSS key)

spki-sign-rsa

SPKI certificates (RSA key)

spki-sign-dss

SPKI certificates (DSS key)

pgp-sign-rsa

OpenPGP certificates (RSA key)

pgp-sign-dss

OpenPGP certificates (DSS key)

Many SSH products handle only plain DSS/RSA keys, but some (such as Tectia) offer PKI support as well. Recall that earlier, the server offered a key type of x509v3-sign-rsa along with plain DSS. Our OpenSSH client does not support certificates, and so selected the DSS key. However, with PKI support, the client could verify the host key by its accompanying certificate. New hosts could be added and existing keys changed, without having to push out new known-hosts files to all clients every time—often a practical impossibility anyway, when you consider laptops, many different SSH clients with different ways of storing host keys, etc. Instead, clients only need a single key; that of the authority issuing your host key certificates. We'll cover PKI in more detail in a case study. [11.5]

As noted earlier, we're avoiding diving too deeply into protocol details, instead attempting a technical overview that covers the issues SSH administrators most need to understand to deploy effective systems. However, it's worth going a little deeper here regarding the actual mechanism of server authentication, since our description begs the question. Simply saying that the server "provides a signature" to prove its identity doesn't cut it. Here's a naive protocol:

We're being at least moderately clever here; by using a random challenge, we assure that the response can't be replayed by an attacker, i.e., is not a reply from an earlier session. Not bad, but no cigar: this simple procedure does not prevent MITM attacks! An MITM attacker can simply pass along the challenge to the server, get the signature, and pass it back to the client. All this protocol really proves to the client is that the entity at the other end of its connection can talk to the real server, when what the client wants to verify is that entity actually is the real server. So, here's how it's done: instead of a random challenge, the server signs the SSH session identifier, which we described earlier. Recall that the identifier is unique to each session, and that neither side can force a particular value for it. In order to do MITM, our attacker has to execute the SSH protocol independently on two sides: once with the client, and again with the server. The identifiers for those two connections are guaranteed to be different, no matter what the attacker does. He needs to produce the client-side identifier signed by the server in order to fool the client, but all he can get is the server-side identifier; he can't force the server to sign the wrong identifier.

Cryptographers are devious people. We like them.

Back to our debug trace example: we've sent and received a single key-exchange message on each side now, and this key-exchange method in fact only requires the two messages. Other exchange mechanisms could take any number and form of messages, but ours is now complete. Based on the contents of these messages, both sides compute the needed shared master secret K and exchange hash H, in such a way that an observer can't feasibly discover them (we leave the mathematical details to your perusal of the actual draft document, if you're that curious). Having authenticated the exchange using the server's host key, we are convinced that we have shared keys with the server we really wanted to talk to, and now everything is in place to turn on security in the form of encryption and integrity checking.

Using a procedure defined in the draft, the client derives appropriate encryption and integrity keys from the master secret; the server does the same to produce matching keys:

    debug2: kex_derive_keys
    debug2: set_newkeys: mode 1
    debug1: SSH2_MSG_NEWKEYS sent
    debug1: expecting SSH2_MSG_NEWKEYS
    debug2: set_newkeys: mode 0
    debug1: SSH2_MSG_NEWKEYS received

Both sides then send the NEWKEYS message, each which marks taking the new keys into effect in its own direction; all messages after NEWKEYS are protected using the new set of keys just negotiated. With a functioning SSH-TRANS session at hand, the client now requests the first service it wants access over the connection: user authentication.

    debug1: SSH2_MSG_SERVICE_REQUEST sent
    debug2: service_accept: ssh-userauth
    debug1: SSH2_MSG_SERVICE_ACCEPT received

Compared to SSH-TRANS, SSH-AUTH is a relatively simple affair, defined in a mere 12 pages as opposed to the 28 of the SSH-TRANS document (and that's not counting various extensions!). As with SSH-TRANS and key-exchange methods, the authentication protocol defines a framework within which arbitrary authentication exchanges may take place. It then defines a small number of actual authentication mechanisms, and allows for easy extension to define others. The three defined methods are password, public-key, and host-based authentication, of which only public-key is required.

The authentication process is driven by the client, framed by client requests and server responses. A request contains the following parts:

Once a particular authentication method starts, it may include any number of other message types specific to its needs. Or in the simplest case, the data carried by the initial request is enough, and the server can respond right away. In any case, after the request and some number of subsequent method-specific messages back and forth, the server issues an authentication response.

Note that, strictly speaking, calling this an "authentication request" is not quite accurate; this request actually mixes authentication and authorization. It requests verifying an authentication identity via some method, and simultaneously asks the server to check that identity's right to access a particular account: an authorization decision. If the attempt fails, the client doesn't know whether this was because authentication failed (e.g., it supplied the wrong password), or authentication succeeded but authorization failed (e.g., the password was right but the account was disabled). A human-readable error message might make that clear, but the situations are indistinguishable as far as the protocol is concerned (in general, but individual methods may provide more information, as we will see later with the public-key method).

An SSH-AUTH authentication response comes in two flavors: SUCCESS and FAILURE (an early version of the protocol had chocolate, too, which was unfortunately abandoned). A SUCCESS message carries no other data; it simply means that authentication was successful, and the requested service has been started; further SSH-TRANS messages sent by the client should be defined within that service's protocol, and the SSH-AUTH run is over.

A FAILURE message has more structure:

The name "failure" is actually a bit misleading here. If the partial success flag is false, then this message does mean the preceding authentication method has failed for some reason (e.g., a supplied password was incorrect, a mismatched public key produced an incorrect signature, the requested account is locked out, etc.). If the flag is true, however, the message means that the method succeeded; however, the server requires that additional methods also succeed before granting access. Thus, the protocol allows an SSH server to require multiple authentication methods—although not all implementations provide the feature; Tectia does, for instance, while OpenSSH currently does not.[17]

In either case, the message also supplies the list of authentication methods the server is willing to accept next. This allows for much flexibility; if it wants, the server can completely control the authentication process by only allowing one method at any time. But it can also specify multiple methods, allowing the client to choose them in an order which makes sense for the user. For instance, given a choice, a SSH client usually first tries methods that allow automatic authentication, such as Kerberos or public key with an agent, before those that require user intervention, such as entering a password or key passphrase.

One thing is missing from all this: if the client drives the authentication process by making requests, but the list of available authentication methods is contained in server responses, then how does the client pick a first method to try? Of course, it could always just try any method and see what happens; the worst that could happen is that it fails or isn't available, and the client gets a correct list to pick from. But that's messy, and there's a standard way to do it: the "none" method. The protocol reserves the method name "none," and gives it a special meaning: if authentication is required at all, then this method must always fail. A client typically starts SSH-AUTH by sending a "none" request, expecting failure and getting back the list of available non-"none" methods to try. Of course, if the account in question does not require authentication, the server may respond with SUCCESS, immediately granting access.

Here, the client, having already sent the "none" request to start with, now receives its initial list of methods to try:

    debug1: Authentications that can continue: publickey,password

If you're debugging on the server side, you see something like this (with the OpenSSH server):

    debug1: userauth-request for user res service ssh-connection method none
    debug1: attempt 0 failures 0
    Failed none for res from 10.1.1.1 port 50459 ssh2

This message is confusing if you're debugging some other problem, as it appears to show some mysterious failure.

The client continues, choosing public-key authentication to try first, with a DSS key stored in the SSH agent:

    debug1: Next authentication method: publickey
    debug1: Offering agent key: res-dsa
    debug2: we sent a publickey packet, wait for reply

A public-key authentication request carries the method name "publickey" and may have different forms depending on a flag setting. One form has this method-specific payload:

The usable public-key algorithms are the same set defined in SSH-TRANS, and the format key data depends on the type; e.g., for "ssh-dss" it contains just the key, whereas for x509v3-sign-rsa it contains an X.509 public-key certificate.

With the flag set to FALSE, this message is merely an authorization test: it asks the server to check whether this key is authorized to access the desired account, without actually performing authentication. If it is, a special response message comes back; this is an example of the possible method-specific SSH-AUTH messages we mentioned earlier. If the key is not authorized, the response is simply FAILURE.

The second form is:

This actually requests authentication; the signature is computed over a set of request-specific data which includes the session ID, which binds the request to this SSH session and gives the public-key method its own measure of MITM resistance, similar to that described earlier for key exchange.

The reason for providing both forms of request is that computing and verifying public-key signatures are compute-intensive tasks, which might also require interaction with the user (e.g., typing in a key passphrase). Hence, it makes sense to test a key first, to see whether it's worth going to the trouble of using it.

The way a server actually authorizes a key for access to an account is outside the scope of the protocol, and can be anything at all. The usual way is to list or refer to the key in some file in the account, as with the OpenSSH ~/.ssh/authorized_keys file. However, the server might access any type of service to do this; again, checking an entry in an LDAP directory comes to mind. Or again, certificates might be used: just as with host authentication, the key here might include a certificate, and any of the certificate's data might be used to make the authorization decision.

Coming back to our debug trace, we see that the server accepts the offered key:

    debug1: Server accepts key: pkalg ssh-dss blen 435 lastkey 0x309a40 hint -1
    debug2: input_userauth_pk_ok: fp 63:24:90:03:cb:78:85:e6:59:71:49:26:55:81:f5:70
    debug1: Authentication succeeded (publickey).

Then it logs the key's fingerprint and returns the final SUCCESS message, indicating that access is granted and the SSH-AUTH session is finished.

Before moving on to the final protocol phase, let's examine two other methods defined in SSH-AUTH: password and hostbased authentication.

The password method is very simple: its name is "password," and the data is, surprise, the password. The server simply returns success or failure messages as appropriate. The method it uses to verify the password is implementation-dependent, and varies a great deal: PAM, Unix password files, LDAP, Kerberos, NTLM; all these are available in various products.

The password is passed in plaintext, at least as far as SSH-AUTH is concerned; hence, it is critical that this method be used over an encrypted connection (as is usually the case with SSH). Furthermore, since this method reveals the password to the server, it is crucial that the server not be an impostor. Even if an SSH product may warn of, but allow, a connection to an unauthenticated server in SSH-TRANS, it usually disallows password authentication in SSH-AUTH for this reason. Compare this with the public-key method, which doesn't reveal the user's key in the authentication process.

It should be mentioned that "password authentication" is a pretty broad term, and might be construed as encompassing other, better methods. If you think of it as describing any mechanisms that rely on secrets that can be easily memorized and typed by a human, then there are "password" methods with much better security properties than the trivial one described here; the Secure Remote Password protocol (SRP, http://srp.stanford.edu/) is one. [1.6.5] In this book, however, when we talk about "password" authentication, we mean as defined in SSH-AUTH.

SSH-AUTH also has a set of messages for password changing—for example, allowing a user whose password has expired to set a new one before logging in.

Hostbased authentication is fundamentally different from its public-key and password cousins, in that the server actually delegates responsibility for user authentication to the client host. In short, hostbased authentication establishes trust relationships between machines. Rather than directly verifying the user's identity, the SSH server verifies the identity of the client host--and then believes the host when it says the user has already authenticated on the client side. Therefore, you needn't prove your identity to every host that you visit. If you are logged in as user andrew on machine A, and you connect by SSH to account bob on machine B using hostbased authentication, the SSH server on machine B doesn't check your identity directly. Instead, it checks the identity of host A, making sure that A is a trusted host. It further checks that the connection is coming from a trusted program on A, one installed by the system administrator that won't lie about andrew's identity. If the connection passes these two tests, the server takes A's word that you have been authenticated as andrew and proceeds to make an authorization check that is allowed to access the account .

This sort of authentication makes sense only in a tightly administrated environment with less stringent security requirements, or when deployed for very specific and limited purposes, such as batch jobs. It demands that all participating hosts be centrally administered, making sure that usernames are globally selected and coordinated. If not, you could get access to someone else's account just by adding an account with the same name to your own machine! Also, there's the problem of transitive compromise: once one host is broken, the attacker automatically gets access to all accounts accessible via hostbased authentication from there, without any further work.

Nevertheless, hostbased authentication has advantages. For one, it is simple: you don't have to type passwords or passphrases, or generate, distribute, and maintain keys. It also provides ease of automation. Unattended processes such as cron jobs may have difficulty using SSH if they need a key, passphrase, or password coded into a script, placed in a protected file, or stored in memory. This isn't only a potential security risk but also a maintenance nightmare. If the authenticator ever changes, you must hunt down and change these hardcoded copies, a situation just begging for things to break mysteriously later on. Hostbased authentication gets around this problem neatly.

The "hostbased" request looks like:

  • Host key algorithm

  • Client host key

  • Client hostname

  • Client-side username, C

  • Signature

Note that this request has two usernames: the requested server-side account name U present in every SSH-AUTH request, and the client-side username C specific to the hostbased request. The interpretation is that user C on the client is requesting access to account U on the server, and the client's authentication as C is vouched for by the signature of the client host key. The mapping of which client usernames may access which accounts on the server is up to the implementation. Unix products tend to use semantics similar to the historical rhosts syntax, in the files /etc/shosts.equiv and ~/.shosts. These can implement global identity mappings, allowing matching usernames automatic access, as well as more complicated or limited access patterns.

In order to perform this authentication, the server must verify the client host identity—that is, it must check that the supplied key matches the claimed client hostname (e.g., with a known-hosts file). Having checked that and verified the signature, it then uses that same hostname in the authorization check (e.g., in /etc/shosts.equiv), to see if the requested client/server name pair is allowed access from this client host. Some implementations also check that the client's network address actually maps to the given hostname via the local naming service (DNS, NIS, etc.), but this is not really necessary; the meat of the authorization is in the association of the verified hostname supplied in the request, and the authorization rules. In fact, the address check may cause more trouble than it's worth, in the presence of poorly maintained DNS, network complications such as NAT, firewalls, proxying, etc.

Of course, for this whole scenario to make any sense at all, there are yet more administrative burdens to be met. The signature, after all, is supplied by the client; and yet it is interpreted here as a trusted third party—the client host as a separate entity—vouching for the user's identity. But the user is behind the SSH client; how does this work? The answer is that the client host and SSH software must be arranged so that the user is not fully in control of what's going on. The private client host key must not be accessible to the user; rather, there must be a trusted service whereby the user can obtain the needed signature for the hostbased authentication request, and such signatures are only issued as appropriate. In a Unix context, usually the private host key file is readable only by the root account, and some part of SSH is installed with special privileges by the sysadmin ("setuid root"; typically this is a separate program called ssh-signer, which serves only this purpose). This trusted program checks the uid of the user running it, and issues signatures only for the corresponding username. This effectively translates the local authentication that allowed the person to log in to begin with, into an SSH certificate which can be transmitted and trusted as part of hostbased authentication. This description makes it even more clear how the whole arrangement is predicated on a very centrally controlled and consistently administrated system. One should evaluate very carefully whether hostbased authentication is the right choice.

In its final, successful authentication request, the client specified a service name of "ssh-connection"; this is not visible in the OpenSSH client debug trace but shows up on the server as:

    debug1: userauth-request for user res service ssh-connection method publickey

Since it authenticated the client, the server now starts that service, and we move on to the SSH Connection Protocol. This layer actually provides the capabilities that users want to employ directly and that define SSH for most people: remote login and command execution, agent forwarding, file transfer, TCP port forwarding, X forwarding, etc.

There is a lot of detail in the connection protocol, but much of it is too low-level for our present discussion; we give a fairly high-level description here, sufficient to interpret most debugging messages and to understand how an SSH product provides its services using SSH-CONN. Unlike the earlier protocols, a really detailed understanding of SSH-CONN is not usually needed for debugging everyday SSH problems.

The basic service SSH-CONN provides is multiplexing. SSH-CONN takes the single, secure, duplex byte-stream provided by SSH-TRANS, and allows its clients to create dynamically any number of logical SSH-CONN channels over it. Channels are identified by channel numbers , and may be created or destroyed by either side. Channels are individually flow-controlled, and each channel has a channel type which defines its use. Types and other items in SSH-CONN are named in the same extensible manner as other SSH namespaces (key exchanges, key algorithm and authenticated method names, etc.). The defined types are:

session

The remote execution of a program.

Merely opening a session channel does not start a program; that is done using subsequent requests on the channel. An SSH-CONN session may have multiple session channels open at once, simultaneously supporting several terminal, file-transfer, or program executions at once. Various Windows-based SSH products have used this ability for some time now; it has only recently appeared in OpenSSH with the ControlMaster/ControlPath feature. [7.4.4.2]

x11

An X11 client connection.

One of these is opened from server to client, for each X11 program using X forwarding as established by an x11-req on a session channel (discussed later).

forwarded-tcpip

An inbound connection to a remotely forwarded port.

When a connection arrives on a remotely forwarded TCP port, the server opens this channel back to the client to carry the connection.

direct-tcpip

An outbound TCP connection.

This directs the peer to open a TCP connection to a given socket, and attach the channel to that connection. The socket may be specified using a domain name or IP address, allowing a name to be resolved on the remote side in a possibly different namespace than the client. These channels are used to implement local TCP forwarding (ssh -L). Preparing for local forwarding is purely a client-side affair: the client simply starts listening on the requested port.[18] The server first hears of it when a connection actually arrives on the port, whereupon the client opens a direct-tcpip channel with the appropriate target socket. This means that if certain local forwardings are disallowed by the server, this isn't noticed on connection setup, but only when a connection is actually attempted

Channel semantics are richer than a traditional Unix file handle; the data they carry can be typed, and this facility is used to distinguish between stdout and stderr output from a program on a single channel.

In addition to an array of channel operations—open, close, send data, send urgent data, etc.--SSH-CONN defines a set of requests , with global or channel scope. A global request affects the state of the connection as a whole, whereas a channel request is tied to a particular open channel. The global requests are:

Now let's summarize the channel requests ; except as indicated, most operations refer to the remote side of a session channel:

pty-req

Allocate a pty, including window size and terminal modes.

This creates a pseudo-terminal for the channel, generally required for interactive applications; the pseudo-terminal is a virtual device which makes it appear that the remote program is directly connected to a terminal.

x11-req

Set up X11 forwarding.

Do the preparation necessary for X11 forwarding on the remote; usually involves listening on a socket (TCP or otherwise) for X11 connections, setting the DISPLAY variable to point to that socket, and setting up proxy X11 authentication.

env

Set an environment variable.

Although useful, this feature is also a potential security problem. It has not been widely supported by SSH implementations until recently and is generally carefully controlled.

shell, exec, subsystem

Run the default account shell, an arbitrary program, or an abstract service, respectively.

These requests start a program running on the remote side, and connect the channel to the program's standard input/output/error streams. The "subsystem" request allows a remote program to be named abstractly, rather than being depended on by a particular remote filename. For instance, an SFTP file transfer is usually started by sending a subsystem request with the name "sftp." The SSH server is configured to execute the correct server program in response to the request; this way, the location of the SFTP server program can change without affecting clients. Or indeed, SFTP could be implemented internal to the SSH server itself, rather than being a separate program, and this, too, would be transparent to clients; this is an option with Tectia.

window-change

Change terminal window size.

xon-xoff

Use client-side ^S/^Q flow control.

signal

Send a specified signal to a remote process (as in the Unix kill command).

exit-status

Return the program's exit status to the initiator.

exit-signal

Return the signal that terminated the program (e.g., if a remote program dies by signal, as from a segmentation fault or manual kill -9 command).

Theoretically, all these requests are symmetric; that is, the protocol allows the server to open a session channel to the client and request a program to be started on it, for example. However, in most SSH implementations as a remote-login tool, this simply doesn't make sense, and is an obvious security risk to boot! So, such requests are usually not honored by clients (and the SSH-CONN draft recommends as much).

With all this behind us, we can easily make sense of the remainder of the connection setup. The client opens a session channel with id 0:

    debug1: channel 0: new [client-session]
    debug2: channel 0: send open
    debug1: Entering interactive session.

This session is a terminal login, so next we request a pseudo-terminal on the session channel:

    debug1: channel 0: request pty-req

X forwarding is turned on, so the client first gets the local X11 display key by running the xauth program on this side, then requests X forwarding on the remote by sending an x11-req global request:

    debug2: x11_get_proto: /usr/X11R6/bin/xauth list :0 2>/dev/null
    debug1: Requesting X11 forwarding with authentication spoofing.
    debug1: channel 0: request x11-req

Agent forwarding is also turned on, so we open a channel for that as well:

    debug1: Requesting authentication agent forwarding.
    debug1: channel 0: request auth-agent-req@openssh.com

But wait... we didn't mention agent forwarding anywhere in SSH-CONN, nor the channel type that appears here, . Indeed, that's because it's not there; key agents are an implementation detail outside the purview of the protocol. This channel type is an example of the naming extension syntax; it is particular to the OpenSSH implementation. An OpenSSH server accepts such a channel request and sets up an agent-forwarding socket on the remote end (whose details are specific to the OpenSSH program suite). A non-OpenSSH server would refuse the unrecognized request, and agent forwarding would not be available.

Finally, the client issues a "shell" request on the session channel:

    debug1: channel 0: request shell

directing the remote account's default command be started. And at long last...

    debug1: channel 0: open confirm rwindow 100000 rmax 1638
    Last login: Mon Aug 30 2004 18:04:10 -0400 from foo.host.net
    $

...we're logged in!



[14] SSH Version 1 went through several revisions, the most popular known as Versions 1.3 and 1.5.

[15] A group is a mathematical abstraction relevant to the Diffie-Hellman procedure; see a reference on group theory, number theory, or abstract algebra if you're curious.

[16] OpenSSH has no support for the "none" cipher; it can't even be enabled at compile time. In contrast, Tectia fully supports the "none" cipher, but it is not enabled by default; it needs to be explicitly included using the Ciphers keyword. [5.3.5]

[17] The OpenSSH team is working on multiple authentication support.

[18] Unlike remote forwarding, no initial setup is required on the remote side.

[19] The OpenSSH team is working on adding this feature.