TCP flow control and avoiding deadlock

When designing application protocols and writing network code, we need to be careful to prevent a deadlock state. A deadlock is when both sides on a connection are waiting for the other side to do something. The worst-case scenario is when both sides end up waiting indefinitely.

A trivial example of a deadlock is if both the client and server call recv() immediately after the connection is established. In that case, both sides wait forever for data that is never going to come.

A less obvious deadlock situation can happen if both parties try to send data at the same time. Before we can consider this situation, we must first understand a few more details of how TCP connections operate.

When data is sent over a TCP connection, this data is broken up into segments. A few segments are sent immediately, but additional segments aren't sent over the network until the first few segments are acknowledged as being received by the connected peer. This is part of TCP's flow-control scheme, and it helps to prevent a sender from transmitting data faster than a receiver can handle.

Consider the following diagram:

In the preceding diagram, the Client sends three TCP segments of data to the Server. The Client has additional DATA ready to send, but it must wait until the already-sent data is acknowledged. Once the ACK Message is received, the Client resumes sending its remaining DATA.

This is the TCP flow-control mechanism that ensures that the sender isn't transmitting faster than the receiver can handle.

Now, keeping in mind that a TCP socket can send only a limited amount of data before requiring acknowledgment of receipt, imagine what happens if both parties to a TCP connection try to send a bunch of data at the same time. In this case, both parties send the first few TCP segments. They both then wait until their peer acknowledges receipt before sending more. However, if neither party is reading data, then neither party acknowledges receiving data. This is a deadlock state. Both parties are stuck waiting forever.

Many application protocols prevent this problem by design. These protocols naturally alternate between sending and receiving data. For example, in HTTP, the client sends a request, and then the server sends a reply. The server only starts sending data after the client has finished sending.

However, TCP is a full-duplex protocol. Applications that do need to send data in both directions simultaneously should take advantage of TCP's ability to do so.

As a motivating example, imagine implementing a file-transfer program where both peers to a TCP connection are sending large parts of a file at the same time. How do we prevent the deadlock condition?

The solution to this is straightforward. Both sides should alternate calls to send() with calls to recv(). The liberal use of select() will help us do this efficiently.

Recall that select() indicates which sockets are ready to be read from and which sockets are ready to be written to. The send() function should be called only when you know that a socket is ready to be written to. Otherwise, you risk that send() may block. In the worst case, send() will block indefinitely.

Thus, one procedure to send a large amount of data is as follows:

  1. Call send() with your remaining data.
  2. The return value of send() indicates how many bytes were actually consumed by send(). If fewer bytes were sent than you intended, then your next call to send() should be used to transmit the remainder.
  3. Call select() with your socket in both the read and write sets.
  4. If select() indicates that the socket is ready to be read from, call recv() on it and handle the received data as needed.
  5. If select() indicates that the socket is ready to write to again, go to step 1 and call send() with the remaining data to be sent.

The important point is that calls to send() are interspersed with calls to recv(). In this way, we can be sure that no data is lost, and this deadlock condition does not occur.

This method also neatly extends to applications with many open sockets. Each socket is added to the select() call, and ready sockets are serviced as needed. Your application will need to keep track of which data is remaining to be sent for each connection.

It should also be noted that setting sockets to a non-blocking mode can simplify your program's logic in some cases. Even with non-blocking sockets, select() can still be used as a central blocking point to wait for socket events.

Two files are included with this chapter's code repository that can help to demonstrate the deadlock state and how select() can be used to prevent it. The first file, server_ignore.c, implements a simple TCP server that accepts connections and then ignores them. The second file, big_send.c, initiates a TCP connection and then attempts to send lots of data. By using the big_send program to connect to the server_ignore program, you can investigate the blocking behavior of send() for yourself.

Deadlocks represent only one way a TCP connection can unexpectedly fail. While deadlocks can be very difficult to diagnose, they are preventable with careful programming. Besides the risk for deadlock, TCP also presents other data transfer pitfalls. Let's consider another common performance problem next.