If we are looking for a more reliable data transfer, we will have to employ the Transmission Control Protocol (TCP) protocol. It is also known as TCP/IP due to its usage of IP addresses and the eponymous underlying lower-level protocol.
TCP uses a networking concept known as connection, meaning a contract between two endpoints in which that data will be sent between them using some contextual data describing the current state of the connection. This is put to use in TCP to maintain counters for the sent and received packets on both ends of the connection. There are several aspects to the TCP protocol, such as the following:
- Connection setup: If we want to communicate with TCP, we must first establish a connection between both endpoints. The mechanism that's used is called a three-way handshake – first, a connection request (called SYN) is sent, then the counterpart responds with SYN-ACK, signalling the intention of participating the connection, and lastly the initiator sends it off with ACK. This sequence is illustrated by the following diagram:
- Data transmission: After that, exchange data can be sent between both endpoints over the connection. Each sent packet will need to be acknowledged by the receiving side (that is, ACK) for the next packet to be sent. If we don't receive the ACK in some fixed interval but we are willing to wait for it, we declare the packet as lost and send it again—a procedure known as retransmission. Of course, if we re-transmit too soon, the remote node gets two copies of the data and has to cope with that by discarding one of them.
- Flow and congestion control: The TCP protocol also contains mechanisms to avoid overflowing the receiver with data faster than it can process it, as well as preventing the entire network from overloading.
This describes the basic mechanism of TCP, however the real implementation contains many more features and optimizations. We will describe them shortly when we discuss TCP's performance later in this chapter.