Chapter 7
Impact of Satellite Networks on Transport Layer Protocols

This chapter discusses the impact of satellite networks on transport layer protocols including the transmission control protocol (TCP) and their applications. TCP is a reliable transport layer protocol of the Internet protocol stack. It provides the protocol for end-to-end communications between a client process in one host and a server process in the other host across the Internet. TCP has neither information on applications nor information on Internet traffic conditions and the transmission technologies (such as LAN, WAN, wireless and mobile and satellite networks). It relies on mechanisms including flow control, error control and congestion control between the client and server hosts to recover from transmission error and data loss and from network congestion and buffer overflows. All these mechanisms affect the performance of TCP over satellite and hence the Internet applications directly. This chapter also explains the major enhancements designed to improve TCP performance over satellite for a ‘satellite-friendly TCP’, although not all of these enhancements have become IETF standards, since they may cause some side-effects on the normal TCP operations. This chapter also provides an introduction to real-time transport protocols built on top of the user datagram protocol (UDP) including RTP, RTCP, SAP, SIP and so on, and related applications including voice over IP (VoIP) and multimedia conferencing (MMC). When you have completed this chapter, you should be able to:

Know the impact of satellite networks on the performance of TCP due to flow control, error control and congestion control mechanisms.
Carry out a performance analysis on the standard TCP slow-start algorithm and congestion avoidance mechanism, and calculate the utilisation of satellite bandwidth.
Know the typical TCP-enhancement for satellite networks.
Describe TCP enhancement on the slow-start algorithm.
Describe TCP enhancement on the congestion avoidance mechanism.
Describe TCP enhancement on acknowledgement.
Know TCP enhancement on error recovery mechanisms including fast retransmission and fast recovery.
Learn the interruptive TCP performance acceleration mechanisms including TCP spoofing and cascading TCP (also known as split TCP).
Understand the impact of satellite networks on different applications.
Understand the limitation of TCP enhancement mechanisms based on existing TCP mechanisms.
Understand real-time protocols, including RTP, RTCP, SAP, SIP and so on, and their differences from other application layer protocols such as HTTP and SMTP.
Understand VoIP and MMC based on the real time transport protocols.

7.1 Introduction

TCP is the protocol for end-to-end communications between processes in different hosts across Internet networks. It is implemented within the client host or server in order to provide applications with reliable transmission services. It is transparent to the Internet, that is, the Internet treats it only as the payload of IP packets (see Figure 7.1).

c07f001 — **Figure 7.1** TCP protocol over satellite Internet

The most challenging task of TCP is to provide reliable and efficient transmission services without knowing anything about applications above it or anything about the Internet below it. TCP carries out proper actions according to application characteristics, client and server parameters and network parameters and conditions (particularly satellite networks).

7.1.1 Application Characteristics

There is a wide range of applications built on TCP, including remote login, file transfer, email and WWW. The amount of data to be transmitted by TCP can range from a few bytes to kilobytes, megabytes or even gigabytes. The duration of a TCP session can be as few as a fraction of a second up to many hours. Therefore, the data size of each transaction and total data size of each TCP session are important factors affecting TCP performance.

7.1.2 Client and Server Host Parameters

The current Internet applications built on TCP are elastic, that is, they can tolerant slow process and slow transmission of their data. It is this feature together with TCP that allows us to build the Internet using different types of computers, from PCs to supercomputers, and enables them to communicate with each other using networks of different types.

The main parameters affecting TCP performance include the process power (how fast it can deal with data within the TCP session), buffer sizes (memory space allocated to the TCP session for data buffering) and speeds of network interface cards (how fast the hosts can send data to networks) in both client and server hosts, and round trip delay (RTT) between the client and the server.

7.1.3 Satellite Network Configurations

Satellite can play many different roles in the Internet. Figure 7.2 shows a typical example of satellite network configurations with the satellite network in the centre connecting two terrestrial access networks.

c07f002 — **Figure 7.2** Example of satellite network configurations

For ease of discussion, we assume that all constraints are due to the satellite network (long delay, high bit error rate (BER), limited bandwidth, etc.) as these are negligible in terrestrial networks comparing to satellite. Both access networks and interworking units (routers or switches) are capable of dealing with traffic flows between access networks and the satellite network. The following are some typical satellite network configurations:

Asymmetric satellite networks:DVB-S/S2, DVB-RCS/RCS2 and VSAT satellite networks are configured with bandwidth asymmetry, a larger data rate in the forward direction (from satellite gateway station to user earth stations) than the return direction (user earth stations to satellite gateway station), because of limits on the transmission power and the antenna size at different satellite earth stations. Receive-only broadcasting satellite systems are unidirectional and can use a non-satellite return path (such as terrestrial networks). The nature of most TCP traffic is asymmetric with data flowing in one direction and acknowledgements in the opposite direction.
Satellite link as last hop:Satellite links that provide service directly to end users, as opposed to satellite links located in the middle of a network, may allow for specialised design of protocols used over the last hop. Some satellite providers use the satellite link as a shared high-speed downlink to users with a lower speed, non-shared terrestrial link that is used as a return link for requests and acknowledgements. In this configuration, the client host has direct access to the satellite network.
Hybrid satellite networks:In the more general case, satellite links may be located at any point in the network topology. In this case, the satellite link acts as just another link between two gateways. In this environment, a given connection may be sent over terrestrial links (including terrestrial wireless), as well as satellite links. This is a typical transit network configuration.
Point-to-point satellite networks:In point-to-point satellite networks, the only hop in the network is over the satellite link. This is a pure satellite network configuration.
Multiple satellite hops:In some situations, network traffic may traverse multiple satellite hops between the source and the destination. Such an environment aggravates the satellite characteristics. This is a generic problem with special circumstances or space communications where there are many more constraints due to long delay, high BER and limited bandwidth.
Constellation satellite networks with and without inter-satellite links (ISL):In constellation satellite networks without ISL, multiple satellite hops are used for wide coverage. In constellation satellite networks with ISL, wide coverage is achieved by ISL. The problem is that the route of the network is highly dynamic hence end-to-end delay is variable.

7.1.4 TCP and Satellite Channel Characteristics

The Internet differs from a single network because it consists of different network topologies, bandwidth, delays and packet sizes. TCP is formally defined in RFC 793 and updated in RFC 1122 and extensions are given in RFC 1323 for high performance to work in such heterogeneous networks.

TCP is a byte stream, not a message stream and message boundaries are not preserved end to end. All TCP connections are full-duplex connections and point to point. As such TCP does not support multicasting or broadcasting.

The sending and receiving TCP entities exchange data in the form of segments. A segment consists of a fixed 20-byte header (plus an optional part) followed by zero or more data bytes. Two limits restrict the TCP segment size:

Each segment must fit into the 65 535 byte IP payload (RFC 2765 describes adapting TCP and UDP to use IPv6 jumbo-gram that supports datagrams larger than 65 535 bytes long).
Each network has a maximum transfer unit (MTU). The segment must fit into the MTU. This is limited by the networks such as the Ethernet payload size at MAC layer.

In practice, the MTU is a few kilo bytes (KBytes) and thus defines the upper boundary of the segment size. Satellite channels have several characteristics that differ from most terrestrial channels. These characteristics may degrade the performance of TCP. These characteristics include:

Long round trip time (RTT):Due to the propagation delay of the satellite channels it may take a long time for a TCP sender to determine whether or not a packet has been successfully received at the final destination. This delay affects interactive applications such as web services, as well as some of the TCP congestion control algorithms.
Large delay bandwidth product: The delay bandwidth (DB) product defines the amount of data a protocol should have ‘in flight’ (data that has been transmitted but not yet acknowledged) at any one time to utilise the available channel capacity. The delay is the RTT (end-to-end) and the bandwidth is the capacity of the bottleneck link in the network path.
Transmission errors:Satellite channels exhibit a higher bit-error rate (BER) than typical terrestrial networks. TCP assumes that all packet drops are cased by network congestion and reduces its window size in an attempt to alleviate the congestion. In the absence of knowledge about why a packet was dropped (congestion at the network or corruption due to transmission error), TCP must assume the drop was due to network congestion to avoid congestion collapse. Therefore, packets dropped due to corruption cause TCP to reduce the size of its sliding window, even though these packet drops do not signal congestion in the network.
Asymmetric use:Due to the expense of the equipment used to send data to satellites, asymmetric satellite networks are often constructed. A common situation is that the uplink for sending requests has less available capacity than the downlink for down loading. This asymmetry may have an impact on TCP performance.
Variable round trip times:In LEO constellations, the propagation delay to and from the satellite varies over time. This may affect retransmission timeout (RTO) granularity.
Intermittent connectivity:In non-GSO satellite orbit configurations, TCP connections may be handed over from one satellite to another or from one ground station to another from time to time. This may cause packet loss when the connections get interrupted.

7.1.5 TCP Flow Control, Congestion Control and Error Recovery

As part of implementing a reliable service, TCP is responsible for flow and congestion control: ensuring that data is transmitted at a rate consistent with the capacities of both the receiver and the intermediate links in the network path.

Since there may be multiple TCP connections active in a link, TCP is also responsible for ensuring that a link's capacity is responsibly shared among the connections using it. As a result, most throughput issues are rooted in TCP.

To avoid generating an inappropriate amount of network traffic on the current network conditions, TCP employs four control mechanisms. These algorithms are:

slow-start;
congestion avoidance;
fast retransmit before RTO expires;
fast recovery to avoid slow-start.

These algorithms are described in detail in RFC 5681. They are used to adjust the amount of unacknowledged data that can be injected into the network and to retransmit segments dropped by the network.

TCP senders use two state variables to accomplish congestion control. The first variable is the congestion window (cwnd). This is an upper bound on the amount of data the sender can inject into the network before receiving an acknowledgement (ACK). The value of cwnd is limited to the receiver's advertised window. The congestion window is increased or decreased during the transfer based on the inferred amount of congestion present in the network.

The second variable is the slow-start threshold (ssthresh). This variable determines which algorithm is used to increase the value of cwnd. If cwnd is less than ssthresh the slow-start algorithm is used to increase the value of cwnd. However, if cwnd is greater than or equal to (or just greater than in some TCP implementations) ssthresh the congestion avoidance algorithm is used. The initial value of ssthresh is the receiver's advertised window size. Further more, the value of ssthresh is reset when congestion is detected. Figure 7.3 illustrates an example of TCP operations.

c07f003 — **Figure 7.3** An example of TCP operations

The above algorithms have a negative impact on the performance of individual TCP connections' performance because the algorithms slowly probe the network for additional capacity, which in turn wastes bandwidth. This is especially true over long-delay satellite channels because of the large amount of time required for the sender to obtain feedback from the receiver. However, the algorithms are necessary to prevent congestive collapse in a shared network. Therefore, the negative impact on a given connection is more than offset by the benefit to the entire network.

7.2 TCP Performance Analysis

The key parameter considered here is satellite link utilisation as satellite networks are very expensive and take a long time to build. The performance of TCP over satellite can be calculated as utilisation $c07-math-0001$ . The TCP transmission may complete before full bandwidth speed has been reached if the amount of data is so small that the slow-start, congestion avoidance, fast retransmission and fast recovery may all be performed. Figure 7.4 illustrates TCP segment traffic block bursts. This section provides analysis and calculation of bandwidth utilisation of TCP connections over a point-to-point satellite network.

c07f004 — **Figure 7.4** TCP segment traffic block bursts

7.2.1 First TCP Segment Transmission

After TCP connection set up, we can calculate the bandwidth utilisation when the TCP has completed the first data segment $c07-math-0002$ :

7.1

Where $c07-math-0004$ is the time to transmit the data segment $c07-math-0005$ , $c07-math-0006$ is propagation delay and $c07-math-0007$ is the bandwidth capacity of the TCP session. It takes a round trip time (RTT) of $c07-math-0008$ to acknowledge a successful transmission. It does not take into account the TCP three-way handshake connection set-up delay and connection close-down delay. The TCP transmission can finish when there are no more data for transmission, that is, the total data size is less than the maximum segment size (MSS). Therefore, the utilisation is shown in Equation 7.1. It can be seen that the delay $c07-math-0009$ bandwidth (DB) product is a key parameter affecting TCP performance. For satellite networks, broadband particular, the DB can be very large. It will take the round trip time $c07-math-0010$ and data transmission time $c07-math-0011$ to complete the TCP data transmission: $c07-math-0012$ .

7.2.2 TCP Transmission in the Slow-start Stage

The utilisation can be improved if the data segment is larger than the MSS. The transmission will enter the TCP slow-start stage. After successful transmission of the first TCP segment traffic block $c07-math-0013$ , two more segments 2 $c07-math-0014$ are transmitted, then a further two more traffic blocks 4 $c07-math-0015$ will be transmitted for each previous successful transmissions. We can see that the number of data segment (S) increases exponentially as $c07-math-0016$ for every round trip time (RTT) if there is no packet loss. The TCP can transmit a data segment of $c07-math-0017$ as a sequence of block sizes of $c07-math-0018$ . Let

7.2

where $c07-math-0020$ is the total number of RTT needed to complete the transmission. We can calculate the utilisation of TCP connection as:

7.3

The time it takes to complete the TCP data transmission is $c07-math-0022$ . This is the round trip time plus the total data transmission times.

7.2.3 TCP Transmission in the Congestion Avoidance Stage

When the transmission data block size reaches the slow-start threshold, the slow-start algorithm stops and the congestion avoidance mechanism starts until it reaches the window size. Then the transmitted data size and link utilisation can be calculated as the following:

7.4

where $c07-math-0025$ , the window size. When the transmission reaches the window size, TCP transmits at a constant speed of one window size of data per RTT.

In classical TCP, $c07-math-0026$ and $c07-math-0027$ are agreed initially between the client and server. The slow-start threshold and window size change according to the network conditions and rules of TCP. If a packet gets lost, TCP goes back to the slow-start algorithm and the threshold is reduced to a half. The window size depends on how fast the receiver can empty the receive buffer.

The basic assumption is that packet loss is due to network congestion, and such an assumption is true in normal networks but not always true in satellite networks where transmission errors can also be the major cause of packet loss.

7.3 Slow-start Enhancement for Satellite Networks

There are many TCP enhancements to make TCP friendly to satellite. In order to optimise TCP performance, we can adapt some of the parameters and TCP rules to the satellite networking environment:

Increasing the minimum segment size $c07-math-0028$ , but it is limited by the slow-start threshold, congestion window size and receiver buffer size.
Improving the slow-start algorithm at the start and when a packet gets lost. It may cause problems such as slower receiver and congested networks.
Improving acknowledgement. This may need additional buffer space.
Early detecting packet loss due to transmission error rather than network congestion. It may not work if acknowledgements are transmitted over different network paths.
Improving congestion avoidance mechanisms. This has similar problems to the slow-start algorithm.

One of the major problems is that TCP does not have any knowledge about the total data size and the available bandwidth. If the bandwidth $c07-math-0029$ is shared among many TCP connections, the available bandwidth can also be variable. Another is that TCP does not know how the IP layer actually carries the TCP segment across the Internet, because the IP packets may need to be limited in size or split into small packets for the network technologies transporting the IP packets. This makes the TCP a robust protocol that provides reliable services for different applications over different technologies, but is often not very efficient; particularly for satellite networks (see Figure 7.5). The RTT is measured by timing when the packet was sent out and the acknowledgement returned as $c07-math-0030$ , and the average $c07-math-0031$ calculated with a weight factor $c07-math-0032$ (typically $c07-math-0033$ , and $c07-math-0034$ is set to a default value) as:

The deviation is calculated with the same weight factor $c07-math-0036$ as:

Then the timeout can be calculated as:

c07f005 — **Figure 7.5** Traffic and control flows

Here, we discuss some TCP enhancement techniques. These are optimised to deal with particular conditions in satellite network configurations, but may have side effects or may not be applicable to general network configurations. It is also a great challenge for the enhancement to interwork with existing TCP implementations.

7.3.1 TCP for Transactions

In a transaction service, particularly for small data size and short TCP session, the utilisation is significantly affected by the connection set-up and connection close-down time. TCP uses a three-way handshake to set-up a connection between two hosts. This connection set-up requires 1.0 or 1.5 RTT, depending upon whether the data sender started the connection actively or passively. This start-up time can be eliminated by using TCP extensions for transactions (T/TCP) defined by RFC 4641. After the first connection between a pair of hosts is established, T/TCP is able to bypass the three-way handshake, allowing the data sender to begin transmitting data in the first segment sent (along with the SYN – synchronisation number). This is especially helpful for short request/response traffic, as it saves a potentially long set-up phase when no useful data are being transmitted.

As each of the transactions has a small data size, the utilisation of satellite bandwidth can be very low. However, it has the potential for many TCP session hosts to share the same bandwidth to improve bandwidth utilisation. T/TCP requires changes of both the sender and the receiver. While T/TCP is safe to implement in shared networks from a congestion control perspective, several security implications of sending data in the first data segment have been identified.

7.3.2 Slow-start and Delayed Acknowledgement (ACK)

As we have discussed, TCP uses the slow-start algorithm to increase the size of TCP's congestion window (cwnd) at exponential speed. The algorithm is an important safeguard against transmitting an inappropriate amount of data into the network when the connection starts up. However, slow-start can also waste available network capacity due to large delay bandwidth product of the network, especially in satellite networks.

In delayed acknowledgement (ACK) schemes, receivers refrain from acknowledging every incoming data segment (refer to RFC 1122). Every second full-sized segment is acknowledged. If a second full-sized segment does not arrive within a given timeout, an ACK must be generated (this timeout cannot exceed 500 ms). Since the sender increases the size of cwnd based on the number of arriving ACKs, reducing the number of ACKs slows the cwnd growth rate. In addition, when TCP starts sending, it sends one segment. When using delayed ACKs a second segment must arrive before an ACK is sent. Therefore, the receiver is always forced to wait for the delayed ACK timer to expire before ACKing the first segment, which also increases the transfer time.

7.3.3 Larger Initial Window

One method that will reduce the amount of time required by slow-start (and therefore, the amount of wasted capacity) is to increase the initial value of cwnd. However, TCP has been extended to support larger windows (RFC 1323). The window-scaling options can be used in satellite environments, as well as the companion algorithms PAWS (protection against wrapped sequence space) and RTTM (round-trip time measurements).

By increasing the initial value of cwnd, more packets are sent during the first RTT of data transmission, which will trigger more ACKs, allowing the congestion window to open more rapidly. In addition, by sending at least two segments initially, the first segment does not need to wait for the delayed ACK timer to expire as is the case when the initial size of cwnd is one segment. Therefore, the value of cwnd saves the number of RTT and a delayed ACK timeout. In the standards-track document RFC 5681, TCP allows an initial cwnd of up to two segments. It is expected that the use of a large initial window would be beneficial for satellite networks.

The use of a larger initial cwnd value of two segments requires changes to the sender's TCP stack, defined in RFC 5681. Using an initial congestion window of three or four segments is not expected to present any danger of congestion collapse, however, it may degrade performance in some networks if the network or terminal cannot cope with such burst traffic.

Using a fixed larger initial congestion window decreases the impact of a long RTT on transfer time (especially for short transfers) at the cost of bursting data into a network with unknown conditions. A mechanism is required to limit the effect of these bursts. Also, using delayed ACKs only after slow-start offers an alternative way to immediately ACK the first segment of a transfer and opens the congestion window more rapidly.

7.3.4 Terminating Slow-start

The initial slow-start phase is used by TCP to determine an appropriate congestion window size for the given network conditions. Slow-start is terminated when TCP detects congestion, or when the size of cwnd reaches the size of the receiver's advertised window. Slow-start is also terminated if cwnd grows beyond a certain size. TCP ends slow-start and begins using the congestion avoidance algorithm when it reaches the slow-start threshold (ssthresh). In most implementations, the initial value for ssthresh is the receiver's advertised window. During slow-start, TCP roughly doubles the size of cwnd every RTT and therefore can overwhelm the network with at most twice as many segments as the network can handle. By setting ssthresh to a value less than the receiver's advertised window initially, the sender may avoid overwhelming the network with twice the appropriate number of segments.

It is possible to use the packet-pair algorithm and the measured RTT to determine a more appropriate value for ssthresh. The algorithm observes the spacing between the first few returning ACKs to determine the bandwidth of the bottleneck link. Together with the measured RTT, the delay bandwidth product is determined and ssthresh is set to this value. When the cwnd reaches this reduced ssthresh, slow-start is terminated and transmission continues using congestion avoidance, which is a more conservative algorithm for increasing the size of the congestion window.

Estimating ssthresh can improve performance and decrease packet loss, but obtaining an accurate estimate of available bandwidth in a dynamic network is very challenging, especially attempting on the sending side of the TCP connection.

Estimating ssthresh requires changes to the data sender's TCP stack. Bandwidth estimates may be more accurate when taken by the TCP receiver, and therefore both sender and receiver changes would be required. It makes TCP more conservative than outlined in RFC 5681.

It is expected that this mechanism will work equally well in all symmetric satellite network configurations. However, asymmetric links pose a special problem, as the rate of the returning ACKs may not be the bottleneck bandwidth in the forward direction. This can lead to the sender setting ssthresh too low. Premature termination of slow-start can hurt performance, as congestion avoidance opens cwnd more conservatively. Receiver-based bandwidth estimators do not suffer from this problem, but needs changes the TCP in receiver side as well.

Terminating slow-start at the right time is useful to avoid overflowing the network, hence avoiding multiple dropped segments. However, using a selective acknowledgement-based loss recovery scheme can drastically improve TCP's ability to quickly recover from multiple lost segments.

7.4 Loss Recovery Enhancement

Satellite paths have higher error rates than terrestrial lines. Higher error rates matter for two reasons. First, they cause errors in data transmissions, which will have to be retransmitted. Second, as noted above, TCP typically interprets loss as a sign of congestion and goes back into the slow-start. Clearly we need to either reduce the error rate to a level acceptable to TCP (i.e., it allows the data transmissions to reach the full window size without suffering any packet loss) or find a way to let TCP know that the datagram loss is due to transmission errors, not congestion (and thus TCP should not reduce its transmission rate).

Loss recovery enhancement is to prevent TCP going to slow-start unnecessarily when data segments get lost due to error rather network congestion. Several similar algorithms have been developed and studied that improve TCP's ability to recover from multiple lost segments without relying on the (often long) retransmission timeout. These sender-side algorithms, known as NewReno TCP (one of the TCP implementations) do not depend on the availability of selective acknowledgements (SACK).

7.4.1 Fast Retransmission and Fast Recovery

It is possible during transmission that one or more TCP segments may not reach the other end of the connection, and TCP uses timeout mechanisms to detect those missing segments. In normal situations, TCP assumes that segments are dropped due to network congestion. This usually results in ssthresh being set to half the current value of the congestion window (cwnd), and the cwnd size is being reduced to the size of one TCP segment. This severely affects TCP throughput. The situation is worse when the loss of TCP segments is not due to network congestion. To avoid the unnecessary process of going back to the slow-start process each time a segment fails to reach the intended destination, the process of fast retransmission was introduced.

The fast retransmission algorithm uses duplicate ACKs to detect the loss of segments. If three duplicate ACKs are received within the timeout period, TCP immediately retransmits the missing segment without waiting for the timeout to occur. Once fast retransmission is used to retransmit the missing data segment, TCP can use its fast recovery algorithm, which will resume the normal transmission process via the congestion avoidance phase instead of slow-start as before. However, in this case ssthresh will be reduced to half the value of cwnd, and the value of cwnd is itself set as the ssthresh plus 3*MSS. This allows faster data transmission than is the case with TCP's normal timeout. When the normal ACK arrives, TCP reduces cwnd to half. For details, refer to RFC 5681.

7.4.2 Selective Acknowledgement (SACK)

TCP, even with fast retransmission and fast recovery, still performs poorly when multiple segments are lost within a single transmission window. This is due to the fact that TCP can only learn of a missing segment per RTT, due to the lack of cumulative acknowledgements. This limitation reduces TCP throughout.

To improve TCP performance for this situation, selective acknowledgement (SACK) is proposed (RFC 2018). The SACK option format allows any missing segments to be identified and typically retransmits them within a single RTT. By adding extra information about all the received segments sequence numbers, the sender is notified about which segments have not been received and therefore need to be retransmitted. This feature is very important in satellite network environments due to occasional high bit-error rates (BER) of the channel, and using larger transmission windows has increased the possibility of multiple segment losses in a single round trip.

7.4.3 SACK Based Enhancement Mechanisms

It is possible to use a conservative extension to the fast recovery algorithm that takes into account information provided by SACKs. The algorithm starts after fast retransmit triggers the resending of a segment. As with fast retransmit, the algorithm reduces cwnd into half of the size when a loss is detected. The algorithm keeps a variable called ‘pipe’, which is an estimate of the number of outstanding segments in the network. The pipe variable is decremented by one segment for each duplicate ACK that arrives with new SACK information. The pipe variable is incremented by one for each new or retransmitted segment sent. A segment may be sent when the value of pipe is less than cwnd (this segment is either a retransmission per the SACK information or a new segment if the SACK information indicates that no more retransmits are needed).

This algorithm generally allows TCP to recover from multiple segment losses in a window of data within one RTT of loss detection. The SACK information allows the pipe algorithm to decouple the choice of when to send a segment from the choice of what segment to send. It is also consistent with the spirit of the fast recovery algorithm.

Some research has shown that the SACK based algorithm performs better than several non-SACK based recovery algorithms, and that the algorithm improves performance over satellite links. Other research shows that in certain circumstances, the SACK algorithm can hurt performance by generating a large line-rate burst of data at the end of loss recovery, which causes further loss.

This algorithm is implemented in the sender's TCP stack. However, it relies on SACK information generated by the receiver (RFC 5681).

7.4.4 ACK Congestion Control

Acknowledgement enhancement is concerned with the acknowledgement packet flows. In a symmetric network, this is not an issue, as the ACK traffic is much less than the data traffic itself. But for asymmetric networks, the return link has much lower speed than the forward link. There is still the possibility that the ACK traffic overloads the return link, hence restricting the performance of the TCP transmissions.

In highly asymmetric networks, such as VSAT satellite networks, a low-speed return link can restrict the performance of the data flow on a high-speed forward link by limiting the flow of acknowledgements returned to the data sender. If a terrestrial modem link is used as a reverse link, ACK congestion is also likely, especially as the speed of the forward link is increased. Current congestion control mechanisms are aimed at controlling the flow of data segments, but do not affect the flow of ACKs.

The flow of acknowledgements can be restricted on the low-speed link not only by the bandwidth of the link, but also by the queue length of the router. The router may limit its queue length by counting packets, not bytes, and therefore begin discarding ACKs even if there is enough bandwidth to forward them.

7.4.5 ACK Filtering

ACK filtering (AF) is designed to address the same ACK congestion effects. Contrary to ACK congestion control (ACC), however, AF is designed to operate without host modifications.

AF takes advantage of the cumulative acknowledgement structure of TCP. The bottleneck router in the reverse direction (the low-speed link) must be modified to implement AF. Upon receipt of a segment, which represents a TCP acknowledgement, the router scans the queue for redundant ACKs for the same connection, that is, ACKs which acknowledge portions of the window which are included in the most recent ACK. All of these ‘earlier’ ACKs are removed from the queue and discarded.

The router does not store state information, but does need to implement the additional processing required to find and remove segments from the queue upon receipt of an ACK.

As is the case in ACC, the use of ACK filtering alone would produce significant sender bursts, since the ACKs will be acknowledging more previously unacknowledged data. The sender adaptation (SA) modifications could be used to prevent those bursts, at the cost of requiring host modifications. To prevent the need for modifications in the TCP stack, AF is more likely to be paired with the ACK reconstruction (AR) technique, which can be implemented at the router where segments exit the slow reverse link.

AR inspects ACKs exiting the link, and if it detects large ‘gaps’ in the ACK sequence, it generates additional ACKs to reconstruct an acknowledgement flow which more closely resembles what the data sender would have seen had ACK filtering not been introduced. AR requires two parameters: one parameter is the desired ACK frequency; while the second controls the spacing, in time, between the releases of consecutive reconstructed ACKs.

7.4.6 Explicit Congestion Notification

Explicit congestion notification (ECN) allows routers to inform TCP senders about imminent congestion without dropping segments [RFC 3168]. There are two major forms of ECN:

The first major form of congestion notification is backward ECN (BECN). A router employing BECN transmits messages directly to the data originator informing it of congestion. IP routers can accomplish this with an ICMP source quench message. The arrival of a BECN signal may or may not mean that a TCP data segment has been dropped, but it is a clear indication that the TCP sender should reduce its sending rate (i.e., the value of cwnd).
The second major form of congestion notification is forward ECN (FECN). FECN routers mark data segments with a special tag when congestion is imminent, but forward the data segment. The data receiver then echoes the congestion information back to the sender in the ACK packet.

Senders transmit segments with an ‘ECN-capable transport’ bit set in the IP header of each packet. If a router employing an active queuing strategy, such as random early detection (RED), would otherwise drop this segment, a ‘congestion experienced’ bit in the IP header is set instead. Upon reception, the information is echoed back to TCP senders using a bit in the TCP header. The TCP sender adjusts the congestion window just as it would if a segment was dropped.

The implementation of ECN requires the deployment of active queue management mechanisms in the affected routers. This allows the routers to signal congestion by sending TCP a small number of ‘congestion signals’ (segment drops or ECN messages), rather than discarding a large number of segments, as can happen when TCP overwhelms a drop-tail router queue.

Since satellite networks generally have higher bit-error rates than terrestrial networks, determining whether a segment was lost due to congestion or corruption may allow TCP to achieve better performance in high BER environments than currently possible (due to TCP's assumption that all loss is due to congestion). While there is no solution to this problem, adding an ECN mechanism to TCP may be a part of a mechanism that will help achieve this goal.

Research shows that ECN is effective in reducing the segment loss rate, which yields better performance especially for short and interactive TCP connections, and that ECN avoids some unnecessary and costly TCP retransmission timeouts.

Deployment of ECN requires changes to the TCP implementation on both sender and receiver. Additionally, deployment of ECN requires some active queue management infrastructure in routers. RED is assumed in most ECN discussions, because RED is already identifying segments to drop, even before its buffer space is exhausted. ECN simply allows the delivery of ‘marked’ segments while still notifying the end nodes that congestion is occurring along the path. ECN maintains the same TCP congestion control principles as are used when congestion is detected via segment drops. Due to long propagation delay, the ECN signalling may not reflect the current status of networks accurately.

7.4.7 Detecting Corruption Loss

Differentiating between congestion (loss of segments due to router buffer overflow or imminent buffer overflow) and corruption (loss of segments due to damaged bits of data) is a difficult problem for TCP. This differentiation is particularly important because the action that TCP should take in the two cases is entirely different. In the case of corruption, TCP should merely retransmit the damaged segment as soon as its loss is detected; there is no need for TCP to adjust its congestion window. On the other hand, when the TCP sender detects congestion, it should immediately reduce its congestion window to avoid making the congestion worse.

TCP's defined behaviour in terrestrial wired networks is to assume that all loss is due to congestion and to trigger the congestion control algorithms. The loss may be detected using the fast retransmit algorithm, or in the worst case is detected by the expiration of TCP's retransmission timer. TCP's assumption that loss is due to congestion rather than corruption is a conservative mechanism that prevents congestion collapse.

Over satellite networks, however, as in many wireless environments, loss due to corruption is more common than on terrestrial networks. One common partial solution to this problem is to add forward error correction (FEC) to the data that are sent over the satellite or wireless links. However, given that FEC does not always work or cannot be universally applied, it is important to make TCP able to differentiate between congestion-based and corruption-based loss.

Corrupted TCP segments are most often dropped by intervening routers when link-level checksum mechanisms detect that an incoming frame has any error. Occasionally, a TCP segment containing an error may survive without detection until it arrives at the TCP receiving host, at which point it will almost always either fail the IP header checksum or the TCP checksum and be discarded as in the link-level error case. Unfortunately, in either of these cases, it is not generally safe for the node detecting the corruption to return information about the corrupted packet to the TCP sender because the sending address itself might have also been corrupted.

Because the probability of link errors on a satellite link is relatively greater than on a hardwired link, it is particularly important that the TCP sender retransmits these lost segments without reducing its congestion window. Because corrupted segments do not indicate congestion, there is no need for the TCP sender to enter a congestion avoidance phase, which may waste available bandwidth. Therefore, it can improve TCP performance if TCP can properly differentiate between corruption caused by error and congestion caused by network overload.

7.4.8 Congestion Avoidance Enhancement Policy

During congestion avoidance, in the absence of loss, the TCP sender adds approximately one segment to its congestion window during each RTT. This policy leads to unfair sharing of bandwidth when multiple connections with different RTTs traverse the same bottleneck link, with the long RTT connections obtaining only a small fraction of their fair share of the bandwidth.

One effective solution to this problem is to deploy fair queuing and TCP-friendly buffer management in network routers. However, in the absence of help from the network, there are two possible changes available to the congestion avoidance policy at the TCP sender:

The ‘constant-rate’ increase policy attempts to equalise the rate at which TCP senders increase their sending rate during congestion avoidance. It could correct the bias against long RTT connections, but may be difficult to incrementally deploy in an operational network. Further studies are required on the proper selection of a constant (for a constant rate of increase).
The ‘increase-by- $c07-math-0039$ ’ policy can be selectively used by long RTT connections in a heterogeneous environment. This policy simply changes the slope of the linear increase, with connections over a given RTT threshold adding ‘ $c07-math-0040$ ’ segments to the congestion window every RTT, instead of one. This policy, when used with small values of $c07-math-0041$ , may be successful in reducing the unfairness while keeping the link utilisation high, when a small number of connections share a bottleneck link. Further studies are required on the selection of the constant $c07-math-0042$ , the RTT threshold to invoke this policy, and performance under a large number of flows.

Implementation of either the ‘constant-rate’ or ‘increase-by- $c07-math-0043$ ’ policies requires a change to the congestion avoidance mechanism at the TCP sender. In the case of ‘constant-rate’, such a change must be implemented globally. Additionally, the TCP sender must have a reasonably accurate estimate of the RTT of the connection. The algorithms outlined above violate the congestion avoidance algorithm as outlined in RFC 5681 and therefore should be considered seriously if implemented in shared networks at this time.

These solutions are applicable to all satellite networks that are integrated with a terrestrial network, in which satellite connections may be competing with terrestrial connections for the same bottleneck link. But increasing the congestion window by multiple segments per RTT can cause TCP to drop multiple segments and force a retransmission timeout in some versions of TCP. Therefore, the above changes to the congestion avoidance algorithm may need to be accompanied by a SACK-based loss recovery algorithm that can quickly repair multiple dropped segments.

7.5 Enhancements for Satellite Networks Using Interruptive Mechanisms

According to the principle of protocols, each layer of the protocol should only make use of the services provided by the protocol below it to provide services to the protocol above it. TCP is a transport layer protocol providing end-to-end connection-oriented services. Any function between the TCP connection or Internet protocol below it should not disturb or interrupt the TCP data transmission or acknowledgement flows.

As the characteristics of satellite networks are known to networking design, there is potential to benefit performance by making using of such knowledge but in an interruptive manner. Two methods have been widely used: TCP spoofing and TCP cascading (also known as split TCP), but they violate the protocol layering principles for the benefit of network performance. Figure 7.6 illustrates the concept of interruptive mechanisms of satellite-friendly TCP (TCP-sat).

c07f006 — **Figure 7.6** The concept of satellite-friendly TCP (TCP-sat)

7.5.1 TCP Spoofing

TCP spoofing is an idea for getting around slow-start in a practice known for satellite networks particularly GEO satellite links. The idea calls for a router near the satellite link to send back acknowledgements of the TCP data to give the sender the illusion of a short delay path. The router then suppresses acknowledgements returning from the receiver, and takes responsibility for retransmitting any segments lost downstream of the router. TCP spoofing is implemented in the router, so the sender and receiver do not know anything about it. Though TCP spoofing helps to improve TCP performance over satellite, there are a number of problems with this scheme.

First, the router must do a considerable amount of work after it sends an acknowledgement. It must buffer the data segment because the original sender is now free to discard its copy (the segment has been acknowledged) and so if the segment gets lost between the router and the receiver, the router has to take full responsibility for retransmitting it. One side effect of this behaviour is that if a queue builds up, it is likely to be a queue of TCP segments that the router is holding for possible retransmission. Unlike an IP datagram, this data cannot be deleted until the router gets the relevant acknowledgements from the receiver.

Second, spoofing requires symmetric paths: the data and acknowledgements must flow along the same path through the router. However, in much of the Internet, asymmetric paths are quite common.

Third, spoofing is vulnerable to unexpected failures. If a path changes or the router crashes, data may be lost. Data may even be lost after the sender has finished sending and, based on the router's acknowledgements, reported data successfully transferred.

Fourth, it does not work if the data in the IP datagram are encrypted because the router will be unable to read the TCP header.

7.5.2 Cascading TCP or Split TCP

Cascading TCP, also known as split TCP, is an idea where a TCP connection is divided into multiple TCP connections, with a special TCP connection running over the satellite link. The thought behind this idea is that the TCP running over the satellite link can be modified, with knowledge of the satellite's properties, to run faster.

Because each TCP connection is terminated, cascading TCP is not vulnerable to asymmetric paths. And in cases where applications actively participate in TCP connection management (such as web caching) it works well. But otherwise cascading TCP has the same problems as TCP spoofing.

7.5.3 Other Considerations for Satellite Networking

A perfect solution should be able to meet the requirements of user applications, take into account the characteristics of data traffic and make full use of network resources (processing power, memory and bandwidth). Current solutions based on the enhancement of existing TCP mechanisms have reached their limits as neither knowledge about applications nor knowledge about networks and hosts (client and server computers) are taken into account.

In future networks, with application traffic characteristics and QoS requirements together with knowledge of network resources, it should be possible to achieve a perfect solution for the TCP within the integrated network architecture. It will need new techniques to achieve multi-layer and cross-layer optimisation of protocol architecture. It will have potentially more benefit to satellite networks where efficient utilisation of the expensive bandwidth resources is the main objective. Given the recent development of virtulisation, clod computing and software defined network (SDN), performance improvement can be achieved at higher layer above the TCP to provide better QoS and user quality of experience (QoE).

7.6 Impacts on Applications

TCP support a wide range of applications. Different applications have different characteristics; hence they are affected by TCP in different ways. This also tells us that it is impossible to have one perfect solution for all the different applications without knowing the characteristics of these applications. Here we give examples of how different applications may be affected by TCP in satellite networks.

7.6.1 Bulk Data Transfer

The file transfer protocol (FTP) can be found on all TCP/IP installed systems and provides an example for the most commonly executed bulk transfer protocol. FTP allows the user to log onto a remote machine and either download files from or upload files to the machine.

At bandwidths of 64 kbit/s and 9.6 kbit/s, throughput was proportional to the bandwidth available and delay had little effect on the performance. This was due to the 24-kbyte window size, which was large enough to prevent any window exhaustion. At a bandwidth of 1 Mbit/s, however, window exhaustion occurred and the delay had a detrimental effect on the throughput of the system. Link utilisation dropped from 98% at 64 kbit/s to only 30% for 1 Mbit/s. The throughput, however, was still higher for the 1 Mbit/s case (due to reduced serialisation delay of the data). All transfers were conducted with a 1 Mbyte file, which was large enough to negate the effect of the slow-start algorithm. Other bulk transfer protocols e.g. simple mail transfer protocol (SMTP) and remote copy (RCP) recorded similar performances using a typical application file size.

At 64 kbit/s link capacity the return link could be reduced to 4.8 kbit/s with no effect on the throughput of the system. This was due to the limited bandwidth availability for the outbound connection, which experienced congestion. At 2.4 kbit/s return link bandwidth, transfer showed a 25% decrease in throughput, resulting from ACKs in the return link.

At a 1 Mbit/s outbound link speed, the performance of FTP was affected more by the TCP window size (24 kbytes) than by any variation in the bandwidth of the return link. It was not affected until the return link dropped to 9.6 kbit/s and started to show congestion. A 15% drop in performance was recorded for the return of 9.6 kbit/s. Delay again had a significant effect on the performance at 1 Mbit/s due to window exhaustion.

The high ratio of outbound to inbound traffic experienced in the FTP session means that it is well suited to links with limited return bandwidth. For a 64 kbit/s outbound link, FTP will perform well with return links down to 4.8 kbit/s.

7.6.2 Interactive Applications

WWW browsers use the HTTP protocol to view graphical pages downloaded from remote machines. The performance of the HTTP protocol is largely dependent on the structure of the HTML files being downloaded.

At bandwidths of 1 Mbit/s and 64 kbit/s the throughput was largely governed by the delay, due to the majority of the sessions being spent in the open/close and slow-start stages of transfer, which are affected by the RTT of the Internet. At 9.6 kbit/s this effect was overshadowed by the serialisation delay caused by the limited bandwidth on the outbound link. With bandwidths of 1 Mbit/s and 64 kbit/s the performance was less effective. At 9.6 kbit/s the users tended to get frustrated when downloading large files and would abandon the session.

At 1 Mbit/s and 64 kbit/s, the speed of the return link had a far greater effect than any variation in delay. This was due to congestion in the return link, arising because of the low server/client traffic ratio. The lower ratio was a result of the increased number of TCP connections required to download each object. At 9.6 kbit/s the return link was close to the congestion, but still offered throughputs comparable to that at 64 kbit/s. At 4.8 kbit/s the return link became congested and the outbound throughput showed a 50% drop off. A further 50% reduction in the outbound throughput occurred when the return link dropped to 2.4 kbit/s.

For both the 1 Mbit/s and 64 kbit/s inbound, the return link speed was down to 19.2 kbit/s, which was acceptable. Below this rate, users started to become frustrated by the time taken to request a WWW page. A return bandwidth of at least 19.2 kbit/s is therefore recommended for WWW applications.

7.6.4 Distributed Caching for Internet Services and Applications

In the early Internet client/server model, user requests are served by a single machine. Very often and especially when this server exists in a rather distant location, the user experiences reduced throughput and network performance. This low throughput is due to bottlenecks that can be either the server itself or one or more congested Internet routing hops. Furthermore, that server represents a single point of failure – if it is down, access to the information is lost.

To preserve the usability of the information distributed in the Internet such as data centre, cloud computing and peer-to-peer networks, the following issues need to be addressed at the server level:

Document retrieval latency times must be decreased by putting a copy near the users when they need it.
Document availability must be increased, perhaps by distributing documents among several servers or even data centres.
The amount of data transferred must be reduced – certainly an important issue for anyone paying for network usage.
Network access must be redistributed to avoid peak hours.
Improvements in general user perceived performance in term of user quality of experience (QoE).

Of course these goals must be implemented to retain transparency for the user as well as backward compatibility with existing standards. A popular and widely accepted approach to address at least some of these problems is the use of caching proxies.

A user may experience high latency when accessing a server that is attached to a network with limited bandwidth. Caching is a standard solution for this type of problem, and it was applied to the Internet (mainly to WWW) early for this reason. Caching has been a well-known solution to increase computer performance since the 1960s. The technique is now applied in nearly every computer's architecture. Caching relies on the principle of locality of reference which assumes that the most recently accessed data have the highest probability of being accessed again in the near future. The idea of Internet caching relies on the same principle.

ICP (Internet caching protocol) is a well-organised, university-based effort that deals with these issues as specified by the RFC 2187. ICP is currently implemented in the public domain Squid proxy server. ICP is a protocol used for communication among squid caches. ICP is primarily used within a cache hierarchy to locate specific objects in sibling caches. If a squid cache does not have a requested document, it sends an ICP query to its siblings, and the siblings respond with ICP replies indicating a ‘HIT’ or a ‘MISS’. The cache then uses the replies to choose from which cache to resolve its own MISS. ICP also supports multiplexed transmission of multiple object streams over a single TCP connection. ICP is currently implemented on top of UDP. It also supports ICP via multicast.

Another way of reducing the overall bandwidth and the latency, thus increasing the user-perceived throughput, is by using replication. This solution can also provide a more fault-tolerant and evenly balanced system. Replication offers promise towards solving some of the deficiencies of the proxy caching method.

An example of replication was the information of NASA's mission to Mars. In that case the information about the mission was replicated in several sites in the United States, Europe, Japan and Australia in order to be able to satisfy the millions of user requests.

7.6.5 Web Caching in Satellite Networks

The concept of web caching is quite popular since many Internet service providers (ISPs) already use central servers to hold popular web pages, thus avoiding the increased traffic and delays created when thousands of subscribers request and download the same page across the network. Caches can be quite efficient but they have several weak points as they are limited by the number of people that are using each cache.

A solution can be provided by using a satellite system to distribute caches among ISPs. This concept can boost Internet performance, since many already have high speed broadband networks for web traffic. The broadcast satellite could avoid much of that backhaul, providing efficient contents delivery to many sites for cache or storage.

Such a satellite system can be useful and becomes significantly exploited in circumstances where bandwidth is expensive and traffic jams and delays are significant, that is, trans-Atlantic access. For example, a large amount of web content resides in the United Sttaes and European ISPs face a heavy bandwidth crunch to move data their way. A complete satellite system where caching can be introduced in most of its points (i.e., ISP, Internet, LAN, etc.) is illustrated in Figure 7.7.

c07f007 — **Figure 7.7** Satellite configuration with caches at IWU

7.7 Real-time Transport Protocol (RTP)

The TCP was primarily specified for the transmission of raw data between computer systems. For a long time the TCP was adequate for the transmission of still pictures and other row data-based documents. However, the emergence of modern applications, mainly those based on real-time voice and video present new requirements try to explore the benefit of the UDP. Real-time protocols are developed on the top of the UDP to meet the real time requirements of the new applications. Products are available that support streaming audio, streaming video and audio-video conferencing.

7.7.1 Basics of RTP

The real-time transport protocol (RTP) provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. RTP does not address resource reservation and does not guarantee QoS for real-time services.

The data transport is augmented by a real-time control protocol (RTCP), which allows monitoring of the data delivery in a manner scalable to large multicast networks, and provides minimal control and identification functionality. RTP and RTCP are designed to be independent of the underlying transport and network layers.

Applications typically run RTP on top of UDP to make use of its multiplexing and checksum services. Figure 7.8 illustrates that the RTP is encapsulated into a UDP datagram, which is transported by an IP packet, as explained in RFC 3550.

c07f008 — **Figure 7.8** RTP packet encapsulations

Both RTP and RTCP protocols contribute parts of the transport protocol functionality. There are two closely linked parts:

The real-time transport protocol (RTP), to carry data that has real-time properties.
The RTP control protocol (RTCP), to monitor the quality of service and to convey information about the participants in an ongoing session.

A defining property of real-time applications is the ability of one party to signal to one or more other parties and initiate a call. Session invitation protocol (SIP) is a client-server protocol that enables peer users to establish a virtual connection (association) between them and then refers to a RTP (real-time transport protocol; RFC 3550) session carrying a single media type. RTP provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. RTP does not address resource reservation and does not guarantee QoS for real-time services.

Note that RTP itself does not provide any mechanism to ensure timely delivery or provide other QoS guarantees, but relies on lower layer services to do so. It does not guarantee delivery or prevent out-of-order delivery, nor does it assume that the underlying network is reliable and delivers packets in sequence.

There are four network components:

End system:an application that generates the content to be sent in RTP packets and/or consumes the content of received RTP packets.
Mixer:an intermediate system that receives RTP packets from one or more sources, possibly changes the data format, combines the packets in some manner and then forwards a new RTP packet.
Translator:an intermediate system that forwards RTP packets with their synchronisation source identifier intact. Examples of translators include devices that convert encodings without mixing, replications from multicast to unicast and application-level filters in firewalls.
Monitor:an application that receives RTCP packets sent by participants in an RTP session, in particular the reception reports, and estimates the current QoS for distribution monitoring, fault diagnosis and long-term statistics.

Figure 7.9 shows the RTP header format. The first 12 octets are present in every RTP packet, while the list of contribution source (CSRC) identifiers is present only when inserted by a mixer. The fields have the following meaning:

Version (V):two bits – this field identifies the version of RTP. The current version is two (2). (The value 1 is used by the first draft version of RTP and the value 0 is used by the protocol initially implemented in the ‘vat’ audio tool.)
Padding (P):one bit – if the padding bit is set, the packet contains one or more additional padding octets at the end which are not part of the payload. The last octet of the padding contains a count of how many padding octets should be ignored, including last padding octet.
Extension (X):one bit – if the extension bit is set, the fixed header must be followed by exactly one header extension, with a defined format.
Contribution source (CSRC) count (CC):four bits – the CSRC count contains the number of CSRC identifiers that follow the fixed header.
Marker (M):one bit – the interpretation of the marker is defined by a profile.
Payload type (PT):seven bits – this field identifies the format of the RTP payload and determines its interpretation by the application. A set of default mappings for audio and video is specified in the companion RFC 3551.
Sequence number:16 bits – the sequence number increments by one for each RTP data packet sent, and may be used by the receiver to detect packet loss and to restore packet sequence.
Timestamp:32 bits – the timestamp reflects the sampling instant of the first octet in the RTP data packet. The sampling instant must be derived from a clock that increments monotonically and linearly in time to allow synchronisation and jitter calculations.
Synchronisation source (SSRC):32 bits – the SSRC field identifies the synchronisation source. This identifier should be chosen randomly, with the intent that no two synchronisation sources within the same RTP session will have the same SSRC identifier.
CSRC list:0 to 15 items, 32 bits each – the CSRC list identifies the contributing sources for the payload contained in this packet. The number of identifiers is given by the CC field. If there are more than 15 contributing sources, only 15 can be identified.

c07f009 — **Figure 7.9** RTP header information

7.7.2 RTP Control Protocol (RTCP)

The RTP control protocol (RTCP) is based on the periodic transmission of control packets to all participants in the session, using the same distribution mechanism as the data packets. The underlying protocol must provide multiplexing of the data and control packets, for example using separate port numbers with UDP. RTCP performs four functions:

The primary function is to provide feedback on the quality of the data distribution. This is an integral part of the RTP role as a transport protocol and is related to the flow and congestion control functions of other transport protocols. The feedback may be directly useful for control of adaptive encodings, but experiments with IP multicasting have shown that it is also critical to get feedback from the receivers to diagnose faults in the distribution. Sending reception feedback reports to all participants allows whoever is observing problems to evaluate whether those problems are local or global. With a distribution mechanism like IP multicast, it is also possible for an entity such as a network service provider who is not otherwise involved in the session to receive the feedback information and act as a third-party monitor to diagnose network problems. This feedback function is performed by the RTCP sender and receiver reports (RS and RR) – see Figure 7.10.
RTCP carries a persistent transport-level identifier for an RTP source called the canonical name or CNAME. Since the SSRC identifier may change if a conflict is discovered or a program is restarted, receivers require the CNAME to keep track of each participant. Receivers may also require the CNAME to associate multiple data streams from a given participant in a set of related RTP sessions, for example to synchronise audio and video. Inter-media synchronisation also requires the NTP and RTP timestamps included in RTCP packets by data senders. The NTP is the network time protocol specified by RFC 5905.
The first two functions require that all participants send RTCP packets, therefore the rate must be controlled in order for RTP to scale up to a large number of participants. By having each participant send its control packets to all the others, each can independently observe the number of participants.
A fourth, optional function is to convey minimal session control information, for example participant identification to be displayed in the user interface. This is most likely to be useful in ‘loosely controlled’ sessions where participants enter and leave without membership control or parameter negotiation.

c07f010 — **Figure 7.10** Sender report (SR) and receiver report (RR)

7.7.3 Sender Report (SR) Packets

There are three sections. The first section (header) consists of the following fields:

Version (V):two bits – identifies the version of RTP, which is the same in RTCP packets as in RTP data packets. The current version is two (2).
Padding (P):one bit – if the padding bit is set, this individual RTCP packet contains some additional padding octets at the end which are not part of the control information but are included in the length field. The last octet of the padding is a count of how many padding octets should be ignored, including itself (it will be a multiple of four).
Reception report count (RC):five bits – the number of report blocks contained in this packet.
Packet type (PT):eight bits – contains the constant 200 to identify this as an RTCP SR packet.
Length:16 bits – the length of this RTCP packet in 32-bit words minus one, including the header and any padding.
SSRC:32 bits – the synchronisation source identifier for the originator of this SR packet.

The second section, the sender information, is 20 octets long and is present in every sender report packet. It summarises the data transmissions from this sender. The fields have the following meaning:

NTP timestamp:64 bits – indicates the wall clock time when this report was sent so that it may be used in combination with timestamps returned in reception reports from other receivers to measure round-trip propagation to those receivers.
RTP timestamp:32 bits – corresponds to the same time as the NTP timestamp (above), but in the same units and with the same random offset as the RTP timestamps in data packets.
Sender's packet count:32 bits – the total number of RTP data packets transmitted by the sender since starting transmission up until the time this SR packet was generated.
Sender's octet count:32 bits – the total number of payload octets (i.e., not including header or padding) transmitted in RTP data packets by the sender since starting transmission up until the time this SR packet was generated.

The third section contains zero or more reception report blocks depending on the number of other sources heard by this sender since the last report. Each reception report block conveys statistics on the reception of RTP packets from a single synchronisation source.

SSRC_n (source identifier): 32 bits – the SSRC identifier of the source to which the information in this reception report block pertains, including:

Fraction lost:eight bits – the fraction of RTP data packets from source SSRC_n lost since the previous SR or RR packet was sent, expressed as a fixed point number with the binary point at the left edge of the field. This fraction is defined to be the number of packets lost divided by the number of packets expected.
Cumulative number of packets lost:24 bits – the total number of RTP data packets from source SSRC_n that have been lost since the beginning of reception. This number is defined to be the number of packets expected less the number of packets actually received.
Extended highest sequence number received:32 bits – the least significant 16 bits contain the highest sequence number received in an RTP data packet from source SSRC_n, and the most significant 16 bits extend that sequence number with the corresponding count of sequence number cycles.
Inter-arrival jitter:32 bits – an estimate of the statistical variance of the RTP data packet inter-arrival time, measured in timestamp units and expressed as an unsigned integer. The inter-arrival jitter $c07-math-0044$ is defined to be the mean deviation (smoothed absolute value) of the difference $c07-math-0045$ in packet spacing at the receiver compared to the sender for a pair of packets.
Last SR timestamp (LSR):32 bits – the middle 32 bits out of 64 in the NTP timestamp received as part of the most recent RTCP sender report (SR) packet from source SSRC_n. If no SR has been received yet, the field is set to zero.
Delay since last SR (DLSR):32 bits – the delay, expressed in units of 1/65536 seconds, between receiving the last SR packet from source SSRC_n and sending this reception report block. If no SR packet has been received yet from SSRC_n, the DLSR field is set to zero.

7.7.4 Receiver Report (RR) Packets

The format of the receiver report (RR) packet is the same as that of the SR packet except that the packet type field contains the constant 201 and the five words of sender information are omitted. The remaining fields have the same meaning as for the SR packet.

7.7.5 Source Description (SDES) RTCP Packet

The SDES packet is a three-level structure composed of a header and zero or more chunks, each of which is composed of items describing the source identified in that chunk. Each chunk consists of an SSRC/CSRC identifier followed by a list of zero or more items, which carry information about the SSRC/CSRC. Each chunk starts on a 32-bit boundary. Each item consists of an eight-bit type field, an eight-bit octet count describing the length of the text (thus, not including this two-octet header), and the text itself. Note that the text can be no longer than 255 octets, but this is consistent with the need to limit RTCP bandwidth consumption.

End systems send one SDES packet containing their own source identifier (the same as the SSRC in the fixed RTP header). A mixer sends one SDES packet containing a chunk for each contributing source from which it is receiving SDES information, or multiple complete SDES packets if there are more than 31 such sources.

The SDES items currently defined include:

CNAME:canonical identifier (mandatory);
NAME:name of user;
EMAIL:address user;
PHONE:number for user;
LOC:location of user, application specific;
TOOL:name of application/tool;
NOTE:transient messages from user;
PRIV:application specific/experimental use.

Goodbye RTCP packet (BYE): the BYE packet indicates that one or more sources are no longer active.

Application-defined RTCP packet (APP): the APP packet is intended for experimental use as new applications and new features are developed, without requiring packet type value registration.

7.7.6 SAP and SIP Protocols for Session Initiations

There are several complementary mechanisms for initiating sessions, depending on the purpose of the session, but they essentially can be divided into invitation and announcement mechanisms. A traditional example of an invitation mechanism would be making a telephone call, which is essentially an invitation to participate in a private session. A traditional example of an announcement mechanism is the television guide in a newspaper, which announces the time and channel that each programme is broadcast. In the Internet, in addition to these two extremes, there are also sessions that fall in the middle, such as an invitation to listen to a public session, and announcements of private sessions to restricted groups.

The session announcement protocol (SAP) must be one of the simplest protocols around (RFC 2974). To announce a multicast session, the session creator merely multicasts packets periodically to a well-known multicast group carrying a session description protocol (SDP) description of the session that is going to take place (RFC 4566). People that wish to know which sessions are going to be active simply listen to the same well-known multicast group, and receive those announcement packets. Of course, the protocol gets a little more complex when we take security and caching into account, but basically that is it.

The session initiation protocol (SIP) works like making a telephone call, for example it finds the person you are trying to reach and causes their phone to ring (RFC 3261). The most important way that SIP differs from making an existing telephone call (apart from that it is an IP-based protocol) is that you may not be dialling a number at all. Although SIP can call traditional telephone numbers, SIP native concept of an address is an SIP URL, which looks very like an email address. Figure 7.11 illustrates a typical SIP call of an initiate and terminate session.

c07f011 — **Figure 7.11** Typical SIP call with initiate and terminate session

Users may move to a different location. Redirect servers and location servers can be used to assist SIP calls. Figure 7.12 illustrates a typical SIP call using redirect server and location server.

c07f012 — **Figure 7.12** Typical SIP call using a redirect server and a location server

SIP makes extensive use of proxy servers, each of which looks at the call request, looks up whatever local information it has about the person being called (i.e., the callee), performs any security checks it has been asked by the callee or her organisation to make, and then routes the call onward. Figure 7.13 shows a typical SIP call using a proxy server and location server.

c07f013 — **Figure 7.13** Typical SIP call using a proxy server and a location server

There are two multicast channels per application per session: one for RTCP and the other for RTCP. It allows ad hoc configurations as stand-alone for individual applications; and also allows advertised conference with session directory (SDR) and configuration information.

7.7.7 Session Directory Service (SDS)

The growth in multicast services and applications has led to some navigation difficulties (just as there are in the WWW). This has led to the creation of a session directory service (SDS). This has several functions:

A user creating a conference needs to choose a multicast address that is not in use. The session directory system has two ways of doing this: first, it allocates addresses using a pseudo-random strategy based on how widespread the conference is going to be according to the user, and where the creator is; second, it multicasts the session information out, and if it detects a clash from an existing session announcement, it changes its allocation. This is a simple mechanism for the management of allocation and listing of dynamic multicast addresses.
Users need to know what conferences there are on the multicast backbone (Mbone), what multicast addresses they are using, and what media are in use on them. They can use the session directory messages to discover all of this. The latest versions of multicast include a facility for administrative scoping, which allows session creators to designate a logical region of interest outside of which traffic will not (necessarily) flow.
Furthermore, the session directory tools currently implemented will launch applications for the user.

7.8 Voice over IP

Based on RTP, IP telephony is becoming a mainstream application moving away from proprietary solutions to standards based solutions, providing QoS comparable to the telecommunication networks, as well as providing transparent interoperability of the IP with the networks.

7.8.1 Gateway Decomposition

The signalling gateway is responsible for signalling between end users on either network. On the telecommunication networks side, the signalling is translated to an IP signalling protocol such as SIP or H.323, and transported across the IP network. Session announcement protocol (SAP) is used to announce the session. Session description protocol (SDP) is used to describe the call (or session; RFC 4566).

Once a call is set up, the media gateway is responsible for transfer of the data, video and audio streams. On the telecommunicationnetwork side, media transport is by PCM-encoded data on TDM streams; on the IP network side, on RTP/UDP. The media gateway controller is used to control one or more media gateways.

7.8.2 Protocols

VoIP uses a number of protocols. As far back as 1994, the ITU-T introduced its H.323 family of protocols, to provide multimedia capability over the Internet. Many vendors have developed and deployed these solutions. In parallel, the IETF introduced many protocols used for IP telephony – RTP, RTSP, RTCP, Megaco, SIP, and SDP. These protocols provide the foundation for standards based IP telephony.

7.8.3 Gatekeepers

Gatekeepers are responsible for addressing, authorisation and authentication of terminal and gateways, bandwidth management, accounting, billing and charging. They may also provide call-routing services. Terminal is a PC or stand-alone device running multimedia applications. Multipoint control units (MCU) provide support for conferences of three or more terminals.

7.8.4 Multimedia Conferencing (MMC)

Multimedia conferencing (MMC) is one of the typical example applications based on IP multicast. It is also well suited for satellite networks with great advantages. It consists of multimedia application with the following components:

Voice provides packet audio in time slices, numerous audio-coding schemes, redundant audio for repair, unicast or multicast, configurable data rates.
Video provides packet video in frames, numerous video-coding schemes, unicast or multicast, configurable data rates.
Network Text Editor can be used for message exchanges.
Whiteboard can be used for free-hand drawing.

It should allow local scoped groups, global scope groups and administratively scoped groups, and also unicast traffic gateway (UTG) so that routers routing protocols and multicast domains can be reached by tunnelling, that is, in a LAN, IP packets are multicasted to all hosts directly; and in a WAN it is a virtual overlay network on top of the Internet. RTP/RTCP is used as the protocol for transmission and control. Overlapping multicast domains can be configured by using different administratively scoped addresses in each of the domains.

7.8.5 Conference Control

Conference control provides functions and mechanisms for users to control how to organise, manage and control a conference, with the following control functions:

Floor control:who speaks? chairman control? distributed control?
Loose control:one person speaks, grabs channel.
Strict control:application specific, for example lecture.
Resource reservation:bandwidth requirement and quality of the conference.
Per-flow reservation:audio only, video only, audio and video.

Exercises

1 Explain how satellite networks affect the performance of TCP due to flow control, error controls and congestion control mechanisms.

2 Discuss typical satellite network configurations for Internet connections.

3 Explain TCP enhancement for satellite networks based on the slow-start algorithm. Explain TCP enhancement based on the congestion avoidance mechanism.

4 Discuss how to achieve TCP enhancement based on acknowledgement.

5 Calculate the utilisation of satellite bandwidth in the slow-start and congestion avoidance stages.

6 Explain TCP enhancement on error recovery mechanisms, including fast retransmission and fast recovery.

7 Explain the pros and cons of TCP spoofing and split TCP (also known as cascading TCP) mechanisms.

8 Explain the limitation of TCP enhancement mechanisms based on existing TCP mechanisms.

9 Discuss real-time protocols, including RTP, RTCP, SAP, SIP and so on, and the HTTP protocol.

10 Compare differences between non-real-time applications, WWW and FTP, and real-time applications, VoIP and MMC.