Chapter 3

Quality of Service: Mechanisms and Protocols 1

3.1. QoS and IP

3.1.1. The stack of IP protocols

QoS in TCP/IP networks can be found on many interdependent levels:

– in the case of a local network, QoS mechanisms can be implemented at the link layer access protocols level to give priorities to some machines (the 802.1p standard over Ethernet enables the definition of 8 traffic classes);

– in the case of an extended network, the link layer can provide QoS guarantees during the connection between a user and the network (CBR classification of ATM provides a guaranteed fixed bandwidth during the time of connection);

– at the routers and IP protocol levels, searching for a specific process on certain datagrams or some identified flows (all IP packets stemming from a video will have priority in the router queue);

– the QoS approach can be based on adaptive applications: these will modify their algorithms according to the network’s behavior (for example, sound coding will adapt to the network’s throughput).

Figure 3.1 illustrates these different QoS levels by locating IP and QoS models or protocols associated at the core of the device.

Figure 3.1. IP and the different levels of QoS

ch3-fig3.1.gif

At the IP level, identification, class and processing of packets are executed from a reserved field in the header to code information linked to QoS.

3.1.2. The IPv4 TOS field

The IP packet, or IP datagram, is organized in 32 bit fields. The Type of Service (TOS) field was planned during IP protocol design to indicate with a specific code the QoS linked to a packet. Figure 3.2 positions this field in the IPv4 (RFC791) header.

Figure 3.2. Structure of the IPv4 header

ch3-fig3.2.gif

Detail of the TOS field over 8 bits is shown in Figure 3.3:

– the 3 bits PRECEDENCE show the priority in the datagram. This goes from 0 (PPP = 000) for a normal routing or no priority, to 7 (PPP = 111) for information with maximum priority;

– the following bits show which type of quality is required: D (Delay) for a short delay, T (Throughput) for high throughput, R (Reliability) for reliable routing; C (Cost) for a priority linked to cost (number of skips, etc.);

– the last 0 bit is not used.

Figure 3.3. Detail of the IPv4 TOS

ch3-fig3.3.gif

The IPv4 TOS field is rarely used in this form. Another definition proposed in the DiffServ model is explained in section 3.3.

3.1.3. QoS on IPv6

3.1.3.1. Traffic class field

Within the IPv6 header, the Traffic Class (TC) field, albeit in a different location, takes the place of the IPv4 TOS field. Figure 3.4 shows the structure of this header (RFC2460).

Figure 3.4. Structure of IPv6 header

ch3-fig3.4.gif

As in IPv4, the TC field was designed to be used by transmitter nodes and routers to identify and distinguish the different classes or priorities of IPv6 packets. This control is managed by the DiffServ (RFC 2474) protocol, which is designed to exploit this field.

The specialized flow transmitter (multimedia, real-time, etc.) will specify a service class through the TC field. The routers, equipped with algorithms (packet classifier), interpret this field and execute a differentiated process (adaptation, queue, etc.) with the help of a packet scheduler. It is important to note that this field is excluded from checksum and can evolve.

The traffic class field is composed of two parts:

– DSCP (DiffServ Code Point) over 6 bits which contains the values of the different behaviors;

– CU (Currently Unused) over 2 bits, not used currently but would be normally used by routers to indicate a congestion risk.

Today, only two types of behaviors are standardized by the IETF (Internet Engineering Task Force):

Assured Forwarding (AF): this defines 4 traffic classes and three priorities (see section 3.3 on DiffServ). Classes are chosen by the user and remain the same until they reach the recipient. Priority may be modified by routers.

Explicit Forwarding (EF): its behavior is equivalent to a constant throughput leased link.

3.1.3.2. Flow label field

The flow label field is also linked to QoS. It can be used by one source to name sequence packets (unicast or multicast) for which a special process is requested from the feedthrough routers (choice of route, real-time processing of information, video sequences, etc.). This chosen flow label value is the same for all packets of the same application going toward the same recipient: the process is then simplified to determine what the packet belongs to. A flow is then defined: it is identified as a combination of the source address and a non-null flow ID.

3.1.3.3. Other IPv6 contributions for QoS

Other IPv6 functions meet QoS criteria, besides TC and flow label specific fields:

Simplification of the header format: some IPv4 header fields have been deleted or have become optional. The header is now 8 fields instead of 15. This reduces packet control costs in traditional situations and limits bandwidth needs for this header.

Fragmentation: IPv6 does not manage fragmentation. IPv6 requires that each internetwork link have a Maximum Transfer Unit (MTU) higher than or equal to 1,280 bytes. For each link that does not have the necessary capacity, fragmentation and re-assembly services must be supplied by the layer that is below IPv6.

Packet maximum time to live: IPv6 nodes are not required to impose a maximum time to live to the packet. There is therefore a reduction of information loss because of the absence of rejected packets at the end of the span.

ICMPv6: the IP control protocol has been reviewed. For IPv4, ICMP is used for error detection, tests and automatic equipment configuration. These functions are better defined by IPv6; furthermore ICMPv6 integrates multicast groups control functions and those of the ARP protocol.

3.1.4. Processing in routers

A QoS router integrates a specific processing logic by using different packet processing algorithms associated with output queues (Figure 3.5).

The router’s first step is packet classification. It directly affects IP and can be executed in different ways:

– from the TOS field of the IPv4 header;

– from the redefined DSCP field within the IPv4 header or defined by default in the IPv6 header in a DiffServ context;

– from a multifield definition integrating, for example:

   - source and destination IP addresses,

   - the TOS or DCSP field,

   - TCP or UDP source or destination ports, which enable a more precise classification according to the application that will be used but that makes the analysis of the headers more complex.

Figure 3.5. IP processing in a router

ch3-fig3.5.gif

3.2. IntServ (RSVP) model

3.2.1. Principle

IntServ (Integrated Services) is an architecture model defined by IETF (RFC 1633) which proposes resource reservation in intermediary nodes (routers) before using them. Contrary to the DiffServ model, as explained in section 3.3, each application is free to request a specific QoS according to its needs. Due to the RSVP (Resource ReSerVation Protocol) associated protocol, intermediary routers check whether they can or cannot grant this QoS and accept or decline the application’s reservation request accordingly. If the reservation is accepted, the application has then been granted data transfer guarantees, according to what has been negotiated (Figure 3.6). Due to the classic Internet infrastructure, the IntServ model then adds two constraints:

– the data flow identification of an application needing QoS;

– the control of the additional information in the routers to process this QoS data flow.

Figure 3.6. The integrated services model

ch3-fig3.6.gif

The protocols used by IntServ are set at levels 3 and 4 of the OSI model. The RSVP messages put in place to establish and maintain a reserved route are directly transported in IP datagrams (the IP header protocol field is then positioned at 46) or through the UDP protocol if direct mode is not supported (ports 1698 and 1699 are then used). RSVP is not a routing protocol (it is meant to work with unicast or multicast routing protocols such as RIP, OSPF, RPM, etc.) and it is assimilated, in the OSI model, to a transport protocol (Figure 3.7).

Figure 3.7. IntServ protocols

ch3-fig3.7.gif

3.2.2. IntServ services

IntServ defines two types of services:

– Controlled Load (CL) service which is equivalent to the best effort service on a non-full network and controls available throughput;

– Guaranteed Service (GS) which corresponds to a dedicated virtual circuit and offers bandwidth and end-to-end delay guarantees.

3.2.3. How an IntServ router works

An IntServ router (Figure 3.8) must integrate supplementary control mechanisms enabling the reservation requested for QoS. Four distinct functions can be examined:

– the classifier, whose role is to classify each incoming packet according to its ownership flow (best effort packet, packet with QoS request, etc.);

– the scheduler, which controls output packet transmission by using multiple queues. Each queue corresponds to a service classification (classifier role) controlled by different algorithms (CBQ, WFQ, etc.);

– the admission control, which must decide according to the requested service classification if a new flow can or cannot be accepted based on already existing flows;

– the process linked to the RSVP reservation protocol, that drives the set of functions and allows to create and to update the relative state of a reservation within a router that is located on a route used by the QoS flow.

Figure 3.8. IntServ router architecture

ch3-fig3.8.gif

3.2.4. The RSVP protocol

3.2.4.1. Principle

RSVP (RFC 2205) is primarily a signaling protocol that makes it possible to reserve bandwidth dynamically and to guarantee a delay for unicast and multicast applications. It is based on the QoS requested by the recipient and not the transmitter, which helps to prevent transmitting applications from monopolizing resources uselessly and thus jeopardizing the global performance of the network.

Routers located on the data flow route meet the RSVP requests, establish and maintain connections (RSVP messages transparently pass non-RSVP routers). Contrary to the reservation of a virtual circuit type static route, routers reserve resources dynamically by memorizing state information (soft state). When a route is no longer utilized, the resources are freed up. Likewise, if the route is modified, the state tables must be kept up to date, which causes periodic exchanges between routers.

Reservation within RSVP is executed in two steps (Figure 3.9):

– information sources periodically generate messages of Path type QoS route searches which spread according to a unicast routing protocol (RIP, OSPF, etc.) or according to a multicast distribution tree structure throughout the routers;

– unicast or multicast addressed receivers are informed of the sources requirements and respond by Resv reservation requests that execute, in the routers, requested reservations and go back up to the selected sources following the opposite route.

Figure 3.9. RSVP reservation process

ch3-fig3.9.gif

The logical connection put in place by RSVP is called a session. It is characterized by a flow of data with a specific destination and a transport protocol. The session is therefore defined by the three elements described below:

– DestAdress: unicast IP address for a unique recipient or a group of IP addresses for multicast;

– Protocol Id: transport protocol;

– DestPort: UDP/TCP port.

3.2.4.2. RSVP messages

An RSVP message is made up of a 64 bit header followed by objects corresponding to the different reservation parameters (Figure 3.10).

In the header part, the explanation of the fields is:

Version specifies the RSVP protocol version (presently 1);

Flags is a field not used at this time;

Type specifies the type of RSVP message:

   - 1 for a Path search message,

   - 2 for a reservation Resv message,

   - 3 identifies a PathErr message for an error in response to a Path message,

   - 4 identifies a ResvErr message for an error in response to a Resv message,

   - 5 for a PathTear message that tells routers to cancel the states concerning the route,

   - 6 for a ResvTear message that tells routers to cancel the reservation states (end of session),

   - 7 for a ResvConf optional confirmation message sent to the receiver by the last router that received the Resv message;

Checksum verifies the integrity of the full RSVP message;

Send TTL gives the value of the TTL RSVP to compare with the TTL of the IP packet in order to find out if there are non-RSVP routers (TTL RSVP is not decremented by a non-RSVP router);

RSVP Length gives the total length of the message in bytes (header and objects).

Figure 3.10. Structure of RSVP messages

ch3-fig3.10.gif

Objects also share a header:

Object Length represents the length in bytes of the object;

Class-Num identifies the nature of the object (see Table 3.1);

C-type groups objects with common properties (C-type = 1 for objects defined in IPv4 and C-type = 2 for objects defined in IPv6, for example).

3.2.4.3. PATH and RESV messages

According to the previous RSVP message description, a Path message contains, among other things, objects defined in Figure 3.11. The three objects that make up the sender descriptor describe the entire characteristics of the source (IP addresses, quantitative specifications of the requested data flow, etc.). Only one field, AD_SPEC, is modified by the feedthrough routers in order to take into account the progressive characteristics of the network at the router level, or at the links level.

Table 3.1. Objects within RSVP messages

Class-Num Object name Description

0

NULL

Content ignored by receiver.

1

SESSION

Required in all RSVP messages: contains destination IP
address, destination port and transport protocol.

3

RSVP_HOP

Contains IP address of RSVP node which transmits
message (Previous HOP in source-receiver direction,
Next HOP in opposite direction).

4

INTEGRITY

Encrypted authentification data.

5

TIME_VALUES

Message refresh interval set by the creator.

6

ERROR_SPEC

Specifies the error for the PathErr and ResvErr messages.

7

SCOPE

Lists of hosts affected by a Resv reservation message.

8

STYLE

Defines reservation style in a Resv message.

9

FLOW_SPEC

Defines a QoS requested in a Resv message.

10

FILTER_SPEC

Defines a data packet subset within session receiving the
QoS specified in FLOWSPEC.

11

SENDER_TEMP LATE

Characterizes a source in a Path message
(IP address and other information).

12

SENDER_TSPEC

Defines a source’s data flow characteristics within a Path
message.

13

AD_SPEC

Transports in a Path message information on the state of the
network used by receivers.

14

POLICY_DATA

Transports information on reservation rules defined from
an administrative viewpoint.

15

RESV_CONFIRM

Contains the IP address of receiver requesting confirmation in
a Resv or ResvConf message.

Figure 3.11. Structure of a Path message

ch3-fig3.11.gif

The structure of a Resv message is described in Figure 3.12. The flow descriptor is made up of FLOW_SPEC, which represents quantitatively the resources requested by the receiver and of FILTER SPEC, which identifies the packets which will receive the QoS specified in FLOW_SPEC.

Figure 3.12. Structure of a Resv message

ch3-fig3.12.gif

Figure 3.13 shows an example of an exchange of Path and Resv messages:

1) the source prepares a Path message in which it will specify its unique characteristics in a TEMPLATE object as well as the desired traffic in a TSPEC descriptor (throughput, variability and size of packet);

2) the Path message is dispatched to destination by applying the unicast (RIP or OSPF) or multicast routing protocol. Routers record whilst passing the path state that will enable the return of the Resv message. At each passing in a router, the AD_SPEC object can be modified to reflect available resources which can be attributed (for example, to specify that the GS service is the only one available);

3) the receiver determines, thanks to TSPEC and AD_SPEC, which parameters to use in return and sends back a Resv message including the FLOW_SPEC objects to specify the requested QoS (for example, a GS service with a bandwidth of 512 Kbit/s) and FILTER_SPEC to characterize the packets for which the reservation will be established (for example, all packets with a port number equal to 3105);

4) the Resv message comes back using the same route as Path. The routers receiving Resv execute their admission control modules to analyze the request and possibly proceed with the allocation of resources (the packet classifier is programmed according to the FILTER_SPEC parameter and the bandwidth defined in FLOW_SPEC is reserved in the appropriate link);

5) the last router that accepts the reservation sends a ResvConf confirmation message to the receiver.

Figure 3.13. Example of Path and Resv message exchanges

ch3-fig3.13.gif

3.2.4.4. Reservation styles

The reservation style of a Resv message specified in the STYLE field enables us to clarify two reservation characteristics in the routers:

– the source selection is done either by a specific list or by default according to the session’s common parameters (destination addresses, transport protocol and destination port);

– the allocation of shared resources for all packets of selected sources or distinct for each source.

Table 3.2 summarizes the different reservation styles according to a combination of these two characteristics. For the FF style, all packets from each source defined by a specific list will get their own reservation;

– SE style implies that all packets from each source are defined by a specific list and will use a shared reservation;

– for the WF style, all the packets share the same reservation whatever their source.

Table 3.2. RSVP reservation styles

Source selection Resource allocation
Distinct for each source Shared by all sources

Explicit

FF (Fixed Filter) style

SE (Shared Explicit) style

Global
(Wildcard)

Not defined

WF (WildCard Filter) style

Figure 3.14 illustrates a request for a SE style reservation on a router with 4 interfaces. This session has three sources (S1, S2 and S3) and three receivers (R1, R2 and R3).

For each interface, a list of explicit sources and the requested shared resources is produced. Thus, the reservation request coming from R3 is made for the two sources S1 and S3 that will share the 3B bandwidth. The router reserves by keeping the maximum on the requested resources for the specified sources. The list is then limited to the sources that are downstream from the router.

Figure 3.14. Example of an SE reservation in a router

ch3-fig3.14.gif

3.2.5. The disadvantages of IntServ

RSVP requires that all state information for each node of the route linking the source to the receiver is maintained. When the number of sessions or participants of a session increases (scalability), the number of states in the routers and the refreshes between routers become sizeable and jeopardize the validity of the model in high traffic networks.

Furthermore, all the routers, including those at the core of the high throughput network, must inspect multiple fields within each packet in order to determine its associated reservation. After classification, each packet is placed in the queue corresponding to its reservation. The classification and management of queues for each individual flow make the IntServ model difficult to use in high throughput networks.

Besides, even if RSVP is expected to work with traditional routers that do not guarantee resource reservation, the process will be less efficient as the number of non-RSVP routers will be more important.

Finally, since RSVP has not been designed to define a control policy but only to manage the mechanisms ensuring this policy, a control of information exchange protocol (such as COPS) between the nodes linked with the policy servers (PDP – Policy Decision Point) must be added.

3.3. The DiffServ model

3.3.1. Principle

Contrary to IntServ, the DiffServ (Differentiated Services) model as defined by IETF (RFC 2475) does not propose reservations in the intermediary nodes. The basic principle consists of introducing multiple service classifications, each offering a different QoS. Depending on the application’s needs, each traffic flow will then be attributed with an appropriate service classification.

This traffic classification is executed at the edge of the network, directly at the source or on an edge router, according to a preconfigured set of criteria (IP addresses or TCP/UDP ports). Each packet is marked with a code (DSCP – DiffServ Code Point) which indicates its assigned traffic classification. The routers at the core of the network (core router) use this code, transported in an IP datagram field, to determine the QoS required by the packet and the associated behavior (PHB Per Hop Behavior), as illustrated in Figure 3.15. All the packets with the same code get the same treatment.

The packets classification criteria must reflect the source application’s real requirements and therefore the information that they transmit in terms of bandwidth, sensitivity to loss of packets, sensitivity to delays and to delay variations (jitter). For example, VoIP, which by itself justifies the introduction of QoS, is sensitive to delays as well as to delay variations, much more so than to the loss of packets.

For a given service classification, a SLA (Services Level Agreement) is defined. This agreement specifies a group of rules, generally detailed according to the context, for the traffic conditioning:

– average availability rate, average loss rate;

– delay limit, average delay, average jitter;

– microflow types within each classification;

– DSCP marking value;

– allocated bandwidth, accepted data peaks;

– policing selection in case of agreement overflow:

   - packet transmission,

   - packet rejection,

   - lowering of priority level (change of class),

   - flow shaping (spreading of the exceeding traffic over time),

– buffer size in queues.

Figure 3.15. “Differentiated services” model

ch3-fig3.15.gif

3.3.2. Architecture

3.3.2.1. DiffServ domains and regions

A DiffServ domain corresponds to a zone of contiguous nodes according to the DiffServ model and containing a common policy of control and of PHB. This policy, which defines QoS, has an SLA with the source of data flows. We generally associate a DiffServ domain to a service operator or an intranet. In order to communicate with the outside and match the PHB, these DiffServ domains execute the appropriate traffic conditionings (TCA – Traffic Conditioning Agreement) on their boundary nodes (Figure 3.16).

A DiffServ region contains a group of contiguous DiffServ domains. Each domain applies predefined SLA service contracts as well as a coherent control policy. The operator must guarantee, however, that the QoS will be ensured end-to-end on the whole DiffServ region by maintaining a constant matching between domains.

Figure 3.16. DiffServ elements

ch3-fig3.16.gif

3.3.2.2. DiffServ nodes

Processes executed on the edge routers and more often on boundary nodes (these can also be servers or workstations on which the applications emitting the flow are executed) are generally the most complex and correspond to a predefined SLA. They will be able to execute a reconditioning of packets if the rules (TCA) are not the same in the destination domain. An SLA from organization X may for example classify a Gold (AF 3 service classification) flow from client Y entering as a Silver (AF 2 classification) flow within its domain. In this case, the DiffServ domain’s edge router of client Y must then execute conditioning operations (TCA) for the outgoing traffic.

Depending on their situation, the boundary nodes are classified into two categories:

– the Ingress Nodes of the domain that classify traffic and verify compliance of classified traffic;

– the Egress Nodes of the domain that must aggregate the different classifications and verify the compliance with contracts negotiated downstream.

Depending on their location, different functions can be put in place in the boundary nodes (Figure 3.17):

– the classifier which detects flow service classes. It is usually based on the association of one or more characteristics of the network and/or transport layer (source or destination IP address, source or destination port, TCP or UDP protocol, etc.);

– the meter which verifies that the inbound flow classifications do not exceed the SLA defined in the router;

– the marker which works on the DSCP field. This module may, for example, decide that in the case where the contract is exceeded, exceeding flows are marked with a lower priority;

– the shaper which will slow down traffic (lower priority queue) may be activated when the flow of a classification exceed the predefined SLA;

– the dropper which comes into place to guarantee a fixed throughput for each service class.

In the case of shaping, the queues having a finite size, the packets can be dropped when profiles are exceeded by too much.

Figure 3.17. DiffServ edge router functions

ch3-fig3.17.gif

Contrary to the IntServ model, where the major problem is the complexity of operation within the intermediary nodes, DiffServ attempts to decrease processes in the interior nodes (core router). The packet’s DSCP field is analyzed and an appropriate predefined behavior (PHB) is executed.

3.3.3. Service classes

3.3.3.1. DSCP codes

The DiffServ model was designed for use with the IP network layer (see sections 3.1.2 and 3.1.3). It uses the TOS field initially defined in the IPv4 protocol but rarely used until now and is an integral part of the IPv6 protocol for which it uses the Traffic Class (TC) field.

The TOS or TC fields have been renamed as DSCP (Differentiated Service Code Point) when transferred to the DiffServ model. The explanation of each DSCP byte is detailed in Figure 3.18.

Figure 3.18. DSCP field in TOS IPv4 or TC IPv6 byte

ch3-fig3.18.gif

Class Selector: the ccc (when c can be 0 or 1) class selector helps to define the major service classes. These will be associated with PHB which will enable differentiated flow processes in intermediary routers. The higher the value of the ccc selector, the higher the priority given to its corresponding flow.

Precedence: this field extends the class selectors with the help of three additional ppp bits that make it possible to define priorities. We then obtain an additional granularity (8 possible priorities per class selector);

Currently Unused: this field is not used at this time and is therefore ignored by intermediary routers. Its goal is to facilitate future protocol expansions.

At this moment, three PHB behaviors are defined for DiffServ:

– EF which offers accelerated processing with bandwidth, delay, loss ratio and jitter guarantees;

– AF which guarantees high probability of packet transmitting with more options (4 traffic classes and three priority levels are defined);

Best Effort (BE) which corresponds to the default service, without quality, offered over the Internet.

3.3.3.2. EF service

This service, which can be compared to a virtual leased line, is also named Premium because the associated traffic is serviced with high priority through routers with these characteristics:

– priority processing in queues;

– traffic parameters (bandwidth, delay, loss ratio and jitter) which are always compliant with SLA;

– outgoing traffic shaping when necessary to ensure an inter-domain QoS.

To make sure the bandwidth is adequate for the other flows, the associated traffic is usually limited to 10% of the total traffic.

This type of behavior is widely used for the transmission of real-time data such as voice or videoconferencing.

The DSCP associated with the EF service is equal to 101 110, which presently corresponds to the highest priority.

3.3.3.3. AF service

Where all the traffic on the Internet does not need guarantees as high as those supplied by the EF service, a second AF service is offered in order to ensure a minimum variable quality according to the applications and favoring the associated data flows compared to best effort, especially in the case of congestion.

The AF service offers multiple levels of packet transmission guarantee (see Table 3.3).

It is made up of a group of 4 service classes (AF4 to AF1), each having 3 drop precedences.

For example, when congestion occurs in service class 4, packets that have a high drop precedence value, i.e. a DSCP value of 100 110, will be dropped first.

The AF service can be used to execute the Olympic service which has three classes: bronze, silver and gold, corresponding to AF1, AF2 and AF3.

Similarly, for each Olympic class, a priority level (1, 2 or 3) can be affected.

Table 3.3. Point code for the different PHB

ch3-tab3.3.gif

3.3.4. DiffServ advantages and disadvantages

One of the major advantages is the capacity to limit processing times of intermediary routers, which helps network operators who had a hard time justifying the complexity brought about by IntServ with the reservations to now implement it on their whole infrastructure.

The PHB (behaviors) normalization constitutes a second major advantage of DiffServ by simplifying the interconnection between the different DiffServ domains.

One of the disadvantages is the obligation to establish, ahead of time, a contract (SLA) within all the equipments in the domain.

This constraint implies a thorough knowledge of the applications that will go through the network and a centralized and distributed policy from specific servers (PDP – Policy Decision Point).

Behavior based on the aggregation of flows also implies a granularity loss of the applications passing through the network, which in certain cases may be inappropriate.

3.4. MPLS architecture

3.4.1. Principle

MPLS (Multiprotocol Label Switching) is an architecture standardized by IETF that makes it possible to integrate and homogenize the different routing and switching protocols in place at different levels in the standard networks (Ethernet, IP, ATM, Frame Relay, etc.).

The major goal is to improve delays, and therefore QoS, in the nodes with multilevel rapid switching based on the identification of labels carried by frames or packets.

MPLS has the following characteristics:

– independent from the protocols at layers 2 and 3;

– supports layer 2 in IP, ATM, and Frame Relay networks;

– interaction with existing reservation and routing protocols (RSVP, OSPF);

– possibility of associating specific traffic profiles (FEC: Forward Equivalence Class) to labels.

Within MPLS, data transmission is done over label or LSP (Label Switched Path) switching routes.

LSPs correspond to a label sequence at each node in the route from source to destination.

Labels are specific identifiers lower layers (MAC Ethernet addresses, VPI/VCI fields in ATM cells, etc.) protocol and are distributed according to LDP (Label Distribution Protocol).

Each data packet encapsulates and transports the labels during their routing.

As long as fixed-length labels are inserted at the front of the frame or cell, high throughput switching becomes possible.

The nodes read the labels and switch the frames or the cells according to the value of these labels and of the switching tables previously established in the LSP (see Figure 3.19).

These nodes, depending on their location on the network, are either:

– LSR (Label Switch Router) for a router or switch type equipment located at the core of an MPLS network and is limited to reading labels and switching (IP addresses are not read by LSR); or

– LER (Label Edge Routers) for a router/switch at the border of the access network or of the MPLS network that can support multiple ports connected to different networks (ATM, Frame Relay or Ethernet). The LERs play an important role in the assignment or deletion of labels for incoming or outgoing packets.

Figure 3.19. MPLS nodes and route

ch3-fig3.19.gif

3.4.2. MPLS label and classes

A label, in its simplest form, identifies the route that the packet must follow. The label is encapsulated and transported in the packet’s header. Once a packet is labeled, the rest of its route is based on label switching. The router that receives the packet, analyzes its label and searches for the corresponding entry in the switching table in order to determine the output interface and the new label allocated to the packet. The label values have a local impact and can be linked to a specific architecture in order to be able to determine a virtual route (DLCI type for frame relay or VCI/VPI for ATM). The generic format of a label is illustrated in Figure 3.20. It is located between layers 2 and 3 or directly in the header of layer 2 (MAC addresses for Ethernet, VCI/VPI in ATM, etc.).

Figure 3.20. MPLS labels’ basic format

ch3-fig3.20.gif

Next to the label value, different fields are present to add functionality:

– the experimental field is not standardized and may be used to control QoS, for example, by associating a DiffServ type PHB;

– the Stack byte assumes a value of 1 when the label is located at the top of the pile in a network interconnection with multiple label levels (VPI/VCI hierarchy of an AT network, for example);

– as with IP, the TTL (Time To Live) field helps to prevent looping.

An FEC (Forwarding Equivalence Class) corresponds to a group of packets that have the same requirements, in terms of address prefixes or QoS.

Contrary to the other models, within MPLS a packet is only assigned once to an FEC, when entering the network.

Each LSR builds itself an LIB (Label Information Base) table in order to figure out how a packet must be transmitted.

The labels are therefore associated with an FEC according to a logic or a policy based on different criteria: QoS, same source or destination address prefix, packets from the same application, affiliation to a VPN (Virtual Private Network).

3.4.3. MPLS routes

In the example illustrated by Figure 3.21, the FEC corresponds to the destination addresses prefix 18.1. When entering the MPLS network, the LER edge router includes the label corresponding to this FEC for the incoming packet according to its LIB switching table.

The LSR central routers then exchange the label again according to the FEC and switch the packet.

The outgoing LER edge router takes away the label and ensures the routing of the packet to its destination.

The traced LSP route corresponds, in this example, to the rest of the labels 9-7-1.

Figure 3.21. MPLS switching example

ch3-fig3.21.gif

The LER edge routers must go back up to the network level to analyze the IP addresses and to position the labels accordingly. The LSR central routers, when the tables have been previously positioned (LPD protocol or routing protocol), play a simple label switching role (LSRs can be simple ATM switches). The routing process therefore requires two protocol levels:

– a routing protocol such as OSPF or BGP responsible for route distribution and for routing tables set-up;

– a label distribution protocol such as LDP that is responsible for the setting up of switching tables from routing tables and FECs.

3.5. QoS at level 2

3.5.1. QoS with ATM

The ATM technology is originally designed for QoS support. Within its model, the ATM Adaptation Layer (AAL) located over the ATM layer (ensuring multiplexing and cell switching) is responsible for supplying a QoS to applications.

Five adaptation classifications have been defined for the different flows:

– AAL1 is designed for the support of voice applications or circuit emulation requiring constant throughput flows;

– AAL2 is designed for voice or video variable throughput flows;

– AALS 3/4 for data transmission on connected or disconnected mode, very rarely used in practice;

– AAL5 class 3/4 simplified version is widely used for secure data transport.

Table 3.4 shows the correspondence between AAL classifications and the services used for different connections or applications.

Table 3.4. AAL services and classes

ch3-tab3.4.gif

Moreover, when the ATM technology is used on a large scale to transport IP packets in a heterogenous network like the Internet, its QoS functionality is rarely used:

– native ATM applications able to use the QoS parameters are scarce;

– if ATM has not been deployed end-to-end, its QoS functionality might be inefficient: queuing introduced by non-ATM routers has an influence on the delay and jitter calculation;

– ATM and TCP have different behaviors when encountering congestion: ATM deletes cells and notifies the terminal system of the packet loss, TCP reduces its transmission window and modifies its throughput, whereas the ATM congestion has already been processed. This fact can be avoided, however, if the ATM network is correctly sized to avoid large scale congestion problems.

3.5.2. QoS with Ethernet

Some QoS functions are planned in local networks such as Ethernet, in particular within the two complementary 802.1q and 802.1p norms.

The 802.1q IEEE norm is today the de facto standard for the identification of Ethernet frames within a Virtual Local Area Network (VLAN). The general principle is the addition, within each Ethernet frame destined to be transmitted from one switch to another, of some additional headers containing particularly the virtual network identifier to which it belongs (VID: VLAN Identifier) and a field to establish priorities over the frames. The QoS mechanisms are not exactly defined for this standard but the possibility, through interconnected Ethernet switches, to group within a VLAN stations with common characteristics, enables the reduction of Broadcast domains, which are very bandwidth intensive on Ethernet.

The 802.1q fields are inserted in the Ethernet 802.3 frame as shown in Figure 3.22:

– the TPID (Tag Protocol ID) field indicates that the frame is signed according to the 802.1q protocol;

– the TCI (Tag Control Information) part contains:

   - the priority field on 3 bits as defined by the 802.1p standard,

   - the CFI (Canonical Format Indicator) field specifies that the MAC address format is standard,

   - the VLAN ID on 12 bits identifies the VLAN to which the frame belongs.

Figure 3.22. 801.1q frame format

ch3-fig3.22.gif

Priority or QoS at MAC level as defined by the 802.1p standard corresponds to different traffic classifications. The number of available traffic classifications is linked directly to the Ethernet switch capacity and its number of queues.

For a switch that has 7 queues, the 7 traffic classes that could be coded by the priority field will be available (the 0 value is reserved for a best effort traffic). The network control traffic will correspond, for example, to the highest priority class; voice and video respectively to classes 6 and 5 and the controlled load traffic to class 4.

If the switch does not have enough capacity, priorities will be grouped. For 2 queues, for example, priorities lower than or equal to 3 will correspond to traffic class 0 and priorities higher than 3 to traffic class 1.

3.5.3. QoS with wireless networks

The 802.11 norm used in Wi-Fi networks defines, at MAC sub-layer level, two exchange coordination functions corresponding to two different access methods:

– DCF (Distributed Coordination Function) is not based on centralized management and enables the control of asynchronous data transport with equal chances for support for all stations (best effort type).

– PCF (Point Coordination Function) is based on polling (query each terminal one after the other) from the access point (AP). This mode, rarely implemented and designed for real-time applications, will enable some form of QoS as long as the AP can control priorities by station.

The 802.11e standard is specifically designed for QoS in Wi-Fi networks and adds two new access methods: EDCF (Extended DCF) and HCF (Hybrid Coordination Function).

3.5.3.1. EDCF access

A priority control mechanism is added to the DCF method. The frames are classified in distinct queues within the originating station according to eight priority levels that correspond to eight traffic levels or Traffic Categories (TC). The lowest level will be associated with best effort traffic, whereas the highest level may correspond to video flows, for example.

For each TC, three parameters controlling support access priority are defined:

– AIFS (Arbitration InterFrame Spacing) replaces DIFS (DCF InterFrame Spacing) with an equal or higher time and controls the wait time between two frames according to the priority level (the highest priority frame will reach the minimum standard time DIFS);

– the CW (Contention Window) duration is always set according to the backoff algorithm (wait time before transmission, when the support becomes available depending on a random value and the number of attempts) while taking into account TC: a station with higher priority will have a shorter CW;

– TxOP (Transmission Opportunities) which is a fixed value timer with priority level and enables a differentiated delay in transmission when many backoff timers expire at the same time (when two classes are ready to transmit, the one with higher priority will have a lower TxOP and will transmit first).

Figure 3.23 shows an example of priority EDCF access when 3 stations want to transmit different priority frames following a previous transmission. The AIFS interframe wait time and the duration of the CW will be lower for the higher priority frame.

Figure 3.23. EDCF priority access example

ch3-fig3.23.gif

3.5.3.2. HCF access

HCF is a hybrid access method that can be used during Contention Periods (CP) and during Contention Free Periods (CFP) and combining these two access methods with or without access point control PCF and EDCF.

In PCF mode, during CFP or CP periods, the access point controls use support with the possibility for the stations to generate successive (burst) or periodic frames, as illustrated in Figure 3.24.

Figure 3.24. 802.11e HCF hybrid access

ch3-fig3.4.gif

1 Chapter written by Stéphane LOHIER.