Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 3

Optimization and Control Theoretic Analysis of Congestion Control

In this chapter, optimization theory and control theory are applied to a differential equation–based fluid flow model of the packet network. This results in the decomposition of the global optimization problem into independent optimizations at each source node, and the explicit derivation of the optimal source rate control rules as a function of a network-wide utility function. In the case of TCP Reno, the source rate control rules are known, but then the theory can be used to derive the network utility function that TCP Reno optimizes. In addition to the mathematical elegance of these results, the results of this theory have been used recently to obtain optimal congestion control algorithms using machine learning techniques. The system stability analysis techniques using the Nyquist criterion, which is introduced in this chapter, have become an essential tool in the study of congestion control algorithms. By using a linear approximation to the delay-differential equation describing the system dynamics, this technique enables us to derive explicit conditions on system parameters to achieve a stable system.

Keywords

congestion control and optimization theory; congestion control and optimal control theory; fluid flow model of congestion control; congestion control and Nyquist stability; congestion control Lagrangian optimization; congestion control dual optimization problem; congestion control primal optimization problem; Active Queue Management; AQM; Random Early Detection; RED; proportional controller; proportional plus integral controller; the averaging principle; TCP utility function

3.1 Introduction

In this chapter, we introduce a network wide model of congestion control in the fluid limit and use it to ask the following questions: Can the optimal congestion control laws be derived as a solution to an optimization problem? If so, what is the utility function being optimized? What is meant by stability of this system, and what are the conditions under which the system is stable? This line of investigation was pursued by Kelly [1], Kelly et al. [2], Low [3], Low et al. [4], Kunniyur and Srikant [5], and Holot et al. [6,7], among others, and their results form the subject matter of this chapter.

As in Chapter 2, we continue to use fluid flow models for the system. Unlike the models used in Chapter 2, the models in this chapter are used to represent an entire network with multiple connections. These models are analogous to the “mean field” models in physics, in which phenomena such as magnetism are represented using similar ideas. They enable the researcher to capture the most important aspects of the system in a compact manner, such that the impact of important system parameters can be analyzed without worrying about per-packet level details.

It has been shown that by applying Lagrangian optimization theory to a fluid flow model of the network, it is possible to decompose the global optimization problem into independent local optimization problems at each source. Furthermore, the Lagrangian multiplier that appears in the solution can be interpreted as the congestion feedback coming from the network. This is an elegant theoretical result that provides a justification for the way congestion control protocols are designed. Indeed, TCP can be put into this theoretical framework by modeling it in the fluid limit, and then the theory enables us to compute the global utility function that TCP optimizes. Alternately, we can derive new congestion control algorithms by starting from a utility function and then using the theory to compute the optimal rate function at the source nodes.

The fluid flow model can also be used to answer questions about TCP’s stability as link speeds and round trip latencies are varied. This is done by applying tools from classical optimal control theory to a linearized version of the differential equations obeyed by the congestion control algorithm. This technique leads to some interesting results, such as the fact that TCP Reno with Active Queue Management (AQM) is inherently unstable, especially when a combination of high bandwidth and large propagation delay is encountered. This analysis has been used to analyze the Random Early Detection (RED) controller and discover suitable parameter settings for it. It has also been used to find other controllers that perform better than RED.

The techniques developed in this chapter for analyzing TCP constitute a useful toolkit that can be used to analyze other congestion control algorithms. In recent years, algorithms such as Data Canter TCP (DCTCP) and the IEEE 802.1 Quantum Congestion Notification (QCN) have been analyzed using these methods. We end the chapter with a discussion of a recent result called the averaging principle (AP), which shows the equivalence of a proportional-integral (PI) type AQM and a special type of rate control rule at the source.

The rest of this chapter is organized as follows: In Section 3.2, we use Lagrangian optimization theory to analyze the congestion control problem and derive an expression for the utility function for TCP Reno. In Section 3.3, we introduce and analyze Generalized additive increase/multiplicative decrease (GAIMD) algorithms, Section 3.4 discusses the application of control theory to the congestion control problem and the derivation of system stability criteria, and Section 3.5 explores a recent result called the AP that has proven to be very useful in designing congestion control algorithms.

An initial reading of this chapter can be done in the following sequence: 3.1→3.2→3.4 (3.4.1 and 3.4.1.1), which covers the basic results needed to understand the material in Part 2 of the book. The most important concepts covered in these sections are that of the formulation of congestion control as the solution to an optimization problem and the application of classical Nyquist stability criteria to congestion control algorithms. More advanced readers can venture into Sections 3.3, 3.4.1.2, 3.4.1.3, and 3.5, which cover the topics of GAIMD algorithms, advanced AQM controllers, and the AP.

3.2 Congestion Control Using Optimization Theory

Consider a network with L links and N sources (Figure 3.1).

Figure 3.1 Illustration of the model for N sources sharing a link.

Define the following:

C_i: Capacity of the i^th link, for $1 \leq i \leq L$ , it is the i^th element of the column vector C

L_i: Set of links that are used by source i

X_li: Element of a routing LXN matrix X, such that X_li=1, if $l \in L_{i}$ , and 0 otherwise

R_i(t): Transmission rate of source i, for $1 \leq i \leq N$

r_i: Steady-state value of R_i(t)

Y_l(t): Aggregate rate at link l from all the N sources, for $1 \leq l \leq L$

y_l: Steady-state value of Y_i(t)

P_l(t): Congestion measure at link l, for $1 \leq l \leq L$ . This is later identified as the buffer occupancy at the link.

p_l: Steady-state value of P_l(t)

$τ_{l i}^{f} (t)$ : Propagation+Transmission+Queuing delay between the ith source and link l, in the forward direction

$τ_{l i}^{b} (t)$ : Propagation+Transmission+Queuing delay between the ith source and link l, in the backward direction

$T_{i} (t) = τ_{l i}^{f} (t) + τ_{l i}^{b} (t)$ Total round trip delay

Q_i(t): Aggregate of all congestion measures for source i, along its route for $1 \leq i \leq N$

q_i: Steady-state value of Q_i(t)

b_l(t): Buffer occupancy at the link l

Note that

$Y_{l} (t) = \sum_{i = 1}^{N} X_{l i} R_{i} (t - τ_{l i}^{f} (t)), 1 \leq l \leq L, and$ (1)

(1)

$Q_{i} (t) = \sum_{l = 1}^{L} X_{l i} P_{l} (t - τ_{l i}^{b} (t)), 1 \leq i \leq N$ (2)

(2)

In steady state,

$y = X r and q = X^{T} p .$ (3)

(3)

Assume that the equilibrium data rate is given by

$r_{i} = f_{i} (q_{i}), 1 \leq i \leq N$ (4)

(4)

where f_i is a positive, strictly monotone decreasing function. This is a natural assumption to make because if the congestion along a source’s path increases, then it should lead to a decrease in its data rate.

Define

$U_{i} (r_{i}) = \int_{r_{i}} f_{i}^{- 1} (r_{i}) d r_{i} so that \frac{d U_{i} (r_{i})}{d r_{i}} = f_{i}^{- 1} (r_{i}) for 1 \leq i \leq N$ (5)

(5)

Because U_i has a positive increasing derivative, it follows that it is monotone increasing and strictly concave.

By construction, the equilibrium rate r_i is the solution to the maximization problem

$\max_{r_{i}} [U_{i} (r_{i}) - r_{i} q_{i}] .$ (6)

(6)

This equation has the following interpretation: If U_i(r_i) is the utility that the source attains as a result of transmitting at rate r_i, and q_i is price per unit data that it is charged by the network, then Equation 6 leads to a maximization of a source’s profit.

Note that Equation 6 is an optimization carried out by each source independently of the others (i.e., the solution r_i is individually optimal). We wish to show that r_i is also the solution to the following global optimality problem:

$\max_{r \geq 0} \sum_{i = 1}^{N} U_{i} (r_{i}), subject to$ (7)

(7)

$X r \leq C .$ (8)

(8)

Equations 7 and 8 constitute what is known as the primal problem. A unique maximizer, called the primal optimal solution, exists because the objective function is strictly concave, and the feasible solution set is compact. A fully distributed implementation to solve the optimality problem described by equations 7 and 8 is not possible because the sources are coupled to each other through the constraint equation 8).

There are two ways to approach this problem:

1. By modifying the objective function (equation 7) for the primal problem, by adding an extra term called the penalty or barrier function, or

2. By solving the problem that is dual to that described by equations 7 and 8.

In this chapter, we pursue the second option and solve the dual problem (we will briefly describe the solution based on the primal problem in Section 3.7). It can be shown that the solution to the primal problem leads to a direct feedback of buffer-related data without any averaging at the nodes, and the solution to the dual problem leads to processing at the nodes before feedback of more explicit information [1]. In general, we will show that the dual problem leads to congestion feedback that is proportional to the queue size at the congested node, and hence is more appropriate for modeling congestion control systems with AQM algorithms operating at the nodes.

The duality method is a way to solve equations 7 and 8 (Appendix 3.D), which naturally leads to a distributed implementation, as shown next. Let $λ_{i}$ be the Lagrange multipliers and define the Lagrangian L(r, $λ$ ) for equations 7 and 8 by

$\begin{array}{l} L (r, λ) & = \sum_{i = 1}^{N} U_{i} (r_{i}) - \sum_{l = 1}^{L} λ_{l} (y_{l} - C_{l}) \\ = \sum_{i = 1}^{N} U_{i} (r_{i}) - \sum_{l = 1}^{L} λ_{l} \sum_{i = 1}^{N} X_{l i} r_{i} + \sum_{l = 1}^{L} λ_{l} C_{l} \\ = \sum_{i = 1}^{N} U_{i} (r_{i}) - \sum_{i = 1}^{N} r_{i} \sum_{l = 1}^{L} X_{l i} λ_{l} + \sum_{l = 1}^{L} λ_{l} C_{l} \\ = \sum_{i = 1}^{N} [U_{i} (r_{i}) - {\bar{q}}_{i} r_{i}] + \sum_{l = 1}^{L} λ_{l} C_{l} where \end{array}$ (9)

(9)

${\bar{q}}_{i} = \sum_{l = 1}^{L} X_{l i} λ_{l} 1 \leq i \leq N$

The dual function is defined by

$\begin{array}{l} D (λ) & = \max_{r_{i} \geq 0} L (r, λ) \\ = \sum_{i = 1}^{N} \max_{r_{i} \geq 0} [U_{i} (r_{i}) - {\bar{q}}_{i} r_{i}] + \sum_{l = 1}^{L} λ_{l} C_{l} \end{array}$ (10)

(10)

Note that the values

$r_{i}^{\max} = {U_{i}'}^{- 1} ({\bar{q}}_{i}) = {U_{i}'}^{- 1} (\sum_{l = 1}^{L} X_{l i} λ_{l}), 1 \leq i \leq N$ (11)

(11)

that maximize $L (r, λ)$ , can be computed separately by each source without the need to coordinate with the other sources, in N separate subproblems. However, as equation 11 shows, a source needs information from the network, in the form of ${\bar{q}}_{i}$ , before it can compute its optimum rate. Hence, to complete the solution we need to solve the dual problem (i.e., find $λ_{l}, 1 \leq l \leq L$ ) such that

$\min_{λ \geq 0} D (λ)$ (12)

(12)

and substitute them into the equation 11 for $r_{i}^{\max}$ . The convex duality theorem then states that the optimum $r_{i}^{\max}$ computed in equation 10 also maximize the original primal problem (equation 7).

The dual problem (equation 12) can be solved using the gradient projection method [8], such that

$λ_{l}^{n + 1} = {[λ_{l}^{n} - γ \frac{\partial D (λ)}{\partial λ_{l}}]}^{+}$ (13)

(13)

where $γ > 0$ is the step size and [z]⁺=max{z,0}. From equation 10, it follows that

$\frac{\partial D (λ)}{\partial λ_{l}} = C_{l} - \sum_{i = 1}^{N} X_{l i} r_{i}^{\max} = C_{l} - y_{l} (r^{\max}), 1 \leq l \leq L$ (14)

(14)

Substituting equation 14 back into equation 13, we get

$λ_{l}^{n + 1} = {[λ_{l}^{n} + γ (y_{l} (r_{i}^{\max}) - C_{l})]}^{+}$ (15)

(15)

Note that the Lagrange multipliers $λ_{l}$ behave as a congestion measure at the link because this quantity increases when the aggregate traffic rate at the link y_l(r^max) exceeds the capacity C_l of the link and conversely decreases when the aggregate traffic falls below the link capacity. Hence, it makes sense to identify the Lagrange multipliers $λ_{l}$ with the link congestion measure p_l, so that $λ_{l} = p_{l}, 1 \leq l \leq L$ , and

$p_{l}^{n + 1} = {[p_{l}^{n} + γ (y_{l} (r_{i}^{\max}) - C_{l})]}^{+}$ (16a)

(16a)

or in the fluid limit

$\frac{d p_{l}}{d t} = {\begin{array}{l} γ (y_{l} (t) - C_{l}) & i f & p_{l} (t) > 0 \\ γ {(y_{l} (t) - C_{l})}^{+} & i f & p_{l} (t) = 0 \end{array}$ (16b)

(16b)

Equations 11 and 16 constitute the solutions to the dual problem. This solution can be implemented in a fully distributed way at the N sources and L links, in the following way:

At link l, $1 \leq l \leq L$

1. Link l obtains an estimate of the total rate of the traffic from all sources that pass through it, y_l.

2. It periodically computes the congestion measure $p_{l}$ using equation 16a, and this quantity is communicated to all the sources whose route passes through link l. This communication can either explicit as in ECN schemes or implicit as in random packet drops with RED.

At source i, $1 \leq i \leq N$

1. Source i periodically computes the aggregate congestion measure for all the links which lie along its route given by

$q_{i}^{n} = \sum_{l = 1}^{L} X_{l i} p_{l}^{n}$ (17)

(17)

2. Source i periodically chooses its new rate using the formula

$r_{i}^{n} = {U_{i}'}^{- 1} (q_{i}^{n})$ (18)

(18)

From equations 17 and 18, we obtain the rate control equations for the congestion control problem, if the utility function U_i is known. This distributed procedure is strongly reminiscent of the way TCP congestion control operates, and hence it will not come as a surprise that it can be put into this framework. Hence, TCP congestion control can be interpreted as a solution to the problem of maximizing a global network utility function (equation 7) under the constraints (equation 8). In the next section, we obtain expressions for the utility function U for TCP Reno.

Note that for the case $γ = 1$ , equation 16b is precisely the equation satisfied by the queue size process at the node; hence, the feedback variable p_l(t) can be identified as the queue size at link l. AQM type schemes can also be put in this framework, as explained next.

The RED algorithm can be described as follows: Let b_l be the queue length at node l and let b_l^av be its average; then they satisfy the following equations:

$\begin{array}{l} b_{l}^{n + 1} = {[b_{l}^{n} + y_{l}^{n} - C_{l}]}^{+} \\ b_{l}^{a v, n + 1} = (1 - α_{l}) b_{l}^{a v, n} + α_{l} b_{l}^{n} \end{array}$ (19)

(19)

In the fluid limit, these become

$\frac{d b_{l} (t)}{d t} = {\begin{array}{l} y_{l} (t) - C_{l} & i f & b_{l} (t) > 0 \\ {[y_{l} (t) - C_{l}]}^{+} & i f & b_{l} (t) = 0 \end{array}$ (20)

(20)

and

$\frac{d b_{l}^{a v} (t)}{d t} = - α_{l} C_{l} (s_{l} (t) - b_{l}^{a v} (t))$ (21)

(21)

for some constant $0 \leq α \leq 1$ . Then the dropping probability p_l is given by

$p_{l}^{n} = {\begin{matrix} 0 & i f & b_{l}^{a v, n} < B_{\min} \\ K (b_{l}^{a v, n} - B_{\min}) & i f & B_{\min} < b_{l}^{a v, n} < B_{\max} \\ 1 & i f & b_{l}^{a v, n} > B_{\max} \end{matrix}$ (22)

(22)

If we ignore the queue length averaging and let B_min=0 and consider only the linear portion of equation 22, then the dropping probability becomes

$p_{l}^{n} = K b_{l}^{n}$ (23)

(23)

This is referred to as a proportional controller and is discussed further in Section 3.4.1.2 of this chapter. Taking the fluid limit and using equation 20, we get

$\begin{array}{l} \frac{d p_{l} (t)}{d t} & = K \frac{d b_{l} (t)}{d t} \\ = {\begin{array}{l} K (y_{l} (t) - C_{l}) & i f & b_{l} (t) > 0 \\ {K [y_{l} (t) - C_{l}]}^{+} & i f & b_{l} (t) = 0 \end{array} \end{array}$ (24)

(24)

But equation 24 is exactly in the form (equation 16b) that was derived from the gradient projection method for minimizing the dual problem. Hence, a proportional controller–type RED arises as a natural consequence of solving the dual problem at the network nodes.

3.2.1 Utility Function for TCP Reno

There are two ways in which the theoretical results from Section 3.2 can be used in practice:

• Given the system dynamics and the source rate function f (equation 4), compute the network utility function U (equation 5) that this rate function optimizes.

• Given a network utility function U, find the system dynamics and the source rate function that optimizes this utility function.

In this section, we use the first approach, in which a fluid model description for TCP Reno is used to derive the network utility function that it optimizes. We start by deriving an equation for the TCP Reno’s window dynamics in the fluid limit.

Note that the total round trip delay is given by

$T_{i} (t) = D_{i} + \sum_{l} X_{l i} \frac{b_{l} (t)}{C_{l}}$ (25)

(25)

where $D_{i} = T_{i d} + T_{i u}$ is the total propagation delay and b_l(t) is the i^th queue length at time t. The source i rate R_i(t) at time t is defined by

$R_{i} (t) = \frac{W_{i} (t)}{T_{i} (t)}$ (26)

(26)

Using the notation from Section 3.2.1, the aggregate congestion measure Q_i(t) at a source can be written as

$Q_{i} (t) = 1 - \prod_{l \in L_{i}} (1 - P_{l} (t - τ_{l i}^{b} (t))) \approx \sum_{l \in L_{i}} P_{l} (t - τ_{l i}^{b} (t))$ (27)

(27)

because it takes $τ_{l i}^{b} (t)$ seconds for the ACK feedback from link l to reach back to source i. Note that equation 27 is in the form required in equation 2. The aggregate flow rate at link l is given by

$Y_{l} (t) = \sum_{l} X_{l i} R_{i} (t - τ_{l i}^{f} (t))$ (28)

(28)

because it takes $τ_{l i}^{f} (t)$ seconds for the data rate at source i to reach link l.

The rate of change in window size at source i for TCP Reno is then given by the following equation in the fluid limit:

$\frac{d W_{i} (t)}{d t} = \frac{R_{i} (t - T_{i} (t)) [1 - Q_{i} (t)]}{W_{i} (t)} - R_{i} (t - T_{i} (t)) Q_{i} (t) \frac{1}{2} \frac{4 W_{i} (t)}{3}$ (29)

(29)

The first term on the Right Hand Side (RHS) of equation 29 captures the rate of increase in window size, which is the rate at which positive ACKs return to the source multiplied by the increase in window size caused by each such ACK during the congestion avoidance phase. The second term captures the rate of decrease of the window size when the source gets positive congestion indications from the network (the factor 4/3 is added to account for the small-scale fluctuation of W_i(t); see Low [3]). Note that equation 29 ignores the change in window size caused by timeouts.

We now make the following approximations: We ignore the queuing delays in equation 25 so that the round trip delay is now fixed and approximated by

$T_{i} = D_{i} + \sum_{l} \frac{X_{l i}}{C_{l}}$ (30)

(30)

The throughput is then approximated by

$R_{i} (t) \approx \frac{W_{i} (t)}{T_{i}} .$ (31)

(31)

It follows from (29) and (31) that

$\frac{d R_{i} (t)}{d t} \approx \frac{R_{i} (t - T_{i}) [1 - Q_{i} (t)]}{R_{i} (t) T_{i}^{2}} - \frac{2 R_{i} (t) R_{i} (t - T_{i}) Q_{i} (t)}{3}$ (32)

(32)

We now assume that R_i(t)≈R_i(t−T_i), so that equation 32 reduces to

$\frac{d R_{i} (t)}{d t} \approx \frac{[1 - Q_{i} (t)]}{T_{i}^{2}} - \frac{2 R_{i}^{2} (t) Q_{i} (t)}{3}$ (33)

(33)

Note that we ignored the propagation delays in the simplifications that led to equation 33. However, modeling these delays is critical in doing a stability analysis of the system, which we postpone to Section 3.4.

In steady state, it follows from equation 33 that

$r_{i} = \frac{1}{T_{i}} \sqrt{\frac{3}{2} \frac{1 - q_{i}}{q_{i}}}$ (34)

(34)

so that we again recovered the square-root formula. Its inverse is given by

$q_{i} = \frac{\frac{1}{T_{i}^{2}}}{\frac{1}{T_{i}^{2}} + \frac{2}{3} r_{i}^{2}} = \frac{1}{1 + \frac{2}{3} r_{i}^{2} T_{i}^{2}}$ (35)

(35)

which is in the form postulated in equation 4. From equation 5, it follows that the utility function for TCP Reno is given by

$U_{i} (r_{i}) = \int \frac{\frac{1}{T_{i}^{2}}}{(\frac{1}{T_{i}^{2}} + \frac{2}{3} r_{i}^{2})} d r_{i} = \frac{\sqrt{3 / 2}}{T_{i}} \tan^{- 1} (\sqrt{\frac{2}{3}} r_{i} T_{i})$ (36)

(36)

Note that for small values of q_i, equation 34 reduces to the expression that is familiar from Chapter 2

$r_{i} = \frac{1}{T_{i}} \sqrt{\frac{3}{2 q_{i}}}$ (37)

(37)

If equation 37 is used in equation 5 instead of equation 34, then

$q_{i} = \frac{3}{2 r_{i}^{2} T_{i}^{2}}$

and the utility function assumes the simpler form

$U_{i} (r_{i}) = - \frac{1.5}{T_{i}^{2} r_{i}}$ (38)

(38)

Equation 38 is of the form

$U_{i} (r_{i}) = - \frac{w_{i}}{r_{i}} with w_{i} = \frac{1}{T_{i}^{2}}$ (39)

(39)

Utility functions of the form equation 39 are known to lead to rates that minimize “potential delay fairness” in the network (Appendix 3.E), that is, they minimize the overall potential delay of transfers in progress.

If the utility function is of the form

$U_{i} (r_{i}) = w_{i} \log r_{i}$ (40)

(40)

then it maximizes the “proportional fairness” in the network. It can be shown that TCP Vegas’s utility function is of the form (40)[9] hence it achieves proportional fairness.

A utility function of type

$U_{i} (r_{i}) = \lim_{α \to \infty} \frac{r_{i}^{1 - α}}{1 - α}$ (41)

(41)

leads to a max-min fair allocation of rates [10]. Max-min fairness is in some sense the ideal way of allocating bandwidth in a network; however, in general, it is difficult to achieve max-min fairness using AIMD type algorithms. It can be achieved by using explicit calculations at the network nodes, as was done by the ATM ABR scheme.

The theory developed in the previous two sections forms a useful conceptual framework for designing congestion control algorithms. In Chapter 9, we discuss the recent application of these ideas to the design of congestion control algorithms, which also uses techniques from machine learning theory.

3.3 Generalized TCP–Friendly Algorithms

In this section, we use the theory developed in Section 3.2 to analyze a generalized version of the TCP congestion control algorithm with nonlinear window increment–decrement rules, called GAIMD. We then derive conditions on the GAIMD parameters to ensure that if TCP GAIMD and Reno pass through a bottleneck link, then they share the available bandwidth fairly. This line of investigation was originally pursued to design novel congestion control algorithms for traffic sources such as video, which may not do well with TCP Reno.

Define the following:

$α_{s} (R_{s} (t))$ : Rule for increasing the data rate in the absence of congestion

$β_{s} (R_{s} (t))$ : Rule for decreasing the rate in the presence of congestion

Then the rate dynamics is governed by the following equation:

$\frac{d R_{s} (t)}{d t} = (1 - Q_{s} (t)) R_{s} (t - T_{s}) α_{s} (R_{s} (t)) - Q_{s} (t) R_{s} (t - T_{s}) β_{s} (R_{s} (t))$ (42)

(42)

Comparing equation 42 with equation 32, it follows that for TCP Reno,

$α_{s} (R_{s}) = \frac{1}{R_{s} T_{s}^{2}} and β_{s} (R_{s}) = \frac{R_{s}}{2}$ (43)

(43)

From equation 42, it follows that in equilibrium,

$q_{s} = \frac{α_{s} (r_{s})}{α_{s} (r_{s}) + β_{s} (r_{s})} = f_{s} (r_{s})$ (44)

(44)

so that the utility function for GAIMD is given by

$U_{s} (r_{s}) = \int \frac{α_{s} (r_{s})}{α_{s} (r_{s}) + β_{s} (r_{s})} d r_{s}$ (45)

(45)

Define an algorithm to be TCP-friendly if its utility function coincides with that of TCP Reno. From equations 44 and 35, it follows that an algorithm with increase–decrease functions given by $(α_{s}, β_{s})$ is TCP friendly if and only if

$\frac{α_{s} (r_{s})}{α_{s} (r_{s}) + β_{s} (r_{s})} = \frac{2}{2 + r_{s}^{2} T_{s}^{2}} i.e. \frac{α_{s} (r_{s})}{β_{s} (r_{s})} = \frac{2}{r_{s}^{2} T_{s}^{2}}$ (46)

(46)

Following Bansal and Balakrishnan [11], we now connect the rate increase–decrease rules to the rules used for incrementing and decrementing the window size.

Consider a nonlinear window increase–decrease function parameterized by integers (k,l) and constants $(α, β)$ and of the following form (these rules are applied on a per RTT basis):

$W \leftarrow W + \frac{α}{W^{k}} On positive ACK$ (47a)

(47a)

$W \leftarrow W - β W^{l} On packet drop$ (47b)

(47b)

Equations 47a and 47b are a generalization of the AIMD algorithm to nonlinear window increase and decrease and hence are known as GAIMD algorithms. For (k,l)=(0,1), we get AIMD; for (k,l)=(−1,1), we get multiplicative increase/multiplicative decrease (MIMD); for (k,l)=(−1,0), we get multiplicative increase/additive decrease (MIAD); and for (k,l)=(0,0), we get additive increase/additive decrease (AIAD).

Using the same arguments as for TCP Reno, the window dynamics for this algorithm are given by

$\frac{d W_{s} (t)}{d t} = (1 - Q_{s} (t)) R_{s} (t - T) \frac{α}{W_{s}^{k + 1} (t)} - Q_{s} (t) R_{s} (t - T_{s}) β W_{s}^{l} (t)$

Substituting $W_{s} (t) = R_{s} (t) T_{s}$ , it follows that

$\frac{d R_{s} (t)}{d t} = (1 - Q_{s} (t)) R_{s} (t - T_{s}) \frac{α}{R_{s}^{k + 1} (t) T_{s}^{k + 2}} - Q_{s} (t) R (t - T_{s}) β R_{s}^{l} (t) T_{s}^{l - 1}$ (48)

(48)

Comparing this equation with equation 42, we obtain the following expressions for the rate increase–decrease functions for the GAIMD source that corresponds to equations 47a and 47b:

$α_{s} (r_{s}) = \frac{α}{r_{s}^{k + 1} T_{s}^{k + 2}} and β_{s} (r_{s}) = β r_{s}^{l} T_{s}^{l - 1}$ (49)

(49)

Substituting equation 49 back into the TCP friendliness criterion from equation 46 yields

$\frac{α}{β} \frac{1}{{(r_{s} T_{s})}^{k + l + 1}} = \frac{2}{r_{s}^{2} T_{s}^{2}}$ (50)

(50)

Hence, the algorithm is TCP friendly if and only if

$k + l = 1 and \frac{α}{β} = 2$ (51)

(51)

Note that if l<1, then this implies that window is reduced less drastically compared with TCP on detection of network congestion. Using equation 51, it follows that in this case, 0<k<1, so that the window increase is also more gradual as compared with TCP.

If equate dR_s/dt=0 in equilibrium, then we obtain the following expression, which is the analog of the square-root formula for GAIMD algorithms:

$r_{s} = {(\frac{α}{β})}^{1 / k + l + 1} \frac{1}{T_{s}} (\frac{1}{q_{s}^{1 / k + l + 1}} - 1) \approx {(\frac{α}{β})}^{1 / k + l + 1} \frac{1}{T_{s}} \frac{1}{q_{s}^{1 / k + l + 1}}$ (52)

(52)

Note that the condition k+l=1 also leads to the square root law for TCP throughput.

3.4 Stability Analysis of TCP with Active Queue Management

The stability of a congestion control system is defined in terms of the behavior of the bottleneck queue size b(t). If the bottleneck queue size fluctuates excessively and very frequently touches zero, thus leading to link under utilization, then the system is considered to be unstable. Also, if the bottleneck queue size grows and spends all its time completely full, which leads to excessive packet drops, then again the system is unstable. Hence, ideally, we would like to control the system so that the bottleneck queue size stays in the neighborhood of a target length, showing only small fluctuations.

It has been one of the achievements of fluid modeling of congestion control systems that it makes it possible to approach this problem by using the tools of classical control system theory. This program was first carried out by Vinnicombe [12] and Holot et al. [6,7] The latter group of researchers modeled TCP Reno, for which they analyzed the RED controller, and another controller that they introduced called the PI controller. Since then, the technique has been applied to many other congestion control algorithms and constitutes one of the basic techniques in the toolset for analyzing congestion control.

To analyze the stability of the system, we will follow Holot and colleagues [6,7] and use a simplified model shown in Figure 3.2. In particular, we assume that there are N homogeneous TCP sources, all of which pass through single bottleneck node.

Figure 3.2 Block diagram of the control system.

Initially, let’s consider the open-loop system shown in Figure 3.3. Later in this and subsequent sections, we will connect the queue length process b(t) with congestion indicator process Q(t), using a variety of controllers. Using equation 29, its window control dynamics are given by

$\frac{d W (t)}{d t} = \frac{R (t - T (t)) [1 - Q (t)]}{W (t)} - R (t - T (t)) W (t) \frac{Q (t)}{2}$ (53)

(53)

and substitute W(t)=R(t)T(t) in the first term on the RHS and make the approximations $R (t) \approx R (t - T (t)), 1 - Q (t) \approx 1$ , leading to the equation

$\frac{d W (t)}{d t} = \frac{1}{T (t)} - \frac{W (t) W (t - T (t))}{2 T (t - T (t))} Q (t)$ (54)

(54)

where $T (t) = \frac{b (t)}{C} + D$ .

The fluid approximation for the queue length process at the bottleneck can be written as

$\frac{d b (t)}{d t} = \frac{W (t)}{T (t)} N (t) - C$ (55)

(55)

Equations 54 and 55 give the nonlinear dynamics for the rate control and queuing blocks, respectively, in Figure 3.3. To simplify the optimal control problem so that we can apply the techniques of optimal control theory, we now proceed to linearize these equations. With (W(t), b(t)), and Q(t) as the input, the operating point (W₀, Q₀,P₀) is defined by dW/dt=0 and db/dt=0, which leads to

$\frac{d W}{d t} = 0 \Rightarrow W_{0}^{2} Q_{0} = 2$ (56)

(56)

$\frac{d b}{d t} = 0 \Rightarrow W_{0} = \frac{C T_{0}}{N}$ (57)

(57)

where

$T_{0} = \frac{b_{0}}{C} + D as per Equation 25$ (58)

(58)

Assuming N(t)=N and T(t)=T₀ as constants, equations 54 and 55 can be linearized around the operating points (W₀, b₀, Q₀) (Appendix 3.A), resulting in

$\frac{d W_{δ} (t)}{d t} = - \frac{2 N}{C T_{0}^{2}} W_{δ} (t) - \frac{C^{2} T_{0}}{2 N^{2}} Q_{δ} (t)$ (59)

(59)

$\frac{d b_{δ} (t)}{d t} = \frac{N}{T_{0}} W_{δ} (t) - \frac{1}{T_{0}} b_{δ} (t)$ (60)

(60)

where

$\begin{array}{l} W_{δ} = W - W_{0} \\ b_{δ} = b - b_{0} \\ Q_{δ} = Q - Q_{0} \end{array}$

The Laplace transforms of equations 59 and 60 are given by

$U_{t c p} (s) = \frac{W_{δ} (s)}{Q_{δ} (s)} = \frac{\frac{T_{0} C^{2}}{2 N^{2}}}{s + \frac{2 N}{T_{0}^{2} C}} U_{queue} (s) = \frac{b_{δ} (s)}{W_{δ} (s)} = \frac{\frac{N}{T_{0}}}{s + \frac{1}{T_{0}}}$ (61)

(61)

so that the Laplace transform of the plant dynamics for the linearized TCP+queuing system is

$U (s) = U_{t c p} (s) U_{queue} (s) = \frac{\frac{C^{2}}{2 N}}{(s + \frac{2 N}{T_{0}^{2} C}) (s + \frac{1}{T_{0}})} = \frac{\frac{{(C T_{0})}^{3}}{{(2 N)}^{2}}}{(\frac{s}{2 N / C T_{0}^{2}} + 1) (s T_{0} + 1)}$ (62)

(62)

and it relates the change in the packet marking probability $Q_{δ}$ to the change in queue size $b_{δ}$ .

System stability can be studied with the help of various techniques from optimal control theory, including Nyquist plots, Bode plots, and root locus plots [13]. Of these, the Nyquist plot is the most amenable to analyzing systems with delay lags of the type seen in these systems and hence is used in this chapter. Appendix 3.B has a quick introduction to the topic of the Nyquist stability criterion that the reader may want to consult at this point.

Assuming initially that there is no controller present, so that there is no AQM stabilization and the queue size difference is fed back to the sender with no delay, then the system is as shown in Figure 3.5, so that $Q_{δ} = P_{δ} = b_{δ}$ :

From Equation 62, the open loop transfer function of this system is of the form

$U (s) = \frac{K}{(a s + 1) (b s + 1)}$

with a,b>0. The corresponding close loop transfer function W(s) is given by

$W (s) = \frac{U (s)}{1 + U (s)} = \frac{K}{(a s + 1) (b s + 1) + K} .$

Using the Nyquist criterion, we can see that the locus of $U (j ω)$ as $ω$ varies from 0 to $\infty$ , does not encircle the point (−1, 0), even for large values of K; hence, the system is unconditionally stable (see Figure 3.4A).

Figure 3.4 Example Nyquist plots without loop delay (A) and with loop delay (B).

Figure 3.5 System without Active Queue Management (AQM) controller with propagation delay ignored.

Next let’s introduce the propagation delay into the system, so that $Q_{δ} (t) = P_{δ} (t - T)$ , as shown in Figure 3.6. In this case, equation 59 changes to

$\frac{d W_{δ} (t)}{d t} = - \frac{2 N}{C T_{0}^{2}} W_{δ} (t) - \frac{C^{2} T_{0}}{2 N^{2}} P_{δ} (t - T)$

Figure 3.6 System without Active Queue Management (AQM) controller but with propagation delay included.

This introduction of the delay in the feedback loop can lead to system instability, as shown next.

The open loop and closed loop transfer functions for this system are given by

$\begin{array}{l} U' (s) = U (s) e^{- s T_{0}} = \frac{K e^{- s T_{0}}}{(a s + 1) (b s + 1)} \\ W' (s) = \frac{K e^{- s T_{0}}}{(a s + 1) (b s + 1) + K e^{- s T_{0}}} \end{array}$

From the Nyquist plot for $U' (j ω)$ shown in Figure 3.4B, note the following:

• The addition of the delay component has led to a downward spiraling effect in the locus of $U' (j ω)$ . This is because $U' (j ω) = | U' (j ω) | e^{- j \arg (U' (j ω))}$ where the point $(ω, U' (j ω))$ on the curve makes the following angle with the real axis:

$arg (U' (j ω)) = ω T_{0} + \tan^{- 1} a ω + \tan^{- 1} b ω$

Hence, as $ω$ increases, the first term on the RHS increases linearly, and the other two terms are limited to at most $\frac{π}{2}$ each.

• The system is no longer unconditionally stable, and as the gain K increases, it may encircle the point (−1,0) in the clockwise direction, thus rendering it unstable.

From equation 62, the gain K for TCP Reno is given by

$K = \frac{{(C T_{0})}^{3}}{{(2 N)}^{2}}$

so that the system can become unstable in the presence of feedback if either (1) C increases or (2) T₀ increases or if N decreases. Thus, this proves that high link capacity or high round trip latency can cause system instability.

Although the onset of instability with increasing C or T₀ is to be expected, the variation with N implies that the system gains in stability with more sessions. The intuitive reason for this is that more sessions average out the fluctuations because of the high variations in TCP window size (especially the decrease by half on detecting congestion). With only a few sessions, these variations affect the stability of the system because even a single packet drop causes a large drop in window size. On the other hand, with many connections, the decrease in window size of any one of them does not have as large an effect. Some of the high-speed variations of TCP that we will meet in Chapter 5 reduce their windows by less than half on detecting packet loss, thus reducing K and subsequently increasing their stability in the presence of large C and T₀.

3.4.1 The Addition of Controllers to the Mix

Because the system shown in Figure 3.6 can become unstable, we now explore the option of adding a controller into the loop, as shown in Figure 3.7, and then adjusting the controller parameters in order to stabilize the system. We will first analyze the RED controller, and then based on the learning from its analysis, introduce two other controllers, the Proportional and the PI controllers, that are shown to have better performance than the RED controller.

Figure 3.7 Congestion control system with a controller in the loop.

3.4.1.1 Random Early Detection (RED) Controllers

The RED controller is shown in Figure 3.8, and its Laplace transform is given by (see Appendix 3.C for a derivation)

$V_{r e d} (s) = \frac{L_{r e d}}{\frac{s}{K} + 1} where$ (63)

(63)

$L_{r e d} = \frac{\max_{p}}{\max_{t h} - \min_{t h}} and K = - \frac{\log (1 - w_{q})}{δ}$ (64)

(64)

where w_q is the weighing factor used to smoothen out the queue size estimate and $δ$ is the queue size sampling frequency, which is set to 1/C. The other RED parameters are defined in Section 1.4.2 in Chapter 1. As Figure 3.8 shows, the RED controller consists of a Low Pass Filter (LPF) followed by a constant gain.

Figure 3.8 Random Early Detection (RED) controller.

The objective of the optimal control problem is to choose RED parameters L_red and K to stabilize the system shown in Figure 3.7. Note that the open loop transfer function for the system is given by

$\begin{array}{l} T (s) & = U_{t c p} (s) U_{queue} (s) V_{r e d} (s) e^{- s T_{0}} \\ = \frac{\frac{L_{r e d} {(C T_{0})}^{3}}{{(2 N)}^{2}} e^{- s T_{0}}}{(\frac{s}{K} + 1) (\frac{s}{2 N / C T_{0}^{2}} + 1) (\frac{s}{1 / T_{0}} + 1)} \end{array}$ (65)

(65)

To apply the Nyquist criterion, set $s = j ω$ , so that

$T (j ω) = | T (j ω) | e^{- j arg (T (j ω))}$ where

$| T (j ω) | = \frac{\frac{L_{r e d} {(C T_{0})}^{3}}{{(2 N)}^{2}}}{\sqrt{{(\frac{ω}{K})}^{2} + 1} \sqrt{{(\frac{ω}{2 N / C T_{0}^{2}})}^{2} + 1 \sqrt{{(\frac{ω}{1 / T_{0}})}^{2} + 1}}} and$ (66)

(66)

$arg (T (j ω)) = ω T_{0} + \tan^{- 1} \frac{ω}{K} + \tan^{- 1} \frac{ω}{2 N / C T_{0}^{2}} + \tan^{- 1} \frac{ω}{1 / T_{0}}$ (67)

(67)

Holot et al. [6] proved the following result:

Let L_red and K satisfy:

$\frac{L_{r e d} {(T^{+} C)}^{3}}{{(2 N^{-})}^{2}} \leq \sqrt{\frac{ω_{c}^{2}}{K^{2}} + 1} where$ (68)

(68)

$ω_{c} = 0.1 \min {\frac{2 N^{-}}{{(T^{+})}^{2} C}, \frac{1}{T^{+}}}$ (69)

(69)

then the linear feedback control system in Figure 3.7 using a RED controller is stable for all $N \geq N^{-}$ and all $T_{0} \leq T^{+}$ .

To prove this, we start with the following observation: From equation 66, it follows that

$| T (j ω) | \leq \frac{\frac{L_{r e d} {(C T_{0})}^{3}}{{(2 N)}^{2}}}{\sqrt{{(\frac{ω}{K})}^{2} + 1}}$ (70)

(70)

Furthermore, if $ω = ω_{c}, N \geq N^{-}, T_{0} \leq T^{+}$ , then it follows from equations 68 and 70 that

$| T (j ω_{c}) | \leq 1$ (71)

(71)

From this, we conclude that the critical frequency $ω_{*}$ at which $| T (j ω_{*}) | = 1$ , satisfies the relation $ω_{*} \leq ω_{c}$ (this is because $| T (j ω) |$ is monotonically decreasing in $ω$ ). Hence, it follows from the Nyquist criterion that if we can show that the angle $\arg T (j ω_{c})$ satisfies the condition