12
Duality Revisited

12.1 Introduction

The material presented in this chapter is largely theoretical in nature and may be considered to be a more “refined” approach to the study of linear programming and duality. That is to say, the mathematical techniques employed herein are quite general in nature in that they encompass a significant portion of the foundations of linear (as well as nonlinear) programming. In particular, an assortment of mathematical concepts often encountered in the calculus, along with the standard matrix operations which normally underlie the theoretical development of linear programming, are utilized to derive the Karush‐Kuhn‐Tucker necessary conditions for a constrained extremum and to demonstrate the formal equivalence between a solution to the primal maximum problem and the associated saddle‐point problem. Additionally, the duality and complementary slackness theorems of the preceding chapter are reexamined in the context of this “alternative” view of the primal and dual problems.

12.2 A Reformulation of the Primal and Dual Problems

Turning to the primal problem, let us maximize f(X) = CX subject to AX ≤ b, X ≥ O, X ∈ En, where A is of order (m × n) with rows a1, …, am, and b is an (m × 1) vector with components b1, …, bm. Alternatively, if

images

are, respectively, of order (m + n × n) and (m + n × 1), then we may maximize f(X) = CX subject to images where X ∈ En is now unrestricted. In this formulation xj ≥ 0, j = 1, …, n, is treated as a structural constraint so that images defines a region of feasible or admissible solutions ⊆ En.

If X0 yields an optimal solution to the primal maximum problem, then no permissible change in X (i.e. one that does not violate any of the constraints specifying K) can improve the optimal value of the objective function. How may we characterize such admissible changes? It is evident that, if starting at some feasible X, we can find a vector h such that a small movement along it violates no constraint, then h specifies a feasible direction at X. More formally, let δ(X0) be a suitably restricted spherical δ – neighborhood about the point X0 ∈ En. Then the (n × 1) vector h is a feasible direction at X0 if there exists a scalar t ≥ 0, 0 ≤ t < δ, such that X0 + th is feasible. In this light, for δ(X0) again a spherical δ − neighborhood about the feasible point X0 ∈ En, the set of feasible directions at X0, D(X0), is the collection of all (n × 1) vectors h such that X0 + th is feasible for t sufficiently small, i.e.

images

Here D(X0) is the tangent support cone consisting of all feasible directions at X0 and is generated by the tangent or supporting hyperplanes to K at X0 (Figure 12.1). And since each such hyperplane

images
Graph depicting patterns (D(Xo)) formed below an intersecting ascending (ar X=br (tangent hyperplane) and descending (as X=bs (tangent hyperplane) lines with a dot marker (X0) at the intersection point.

Figure 12.1 Set of feasible directions D(X0).

specifies a closed half‐space images the tangent support cone D(X0) represents that portion of the intersection of these closed half‐planes in the immediate vicinity of (X0).

To discern the relationship between D(X0)(X0 optimal) and: (i) f(X0); (ii) the constraints images let us consider the following two theorems, starting with Theorem 12.1.

Hence f must decrease along any feasible direction h. Geometrically, no feasible direction may form an acute angle (<π/2) between itself and C.

To set the stage for our next theorem, we note that for any optimal XK, it is usually the case that not all the inequality constraints are binding or hold as an equality. To incorporate this observation into our discussion, let us divide all of the constraints images into two mutually exclusive classes – those that are binding at images and those which are not, images So if

images

depicts the index set of binding constraints, then images for iI. In this regard, we have Theorem 12.5.

To interpret this theorem, we note that images may be characterized as an inward pointing or interior normal to images at X0. So if h is a feasible direction, it makes a nonobtuse angle (≤π/2) with all of the interior normals to the boundary of K at X0. Geometrically, these interior normal or gradients of the binding constraints form a finite cone containing all feasible directions making nonobtuse angles with the supporting hyperplanes to K at X0. Such a cone is polar to the tangent support cone D(X0) and will be termed the polar support cone D(X0)+(Figure 12.2a) Thus, D(X0)+ is the cone spanned by the gradients images such that for hD images iJ. Looked at from another perspective, images may be considered an outward‐pointing or exterior normal to the boundary of K at X0. In this instance, if h is a feasible direction, it must now make a non‐acute angle (≥π/2) with all of the outward‐pointing normals to the boundary of K at X0. Again looking to geometry, the exterior normals or negative gradients of the binding constraints form a finite cone containing all feasible directions making nonacute angles with the hyperplanes tangent to K at X0. This cone is the dual of the tangent support cone D(X0) and is termed the dual support cone D(X0)*(Figure 12.2b). Thus, D(X0)* is the cone spanned by the negative gradients images such that, for all hD images iJ.

Image described by caption and surrounding text.

Figure 12.2 (a) Polar support cone D(X0)+; (b) Dual support cone D(X0)*.

At this point, let us collect our major results. We found that if f(X) subject to images iJ, attains its maximum at X0, then

How may we interpret this condition? Given any hD(X0), (12.1) holds if the gradient of f or C lies within the finite cone spanned by the exterior normals images iJ, i.e. CD(X0)*(or if −C is contained within the finite cone generated by the interior normals images iJ, i.e. −CD(X0)+). Hence (12.1) requires that the gradient of f be a nonnegative linear combination of the negative gradients of the binding constraints at X0 (Figure 12.3). In this regard, there must exist real numbers images such that

Graph displaying an intersecting ascending and descending lines with dot marker (x0) at the intersection point. The dot marker has 3 upward arrows (a’r, C, and a’s) forming line patterns labeled D (X0)*.

Figure 12.3 C ∈ D(X0)*.

Under what conditions will the numbers images, iJ, exist? To answer this question, let us employ Theorem 12.3.

Returning now to our previous question, in terms of the notation used above, if: C = V; the vectors images iJ, are taken to be the columns of B (i.e. images); and the images iJ, are the elements of λ ≥ O, then a necessary and sufficient condition for C to lie within the finite cone generated by the vectors images is that Ch ≤ 0 for all h satisfying images i ∈ J. Hence, there exist real numbers images such that (12.2) holds.

We may incorporate all of the constraints (active or otherwise) images into our discussion by defining the scalar images as zero whenever images or iJ. Then (12.2) is equivalent to the system

where images Note that images for all i values since, if images then images while if images then images

When we formed the structural constraint inequality images the n non‐negativity conditions X ≥ O were treated as structural constraints. That is, xj ≥ 0 was converted to images However, if the non‐negativity conditions are not written in this fashion, but appear explicitly as X ≥ O, then (12.3) may be rewritten in an equivalent form provided by Theorem 12.4.

We note briefly that if the primal problem appears as

images

Then (12.4) becomes

(12.4.1) images

We now turn to a set of important corollaries to the preceding theorem (Corollaries 12.1, 12.2, and 12.3).

Next,

Finally,

So if the right‐hand side bi of the ith primal structural constraint were changed by a sufficiently small amount εi, then the corresponding change in the optimal value of f, f0, would be images Moreover, since images we may determine the amount by which the optimal value of the primal objective function changes given small (marginal) changes in the right‐hand sides of any specific subset of structural constraints. For instance, let us assume, without loss of generality, that the right‐hand sides of the first t < m primal structural constraints are increased by sufficiently small increments ε1, …, εt, respectively. Then the optimal value of the primal objective function would be increased by images

Throughout the preceding discussion the reader may have been wondering why the qualification “generally” was included in the statement of Corollary 12.3. The fact of the matter is that there exist values of, say, bk, for which ∂f0/∂bk is not defined, i.e. points where ∂f0/∂bk is discontinuous so that (12.10) does not hold. To see this, let us express f in terms of bk, as

images

where the images depict fixed levels of the requirements images.

If in Figure 12.4 we take area OABCG to be the initial feasible region when images (here the line segment CG consists of all feasible points X that satisfy images, then increasing the kth requirement from images changes the optimal extreme point from C to D. The result of this increase in bk is a concomitant increase in f0 proportionate to the distance images where the constant of proportionality is images Thus, over this range of bk values, ∂f0/∂bk exists (is continuous). If we again augment the level of bk by the same amount as before, this time shifting the kth structural constraint so that it passes through point E, ∂f0/∂bk becomes discontinuous since now the maximal point moves along a more steeply sloping portion (segment EF) of the boundary of the feasible region so that any further equal parallel shift in the kth structural constraint would produce a much larger increment in the optimal value of f than before. Thus, each time the constraint line akX = bk passes through an extreme point of the feasible region, there occurs a discontinuity in ∂f0/∂bk.

Image described by caption and surrounding text.

Figure 12.4 Increasing the requirements of bk.

To relate this argument directly to f(bk) above, let us describe the behavior of f(bk) as bk assumes the values images In Figure 12.5 the piecewise continuous curve f(bk) indicates the effect on f(bk) precipitated by changes in bk. (Note that each time the constraint akX = bk passes through an extreme point of the feasible region in Figure 12.4, the f(bk) curve assumes a kink, i.e. a point of discontinuity in its derivative. Note also that for a requirement level of images [say images], f cannot increase any further so that beyond F′, f(bk) is horizontal) As far as the dual objective function g(u) = bu is concerned, g may be expressed as g(bk) = bkuk+ constant since images is independent of bk. For images (images optimal), images Moreover, since the dual structural constraints are unaffected by changes in bk, dual feasibility is preserved for variations in bk away from images so that, in general, for any pair of feasible solutions to the primal and dual problems, f(bk) ≤ g(bk), i.e. g(bk) is tangent to f(bk) for images (point E′ in Figure 12.5) while images sufficiently small. In this regard, for images while for images whence images So while both the left‐ and right‐hand derivatives of f(bk) exist at images itself does not exist since images i.e. ∂f/∂bk possesses a finite discontinuity at images. In fact, ∂f/∂bk is discontinuous at each of the points, A′, B′, E′ and F′ in Figure 12.5. For images it is evident that images since f(bk), g(bk) coincide all along the segment B′ E′ of Figure 12.5. In this instance, we may unequivocally write images Moreover, for images (see Figures 12.4 and 12.5), images since the constraint images is superfluous, i.e. it does not form any part of the boundary of the feasible region. In general, then, ∂f/∂bk|+ ≤ uk ≤ ∂f/∂bk|.

Top: Graph displaying an ascending curve labeled f(bk) with dot markers labeled A, B, C, D, E, and F. Bottom: Graph displaying horizontal lines in between vertical lines connecting to dot markers indicated on top.

Figure 12.5 Tracking f(bk) as bk increases.

To formalize the preceding discussion, we state Theorem 12.5.

To summarize: if both the left‐ and right‐hand derivatives ∂f0/∂bi|, ∂f0/∂bi|+ exist and their common value is images then images otherwise images

12.3 Lagrangian Saddle Points

(Belinski and Baumol 1968; Williams 1963)

If we structure the primal‐dual pair of problems as

PRIMAL PROBLEM
images
DUAL PROBLEM
images

then the Lagrangian associated with the primal problem is L(X, U) = CX + U′(b − AX) with X ≥ O, U ≥ O. And if we now express the dual problem as maximize {−bU} subject to −AU ≤  − C,U ≥ O, then the Lagrangian of the dual problem is M(U, X) =  − bU + X′(−C + AU), where again U ≥ O, X ≥ O. Moreover, since M(U, X) =  − bU + XC + XAU =  − CX − Ub + UAX =  − L(X, U), we see that the Lagrangian of the dual problem is actually the negative of the Lagrangian of the primal problem. Alternatively, if xj ≥ 0 is converted to images then the revised primal problem appears as

images

with associated Lagrangian images where images As far as the dual of this revised problem is concerned, we seek to

images

In this instance images also.

We next turn to the notion of a saddle point of the Lagrangian. Specifically, a point (X0, Uo)ε En + m is termed a saddle point of the Lagrangian L(X, U) = CX + U′(b − AX) if

for all X ≥ O, U ≥ O. Alternatively, images E2n + m is a saddle point of images if

for all X unrestricted and U ≥ O. What these definitions imply is that images simultaneously attains a maximum with respect to X and a minimum with respect to images Hence (12.11), (12.12) appear, respectively, as

Under what conditions will images possess a saddle point at images To answer these questions, we must look to the solution of what may be called the saddle point problem: To find a point (X0, U0) such that (12.11) holds for all (X, U)ε En + m, X ≥ O, U ≥ O; or, to find a point images such that (12.12) holds for all images E2n + m, X unrestricted, images

In the light of (12.11.1), (12.12.1) we shall ultimately see that determining a saddle point of a Lagrangian corresponds to maxi‐minimizing or mini‐maximizing it. To this end, we look to Theorem 12.6.

Our discussion in this chapter has centered around the solution of two important types of problems – the primal linear programming problem and the saddle point problem. As we shall now see, there exists an important connection between them. Specifically, what does the attainment of a saddle point of the Lagrangian of the primal problem imply about the existence of a solution to the primal problem? To answer this, we cite Theorem 12.7.

The importance of the developments in this section is that they set the stage for an analysis of the Karush‐Kuhn‐Tucker equivalence theorem. This theorem (Theorem 12.5) establishes the notion that the existence of an optimal solution to the primal problem is equivalent to the existence of a saddle point of its associated Lagrangian, i.e. solving the primal problem is equivalent to maxi‐minimizing (mini‐maximizing) its Lagrangian.

It is instructive to analyze this equivalence from an alternative viewpoint. We noted above that solving the primal linear programming problem amounts to maxi‐minimizing (mini‐maximizing) its associated Lagrangian. In this regard, if we express the primal problem as

images

then, if L(X, U0) has a maximum in the X direction at X0, it follows from (12.4a, b, f) that

images

while, if L(X0, U) attains a minimum in the U direction at U0, then, from (12.4c, d, e),

images

Thus, X0 is an optimal solution to the preceding primal problem if and only if there exists a vector U0 ≥ O such that (X0, U0) is a saddle point of L(X, U), in which case system (12.4) is satisfied.

One final point is in order. As a result of the discussion underlying the Karush‐Kuhn‐Tucker equivalence theorem, we have Corollary 12.4.

12.4 Duality and Complementary Slackness Theorems

(Dreyfus and Freimer 1962)

We now turn to an alternative exposition of two of the aforementioned fundamental theorems of linear programming (Chapter 6), namely the duality and complementary slackness theorems. First, Theorem 12.5 explains the duality theorem.

Next, if, as above, we express the primal‐dual pair of problems as

PRIMAL PROBLEM
images
DUAL PROBLEM
images

then their associated Lagrangians may be written, respectively, as

images
images

where images is an (m + n × 1) vector of nonnegative primal slack variables, i.e.

images

In this light, we now look to Theorem 12.5.

By way of interpretation: (i) if the primal slack variable images corresponding to the ith primal structural constraint is positive, then the associated dual variable images must equal zero. Conversely, if images is positive, then the ith primal structural constraint must hold as an equality with images Before turning to point (ii), let us examine the form of the jth dual structural constraint. Since A = [α1, …, αn],

images

and thus the jth dual structural constraint appears as images In this light, (ii) if the jth primal variable images is different from zero, then the dual surplus variable images associated with the jth dual structural constraint images equals zero. Conversely, if images then the jth dual structural constraint is not binding and thus the jth primal variable images must be zero.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.241.82