CHAPTER 11
MAXIMA AND MINIMA AND OPTIMIZATION PROBLEMS
11.2. Maxima and Minima of Functions of One Real Variable
11.2-1. Relative Maxima and Minima
11.2-2. Conditions for the Existence of Interior Maxima and Minima
11.3. Maxima and Minima of Functions of Two or More Real Variables
11.3-1. Relative Maxima and Minima
11.3-3. Conditions for the Existence of Interior Maxima and Minima
11.4. Linear Programming, Games, and Related Topics
11.4-1. Linear-programming Problems
(b) The Linear-programming Problem in Standard Form. Feasible Solutions
11.4-3. Nonlinear Programming. The Kuhn-Tucker Theorem
11.4-4. Introduction to Finite Zero-sum Two-person Games
(a) Games with Pure Strategies
(b) Games with Mixed Strategies
(c) Relation to Linear Programming
11.5. Calculus of Variations. Maxima and Minima of Definite Integrals
11.5-2. Maxima and Minima of Definite Integrals
11.5-3. Solution of Variation Problems
11.6. Extremals as Solutions of Differential Equations: Classical Theory
11.6-1. Necessary Conditions for the Existence of Maxima and Minima
11.6-3. Isoperimetric Problems
11.6-4. Solution of Variation Problems Involving Higher Derivatives in the Integrand
11.6-5. Variation Problems with Unknown Boundary Values and/or Unknown Integration Limits
(a) Given Integration Limits, Unknown Boundary Values
(b) Given Boundary Values, Unknown Integration Limit
(c) General Transversality Conditions
11.6-6. The Problems of Bolza and Mayer
11.6-7. Extremals with Corners Refraction, Reflection, and Boundary Extrema .
11.6-8. Canonical Equations and Hamilton-Jacobi Equation
11.6-10. Simple Sufficient Conditions for Maxima and Minima
11.7. Solution of Variation Problems by Direct Methods
11.7-2. The Rayleigh-Ritz Method
11.7-3. Approximation of y(x) by Polygonal Functions
11.8. Optimal-control Problems and the Maximum Principle
(a) State Equations, Controls, and Criterion
(b) Optimum-control Theory and Calculus of Variations
(c) Initial-state and Terminal-state Manifolds
(d) Continuity, Differentiability, and Independence Assumptions
11.8-2. Pontryagin's Maximum Principle
(a) Adjoint Variables and Optimal Hamiltonian
(b) The Boundary-value Problem
(a) Zermelo's Navigation Problem
(b) Simple Bang-bang Time-optimal Control
(c) A Simple Minimal-time Orbit-transfer and Rendezvous Problem
11.8-4. Matrix Notation for Control Problems
11.8-5. Inequality Constraints on State Variables. Corner Conditions
11.8-6. The Dynamic-programming Approach
11.9. Stepwise-control Problems and Dynamic Programming
11.9-2. Bellman's Principle of Optimally
11.10 Related Topics, References, and Bibliography
11.10-2. References and Bibliography
11.1-1. A large class of problems can be stated as extreme-value problems: one desires to find parameter values or functions which maximize or minimize SL quantity dependent upon them. In many engineering problems it is, for instance, desirable to maximize a measure of performance
or to minimize cost. Again, one can at least approximate the solutions of many problems by choosing unknown parameter values or functions so as to minimize errors in trial solutions; restatement of a problem as an extreme-value problem may then lead to powerful numerical approximation methods.
EXAMPLES: Solution of eigenvalue problems in vibration theory and quantum mechanics (Secs. 15.4-7 and 15.4-8b); Hamilton's and Jacobi's principles in dynamics.
11.2. MAXIMA AND MINIMA OF FUNCTIONS OF ONE REAL VARIABLE
11.2-1. Relative Maxima and Minima (see also Sec. 4.3-3). A real function f(x) defined for x = a has a (relative) maximum or a (relative) minimumf(a) for x = a if and only if there exists a positive real number 5 such that, respectively,
for all Δx = x – a such that f(a + Δx) exists and 0 <\Δx\ < δ.* The relative maximum (minimum) is an interior maximum (interior minimum) or a boundary maximum (boundary minimum) if x = a is, respectively, an interior point or a boundary point of the domain of definition assigned to f(x) (Sec. 4.3-6a).†
11.2-2. Conditions for the Existence of Interior Maxima and Minima (a) If f'(x) exists for x = a, then f(a) can be a (necessarily interior) maximum or minimum only if fix) has a stationary value for x = a, i.e..,
(b) If f(x) has a second derivative f" (x) for x = a, then f (a) is
A maximum if f(a) = 0 and f"(a) <0
A minimum if f'(a)= 0 and f"(a) > 0
(c) More generally, if f(x) has n continuous derivatives f(x), f"(x), . . . ,fn(:x) for x = a, and f(a) = f"(a) = . . . = f(n-1)a = 0, then f(a) is
A maximum if n is even and f (n) (a) < 0
A minimum if n is even and f (n) (a) > 0
* Δf is defined as the change of a given function f(x) resulting from change Δx in the independent variable x. Δf is a function of a and Δx. Δf must not be confused with the variation δf introduced in Sec. 11.5-1.
† The problem statement must specify the domain of definition of f(x). Note that fiix) = x(– ∞ < x < ∞) has no maximum, but /2(x) = x (x < 1) has a boundary maximum for x =1.
If n is odd and f(n)(a) 5* 0, then f(x) has neither a maximum nor a minimum for x = a, but a point of inflection (Sec. 17.1-5).
EXAMPLES: Each of the functions x2, x4, x6, . . . has a minimum for x = 0. Each of the functions x3, x5, x7, . . . has a point of inflection for x = 0.
11.3. MAXIMA AND MINIMA OF FUNCTIONS OF TWO OR MORE REAL VARIABLES
11.3-1. Relative Maxima and Minima.* A real function f(x1, x2. . . , xn) denned for x1 = a1 x2 = a2, . . . , xn = an has a (relative) maximum or a (relative) minimum f(a1 a2, . . . , an) for x1 = a1 x2 = a2, . . . , xn = an if and only if there exists a positive real number 5 such that
is, respectively, less than zero or greater than zero for all Δx1b> Δx2, . . . , Δxn such that f(a1 + Δx1, a2 + Δx2, . . . , an + Δxn) exists and 0 < Δx 12 + Δx22 + . . . + Δxn2 < 5. The relative maximum (minimum) is an interior maximum (interior minimum) or a boundary maximum (boundary minimum) if the point (a1, a2, . . . , an) is, respectively, an interior point or a boundary point of the region of definition assigned to f(x1 x2, . . . , xn) (Sec. 4.3-6α).
11.3-2. Expansion of Δf. The quantity Δf defined by Eq. f(1) is a function of a1, a2 . . . , an and Δx1 Δx2, . . . , Δxn. If /(x1, x2, . . , xn) is suitably differentiable,
(Sec. 4.10-5). The terms of degree 1, 2, . . . in the Δxi in this expansion respectively constitute the first-order change (principal part of Δf ) Δ1f, the second-order change Δ2/, ... of the function f(x1 x2 . . . , Xn), . . . for x1 = a1, x2 = a2, . . . , xn – an (see also Sec. 4.5-3b).
11.3-3. Conditions for the Existence of Interior Maxima and Minima, (a) If f(x1, x2 . . . , xn) is differentiable for X1 = a1 x2 = a2, . . . , xn = an, f(a1 a2, . . . , an) can be a (necessarily interior) maximum or minimum, only if the first-order change Δ1fvanishes, i.e., only if
* See footnotes to Sec. 11.2-1.
for x1 = a1, x2 = a2, . . . , xn = an. (x1, x2. . . , xn) is then said to have a stationary value for x1 = a1, x2 = a2. . . , xn – an.
(b) If(x1, x2, . . . , xn)is twice differentiable and satisfies the necessary condition (3) for x1 = a1, x2 = a2. . , xn = an, then (a1, a2, . . . , an) is a maximum if the (real symmetric) quadratic form
is negative definite (Sec. 13.5-2); and f(a1, a2, . . . , an) is a minimum if the quadratic form (4) is positive definite.
EXAMPLE: Find the maxima and minima of the function
Here the necessary conditions
shows that the only extreme values are
(c) If the first-order change Δ1f of a suitably differentiable function f(x1, x2), . . . , xn) vanishes and Δ2f is semidefinite (Sec. 13.5-2), the nature of the stationary value depends on higher-order derivatives. f(x1, x 2. . . , xn) cannot have a maximum or minimum where Δ2exists and is indefinite.
Given a set of values x1 = a1, x2 = a2, . . . , xn = an obtained from Eq. (3), one may investigate the nature of the quadratic form (4) by any of the methods listed in Sec. 13.5-6; or one may test for a maximum or minimum by actual computation of f(a1 + Δx1, a2 + Δx2, . . . , an + Δn) for judiciously chosen combinations of increments Δx1, Δx2, . . . , Δxn.
11.3-4. Extreme-value Problems with Constraints or Accessory Conditions. The Method of Lagrange Multipliers. Maxima and minima of a real function f(x1, x2, . . . , xn) of n variables x1, x2, . . . , xn subject to suitably differentiable constraints or accessory conditions in the form of m < n equations
may, in principle, be found in the manner of Sec. 11.3-3 after m of the n variables x1, x2 . . . , xn have been eliminated with the aid of the relations. If it is impossible or impractical to eliminate m variables directly, one applies the following necessary condition for a maximum or minimum of f{x1, x2, . . . , xn) subject to the constraints (5):
The m parameters λj are called Lagrange multipliers. The n + m unknowns xi = ai and λj are obtained from the n + m equations (5) and (6).
Note that Eq. (6) is a necessary condition for a stationary value of the function Φ(x1, x2, . . , xn) if x1, x2 . . . , xn are independent variables.
EXAMPLE: Find the sides of the rectangle of maximum area inscribed in the circle
The rectangle area A may be expressed in the form
Then
The necessary condition for a maximum or minimum yields
so that λ = – 2, and x = y yields the desired maximum.
11.3-5. Numerical Methods. If, as is often the case, the function to be maximized or minimized depends on many variables, or if explicit differentiation is difficult or impossible, maxima and minima must be found by systematic numerical trial-and-error schemes. The most important numerical methods for problems with and without constraints are outlined in Secs. 20.2-6 and 20.3-2.
11.4. LINEAR PROGRAMMING, GAMES, AND RELATED TOPICS
11.4-1. Linear-programming Problems, (a) Problem Statement. A linear-programming problem requires one to determine the values of r variables X1 X2, . . . , Xr which minimize a given linear objective function (criterion function)
(or maximize –z) subject to n ≥ r linear inequality constraints
Figure 11.4-la illustrates a simple example.
In a typical application, the problem is to buy necessarily positive quantities X1, X2, . . . , Xr of r types of raw materials (“inputs”) so as to minimize a given total cost (la) while keeping the respective quantities
of m “output” products at or above m specified levels B1, B2 . . . , Bm; in view of the r conditions Xk ≥ 0 (k = 1, 2, . . . , r), one has a total of n = r + m inequality
FIG. 11.4-1. Solution sets in (X1,X2) space (a), and in (x1,x2,x3) space (b) for the linear-programming problem
In this case, r = 2, n = 3, m – n – r = 1.
constraints. A large number of applications, especially to management problems, are discussed in Refs. 11.2 to 11.10. Note that a given problem may or may not have a solution.
(b) The Linear-programming Problem in Standard Form. Feasible Solutions. Expressed in terms of the n positive slack variables (1b), the linear-programming problem (1) requires one to minimize the linear objective function
subject to m = n – r < n linear constraints
and n linear inequality constraints
(standard form of the linear-programming problem).
A feasible solution (feasible program) for a linear-programming problem given in the standard form (2) is an ordered set (“point”) (x1, x2, . . . , xn) which satisfies the constraints (2b) and (2c); a minimum feasible solution actually minimizes the given objective function (2a) and thus solves the problem. A basic feasible solution is a feasible solution with N < m positive xk a basic feasible solution is non-degenerate if and only if N = m.
Whenever a feasible program exists, there exists one with at least n – m of the xk equal to zero. If, in addition, the objective function z has a lower bound on the solution set, then there exists a minimum feasible solution with at least n – m of the xk equal to zero (Ref. 11.2).
The set of all feasible solutions to a given linear-programming problem is a convex set in the n-dimensional solution space) i.e.; every point (ξ1 ξ2, . . . , ξn) on a straight-line segment joining two feasible-solution points (x1, x2, . . . , xn) and {x'1, x'2, . . . , x'n), or
(see also Sec. 3.1-7) is also a feasible solution. More specifically, the solution set is a convex polygon or polyhedron on the plane or hyperplane defined by the constraint equations (2b), with boundary lines, planes, or hyperplanes denned by the inequalities (2c), as shown in the simple example of Fig. 11.4-1b. If a finite minimum feasible solution exists, the linear-programming problem either has a unique minimum feasible solution at a vertex of the solution polygon or polyhedron, or the same minimal z is obtained on the entire convex set generated by two or more vertices {degenerate solution).
(c) Duality. The minimization problem
and the corresponding maximization problem
are dual linear-programming problems. If either problem (3) or (4) has a finite optimum solution, then the same is true for the other problem, and
If either problem has an unbounded solution, then the other problem has no feasible solution (Ref. 11.2).
The duality theorem permits replacement of a given linear-programming problem with one having fewer unknowns or fewer given inequalities and is also useful in certain numerical solution methods (Ref. 11.5).
11.4-2. The Simplex Method, (a) Linear-programming problems ean be solved by the numerical hill-climbing methods outlined in Sec. 20.2-6, but the simplex method (Refs. 11.2 to 11.6) takes advantage of the linearity of the expressions (1) and (2). Assuming that a solution of the linear-programming problem (2) exists, any solution-polygon vertex can be found through simultaneous solution of the m linear equations (2b) and n – m linear equations xk = 0. To avoid laborious investigation of an excessive number of nonminimal vertices, the simplex method provides a systematic computing scheme progressing from vertex to vertex in the direction of lower z values.
(b) To find a solution vertex, one selects m basie variables, say x1, x2, . . . , xm, which are to be greater than zero at the vertex. By systematic elimination of variables (Secs. 1.9-1 and 20.3-1), x1, x2, . . . , xm can be expressed in terms of the remaining unknowns xm+1, xm+2, . . . , xn; then Eqs. (2a) and (26) can be rewritten in canonical form relative to the selected basic variables, viz.,
Now xm+1 = xm+2 = ... = xn = 0 defines a basic feasible solution (Sec. 11.4-1b) if all ft in Eq. (6) are nonnegative. If, in addition, all γkare nonnegative, then
is a minimum feasible solution with z = zo; this solution is unique if all y* are positive.
(c) Consider next the case that Eq. (b) yields a nondegenerate basic feasible solution (i.e., β1, β2, . . . , βm > 0, see also Sec. 11.4-1b), which is not optimal, i.e., at least one γk is negative. Let γk be the smallest (most negative) γk. To obtain a basic feasible solution with a smaller z value, xk is then introduced as a basic variable in the next iteration cycle; xk will replace that former basic variable xi which, referring to Eq. (6), reaches zero first as xi is increased from zero, i.e., I is associated with the smallest possible
Then an improved basic feasible solutio is given by (8a) and
Now, introduction of
into Eq. (6) produces the canonical form relative to the new basic variables. The complete simplex algorithm, which permits convenient tabulation (simplex tableau, Refs. 11.5 and 11.6) and also adapts readily to machine computation, is repeated and usually progresses to an optimal solution in a finite number of steps. If, in the course of the computation, the basic variables correspond to a degenerate basic feasible solution (i.e., if at least one of the basic variables turns out to be zero, see also Sec. 11.4-1b), further improvement of the objective function z could, in principle, stop, but this rarely if ever occurs in a practical problem (Ref. 11.5). One can extend the simplex method to degenerate cases, e.g., by introducing small changes in the given coefficients during the computation (perturbation method) (Ref. 11.15). Many practical refinements of the basic simplex method appear in the literature (Refs. 11.2 to 11.7).
NOTE: The simplex algorithm necessarily fails when all αikare non-positive; in this case, the objective function z has no lower bound on the solution set.
EXAMPLE (Fig. 11.4-2): The problem
transforms to the standard form
FIG. 11.4-2. Solution set in (X1,X2) space for the example of Sec. 11.4-2c.
Starting with x1, x2 as basic variables (first vertex from the right in Fig. 11.4-2), one obtains the canonical form
The coefficient of x4 in the last expression is negative; x1 is, therefore, increased until x2 reaches zero (this takes place before x1 reaches zero and corresponds to the second vertex from the right in Fig. 11.4-2). Thus, K = 4, / = 2, and
The new canonical form relative to x1, x4 is
which corresponds to the unique minimum feasible solution with
(d) Use of Artificial Variables to Start the Simplex Procedure.
The simplex algorithm, as described, presupposes that a suitable choice of the initial basic variables x1, x2, . . . , xm has produced a basic feasible solution with β1, β2, . . . , βm > 0. Such a choice may not be obvious. The following procedure may be used to start the solution, or to decide whether a feasible solution exists.
With the given problem in the standard form (2), introduce m artificial variables xn+1, xn+2 , . . . + xn+m and solve the “augmented” linear-programming problem minimizing
subject to the constraints
The m artificial variables necessarily define a basic feasible solution of the augmented problem; the original problem has no solution unless the minimum of the feasibility form ω = xn+1+ xn+2 + . . . + xn+m is zero.
11.4-3. Nonlinear Programming. The Kuhn-Tucker Theorem.
If the objective function and/or one or more inequalities in the linear-programming problem (1) is replaced by an expression nonlinear in the problem variables Xk, one has a nonlinear-programming problem. Such a problem results, for example, if the solution-set boundaries and/or the lines of constant z in Fig. 11.4-1 a are replaced by nonlinear curves. Nonlinear-programming problems are of considerable practical interest but are, with few exceptions (Ref. 11.11), accessible only to numerical solution methods (see also Sec. 20.2-6).
Assuming suitable differentiability, a necessary {not sufficient) condition for a maximum of
subject to m inequality constraints
is the existence of m + 1 nonnegative Lagrange multipliers λ0, λ1, λ2, . . . , λm (see also Sec. 11.3-4) not all equal to zero and such that
(Fritz John Theorem). λo is positive and can be set equal to unity in the condition (12α) whenever there exists a set of n real numbers (y1 y2, . . . , yn) such that, for the values of x1, x2, . . . , xn in question,
(Abadie's form of the Kuhn-Tucker Theorem, Ref. 11.1).
11.4-4. Introduction to Finite Zero-sum Two-person Games, (a) Games with Pure Strategies. A finite zero-sum two-person game is a model of a conflict situation characterized by a finite payoff matrix,
where aik is the (positive or negative) payment due player A from player B if A chooses the ith of n pure strategies Ѕ1, Ѕ2, . . . , Ѕn available to him, and B chooses the kth of m pure strategies S1, S'2, . . . , S'm possible for him. Neither player knows the other's choice. Note that the sum of the payoffs to both players is zero for each move (hence the name zero-sum game). The game is symmetrical if and only if m = n, and aki = –aik for all i, k.
To win, the maximizing player A selects the ith row of the payoff matrix so as to maximize while the minimizing player B attempts to minimize
For every given payoff matrix (13)
If the two quantities in Eq. (14) are equal for a (not necessarily unique) pair i = I,k = K, the game is said to have the saddle point or solution I, K and the (necessarily unique) value aiK. The optimal strategies for such a game are unaffected if player A knows B's move beforehand, and vice versa.
(b) Games with Mixed Strategies. Mixed strategies are defined by probabilities (Sec. 18.2-2) p1 p2, . . . , pn assigned by player A to his n respective strategies Ѕ1, Ѕ2, . . . , Ѕn, and probabilities p'1 p'2, . . . , p'm assigned by player B to his strategies Ѕ1 Ѕ'2, . . . , Ѕ'm. In a mixed-strategy game, A attempts to maximize the minimum of the expected value (Sec. 18.3-3)
through his choice of p1, p2, . . . , pn, while B tries to minimize
But for every given payoff matrix (11),
(Minimax Theorem for finite zero-sum two-person games). The common value of the two quantities (15) is called the value v of the game. Every finite zero-sum two-person game has at least one solution defined by the optimal strategies p1 p2, . . . , pn; p'1,p'2, . . . , p'm Multiple solutions are possible, but the value of the game is necessarily unique. A fair zero-sum two-person game has the value 0.
EXAMPLE: The familiar penny-matching game has the payoff matrix
where strategy 1 for each player is a bet on heads, and strategy 2 is a bet on tails. The game is symmetrical and fair and has no saddle point. The solution is pi = Pi = lit Vi = Pi = H-
(c) Relation to Linear Programming. While a number of approximation methods yielding solutions to mixed-strategy games have been developed (Ref. 11.10), the most generally applicable approach relates the problem to linear programming.
The solution of a given matrix game is unchanged by addition of a positive constant a to each aik, so that it is not a restriction to consider only games with positive values v > 0. In this case, the optimal strategies p1, p2, . . . , pn and p1, p2, . , . , p'm and the value v for a finite
zero-sum two-person game with the payoff matrix (11) are given by
where zmax, wmm, and the Xi and Yk are determined by the solution of the dual linear-programming problems (Sec. 11.4-1)
11.5. CALCULUS OF VARIATIONS. MAXIMA AND MINIMA OF DEFINITE INTEGRALS
11.5-1. Variations, (a) A variation δy of a function y(x) of # is a function of x denned for each value of x as the difference by =s Y(x) – y (x) between a new function Y(x) and y (x).
Each variation δy defines a change in the functional relationship of y and x and must not be confused with a change Δy in the value of a given function y{x) due to a change Δx in the independent variable (Sec. 11.2-1).
(b) Given a function F[y1(x), y2(x)y . . . , yn(x)', x], the variation δF corresponding to the respective variations δy1 δy2 . . . , δyn of the functions y1(x), y2(x), . . . , yn(x) is
If y(x) and δy are differentiate, the variation δy' of the derivative y' (x) due to the variation δy is
More generally, given F[y1(x), y2(z), . . . , yn(x)] y[(x), y'%(x), . . . ,
(c) If F is suitably differentiable, Eq. (3) can be expanded in the form
for each value of x (Sec. 4.10-5). The terms of degree 1, 2, . . .in the δyi and δy[ in the Taylor expansion (4) respectively constitute the first-order variation δlF of F, the second-order variation δ2F of F,
(d) The definitions and formulas of Secs. 11.5-la, b, and c apply without change to functions y, yi, and F of two or more independent variables x1, x2, . . . .
11.5-2. Maxima and Minima of Definite Integrals. While the theory of ordinary maxima and minima (Secs. 11.2-1 to 11.3-5) is concerned with unknown values of independent variables x or xi corresponding to maxima and minima of given functions, it is the objective of the calculus of variations to find unknown functions y(x) or yi(x) which will maximize or minimize definite integrals like
for a specified function F. I is a functional (Sec. 12.1-4) determined by the function(s) y(x) or yi(x) together with the integration limits x1 x2 and the boundary values y(xo), y(xF) or yi(xo), yi{xF).
A definite integral (5) has a strong maximum for given integration limits xo, xF and a given function y(x) or a given set of functions y%{x) if and only if there exists a positive real number e such that the variation
is negative for all increments δ0, δxf and all variations δy or δyh δy2l. . . δyn whose absolute values are less than e and not identically zero. A strong minimum of a definite integral (5) is similarly defined by δI >0.
The integral I has a weak maximum (or a weak minimum) for given Xo, xF and y(x) or yi(x), y < t{x{, . . . , yn{x) if and only if there exists a positive real number e such that δi < 0 (or δI > 0, respectively) only for all increments δx0, δxF and all variations δy or δy1 δy2 . . . , byn and by' or by[, by'2, . . . , by'n whose absolute values are less than e and not identically zero. A strong maximum (or minimum) is necessarily also a weak maximum (or minimum).
If the region of definition for “admissible” functions y(x) or y1(x) y2(x). . . ,yn{x) is bounded through inequalities such as
a function y(x) or y{x) maximizing or minimizing I can lie wholly in the interior of the region of definition (interior maximum or minimum), or wholly or partially on its boundary (boundary maximum or minimum, see also Sec. 11.6-6).
In most applications, the maximizing or minimizing functions y(x) or yi(x) need not be compared with all possible functions of x. In the following, the existence of the definite integrals in question is assumed wherever necessary, and it is understood that (1) maximizing or minimizing functions are to be chosen from the set of all functions having piecewise continuous first derivatives on the interval or region under consideration. In addition, it will be assumed that (2) each integrand F is twice continuously differ-entiable on the integration domain (see also Sec. 11.6-lc).
11.5-3. Solution of Variation Problems. Functions y(x) or yi(x) which maximize or minimize a given definite integral may be found (1) as solutions of differential equations which ensure that 8*F = 0 (Secs. 11.6-1 to 11.6-7), or (2) by the “direct” methods described in Secs. 11.7-1 to 11.7-3.
It is not a trivial observation that a given problem of the type discussed here may not possess any solution. Every solution derived with the aid of the necessary (not sufficient) conditions of Secs. 11.6-1 to 11.6-7 must be tested for actual maximum or minimum properties. Some sufficient conditions for the existence of maxima or minima of definite integrals are discussed in Sec. 11.6-9.
11.6. EXTREMALS AS SOLUTIONS OF DIFFERENTIAL EQUATIONS: CLASSICAL THEORY
11.6-1. Necessary Conditions for the Existence of Maxima and Minima, (a) A necessary condition for the existence of either a maximum or a minimum of the definite integral
for fixed x0, xF is
for an arbitrary small variation by. Hence every maximizing or minimizing function y(x) must satisfy the differential equation
wherever the quantity on the left exists and is continuous (see also Sec. 11.6-lc). In addition, y(x) must either assume given boundary values y(xo) and/or y(xF), or y(x) must satisfy other conditions determining its boundary values (Sec. 11.6-5).
(b) Similarly, every set of n functions y1(x), y2(x), . . . , yn{x) maximizing or minimizing the definite integral
must satisfy the set of n differential equations
together with suitably given boundary conditions, wherever all the quantities on the left exist and are continuous (see also Sec. 11.6-lc).
(c) Functions y(x) or yi{x) satisfying the Euler equation or equations associated with a given variation problem are called extremals for theproblem in question.
A further necessary (still not sufficient) condition for a maximum or minimum of I on a given extremal is that the matrix [d2F /dy' dy'k] be, respectively, negative or positive semidefinite (Sec. 13.5-2) on the extremal (Legendre's Condition); this reduces to d F/δy'2 < 0 or δ2F/δy'2 > 0, respectively, in the one-dimensional case (see also Secs. 11.6-7 and 11.6-10).
In general, the necessary conditions of Secs. 11.6-la and b apply only where the y(x) or yi(x) are twice continuously differentiate, but this is not as restrictive a condition as it might seem. Let the integrand F be twice continuously differentiate for x0 < x < xF. Then all continuous differentiable functions y(x) or yi{x) which actually maximize or minimize I for given x0, XF and given boundary values necessarily have continuous second-order derivatives for all x in (x0, XF) where the matrix [δ2F/δyi δyk] is negative or positive definite (Sec. 13.5-2), or δ2F/δy'2 ≠ 0 {Theorem of Du Bois-Reymond).
(d) The derivation of the conditions (2) and (4) from δ1I = 0 is based on the fundamental lemma of the calculus of variations: if f(x) is continuous on the bounded interval [xQ, XF] and for arbitrary g(x), then f(x) = 0 on [x0,xF]', in (x0, xF) if
for every g(x) such that
dx is a continuous function in (x0, XF), and
and on Du Bois-Reymond's Lemma: a continuous function fix is necessarily a constant
FIG. 11.6-1a Two of the infinite set of broken-line extremals minimizing I = with y(0) = 0, y(2)= 1. The Euler-Lagrange equation yields extremal segments y = ax + b, with a and b determined by the boundary conditions and a Weierstrass-Erdmann condition for each corner; it follows that a is either 0 or 1. Note that δ2F/δy'2 = 12y'2 – 12y' + 2 (with undetermined y') can vanish anywhere in (0, 2), but δ2F/δy'2 > 0 on each minimizing extremal. Note also that no continuously differ en tiable extremal yields a smaller value of I(Refs. 11.19 and 11.22).
FIG. 11.6-1c. Reflection of extremals from a boundary curve.
11.6-1. Eextremals with corners.
EXAMPLES OF APPLICATIONS INVOLVING VARIATION PROBLEMS: geodesies in Riemann spaces (Sec. 17.4-3); derivation of Lagrange's equations from Hamilton's principle. See also Sec. 15.4-7.
EXAMPLE: Brachistochrone in Three Dimensions. Given a three-dimensional rectangular cartesian coordinate system with the positive x axis vertically downward, this classical problem requires the determination of the space curve y = y(x), z = z(x) which will minimize the time t taken by a particle sliding on the curve without friction under the action of gravity to get from the origin to some specified point [xF > 0, y(xF), z(xF)]. Since by conservation of energy
where g is the acceleration of gravity, the quantity to be minimized is
The Euler equations (4) become
where c1 and c2 are constants. Since the curve must lie in a vertical plane, which will be the xy plane if the boundary conditions
are assumed. In this case, z' – 0 and
The constant k must vanish, since y = 0 for x = 0; the constant a will depend on the values of xF and yF. The desired curve represents a cycloid (Sec. 2.6-2e) with its base along the y axis and its cusp at the origin.
NOTE: The problem of the brachistochrone, like the problem of Sec. 11.6-3, is an example of the minimization of a line integral (Sec. 4.6-10). Problems of this type can always be reduced to problems involving integration over a single independent variable.
11.6-2. Variation Problems with Constraints or Accessory Conditions. The Method of Lagrange Multipliers. It is desired to find sets of functions y1(x), y2(x), . . . , yn(x) which maximize or minimize the definite integral (3) while also subject torn n suitably differentiable constraints or accessory conditions
Unless it is possible to eliminate m of the n variables yi directly with the aid of the relations (5), one obtains the desired sets of functions y1(x), y2(x), . . . , yn(x)as solutions of the set of differential equations (Euler equations)
subject to the constraints (5), where
The unknown functions λj(x) are called Lagrange multipliers (see also Sec. 11.3-4). If the problem has a solution, the n + m functions yi and λj can be determined from the n + m relations (6) and (5) together with the given boundary conditions. The differential equations (6) are necessary (but not sufficient) conditions for a maximum or minimum, provided that all the quantities on the left exist and are continuous.
The method of Lagrange multipliers remains applicable in the case of suitably differentiable nonholonomic constraints of the form
11.6-3. Isoperimetric Problems. An isoperimetric problem requires one to find sets of functions y1(x), y2(x) . . . , yn(x) which will maximize or minimize a definite integral (3) subject to the accessory conditions
where the ck are given constants. If the given functions Ψk are suitably differentiable, the method of Lagrange multipliers applies; the unknown functions yi(x) must satisfy a set of differential equations (6) subject to the constraints (9), where now
The m' Lagrange multipliers μk are constants to be determined together with the n unknown functions yi{x) from the m' + n relations (9) and (6) and the given boundary conditions.
EXAMPLE: The area o a closed plane curve x = x(t), y = y(t) can be written in the form
if the parameter t is suitably chosen. To maximize I subject to the accessory condition
where R is a constant different from zero, write
The resulting Euler equations (6), viz.,
and the given accessory conditions are satisfied by
i.e., the desired curve is a circle of radius R (see also Sec. 11.7-1).
NOTE: To maximize or minimize the definite integral (3) subject to m accessory conditions (5) and r accessory conditions (9), apply the conditions (6) with
11.6-4. Solution of Variation Problems Involving Higher-order Derivatives in the Integrand. To maximize or minimize definite integrals of the form
one may introduce all derivatives of order higher than one as new dependent variables related to one another and to the yi by constraints yi'' = dyi'/dx, y''i' = dy'i'/dx, .... The resulting necessary conditions for a maximum or minimum of the definite integral (11) take the form
provided that the integrand F is suitably differentiable, and all the quantities on the left exist and are continuous. In order to solve the problem completely, one must also specify the boundary values of all but the highest derivatives of each function yi appearing in the integrand.
11.6-5. Variation Problems with Unknown Boundary Values and/ or Unknown Integration Limits, (a) Given Integration Limits, Unknown Boundary Values. To maximize or minimize the definite integral (3) when one or more of the boundary values yi(x0) and/or yi(xF) are not specified or constrained (Sec. 11.6-5c), replace each missing boundary condition by a corresponding natural boundary condition
(b) Given Boundary Values, Unknown Integration Limit. If one of the integration limits, say xF, is unknown but all boundary values yi(xF) are given, then the extremals yi(x) must satisfy
(c) General Transversality Conditions. Frequently, an integration limit, say xF, and/or one or more of the corresponding boundary values y1(xF) will not be given explicitly, but the “point” [xF;y1(xF), y2(xf). . . , yn(xF)] is known to satisfy nF given continuously differentiable relations
The integration limit xF and the unknown functions yi(x) which together maximize or minimize the integral (3) must then satisfy the Euler equations (4) and the boundary conditions (15) together with a set of transversality conditions expressed by
for each yi{xF) which is not given explicitly, and/or
if xF is not explicitly known. The Λi are nF constant Lagrange multipliers to be determined, together with up to n + 1 unknowns yi(xF) and/or xF, from the n + nF + 1 relations (15) and (16). Note that Eqs. (13) and (14) are special cases of Eqs. (16a) and (16b).
(d) Analogous methods apply if the boundary conditions corresponding to x = xo are not completely specified.
EXAMPLE: To minimize a line integral
where (x0, y0, z0) and (XF, VF, ZF) lie on two suitable (convex) nonintersecting regular surfaces So, SF specified by
the solution extremal must satisfy the transversality conditions
i.e., it must intersect S0 and SF orthogonally. If, in particular, G(x, y, z) ≡ 1, then I is a distance between the surfaces, and the extremals are straight lines. The actual existence of a minimum depends on the specific nature (convexity) of the given surfaces.
11.6-6. The Problems of Bolza and Mayer. A functional of the form
to be maximized by a suitable choice of functions yi{x) {Problem of Bolza), subject to suitably given boundary conditions of the form (15), can, for suitably differentiate h, be written as
h = h(x) is an additional dependent variable satisfying the differential-equation constraint
and the boundary conditions
The yi(x) must satisfy the usual Euler-Lagrange equations for x0 < x < XF.
An analogous procedure is used if a function of the boundary values corresponding to x = xo is added to the given expression (17).
If the yi{x) are subject to given differential-equation constraints but F = 0, so that only the boundary function h remains to be maximized or minimized, one has the Problem of Mayer, which will be considered in the context of optimum-control theory in Secs. 11.8-1 to 11.8-6.
11.6-7. Extremals with Corners. Refraction, Reflection, and Boundary Extrema. The functions y(x) or yi{x) maximizing or minimizing a definite integral (1) or (3) were assumed to be only piecewise continuously differentiable (Sec. 11.5-2). They can have corners (discontinuities in first-order derivatives) for values of x such that δ2F/δy'2 = 0, or where the matrix [δ2F/δy'i δy'k] is only semidefinite for some y, y' or y1,y2, . . . , yn; y'1, y'2, . . ., y'n, or where F has a discontinuity (Secs. 11.6-1 and 13.5-2). Corners may, in particular, occur
For any x over an interval of x values, as illustrated by the example of Fig. 11.6-la.
On a curve, surface, or hypersurface
crossed by the extremal curves (“refraction” of extremals, Fig. 11.6-16).
3.On the boundary of a region from which the extremals are excluded by an inequality constraint
(Fig. 11.6-lc and d).
At every “free” corner [x1, y(x1) or [x1; y1(x1), y 2(x1, . . . , yn(x1)] between extremals (Fig. 11.6-la, but not b, c, d), the extremal segments must satisfy
For refraction (Fig. 11.6-1b) and reflection (Fig. 11.6-lc) of extremals, the curve, surface, or hypersurface defined by Eq. (21) or (22) acts like an intermediate boundary where each extremal must satisfy Eq. (21) together with the corner conditions
where Λ is a constant Lagrange multiplier. In each case, the corner conditions supplement the given boundary conditions to determine the corner points.
In the boundary-extremum case illustrated by Fig. 11.6-ld, extremals must enter and leave the boundary along a tangent to the boundary (see also Sec. 11.8-5).
NOTE: In the special.case ds with ds2 = dx2 + dy2 + dz2, the corner conditions yield the refraction and reflection laws of optics if the function G(x, y, z) is interpreted as the reciprocal of the propagation velocity (Ref. 11.21).
11.6-8. Canonical Equations and Hamilton- Jacobi Equation. The n second-order Euler equations (4) associated with the variation problem of Sec. 11.6-1b are equivalent to the 2n first-order canonical equations
(contact transformation, see also Secs. 10.2-6 and 10.2-7).
Note that the transversality conditions (13), (14), (16), and the corner conditions (23), (24) are greatly simplified if one introduces the pi and H. In classical dynamics, the canonically conjugate or adjoint variables pi are interpreted as generalized momenta, and the Hamiltonian function H has the dimensions of energy. Sections 11.8-1 to 11.8-6 restate the calculus of variations in terms of canonical equations in the context of optimum-control theory.
One can sometimes simplify the solution of the problem by introducing 2n + 1 new variables ỹi, pi, and through a suitable canonical transformation (Sec. 10.2-6) which yields new canonical equations
(b) Assuming that the minimum
defines unique optimal extremals for fixed xF, yi(xF), and variable x0 = X, yi(x0) = Yi (i = 1, 2, . . . , n), the function S(X, Y1, Y2, . . . , Yn) satisfies the first-order partial differential equation
The ordinary differential equations (25a) are the corresponding characteristic equations (see also Secs. 10.2-4 and 11.8-7).
11.6-9. Variation Problems Involving Two or More Independent Variables: Maxima and Minima of Multiple Integrals. One often wishes to determine a set of n functions y1(x1, x2,. . . , xm), y1(x1,x2, . . . , xm), . . . , yn(x1, X2, . . . , xm) of m independent variables x1 x2, . . . , xm so as to maximize or minimize a multiple integral
where the integrand F is a given function of the n + m + mn variables yi, xk, and . The boundary S of the region of integration V and the boundary values of the functions yi may or may not be given. The functions yi are assumed to be chosen from the set of all functions having continuous second partial derivatives in V (see also Sec. 11.5-2).
If the integrand F is suitably differentiate, every set of maximizing or minimizing functions y1, y2 . . . , yn must satisfy the set of partial differential equations
together with suitable boundary conditions, whenever the quantities on the left exist and are continuous. The Lagrange-multiplier methods of Secs. 11.6-2 and 11.6-3 apply to variation problems involving accessory conditions.
If the boundary values of one or more of the unknown functions yi are not given, one must use the condition δ1I = 0 to derive natural boundary condition analogous to Eq. (13). Thus in the case of two independent variables, x1, x2, let the boundary S be a regular closed curve described in terms of its arc length s by Xi = x1(s), x2 = x2(s);
then the natural boundary conditions are
The case of an unknown boundary is treated in Ref. 11.21.
EXAMPLE: The small displacements of a string of length L may be represented by a function y(x, t) of the coordinate x measured along the string and of the time t.. The kinetic and potential energies of the entire string are, respectively,
where m is the mass per unit length, and Q is the string tension. For a minimum of the integral
(Hamilton&s principle, Ref. 11.22), one must have
which is the correct wave equation for the vibrating string.
11.6-10. Simple Sufficient Conditions for Maxima and Minima. Consider a one-parameter family of extremals [solutions of the Euler-Lagrange equation (2)] y = y(x, μ) for the definite integral
with y(x0, μ) = y0 for a parameter range μ1 < μ < μ2 such that
exists. The extremals corresponding to this parameter range all go through the point (x0, y0); they will not intersect again in (x0, XF ) if z(x, μ) satisfies the differential equation
in (x0, xF) [extremal field centered at (x0, y0)]. The function
obtained through elimination of μ with the aid of y = y(x,μ) represents the slope of the extremal field at a point (x, y).
An extremal y(x, µ)of the field satisfying Eqs. (2) and (31) maximizes the given integral (30) if the function
is nonpositive for all points (x, y) sufficiently close to y(x,µ); this is, in particular, true if
along the field extremal in question. The maximum is a strong maximum (Sec. 11.5-2) if E ≤ 0 for all (x, y) sufficiently close to the extremal curve and arbitrary y'.
E ≤ 0 along the extremal for every finite y' (Weierstrass&s necessary condition) is necessary for a strong maximum, but is not sufficient by itself. Conditions for a minimum of the integral (30) similarly correspond to E ≥ 0. The multidimensional case is treated in Refs. 11.16 and 11.17.
11.7. SOLUTION OF VARIATION PROBLEMS BY DIRECT METHODS
11.7-1. Direct Methods. Each so-called direct method for maximizing or minimizing a given definite integral
attempts to approximate the desired function y(x) successively by a sequence of functions u1(x), U2(x), . . . selected so as to satisfy the boundary conditions imposed on y(x). Each function ur(x) is taken to be a differentiate function of x and of r parameters αr1, αr2, . . . , αrr. The latter are then chosen so as to maximize or minimize the function
with the aid of the relations
(Sec. 11.3-3a).
Every tentative solution y(x) obtained in this manner as the limit of a sequence of approximating functions u1(x), u2(x), . . . still requires a proof that the definite integral (1) is actually a maximum or minimum.
Analogous methods of solution apply to variation problems involving more than one unknown function (Sec. 11.6-1b) and/or more than one independent variable (Sec. 11.6-9). If accessory conditions are given (Secs. 11.6-2 and 11.6-3), they are made to apply to each approximating function; the approximating integrals may then be maximized or minimized by the Lagrange-multiplier method of Sec. 11.3-4 (see example).
Direct methods can yield numerical approximations and/or exact solutions. Since, under the conditions listed in Sec. 11.5-2, the solutions of variation problems must satisfy differential equations expressing the condition δ1I = 0, every direct method for the solution of a variation problem is also an approximation method for solving differential equations.
EXAMPLE: To solve the isoperimetric problem given as an example in Sec. 11.6-3,
so that
To maximize Ir subject to the accessory condition
one applies the Lagrange-multiplier method of Sec. 11.3-4 and finds
for r = 1, 2, .... Starting with k = 1, one obtains λr = ½ β1 = —α1, α1 = b1;
for k > 1, ah = bk — ak = (3k = 0. In this example,the direct method yields the exact solution
11.7-2. The Rayleigh-Ritz Method. This method attempts to expand the desired solution y(x) as a series in terms of a complete set of functions φi(x) (Sec. 15.2-4) so that the approximating functions satisfy the given boundary conditions. As a rule, the φi(x) are orthogonal functions, so that the parameters αrj = αj are independent of r (Sec. 15.2-4), just like in the example of Sec. 11.7-1.
The Rayleigh-Ritz method is useful for numerical solution of certain eigenvalue problems in vibration theory and quantum mechanics (see also Sec. 15.4-7).
11.7-3. Approximation of y(x) by Polygonal Functions. y(x) may be approximated by polygonal functions ur(x) each defined, say, by its r values αr1 = ur(x0 + Δx), αr2 = ur(x0 + 2Δx), . . . , αrr = ur(x0 + r Δx) = ur(xF — Δx). In this case, the conditions (3) yield a difference equation (Sec. 20.4-3) approximating the Euler equation (11.6-2) of the given variation problem.
11.8. CONTROL PROBLEMS AND THE MAXIMUM PRINCIPLE
11.8-1. Problem Statement, (a) State Equations, Controls, and Criterion. In control problems, the state of a dynamical (mechanical, electrical, chemical, etc.) system is represented by n state variables x1(t),x2(t), . . . xn{t) satisfying n first-order differential equations
Typical state variables are generalized coordinates and velocities in mechanics, electric currents and voltages, and concentrations of chemicals (Sec. 11.8-3); the independent variable t is usually the time. The problem is to determine the r control variables (controls) uk = uk(t) (k = 1, 2, . . . , r) as functions of t for t0 ≤ t ≤ tF so as to minimize a given criterion functional
(e.g., cost, mean-square error, time to achieve a task, etc.) subject to inequality constraints
defining the closed domain of admissible controls U. Possible constraints on the admissible states (x1, x2, . . . , xn) will be discussed in Secs. 11.8-ld and 11.8-5.
The optimal control functions Uk(t) will produce an optimal trajectory xi = Xi(t) in the n-dimensional state space. Solution of such a control problem requires suitably given boundary conditions to determine initial and final values xi(t0), xi(tF); the initial and final times t0, tF may themselves be unknowns (Sec. 11.8-lc).
(b) Optimum-control Theory and the Calculus of Variations. The methods of Secs. 11.8-2 to 11.8-5 may be regarded as a somewhat generalized calculus of variations applied to the important class of problems defined in Sec. 11.8-1. In the language of earlier sections, both the xi(t) and the uk(t) are unknown dependent variables* yi(t). The state equations (1) are differential-equation constraints, and the variables pi defined in Sec. 11.8-2 are the corresponding variable Lagrange multipliers. The adjoint equations and maximum principle introduced in Sec. 11.8-2 constitute necessary (not sufficient) conditions for optimal uk, xi and are essentially equivalent to the Euler equations (11.6-4) plus the condition E ≥ 0 (Sec. 11.6-10), where they apply. The maximum principle states the optimizing conditions in an elegant, convenient, and
* As a matter of fact, the entire problem of Secs. 11.6-1 to 11.6-7, i.e., maximization or minimization of integrals
with suitable constraints and/or boundary conditions is reformulated as an optimum-control problem if one substitutes
In this special case, the theory of Sec. 11.8-2 leads to the classical canonical equations of Sec. 11.6-8; this approach may and may not simplify the given problem.
more general form, permitting a relatively straightforward treatment of systems with discontinuous control variables (Sec. 11.8-3).
(c) Initial-state and Terminal-state Manifolds. While the starting time to and the initial values xi(to) of the state variables are given in many control problems, a more general problem statement merely requires the initial state [x1(to), x2(to), . . . , xn(to)] to lie on a given (n – no)-dimensional initial-state manifold (hypersurface, curve, or point in the state space) defined by
The terminal state [x1(tf),x2(tf), . . . xn(tF)] is similarly constrained to lie on a given (n – nF)-dimensional terminal-state manifold defined by
In addition, one may have given inequality constraints defining allowable regions on each manifold (4). Note that a single inequality
confining, say, the terminal state to the interior of a given n-dimensional state-space region is essentially equivalent to the corresponding equality constraint, since each trajectory entering the terminal-state region must cross its boundary.
(d) Continuity, Differentiability, and Independence Assumptions (see also Sec. 11.5-2 and Ref. 11.25). Unless the contrary is specifically stated, it will be assumed in the following that, for all xi, Uk, and t under consideration,
The given functions fo, f1, f2, . . . , fn are continuously differenti-able with respect to the state variables xi, and continuous in the control variables uk.
The functions Qj- are continuously differentiate with nonzero gradients.
The n0 functions Bj and the nf functions Gj defining the initial-and terminal-state manifolds are continuously differentiable; functional independence is assured by the condition that the ranks are,(Sec. 13.2-7) of the matrices are, respectively, equal to n0 and nf
All admissible controls uk(t) are to be chosen from the class of functions of bounded variation in (to, tF) (hence, unilateral limits exist everywhere, Sec. 4.4-8), and piecewise continuously differentiable wherever they are continuous. The corresponding xi(t) will be piecewise continuously differentiable.
Less restrictive assumptions are often possible, but the resulting theorem statements become cumbersome (Ref. 11.25).
(e) Generalizations (see also Secs. 11.8-4 and 11.8-5). The optimization methods introduced in Secs. 11.8-1 to 11.8-5 can be made to apply to much more general problems. In particular,
If one or more of the given functions fi, Qj, φj,Ψj depend explicitly on the independent variable t (“nonautonomous” system), one reduces the problem to the simpler case by introducing t = xn+i as an extra state variable, with
This procedure applies also if a Bj depends explicitly on U, or a Gj on tF.
Continuously differentiate equality constraints on the state variables,
and isoperimetric conditions
with piecewise continuously differentiable φj can be handled by the Lagrange-multiplier method of Secs. 11.6-2 and 11.6-3. One replaces the function f0 in the criterion functional (2) with
where the λt and uj are, respectively, variable and constant Lagrange multipliers. Such constraints are not enforced by the control, but modify the definition of the given system. Control-enforced constraints on the xi will be treated in Sec. 11.8-5.
3. Criterion functional of the form
reduce to the simpler form (2) if one introduces
(see also Sec. 11.6-6).
4. If the constraints (4) on the control variables uk depend explicitly on the state variables xt (including, possibly, t = xn+i), one can usually eliminate this dependence by introducing new control variables.
EXAMPLE: A constraint (4) of the form
reduces to the form
if one introduces the new control variable v defined by uk = vq{x1, x2) . . . , xn).
11.8-2. Pontryagin's Maximum Principle, (a) Adjoint Variables and Optimal Hamiltonian. It is convenient to treat the criterion functional (2) as the final value X0(tf) of an added state variable x0(F) satisfying the state equation
and the initial condition
Necessary conditions for optimal control are then defined in terms of Pontryagin's Maximum Principle: Define n + 1 adjoint variables po(t), P1(t), p2(t),... , pn(t) as solutions of the n + 1 first-order differential equations
Then the optimal control minimizing the criterion functional (2) is given by that set of admissible control variables uk = uk(t) which maximize the Hamiltonian function
for each t between t0 and tF; moreover,
In addition, the optimal xi(t) and uk(t) must satisfy the given conditions (1) to (4), and transversality conditions
corresponding to the given boundary conditions (4); the Ay and Ay are unknown constants (see also Secs. 11.6-5, 11.6-8, and 11.8-2b).
Given the assumptions listed in Sec. 11.8-lc, the adjoint variables pi(t) will be piece wise continuously differentiate functions. They remain continuous even if one admits discontinuities of f0 and/or f1, f2, . . . , fn on a hypersurface S given by
provided that g is continuously differentiate, and that the fz possess unilateral derivatives with respect to x1, x2 . . . , xn on either side of S (“refraction” of optimal trajectories, see also Sec. 11.6-7). Under these conditions, the optimal Hamiltonian, too, remains continuous on S.
(b) The Boundary-value Problem. Pontryagin's maximum condition yields, in principle, relations expressing each control variable uk in terms of the xi and pi i.e.,
These relations may be obtained through solution of an ordinary maxi-mum-of-a-function problem [possibly with inequality constraints (3)] for each t. Once this has been done, the optimum-control problem reduces to the solution of the 2n + 2 first-order differential equations (1), (12), and (14), or
subject to Eq. (17) and the boundary conditions. Since the adjoint equations (14) are homogeneous in the pi one can arbitrarily choose the constant in Eq. (15), so that
One now has exactly 2n + n0 + nF + 2 boundary conditions (4), (13), (18), and (21) to determine 2n + 2 unknown constants of integration, no + nF unknown multipliers Ay, Ay, and the unknown time interval tF – to. The missing boundary condition is obtained through substitution of t = tF into Eq. (17). Unless either t0 or tF is given explicitly, one introduces the additional state variable xn+1 defined by
(see also Sec. 11.8-le).
NOTE: Whenever a boundary condition (4), say Gj = 0, permits explicit determination or elimination of a boundary value, say x1(tF), then p1(tF) is undetermined by the relations (4) and (18). In this case, the Jth equation (4b) (and hence Aj) and the Ith equation (18b) are simply omitted. If, on the other hand, the boundary value of a state variable, say xi(tF), is left “free” or undetermined by the terminal-state constraints (4b), then the 7th equation (18b) yields the corresponding “natural boundary condition” pi(tF) = 0 (see also Sec. 11.6-5). Similar special cases apply to Eqs. (4a) and (18a).
(c) Since the maximum principle, as given, expresses only necessary (not sufficient) conditions for optimum control, the Pontryagin method may yield multiple candidates for the optimal solution, or no solution may exist. Actual solution of the two-point boundary-value problem usually requires numerical iteration methods (see also Secs. 20.9-2 and 20.9-3). In addition, actual derivation of the maximizing functions (19) may also require successive approximations (Refs. 11.23 and 11.24).
11.8-3. Examples, (a) Zermelo's Navigation Problem. A ship with rectangular cartesian coordinates x1, x2 runs at a given constant velocity V in a current with given local velocity components v1{x1 v2), v2(x1,x2) (vi2 + v22 = v2). Given x1 (0) = 0, x2(0) = 0, one desires to minimize the time
required to reach the given terminal state x1F ,x2F = 0 by choosing an optimal control u = u{t) such that
If, in particular, v1 and v2 are constant, then so are p1, p2 and u; their values, together with tF, must satisfy
The state equations are
(b) Simple Bang-bang Time-optimal Control. Given x1(0),x2(0) = xi(0) and the state equations
, one desires to minimize the time
required to reach a given (attainable) point (x1(tF, x2(tF) by choosing the angle u(t) between the instantaneous velocity relative to the water and the xi axis.
Maximization of the Hamiltonian
so that p1 = pi(0), p% = ^2(0) – tpi(0). The optimal trajectories in the xi, xi plane (phase plane) are parabolic arcs corresponding to u = 1 and u = – 1. These arcs
FIG. 11.8-1. Phase-plane trajectories for simple bang-bang control of a simple optimal-control problem (Sec. 11.8-3b).
intersect the “switching curve” corresponding to pz – 0, and each trajectory continues to the origin on the latter (Fig. 11.8-1). Each specific trajectory depends on the unknown parameters p1(0),p2(0), which must be determined so that the given boundary conditions x1(tF)=x2(tF) are satisfied.
(c) A Simple Minimal-time Orbit-transfer and Rendezvous Problem. Referring to Fig. 11.8-2, the motion of a simple rocket-driven vehicle in a vertical plane, assuming a flat earth (constant acceleration of gravity, – g) and no air resistance, is given by the four state equations
(Fig. 11.8-2). x = xi, x = X2, y = x3, and y = x4 are state variables; gr, T (thrust), mQ (vehicle mass at start), and rh < 0 (fuel-consumption rate, assumed constant) are
FIG. 11.8-2. Geometry for the orbit-transfer problem (Sec. 11.8-3c).
given constants. It is desired to transfer the vehicle from a given horizontal starting orbit defined by
to the given horizontal orbit of a target, and to match its given position and constant velocity:
This is to be accomplished by programming the control variable (attitude angle) u = u(t) so as to minimize the total fuel consumption, i.e., so as to minimize m(tF – t0), or simply . Since the initial- and final-state manifolds depend explicitly on t9 and tF, one introduces x5 = t with dx5/dt = 1, x5(t0) = to,x5(t0),x5(tf)=tF. Maximization of the Hamiltonian,
Since the starting and terminal values of x, y, and y are fixed, the only transversality conditions (18) of interest are those resulting from
With p1 and p5 constant, this implies p1 = p5 = 0. Furthermore, the maximal Hamiltonian is zero at t = tFi
This, together with the eight given initial and terminal conditions, determines the nine quantities t0, tf, x(t0),x(t0), y(t0), y(t0), p2(to), P3(t0), p4{t0), and thus the complete solution.
11.8-4. Matrix Notation for Control Problems. The control problem can be very conveniently expressed in the matrix notation of Sec. 13.6-1 (and also in the corresponding tensor notation, Sec. 14.7-7). One introduces
x = x{t) = [x0, x1, x2, . . . , xn] (n+1) x 1 column matrix representing the state vector u = u(t) = {u1, u2,. . . , ur} r X 1 column matrix representing the control vector p = p(t) = (p0, p1 p2, . . . , pn), 1 X (n + 1) row matrix representing the adjoint vector (see also Sec. 14.5-2). The relations of Secs. 11.8-1 and 11.8-2 can now be restated compactly; note that is an (n -j- 1) X (n + 1) matrix. In particular,one has
The criterion functional (2) may be regarded as the matrix product (inner product, Sec. 14.7-1) xo = ex of the column matrix x and the 1 X (n + 1) row matrix c = (1, 0, 0, . . . ,0). Reference 11.25 treats the case of criterion functionals defined as ex with a more general row matrix c.
11.8-5. Inequality Constraints on State Variables. Corner Conditions (see also Sec. 11.6-6). (a) If the domain X of admissible system states (x0 x1 x2, ... , xn) is restricted by continuously differentiable inequality constraints
which are to be enforced by the control in the course of the optimal-trajectory generation (see also Sec. 11.8-le), the theory of Sec. 11.8-2 applies without change to trajectories and portions of trajectories situated in the interior of the closed state region X defined by Eq. (23). For trajectories and portions of trajectories situated on the boundary of X, at least one of the constraints (23) becomes an equality. For simplicity, consider a boundary region Ds with only a single equality,say
In this boundary region, the Lagrange-multiplier method of Secs. 11.6-2 and 11.8-le applies; i.e., each optimal trajectory satisfies Eq. (24) and the relations of Sec. 11.8-2a, with fo replaced by
where λ is a variable Lagrange multiplier.
Alternatively (and equivalently), f0 can be replaced on Ds by any one of the expressions
where fx(t) is a variable Lagrange multiplier.
(b) If S(K) is the first of the functions (27) which contains a control variable u explicitly, S < 0 is termed a Kth-order inequality constraint. In this case, an optimal trajectory entering the boundary region Ds defined by Eq. (24) from the interior of X at the time t = t\ must satisfy the corner conditions (jump conditions)
where the vk are constant Lagrange multipliers, and b is an arbitrary constant.
An optimal trajectory leaving the state-space boundary at the time t = t2 for the first time after entering at t = h must satisfy the corner conditions
The arbitrary constant b occurring in both Eq. (28) and Eq. (29) is usually taken to be zero, so that the pi(t) are continuous at the exit point. If an optimal trajectory is reflected by a boundary region Ds corresponding to a single constraint (24), one has t1= t2, and the corner conditions (28) apply with 6=0.
If the single constraint (24) is replaced by two or more such constraints, terms corresponding to each new constraint (with additional multipliers) must be added to each sum in Eqs. (25), (26), (28), and (29), Explicit time dependence of constraints is treated in the manner of Sec. 11.8-le.
NOTE: See Ref. 11.25 for exceptions to the corner conditions (28), (29) in special situations.
11.8-6. The Dynamic-programming Approach (see also Sec. 11.9-2). If the optimum-control problem denned in Sec. 11.8-1 defines unique optimal extremals for fixed xi(tF) and variable
then the criterion-function minimum
satisfies the first-order partial differential equation
where M is the optimal (maximized) Hamiltonian function Eq. (17) (see also Sec. 11.6-8). The ordinary differential equations (20) are the corresponding characteristic equations; their solutions are thus readily determined if a complete integral of the partial differential equation (31) is known (Sec. 10.2-4).
The dynamic-programming approach "imbeds" a given optimum-control problem in an entire class of similar problems with different initial coordinates. The partial differential equation (31), which solves this entire set of problems, expresses the fact that every portion of an optimal trajectory optimizes the criterion functional for the corresponding initial and terminal points (Bellman's Principle of Optimality). It is possible to derive the entire optimum-control theory under very general assumptions from the principle of optimality. In general, numerical solution of Eq. (31) is difficult for n > 2 (see Refs. 11.13 to 11.15, which also discuss dynamic programming in a more general context).
11.9. STEPWISE-CONTROL PROBLEMS AND DYNAMIC PROGRAMMING
11.9-1. Problem Statement. An important class of optimization problems involving stepwise control (this includes control of continuous systems by digital computers) or stepwise optimal resource allocation can be formulated as follows: a system described by a set of state variables x s (x1, x2, . . . , xn) proceeds through a sequence of states 0x, lx, 2x, . . . such that each state change is given by the state equations (in
this case, difference equations, see also Sec. 20.4-3)
where the “control variable” k+1u = (k+1Ui, k+1U2, . . . , k+1ur) defines the sequence of decisions (policy) changing the A:th system state into the (k + l)8t state. Given the initial state 0X (and, possibly, a set of suitable equality or inequality constraints on state and control variables), it is desired to find an optimal policy 1u, 2u, . . . , Nu which will minimize a given criterion function
where N = 1, 2, ... is the number of steps considered (dynamic programming). As in continuous optimal-control problems, the initial and final states may or may not be given by the problem; and it is, again, possible to generalize the problem statement in the manner of Sec. 11.8-le.
11.9-2. Bellman's Principle of Optimality (see also Sec. 11.8-6). If 1uJ 2u, . . . , Nu is an optimal policy resulting in the state sequence 0x, lx, 2x, . . . , Nx for a given dynamic programming problem with initial state 0x, then 2u, 3u, . . . , Nu is an optimal policy for the same criterion function and final state Nx but initial state 1x. If one denotes min Nx0(X) by NS(X),
the optimality principle leads to the fundamental recurrence relation (partial difference equation, Sec. 20.4-3&)
where minima are determined in accordance with any given constraints. Numerical solution of this functional equation for the unknown function NS(X) amounts to stepwise construction of a class of optimal policies for many initial states. The desired optimal policy is “imbedded” in this class. Solution usually requires digital computation; even so, solution of problems with more than two or three state variables Xi is practical only in special cases. References 11.12 to 11.18 describe many examples and approximation methods, and Ref. 11.20 discusses a stepwise-control analogy to the Maximum Principle.
11.10. RELATED TOPICS, REFERENCES, AND BIBLIOGRAPHY
11.10-1. Related Topics. The following topics related to the study of maxima and minima are treated in other chapters of this handbook:
Ordinary differential equations Chap. 9
Partial differential equations, canonical transformations Chap. 10
Eigenvalues of matrices and quadratic forms Chap. 13
Sturm-Liouville problems Chap. 15
Statistical decisions Chap. 19
Numerical methods Chap. 20
11.10-2. References and Bibliography (see also Sec. 20.2-6 for numerical optimization techniques).
Linear and Nonlinear Programming, Theory of Games
11.1 Abadie, J.: On the Kuhn-Tucker Theorem, ORC 65-18, Operations Research Center, University of California, Berkeley, 1965.
11.2 Dantzig, G. B.: Linear Programming and Extensions, Prentice-Hall, Englewood Cliffs, N.J., 1961.
11.3 Dresher, M.: Games of Strategy: Theory and Applications, Prentice-Hall, Englewood Cliffs, N.J., 1961.
11.4 Gale, D.: The Theory of Linear Economic Models, McGraw-Hill, New York, 1960.
11.5 Gass, S. I.: Linear Programming, 2d ed., McGraw-Hill, New York, 1964.
11.6 Hadley, G.: Linear Programming, Addison-Wesley, Reading, Mass., 1962.
11.7 Karlin, S.: Mathematical Methods and Theory in Games, Programming, and Economics, 3d ed., Addison-Wesley, Reading, Mass., 1959.
11.8 Kuhn, H. W., and A. W. Tucker, Nonlinear Programming, in Proc. 2d Berkeley Symp. on Math. Stat, and Prob., vol. 5, University of California Press, Berkeley, 1952.
11.9 Luce, R. D., and H. Raiffa: Games and Decisions, Wiley, New York, 1957.
11.10 Vajda, S.: Theory of Games and Linear Programming, Wiley, New York, 1956.
11.11 Boot, J. C. G.: Quadratic Programming, North Holland Publishing Company, Amsterdam, 1964.
See also the articles by L. Rosenberg and E. M. Beale in M. Klerer and G. A. Korn: Digital Computer User's Handbook, McGraw-Hill, New York, 1967.
Calculus of Variations, Optimum-control Theory, and Dynamic Programming
11.12 Aris, Rutherford: Discrete Dynamic Programming, New York, Wiley, 1963.
11.13 Bellman, R.: Dynamic Programming, Princeton University Press, Princeton, N.J., 1957.
11.14 and S. E. Dreyfus: Applied Dynamic Programming, Princeton University Press, Princeton, N.J., 1962.
11.15. and R. Kalaba: Dynamic Programming and Modern Control Theory,
Academic, New York, 1965.
11.16 Bliss, G. A.: Calculus of Variations, Open Court, Chicago, 1925.
11.17 Bolza, O.: Vorlesungen ilber Varialionsrechnung, Stechert, New York, 1931.
11.18Dreyfus, S. E.: Dynamic Programming and the Calculus of Variations, Academic, New York, 1965.
11.19 Elsgolc, L. E.: Calculus of Variations, Pergamon/Addison-Wesley, Reading, Mass., 1962.
11.20 Fan, Liang-Tsen, and Chin-sen Wang: The Discrete Maximum Principle, Wiley, New York, 1964.
11.21Fan, Liang-Tsen: The Continuous Maximum Principle, Wiley, New York, 1966.
11.22 Gelfand, I. M., and S. V. Fomin: Calculus of Variations, Prentice-Hall, Engle-wood Cliffs, N.J., 1963.
11.23 Leitmann, G.: Optimization Techniques, Academic, New York, 1962.
11.24 Merriam, C. W.: Optimization Theory and the Design of Feedback Control Systems, McGraw-Hill, New York, 1964.
11.25 Pontryagin, L. S., et al.: The Mathematical Theory of Optimal Processes, Wiley, New York, 1962.
11.26 Weinstock, R.: Calculus of Variations, McGraw-Hill, New York, 1952.
11.27 Athans, M., and P. L. Falb: Optimal Control, McGraw-Hill, New York, 1966.