8
The Mathematical Theory of Control Processes
RICHARD BELLMAN
RESEARCH MATHEMATICIAN
THE RAND CORPORATION
8.1 Introduction
Let us begin by stating precisely what we mean by a control process.† Consider a physical system S that we suppose to be described at any time t by a finite-dimensional vector
x(t) ≡ [x1(t), x2(t), …, xn(t)]
which we shall call the state vector. Initially, let us assume that this vector is determined as a function of time by means of a vector differential equation
where the vector c represents the initial state.
A large part of the theory of differential equations is devoted to the study of the existence and uniqueness of the solutions of Eq. (8.1) and to the qualitative and quantitative analysis of the solution. Let us assume here that we are able in one way or another to predict the future behavior of the system.
On comparing this actual behavior with some idealized or desired behavior, we may find that the extent of agreement is not satisfactory. If, as is often the case, it is impossible, or too expensive in time or resources, to construct a totally new system, we must devise an efficient method for modifying the behavior of the system we possess.
One ingenious idea is to use the very deviation of the system from desired behavior as a restoring force that will influence its behavior in the right way. This is the concept of feedback control, an idea that goes back at least as far as the governor on the Watt steam engine.
In many cases the mathematical version of this fruitful concept is the following: Suppose that the vector z(t) is the desired state of the system at any particular time, and let h(x − z) represent a restoring force applied to the system at time t. Assume that this is done in such a way that the new vector differential equation describing the system is
In general, there will be a certain cost involved in doing this over a time interval [0,T], measured by a functional J1(x,z,h), and a cost due to the deviation of x from z. This latter may be a tractable functional such as
where (x,y) is the inner product
Or, more realistically, the functional may be a measure such as
where xi and zi are the ith components of x and z, respectively.
We thus arrive at the problem of choosing the vector function, subject perhaps to certain physical constraints that we shall discuss in detail below, so as to minimize the sum J1 + J2.
Analytically, many of these problems can be subsumed under the problem of choosing a vector function y to minimize the functional
subject to the condition that x and y are related by a vector differential equation of the form
and perhaps subject to additional constraints of the type
ri(x,y) ≤ 0i = 1, 2, …, M0 ≤ t ≤ T
A process of this type we shall call a control process.
Two particularly important cases are those in which T is fixed and either dG(t) = dt or G(t) is a jump function with a single jump at t = T, so that
J(y) = h(x(T),y(T))
In this latter case, we often speak of terminal control.
The case in which T is variable and itself dependent on x and y is of great current interest. Particular cases are the “bang-bang” control problem27,41,48,53–55 and various trajectory problems.42 We shall call problems of this nature implicit variational problems. Little has been done on these problems, and they offer a fascinating field of research.
The role of the mathematician in this challenging study is threefold. In the first place, he has the task of formulating questions of this nature—which arise in economic, industrial, engineering, and biological context—in precise mathematical terms. He must convert a well-posed scientific problem into a well-posed mathematical problem. The two types of problems are not at all equivalent; see Ref. 21 for a further discussion of this point.
The foregoing consideration necessitates a study of the existence and uniqueness of solution of the variational problems that arise. Even more, it requires a choice of formulation. One of the main points of this survey will be the fact that in studying control processes we have two approaches: the classical calculus of variations,49 with modern amplifications, and the theory of dynamic programming.9,32,49
A second part of the general problem is that of determining the analytic structure of the solution. Infrequently, this can be done by means of an explicit solution, but most often it must be done without this luxury. We shall present, below, many examples of what we mean here.
A third part of the problem, and one that is required before we can consider the problem to be resolved in any sense, is that of producing algorithms that can be used for numerical solution.
The three parts are completely intertwined and interdependent. Clearly there is no point in attempting to find the numerical solution to a problem that does not possess a solution. Conversely, a knowledge of the uniqueness or multiplicity of a solution is an essential feature of a computational approach.
Finally, the very formulation of the control process in analytic terms is dependent on the numerical information that is available and the numerical results that are desired.
DETERMINISTIC CONTROL PROCESSES
8.2 The Calculus of Variations
A powerful tool for the investigation of problems of the type posed in Sec. 8.1 is the classical calculus of variations. Let us, for the sake of simplicity, consider the scalar variational problem of maximizing the functional
over all scalar functions v(t), where u and v are related by means of the relationship
By proceeding in the usual way, we obtain the Euler equation
and the boundary conditions
The first condition is one with which we started, while the second arises from the variational process.
There is a temptation to regard Eq. (8.2) as a solution of the original variational problem. Actually, the problem has only begun at this point. Often, the computational solution is so difficult that the original variational problem is used to furnish a solution to Eqs. (8.2) and (8.3).
8.3 A Catalogue of Catastrophes
Let us now indicate briefly some of the difficulties encountered in applying the variational technique outlined above. Further discussion will be found in Ref. 32.
a. Two-point Boundary-value Problems. The computational solution of a differential equation with initial conditions is today a rather simple matter with the aid of digital computers. Systems involving 100 simultaneous differential equations in 100 unknowns are readily treated.
The situation is quite different when two-point conditions are imposed. For a discussion of these matters and the methods that can be used to treat these more complex matters, see Ref. 17.
b. Relative Extrema. As in the case of ordinary calculus, the variational equation determines the position of the absolute maximum, but also the positions of relative maxima, relative minima, and critical points of more complex nature.
c. Constraints. If, in the problem of Sec. 8.2, we introduce the constraint
|v| ≤ m
then the process of solution becomes very much more intricate, and in some cases it eludes us completely. This is due to the fact that there is no longer one variational condition; rather, there is a set of Euler equations and inequalities. Furthermore, the number and order of these relationships depend on the solution itself.
d. Implicit Functionals. The Euler equation presupposes the existence of an analytic functional. As we shall see, some quite important problems cannot be posed in these terms.
e. Linearity. Far on the other extreme, variational problems posed in the simplest of analytic terms, even in linear terms, cannot be treated by means of classical variational techniques.
In view of these briefly sketched facts (a more detailed account of which will be found in Ref. 32) our program will be the following: We shall first discuss various classes of problems that can be treated by conventional techniques and modern extensions. Then we shall present a new technique, the theory of dynamic programming,9 which will enable us to treat deterministic, stochastic, and adaptive control processes in a uniform fashion.
8.4 Quadratic Criteria and Linear Equations
Let us begin by indicating one class of variational problems that can be analyzed rigorously and completely. Problems of this nature, regardless of their apparent simplicity and lack of rigor, are always valuable as stepping stones, via successive approximations, to the solution of more significant problems.
Consider the problem of minimizing the quadratic functional
over all vector functions y(t), where x and y are related by means of the linear equation
Here, A(t), C(t), and D(t) are matrix functions.
It is easy to verify that the basic variational equation is linear in this case. Since, however, the boundary conditions are of the two-point type, there are still matters of existence and uniqueness to settle.
In place of deciding these questions by means of the theory of differential equations, it is better to take a more general point of view that simultaneously resolves a large class of problems of similar form. By using the solution X(t) of the matrix equation
where I denotes the unit matrix, we can write the solution of Eq. (8.4) in the form
Let us then write
x = L(y)
where L is an inhomogeneous linear transformation of the vector y.
The variational problem is now that of minimizing
over all admissible vectors y. The advantage of this formulation lies in the fact that a large class of problems arising from differential-difference equations, partial differential equations, difference equations, integral equations, and so on, can all be subsumed under this general form.
Once the problem has been posed in this manner, we can employ quite general Hilbert-space techniques to establish existence and uniqueness of the solution, to derive the variational equations, and to deduce the dependence of the solution on basic parameters such as T. For detailed discussions see Refs. 26 and 29.
8.5 Linear Criteria and Linear Constraints
A problem of a type that arises frequently in mathematical economics and occasionally in engineering work is that of maximizing the linear functional
over all vector functions y(t) satisfying the constraints
c(t) ≤ y(t) ≤ d(t)0 ≤ t ≤ T
where the vectors x and y are connected by the linear equation
Here, we face an excess of linearity! The correct approach to variational problems of this nature is by means of the famous Neyman-Pearson lemma of mathematical statistics.26,29 See Refs. 26, 36, and 56 for the solution of particular problems.
8.6 Nonlinear Criteria and Constraints
The addition of constraints renders nontrivial even the simplest-appearing problems. As mentioned above, the reason for this lies in the fact that, in place of one Euler equation over the entire interval, one has a set of equalities and inequalities. Thus, if the problem is that of maximizing
over all scalar functions v, where u and v are related by means of the differential equation
then the solution can take the form shown in Fig. 8.1 (see page 238).
This means that the Euler equation is satisfied for 0 < t < T1, an Euler inequality for T1 < t < T2, and so on.
Fig. 8.1 Possible solution of variational problem with inequality constraint.
At the present time, there exists no systematic technique for determining the number of transition points, or their location. For a detailed discussion of particular problems under various assumptions, see Refs. 25, 26, and 57.
8.7 Implicit Functionals
An interesting class of control processes that has attracted a great deal of attention in recent years arises from the following considerations. Let the state of the system be determined by the linear vector differential equation
and let it be required to determine the vector y, subject to the constraint
|yi(t)| ≤ mii = 1, 2, …, n0 ≤ t
so as to minimize the time required to drive the system into a preassigned state. This problem, under the name of the “bang-bang” control problem, has been extensively investigated in the United States; see Refs. 27, 41, 54, and 55, where further references may be found. Under the name of “optimal time systems,” it has been extensively studied also in Russia; see Refs. 48 and 53, where references to the work of Pontrjagin, Gamkrelidze, and others are given.
The methods that have been used are a combination of classical analytical techniques and modern analytical-topological procedures based on moment-space concepts. Dynamic programming can also be used for both analytic and computational purposes.
8.8 Dynamic Programming
Let us now briefly outline the application of the theory of dynamic programming to control processes, referring the reader to Refs. 9 and 32 for more detailed discussions. In a number of cases, the techniques of dynamic programming furnish the only feasible computational solution of the variational problems that arise. We shall indicate the limitation of these methods below.
The basic idea of the dynamic-programming approach is to regard a control process of the type described, say, in Sec. 8.2, as a multistage decision process of continuous type. If we write
then the principle of optimality5,9 yields the nonlinear partial differential equation
If a constraint such as |v| ≤ k is added, then this becomes
We can use Eq. (8.5) for computational purposes, invoking standard difference techniques, or we can use the approximation equation
arising from a discrete version of the original process.
A discussion of the computational aspects will be found in the author’s book on adaptive control processes.19
The advantage of this approach is that it handles linearities or non-linearities, constraints or not, implicit or explicit functionals, in precisely the same manner, and with the same basic equation. The disadvantages arise from the requirement that the computer tabulate the function f(c,T). For the case of scalar variational problems, this is easily done. But in the general case, we face a function of the n variables ci, the components of the vector c.
With present computers, this limits routine solution to variational problems in which the dimension n of x is at most 2; and it limits feasible solution by various coarsenings of grid, and so on, to cases in which the dimension is at most 4. For problems of higher dimensions, we must use analytic devices,6,12 the method of successive approximations,13 and techniques of analytic representation.24
8.9 Trajectories
A very interesting type of control process arises from the determination of optimal trajectories for space rockets and satellites. A great deal of effort has been devoted to this field, using both conventional variational techniques47,49,57 and dynamic-programming techniques.37,42
Generalized trajectory problems arise in a number of different fields. Abstractly, we wish to transform a system from one point in phase space to another point in phase space at minimum cost. Here, the cost may be measured in terms of time, fuel, or other quantities.
Particular versions of this general problem are the actual trajectory problems, the bang-bang control problem mentioned above, shortest-route problems,14,31 some problems in chemical engineering,2 and a variety of problems in the field of sequential machines.18,30 All of these can be treated in a uniform fashion by means of the theory of dynamic programming.
8.10 Computational Aspects
Those readers who are interested in the actual numerical solutions of variational problems by means of functional-equation techniques may wish to refer to the author’s book in collaboration with S. Dreyfus23 and to Refs. 22 and 42.
It is interesting to note that even in cases of linear equations and quadratic criteria, for which the variational equations are linear and hence capable of being solved explicitly, the functional-equation approach may still yield a simpler and quicker computational treatment; see Refs. 10 and 11.
STOCHASTIC CONTROL PROCESSES AND GAME THEORY
8.11 Stochastic Effects
A mathematical theory of stochastic processes can be introduced either on the grounds of expediency, as a mathematical device for circumventing analytic or conceptual difficulties, or on the grounds of realism, as a more accurate picture of the universe. The reader can take his choice of the philosophical basis he prefers, since the mathematical consequences are the same.
At the present time, although there is great interest in stochastic control processes, the problems that have been treated fall into only two categories, those specified by linear equations and quadratic criteria and those that have been approached by means of dynamic programming.
In place of the deterministic differential equations governing the process, we have stochastic equations of the form
where r(t) is a random variable with given distribution. In place of maximizing a prescribed functional, we maximize the expected value of a functional of the form
Since the mathematical theory of variational problems of this nature is in its infancy, it is well to begin with discrete versions that can be treated rigorously.
For the case of linear equations and quadratic criteria, however, a well-established theory exists; it is discussed in a number of readily available books.3,63
For the treatment of more general stochastic control processes by means of dynamic-programming techniques, see Ref. 15.
8.12 Games against Nature
Game theory enters into the consideration of control processes in several different ways: as a substitute for knowledge, in the study of pursuit processes, and as an analytic technique for treating some important classes of variational problems. As in the case of stochastic processes, the theory has barely been conceived and many fascinating and significant problems are as yet untouched. A number of interesting results may be found in Ref. 64.
Let us introduce the concept of a “game against Nature,” to begin our discussion. One of the objections that can be leveled against the theory of stochastic control processes, as outlined in the preceding section, is that the distribution of the random function may not be known. This is a question we shall examine in much greater detail in the following part on adaptive control processes. Here, we overcome the foregoing objection by supposing that the distribution function for r(t) will be chosen by someone completely opposed to our interests. This “someone” we call Nature. In this way, the mathematical theory of games enters the theory of control processes.
There are now two approaches, as in the theory of deterministic control processes, one based on an extension of the classical calculus of variations,38,43,64 and one based on the theory of dynamic programming. A game against Nature of this type is viewed as a multistage game, and accordingly it is susceptible to the functional-equation approach.7,20,35,58,59
All aspects of the problem are now very much more difficult. This is the price that we must pay for ignorance of the precise disturbing effects in the system that we wish to control.
8.13 Pursuit Processes
In various applications of control processes, we encounter situations in which one person or object is chasing another person or object. One is trying to effect capture, and the other is trying to evade capture. These processes may quite legitimately be regarded as two-person games of multistage type, and a great deal of effort has been devoted to their study. The problems are quite difficult to pin down precisely, and most of the significant problems remain not only open, but not even precisely formulated.32,50
8.14 Analytic Techniques
It sometimes happens that variational problems that are originally maximization or minimization problems of rather complex type become quite simple when viewed as min-max problems. They can then be regarded as arising from the theory of games. Not only does this simplify our solution of the problem, but it enormously simplifies the proof of the extremum property. The technique is used in Ref. 28, and the idea of replacing an ordinary equation by one containing a maximization or minimization in order to obtain a stronger foothold is extensively discussed in Refs. 4 and 52.
ADAPTIVE CONTROL PROCESSES
8.15 Adaptive Systems
Let us now, again quite briefly, glance into one of the newest and most exciting domains of science: the theory of adaptive control processes. In treating stochastic control processes and multistage games, we have skirted the unknown; in discussing adaptive processes, we must boldly grapple with it.
The general problem that faces us is that of exerting control in some efficient way in situations in which we do not possess enough information for a classical formulation. This can occur if we do not know
a. The distribution of the random variables involved
b. The analytic form of the function g(u, v, r(t)) appearing in the describing equation
c. The state of the system at any particular time
d. The analytic form of the criterion function h(u, v, r(t))
e. The duration of the process
The type of process that confronts us is then a learning process, in which we must obtain the missing information, to the best of our ability, as the control process unravels. The decisions at any stage may then consist of control operations and information probes.
The first theory of this type to be developed, from a statistical and economic background, was the theory of sequential analysis,65,66 although the first analytic efforts were in the field of biological control processes.61,62 For subsequent papers in this field, see Refs. 8 and 51. From the psychological side, the subject has been extensively studied; see Ref. 40, where many references are given. For a discussion of some experimental work, see Refs. 44 and 45.
In the following section, we shall present an approach to these problems based on dynamic-programming ideas.
8.16 Functional-equation Approach
We now characterize the state of a system in which uncertainties occur in terms of two quantities:
The problem of making precise what we mean by the information pattern is a formidable one, and it is one of the principal obstacles to progress. One of the aspects of these problems that is so frustrating is that the very formulation of a problem is a task of sometimes major nature. The concept is best elucidated by particular examples. Thus, P may consist of a priori estimates for the distribution function of a random variable, or of parameters specifying the particular distribution function; or it may merely be a catalogue of past events of importance.
Our assumption will be that a decision modifies p and P in known fashion, dependent on the actual outcome of the decision. Thus
Here r represents a random variable, which we suppose to be governed by our a priori probability distributions. Thus the distribution function for r has the form
dG(r;p,P)
Let us now consider a typical terminal-control problem in which, starting in state (p,P), we wish to minimize the expected value of ϕ(pN). Write
fN(p,P) = min [E ϕ(PN)]
where E denotes expected value.
The expected value E is to be computed according to the a priori probability distributions. Then, by the principle of optimality,5,9 the recurrence relationship we obtain is
for N = 2, 3, …, with
8.17 Computational Aspects
The foregoing relationships provide a conceptual basis for a theory of adaptive control processes. More detailed discussions, and applications to feedback control, communication processes, and medical testing, will be found in Refs. 1, 16, 33, 34, 46, and 8.
A successful computational approach is dependent on characterizing P by means of numbers rather than functions. For this, one can draw upon a number of devices used by statisticians, devices that may be lumped under the heading of “sufficient statistics.” Applications of these ideas are given in the references cited in the first paragraph of this section.
AN ILLUSTRATIVE EXAMPLE
8.18 Formulation
In order to make some of the foregoing concepts precise, let us consider a particular control problem in three versions: deterministic, stochastic, and adaptive. As our differential equation, we shall take the van der Pol equation,
where r(t) is a forcing term. For the purposes of digital computation, we shall use a discrete version of this equation, namely,
obtained from the system equivalent to Eq. (8.7),
We shall suppose that our aim in introducing feedback control is to keep the system ruled by Eq. (8.7) as close to the equilibrium state
u = u′ = 0
as possible over a time interval [0,T]. For the sake of simplicity, we shall use the criterion
as a measure of the deviation from equilibrium, or, in discrete terms, the criterion
Let us represent the effect of feedback control by a forcing term wn, so that Eq. (8.8) takes the form
The cost of control will be measured by a function g(wn), so that the total measure of the control process will be given by the function
8.19 Deterministic Case
Let us take rn to be independent of n, for the sake of simplicity, and suppose that we wish to choose the wn to minimize the function JN(w) appearing above. Write
Then we readily obtain the recurrence relationship
for N ≥ 1, with
8.20 Stochastic Case
Let us now consider the case in which each rn is a random variable drawn from a common distribution dG(r). We then wish to choose the wn, in sequential fashion, to minimize the expected value of JN(w). Write
Then the recurrence relationship is very much as above,
for N ≥ 1, with
For computational purposes, we replace dG(r) by a discrete probability distribution (p1,p2, … ,pM). For example, if each rn can assume only two values, 0 and 1, let p1 be the probability that rn = 0 and p2 the probability that it is 1. Then Eq. (8.18) takes the form
8.21 Adaptive Case
Consider now the case in which p1 is not known, but in which we do have an a priori estimate for the distribution function for p1, dH(p1). We agree to transform dH(p1) into
if we observe m zero values and n unit values for the rk over the previous m + n stages. Write
the expected probability at the end of m + n stages.
The theory of sufficient statistics tells us that it is not necessary to keep track of the actual sequence of 0s and 1s, merely the number of each. Let us then introduce the function
Then, quite analogous to the recurrence relation of the foregoing section,
for N ≥ 1, with
f0(c1,c2,m,n) = |c1| + |c2|
The computational difficulties are, of course, much more severe in this case, and various special techniques, which we shall not enter into here, must be employed to obtain results, even when using the largest and most versatile currently existing computers.
REFERENCES
1. Aoki, M., “Adaptive Control for Nonlinear Systems,” Ph.D. Thesis, University of California, Los Angeles, 1959.
2. Aris, G., R. Bellman, and R. Kalaba, Dynamic Programming and Chemical Process Control (to appear).
3. Battin, R. H., and J. H. Laning, “Random Processes in Automatic Control,” McGraw-Hill Book Company, Inc., New York, 1956.
4. Bellman, R., Functional Equations in the Theory of Dynamic Programming—V: Positivity and Quasi-linearity, Proc. Natl. Acad. Sci. U.S.A., vol. 41, pp. 743–746, 1955.
5. ——, The Theory of Dynamic Programming, chap. 11 in “Modern Mathematics for the Engineer,” First Series, edited by E. F. Beckenbach, McGraw-Hill Book Company, Inc., New York, 1956.
6. ——, Dynamic Programming and Lagrange Multipliers, Proc. Natl. Acad. Sci. U.S.A., vol. 42, pp. 767–769, 1956.
7. ——, Functional Equations in the Theory of Dynamic Programming—III, Rend. Circ. Mat. Palermo, ser. 2, vol. 5, pp. 1–23, 1956.
8. ——, A Problem in the Sequential Design of Experiments, Sankhyā, vol. 16, pp. 221–229, 1956.
9. ——, “Dynamic Programming,” Princeton University Press, Princeton, N.J., 1957.
10. ——, On Some Applications of Dynamic Programming to Matrix Theory, Illinois J. Math., vol. 1, pp. 297–301, 1957.
11. ——, On the Computational Solution of Linear Programming Problems Involving Almost Block Diagonal Matrices, Management Sci., vol. 3, pp. 403–406, 1957.
12. ——, Some New Techniques in the Dynamic Programming Solution of Variational Problems, Quart. Appl. Math., vol. 16, pp. 295–305, 1958.
13. ——, Dynamic Programming, Successive Approximations and Monotone Convergence, Proc. Natl. Acad. Sci. U.S.A., vol. 44, pp. 578–580, 1958.
14. ——, On a Routing Problem, Quart. Appl. Math., vol. 16, pp. 87–90, 1958.
15. ——, Dynamic Programming and Stochastic Control Processes, Information and Control, vol. 1, pp. 228–239, 1958.
16. ——, Note on Matrix Theory—Multiplicative Properties from Additive Properties, Amer. Math. Monthly, vol. 65, pp. 693–694, 1958.
17. ——, Dynamic Programming, Invariant Imbedding and Two-point Boundary-value Problems, Proc. Symposium on Two-point Boundary-value Problems, University of Wisconsin, April, 1959.
18. ——, “Sequential Machines, Ambiguity and Dynamic Programming,” Paper P-1597, The RAND Corporation, Santa Monica, Calif., 1959.
19. ——, “Adaptive Control Processes: A Guided Tour,” Princeton University Press, Princeton, N.J., 1960.
20. —— and D. Blackwell, “On a Particular Non-zero-sum Game,” Research Memorandum RM-250, The RAND Corporation, Santa Monica, Calif., 1949.
21. —— and P. Brock, On the Concepts of a Problem and Problem-solving, Amer. Math. Monthly, vol. 67, pp. 119–134, 1960.
22. —— and S. Dreyfus, On the Computational Solution of Dynamic Programming Processes—I: On a Tactical Air Warfare Model of Mengel, J. Operations Res. Soc. Amer., vol. 6, pp. 65–78, 1958.
23. —— and ——, Computational Aspects of Dynamic Programming (to appear).
24. —— and ——, “Functional Approximations and Dynamic Programming,” Paper P-1176, The RAND Corporation, Santa Monica, Calif., 1959.
25. ——, W. Fleming, and D. V. Widder, Variational Problems with Constraints, Ann. Mat. Pura Appl., ser. 4, vol. 41, pp. 301–323, 1956.
26. ——, I. Glicksberg, and O. Gross, On Some Variational Problems Occurring in the Theory of Dynamic Programming, Rend. Circ. Mat. Palermo, ser. 2, vol. 3, pp. 1–35, 1954.
27. ——, ——, and ——, On the “Bang-bang” Control Problem, Quart. Appl. Math., vol. 14, pp. 11–18, 1956.
28. ——, ——, and ——, Some Non-classical Problems in the Calculus of Variations, Proc. Amer. Math. Soc., vol. 7, pp. 87–94, 1956.
29. ——, ——, and ——, “Some Aspects of the Mathematical Theory of Control Processes,” Report R-313, The RAND Corporation, Santa Monica, Calif., 1958.
30. ——, J. Holland, and R. Kalaba, “On an Application of Dynamic Programming to the Synthesis of Logical Systems,” Paper P-1551, The RAND Corporation, Santa Monica, Calif., 1958.
31. —— and R. Kalaba, “On kth Best Policies,” Paper P-1417, The RAND Corporation, Santa Monica, Calif., 1958.
32. —— and ——, “A Mathematical Theory of Adaptive Control Processes,” Paper P-1699, The RAND Corporation, Santa Monica, Calif., 1959.
33. —— and ——, On Adaptive Control Processes, IRE National Convention Record, 1959.
34. —— and ——, Dynamic Programming and Adaptive Processes—I: Mathematical Foundation, Proc. IRE (to appear).
35. —— and J. P. LaSalle, “On Non-zero-sum Games and Stochastic Processes,” Research Memorandum RM-212, The RAND Corporation, Santa Monica, Calif., 1949.
36. —— and S. Lehman, Studies in Bottleneck Processes (unpublished).
37. —— and J. M. Richardson, “On the Application of Dynamic Programming to a Class of Implicit Variational Problems,” Paper P-1374, The RAND Corporation, Santa Monica, Calif., 1958.
38. Berkovitz, L. D., and W. H. Fleming, “On Differential Games with Integral Payoff,” Paper P-717, The RAND Corporation, Santa Monica, Calif., 1955.
39. Bohnenblust, H. Frederic, The Theory of Games, chap. 9 in “Modern Mathematics for the Engineer,” First Series, edited by E. F. Beckenbach, McGraw-Hill Book Company, Inc., New York, 1956.
40. Bush, R. R., and F. Mosteller, “Stochastic Models for Learning,” John Wiley & Sons, Inc., New York, 1955.
41. Bushaw, D. W., “Differential Equations with a Discontinuous Forcing Term, Experimental Towing Tank,” Ph.D. Thesis, Princeton University, 1952, Stevens Institute of Technology Report No. 469, 1953; Optimal Discontinuous Forcing Terms, in “Contributions to the Theory of Nonlinear Oscillations,” vol. 4, edited by S. Lefschetz, Annals of Mathematics Studies, study 24, Princeton University Press, Princeton, N.J., 1958.
42. Cartaino, T. F., and S. Dreyfus, “Applications of Dynamic Programming to the Airplane Minimum Time-to-climb Problem,” Paper P-834, The RAND Corporation, Santa Monica, Calif., 1956.
43. Fleming, W. H., “On a Class of Games over Function Space and Related Variational Problems,” Paper P-405, The RAND Corporation, Santa Monica, Calif., 1953.
44. Flood, M. M., “On Game-learning Theory and Some Decision-making Experiments,” Paper P-346, The RAND Corporation, Santa Monica, Calif., 1952.
45. ——, “On Stochastic Learning Theory,” Paper P-353, The RAND Corporation, Santa Monica, Calif., 1952.
46. Freimer, M., “A Dynamic Programming Approach to Adaptive Control Processes,” Lincoln Laboratories, Cambridge, Mass., 1959.
47. Fried, B. D., General Formulation of Powered Flight Optimization Problems, J. Appl. Physics, vol. 29, pp. 1203–1209, 1958.
48. Gamkrelidze, R. V., Theory of Time-optimal Processes for Linear Systems, Russian, Izv. Akad. Nauk SSSR, Ser. Mat., vol. 22, pp. 449–474, 1958.
49. Hestenes, M. R., Elements of the Calculus of Variations, chap. 4 in “Modern Mathematics for the Engineer,” First Series, edited by E. F. Beckenbach, McGraw-Hill Book Company, Inc., New York, 1956.
50. Isaacs, R. P., “Differential Games I: Introduction,” Research Memorandum RM-1391; “Differential Games II: The Definition and Formulation,” Research Memorandum RM-1399; “Differential Games III: The Basic Principles of the Solution Process,” Research Memorandum RM-1411; “Differential Games IV: Mainly Examples,” Research Memorandum RM-1486; The RAND Corporation, Santa Monica, Calif., 1954–1955.
51. Johnson, S. M., and S. Karlin, “A Bayes Model in Sequential Design,” Paper P-328, The RAND Corporation, Santa Monica, Calif., 1954.
52. Kalaba, R., On Nonlinear Differential Equations, the Maximum Operation, and Monotone Convergence, J. Math. Mech., vol. 8, pp. 519–574, 1959.
53. Krasovskii, N. N., Concerning the Theory of Optimum Control, Russian, Avtomat. i Telemeh., vol. 18, pp. 960–970, 1957.
54. LaSalle, J. P., Abstract 247t, Bull. Amer. Math. Soc., vol. 60, p. 154, 1954; “Study of the Basic Principle Underlying the Bang-bang Servo,” Report GER-5518, Goodyear Aircraft Corporation, 1953.
55. ——, Time Optimal Control Systems, Proc. Natl. Acad. Sci. U.S.A., vol. 45, pp. 573–577, 1959.
56. Lehman, S., “On the Continuous Simplex Method,” Research Memorandum RM-1386, The RAND Corporation, Santa Monica, Calif., 1954.
57. Miele, A., and J. O. Cappellari, “Topics in Dynamic Programming,” Purdue University, Lafayette, Ind., 1958.
58. Milnor, J. W., and L. S. Shapley, “On Games of Survival,” Paper P-622, The RAND Corporation, Santa Monica, Calif., 1955.
59. Peisakoff, M. P., “More on Games of Survival,” Research Memorandum RM-884, The RAND Corporation, Santa Monica, Calif., 1952.
60. Robbins, H., Some Aspects of the Sequential Design of Experiments, Bull. Amer. Math. Soc., vol. 58, pp. 527–535, 1952.
61. Thompson, W. R., On the Likelihood That One Unknown Probability Exceeds Another in View of the Evidence of Two Samples, Biometrika, vol. 25, pp. 285–294, 1933.
62. ——, On the Theory of Apportionment, Amer. J. Math., vol. 57, pp. 450–456, 1935.
63. Truxal, J. G., “Automatic Feedback Control System Synthesis,” McGraw-Hill Book Company, Inc., New York, 1955.
64. Tucker, A. W., et al. (eds.), “Contributions to the Theory of Games,” vols. 1–4, Annals of Mathematics Studies, studies 24, 28, 39, and 40, Princeton University Press, Princeton, N.J., 1950–1959.
65. Wald, A., “Sequential Analysis,” John Wiley & Sons, Inc., New York, 1947.
66. ——, “Statistical Decision Functions,” John Wiley & Sons, Inc., New York, 1950.
† A detailed discussion of the ideas and techniques of this article will be found in the author’s book “Adaptive Control Processes: A Guided Tour,” Princeton University Press, Princeton, N.J., 1960.