Proof

The Banach space F can be identified with F 1 × F 2, and since D i (a) is an isomorphism from E onto F 1, E can be identified with F 1. By translation, we may assume that a = 0. Let

φ : U × F 2 F 1 × F 2 : x y 2 i x + 0 y 2 .

We have φ (x, 0) = i (x) and (0, 0) = D i(0) + (0, 1 F 2 ). Since D i (0) is an isomorphism from F 1 onto F 1 × 0, (0, 0) is an isomorphism from F 1 × F 2 onto F 1 × F 2. By Theorem 1.29, φ is a local diffeomorphism of class C p that admits an inverse diffeomorphism r of class C p . Hence, there exists an open neighborhood U of a in F 1 such that, for all xU, r (i (x)) = x.

Recall that every closed subspace of a Hilbert space E splits in E ([P2], section 3.10.2(II), Theorem 3.147(2)). Corollary 1.33 shows that any immersion of class C p admits a local retraction r of class C p . Immersions are therefore local sections (see [P1], section 1.1.1 (III)).

([P2], section 3.2.2 (IV), Theorem 3.5(3)) implies the following result:

The composition of two immersions is an immersion, the composition of two submersions is a submersion and, given a subimmersion f, an immersion i, and a submersion s, the mapping ifs is a subimmersion (exercise). However, the composition of two subimmersions is not always a subimmersion ([DIE 93], Volume 3, section 16.8, Problem 1(b)).

Theorem 1.35

(rank) Let E, F be Banach spaces, A some non-empty open subset of E and f : AF a mapping ofclass C 1.

  1. i)  If f is a subimmersion, there exists an open neighborhood UA of a in E such that rk x (f) = rk a (f) for all xU.
  2. ii)  Conversely, if there exists an open neighborhood UA of a in E such that rk x (f) = rk a (f) for all xU and if the spaces E, F are finite-dimensional, then f is a subimmersion at the point a.
  3. iii)  Write rk x (f) = +∞ if rk x (f) is not finite. The mapping xrk x (f) from A into the discrete subspace ¯ + si227_e of the extended real line ¯ si228_e is lower semi-continuous ([P2], section 2.3.3 (III)) in A.

Proof

(i): By [1.18], rk x (f)= rk (Φ) for all xU. (ii): See [DIE 93], Volume 1, (10.3). (iii): If r = rk a (f) < +∞, we may assume that E = K r si229_e and extract a square submatrix M x that has rank r at x = a from the Jacobian matrix of f at the point x. The determinant Δx of this submatrix is therefore non-zero for x = a. But the mapping x ↦ Δ x is continuous, so there exists a neighborhood U of a in which Δ x ≠ 0, and so rk x (f) ≥ r. If rk a (f) = +∞, then, for all r si230_e , there exist vectors e 1, …, e r E such that the vectors D f (a).e 1, …, D f (a).e r generate a subspace of F of dimension r. Arguing by contradiction, we deduce that there exists a neighborhood U of a in which rk x (f) = +∞.

1.3 Other approaches to differential calculus

1.3.1 Lagrange variations and Gateaux differentials

In this section, K = si6_e .

(I) Affine spaces Given a vector space E, an affine space E si232_e attached to the space E is a homogeneous space of the additive group E ([P1], section 2.2.8(II)) such that the (transitive) action of E on E si232_e is faithful, i.e. such that the neutral element 0 is the only element of E that fixes every element of E si232_e . The action of xE on P E si235_e is written as P + x. We say that E is the space of translations of E si232_e , its elements are the translations of E si232_e and, if dim (E) < ∞, this quantity is called the dimension of E si232_e . Given some origin O chosen from E si232_e , the elements of E si232_e are all of the form O + x (xE).

Remark 1.36

The mapping O + xx is a bijection from E si232_e onto E that allows these two sets to be identified.

If Q = P + x, we write x = PQ si242_e (the bipoint of origin P and endpoint Q) 7 . If E is a locally convex space, the sets O + U = {O + x : xU}, where the U are the open sets of E, define a topology on E si232_e . When equipped with this topology, E si232_e is called a locally convex affine space. Every point of such a space admits a fundamental system of convex neighborhoods. We can similarly define the concepts of affine topological space, affine normed space, affine pre-Hilbert space, etc.

(II) Lagrange variations Let E = O + E si245_e be a locally convex affine space, F a locally convex space, A some non-empty subset of E si232_e , a some point of A and f : AF a mapping.

We say that f admits a Lagrange first variation δ f (a): EF : hδ f (a) [h] at the point a if, for all hE,

lim t 0 , t 0 f a + t . h f a t . δ f a h t = 0 .

If this condition is satisfied, we say that f admits a Lagrange second variation δ 2 f (a): EF if

lim t 0 , t 0 f a + t . h f a t . δ f a h t 2 . δ 2 f a h t 2 = 0 .

The Lagrange variation of order n, δ n f (a) : E n F : hδ n f (a) [h], is defined inductively in the same way.

(III) Gateaux differentiability It is easy to show using the generalized Goursat theorem (section 1.2.5 (III)) that, if E and F are complex locally convex spaces and f admits a Lagrange first variation at the point a, then δ f (a) is linear ([HIL 57], Theorem 26.3.2). In general, we have the following result:

Lemma 1.37

If f admits a Lagrange first variation δ f (a), then the mapping δ f (a): EF is homogeneous, i.e. δ f (a) [λ.h] = λ.δ f (a) [h] for all λ si249_e and all hE (exercise).

Definition 1.38

We say that f is Gateaux differentiable (or G-differentiable) at the point a if it admits a Lagrange first variation δ f (a) at a and δ f (a) is a bounded linear mapping from E into F ([P2], section 3.4.4 (I) , Definition 3.63). If so, δ f (a) is written as D G f (a) and is called the Gateaux differential of f at the point a.

The set D a G A F si250_e of G-differentiable mappings at the point a is an affine space. A G-differentiable mapping is not necessarily continuous. If E is a normed vector space and f is differentiable at the point a, then it is also G-differentiable at this point and D f (a) = D G f (a) (exercise). On product spaces, we may define the Gateaux partial differential D 1 G f (a 1, a 2) in the first variable and, similarly, in the second variable, etc. In the conditions of Theorem 1.9, where f D a G A F si251_e and g D f a B G si252_e (and B denotes an open subset of F containing f (A)), we have g f D a G A G si253_e and (instead of [1.5])

D G g f = D g f a D G f a

(exercise *: see [ALE 87] section 2.2.2).

Let E si232_e be a normed affine space, F a locally convex space, A a non-empty open subset of E si232_e , [a, a + h] a segment contained in A and f : AF a G-differentiable mapping. Then, the mean value theorem (Theorem 1.13(i)) remains valid (exercise *: see [ALE 87], section 2.2.3) in the form

f a + h f a γ h . sup t Θ D G f a + t h γ ,

where |.|γ is a continuous semi-norm on F. The claims (ii) and (iii) of this theorem also hold after making analogous adjustments. From this, we deduce the following result:

Corollary 1.39

Let E be a normed vector space, A some non-empty open subset of E, F a locally convex space and f : EF a mapping that is G-differentiable in A. If D G f : A E F si258_e is continuous, then f is differentiable in A.

Thus, we do not need to distinguish between Fréchet and Gateaux differentiation when talking about mappings of class C p (p > 0).

1.3.2 Calculus of variations: elementary concepts

In this section, K = si6_e .

(I) Euler condition Let E si232_e be a locally convex affine space, A some non-empty open subset of E si232_e , a some point of A and J : A si263_e a mapping that admits a Lagrange first variation δJ (a) at the point a.

Theorem 1.40

(Euler) For the mapping J to have a relative extremum (or local extremum) at the point a, the Euler stationarity condition δJ (a) = 0 must necessarily be satisfied.

Proof

By translating the origin, we may assume that a = 0. Similarly, we can assume that J has a minimum at 0 for clarity. We will argue by contradiction. Assume that δJ (0) ≠ 0. There exists hA such that β := δJ (0) [h] ≠ 0. By Lemma 1.37, we may assume without loss of generality that β < 0 (replacing h by − h if necessary). There exists a function φ defined in some open neighborhood of 0 in si48_e satisfying lim t → 0 φ (t) = 0 and J (t.h) = J (0) + t.β + t.φ(t). Pick t > 0 sufficiently small that β + φ (t) < 0. Then, J (t.h) < J (0), contradiction.

Definition 1.41

We say that a is an extreme point (or an extremal point) of J if δJ (a) = 0.

Theorem 1.42

Suppose that the Euler stationarity condition is satisfied. A necessary condition for J to admit a relative minimum at the point a is given by δ 2 J (a) [h] ≥ 0 for all h ≠ 0.

Proof

We have f (a + t.h) − f (a) = t 2.δ 2 J (a) [h] + o (t 2).

(II) Euler-Lagrange equation The Euler–Lagrange equation is essentially the Euler condition applied to calculus of variations. A full treatment would require another volume 8 ; we will content ourselves with briefly mentioning the simplest part, which is sufficient for our purposes. Let E be a Banach space and suppose that t 1 , t 2 si265_e , t 1 < t 2.

Lemma 1.43

(fundamental lemma of the calculus of variations) Let f : [t 1, t 2] → E be a continuous mapping. If every mapping h :[t 1, t 2] → E of class C 1 such that h (t 1) = h (t 2) = 0 satisfies t 1 t 2 f(t), h(t)〉dt = 0, then f = 0.

Proof

We will argue by contradiction from the assumption that f (t 0) ≠ 0. There must therefore exist zE such that 〈f (t 0), z〉 > 0. Since f is continuous, there exists an interval [α, β] ⊂ [t 1, t 2] containing t 0 such that α < β and 〈f (t), z〉 > 0 for all t ∈ [α, β]. There exists a function φ : t 1 t 2 si266_e of class C 1 that is zero in t 1 t 2 ] α , β [ si267_e and which satisfies φ (t) > 0 for t ∈ ]α, β[; for example, φ (t) = (tα)2 (βt)2 if t ∈ ]α, β[ and φ (t) = 0 if t t 1 t 2 ] α , β [ si268_e . Therefore, setting h (t) = φ (t).z, we have ∫ t 1 t 2 f(t), h(t)〉dt > 0, contradiction.

Let Ω1, Ω2 be non-empty open subsets of E and let

: t 1 t 2 × Ω 1 × Ω 2 : t x u t x u

be a mapping (known as the Lagrangian in mechanics) of class C 1 with partial differential D 3 = ∂ℒ u si270_e . Let x 1, x 2 ∈ Ω1 and write X si271_e for the set of mappings x : [t 1, t 2] → Ω1 of class C 1 satisfying x (t 1) = x 1, x (t 2) = x 2 whose derivatives x . si272_e take values in Ω2. The set X si271_e is an open subset of the normed affine space O + X, where X is equipped with the norm h 1 sup t t 1 t 2 h t + h . t si274_e ; X is a Banach space whose elements satisfy the condition h (t 1) = h (t 2) = 0 (exercise). Let

J : X : x t 1 t 2 t x t x . t . dt .

Theorem 1.44

Let x X si276_e be a mapping of class C 2 9 . The relation δJ (x *) = 0 holds, i.e. x * is an extreme point of J, if and only if x * is a solution of the Euler–Lagrange equation:

d dt ∂ℒ x . ∂ℒ x = 0 .

The Lagrangian si284_e is said to be regular at x = x * if 2 x . 2 t x t x . t si285_e is invertible in 2 s E si286_e for every t ∈ [t 1, t 2].

Corollary 1.45

(Hilbert) If x X si276_e is a solution of the Euler–Lagrange equation [1.21] and the Lagrangian is regular at x = x *, then x * is of class C 2.

Proof

The Euler–Lagrange equation can be written as φ (t, x, u, q) = 0, φ t x u q = ∂ℒ u q si288_e , q = ∂ℒ x . dt + C te si289_e ; φ is of class C 1 and such that φ u = 2 u 2 si290_e . If the Lagrangian is regular at the point x *, then, for any t ∈ [t 0, t 1], the implicit function theorem (Theorem 1.30) implies that there exists an open neighborhood of t x t x . t si291_e in [t 1, t 2] × Ω1 × Ω2, in which the relation φ (t, x, u, q) = 0 is equivalent to u = ψ (t, x, q), where ψ is of class C 1; thus, x . t = ψ t x t x . t si292_e is of class C 1, and x * is of class C 2.

(III) Legendre condition Suppose that si284_e is of class C 2 and x * satisfies the Euler–Lagrange equation. Expanding to second order, we have

δ 2 J x h = t 1 t 2 2 x 2 . h h + 2 2 x x . ( h h . ) + 2 x . 2 . ( h . h . ) . dt = t 1 t 2 2 x . 2 P . h . h . + 2 x 2 d dt 2 x x . Q . ( h h ) . dt ,

where P and Q are evaluated at the point (t, x * (t)) and h is evaluated at the point t.

Remark 1.47

If E is finite-dimensional, the strong Legendre condition can be stated as P (t, x * (t)) > 0(t 1tt 2), i.e. P (t, x * (t)).(v, v) > 0 for all v ≠ 0. If the strong Jacobi condition 10 is also assumed, we obtain the sufficient condition for a “weak” local minimum proved by Weierstrass in 1879.

1.3.3 “Convenient” differentials

(I) c si301_e topology Let E be a real locally convex space and suppose that A is a non-empty subset of E. A smooth curve in A is defined as a mapping c : IA of class C , where I is a non-empty open interval of si48_e , for example ]−1, 1[. Any such curve is said to be analytic if it is of class C ω .

If E is a complex locally convex space, suppose again that A is a non-empty subset of E. Then, A can be viewed as a subset A 0 of the real locally convex space E 0 obtained from E by restriction of the field of scalars ([P2], section 3.2.2 (II)), and any smooth curve in A 0 is said to be a smooth curve in A.

Definition 1.48

[KRI 97] Let E be a locally convex space. The topology c si301_e of E is the finest topology that makes all smooth curves in E ( section 1.3.3 ) continuous. When equipped with this topology, the space E is denoted by c E si303_e .

The topology of c E si303_e is finer than the topology of E, so every open subset of E is open in c E si303_e . If the space E is bornological ([P2], section 3.4.4 (I), Definition 3.61), it has the finest locally convex topology of all locally convex topologies coarser than the topology of c E si303_e ([KRI 97], Corollary 4.6). “Convenient” differential calculus is performed with mappings defined in open sets of c E si303_e .

Recall that every Fréchet space and every Silva space is bornological and complete ([P2], sections 3.4.4 (I) and 3.8.2(II)). We have the following result ([KRI 97], Theorem 4.11):

Theorem 1.49

Let E be a metrizable locally convex space or a Silva space. Then, the topology of E coincides with the topology of c E si303_e .

Remark 1.50

If a locally convex space E is bornological but non-normable, then c E × E si309_e is not a topological vector space ( [KRI 97] , Corollary 4.21).

(II) K si310_e spaces

Definition 1.51

[KRI 97] A locally convex space F is said to be convenient if it is Mackey-complete, i.e. if the normed vector space F B ( Definition 1.19 ) is a Banach space for every bounded, balanced and convex set BF.

The following definition will be useful:

Definition 1.52

A locally convex space E is called a K si310_e space if it is quasi-complete and bornological, and the topologies of E and c E si303_e coincide.

Theorem 1.49 shows that Fréchet (and in particular Banach spaces) and Silva spaces are K si310_e spaces. Any quasi-complete locally convex space, and hence any K si310_e space, is convenient (exercise). Nonetheless, K si310_e spaces are sufficiently general for our purposes, so we will restrict attention to them to simplify the statements of results.

(II) Mappings of class c r r ω si316_e

Definition 1.53

Let E and F be real locally convex K si310_e spaces and let A be an open subset of E. A mapping f from A into F is said to be of class c si301_e if fc is a smooth curve in F for any smooth curve c in A. This mapping f is said to be of class c ω si319_e if it is of class c si301_e and fc is an analytic curve in F for every analytic curve c in A. 11

If E is a Banach space and f is of class C r (r ∈ {∞, ω}), then f is of class c r si323_e . J. Boman showed the following result in 1967 ([KRI 97], Corollary 3.14):

Theorem 1.54

(Boman) Let A be a non-empty open subset of n si99_e . The mapping f : A m si325_e is of class C if and only if it is of class c si301_e .

Theorem 1.55

Let E and F be K si310_e spaces and suppose that A is an open subset of E.

  1. 1)  Suppose that f : AF is of class c si301_e . Then, f has a Gateaux differential D G f a E F si329_e at every point of A and the mapping D G f : A E F si330_e is of class c si301_e .
  2. 2)  Let G be a locally convex space and suppose that g : BG is a mapping of class c si301_e , where B is an open subset of F containing f (A). Then, the following chain differentiation rule holds (compare with [1.5] and [1.19] ):

D G g f a = D G g f a D G f a .

Proof

(1) Let hE; there exists ε > 0 such that a + t hA for all t ∈ ]− ε, + ε[ and the curve ta + th is smooth. Therefore, f has a Lagrange first variation δ f (a). It can be shown that δ f (a) is bounded linear ([KRI 97], Chapter I, 3.18), so δ f (a) = D G f (a). Furthermore, D G f (a) is continuous linear, since E and F are bornological ([P2], section 3.4.4 (I), Theorem 3.62). If c is a smooth curve in A, then δ f (a) ∘ c is a smooth curve in E × F si213_e , which gives the stated result by induction (ibid.). (2): ibid.

In the real analytic case, the following result ([KRI 97], Chapter II, section 10.4) generalizes Boman’s theorem:

Theorem 1.56

Let E and F be real K si310_e spaces, U a non-empty open subset of E and f : UF a mapping. The following conditions are equivalent: (i) f is of class c ω si319_e ; (ii) f is of class c si301_e and λfμ is analytic from Ω into si48_e for any linear form λF and any affine mapping μ : Ω → U (Ω = μ − 1 (U)).

(III) Holomorphic mappings Assume that K = si8_e .

A holomorphic curve in a non-empty open subset A of a complex K si310_e space E is a holomorphic mapping from a disk in the complex plane, for example the open disk D si340_e of center 0 and radius 1, into A. A c si341_e -holomorphic mapping (or a mapping of class c si301_e ) from A into F, where F is a complex K si310_e space, is a mapping that transforms the holomorphic curves in A into holomorphic curves in F. With this notation, the mapping f : AF is holomorphic if and only if λfc is a holomorphic function for every continuous linear form λF and every holomorphic curve c : D A si343_e . The following result ([KRI 97], Chapter II, section 7.9) generalizes the classical Hartogs theorem ([P2], section 4.3.2 (II), Corollary 4.80):

Theorem 1.57

(generalized Hartogs theorem) Let E 1, E 2, F be complex K si310_e spaces and suppose that A = A 1 × A 2 is a non-empty open subset of E 1 × E 2. The mapping f : AF is -holomorphic if and only if it is c si341_e -holomorphic separately in each variable, i.e. f (., z 2) and f (z 1,.) are -holomorphic for every zA i (i = 1, 2).

1.4 Smooth partitions of unity

In this section, K = si6_e .

1.4.1 C -paracompactness of Banach spaces

Let E be a normed vector space. The topological notions of normal space and paracompact space ([P2], sections 2.3.10 and 2.3.11) inspire the following definitions:

Recall that (ψ i ) i ∈ I is a subordinate partition of unity of (U i ) i ∈ I if and only if supp (ψ i ) ⊂ U i , the family (supp (ψ i )) i ∈ I is locally finite and ∑ i ∈ I ψ i  = 1 ([P2], section 2.3.12).

Theorem 1.59

The normed vector space E is C r -paracompact if and only if it is C r -normal.

Proof

The necessary condition is clear. We will show the sufficient condition: let (U i ) i ∈ I be an open covering of E. If E is paracompact, there exists a locally finite open covering (V j ) i ∈ J finer than (U i ) i ∈ I such that each V j is contained in some U i(j). There also exist other open refinements (W j ) j ∈ J and (Z j ) j ∈ J of (V j ) j ∈ J such that Z j ¯ W j W j ¯ V j si347_e . If E is C r -normal, then, for all j, there exists a function ψ j : X → [0, 1] of class C r that is equal to 1 in Z j ¯ si348_e and 0 in E V j si349_e . Let ψ = ∑ j ψ j and θ j = ψ j /ψ. Thus, (θ j ) j ∈ J is a subordinate partition of unity of class C r of (V j ) i ∈ J and hence of (U i(j)) j ∈ J , which shows that E is C r -paracompact.

As a metrizable space, every Banach space is paracompact by Stone’s theorem ([P2], section 2.3.10, Theorem 2.57). The next result ([ABR 83], Propositions 5.5.18 and 5.5.19), whose proof was established by Bonic and Frampton in 1966, follows from the fact that any separable Banach space is a Lindelöf space ([P2], section 2.6.3) (exercise).

Theorem 1.61

For a separable Banach space to be C r -paracompact (1 ≤ r), it is sufficient for its norm |.| : x ↦ | x | to be of class C r in E − {0}.

Corollary 1.62

Every separable Hilbert space E is C -paracompact.

Proof

Write N for the norm of E. We have N (x)2 = 〈x | x〉, so 2N (x) DN (x).h = 2 〈x, | h〉, and if x ≠ 0, DN (x).h = 〈x, | h〉/N (x). Hence, N is differentiable in E − {0} and DN (x) = 〈x |.〉/N(x) (x ≠ 0). This mapping DN is continuous from E − {0} into E  ≅ E. It is easy to see that it is differentiable from E − {0} into E , and so we can deduce by induction (exercise) that N is of class C .

It was shown in [TOR 73] that every reflexive (not necessarily separable) Banach space is C 1-paracompact and that the separability condition is not required in Corollary 1.62:

Theorem 1.63

Every Hilbert space E is C -paracompact.

This implies the result stated in ([P2], section 4.4.1, Theorem 4.88)

Corollary 1.64

(Whitney’s theorem) The space n si99_e is C -paracompact.

1.4.2 c si301_e -paracompactness

We can define c si301_e -regularity, c si301_e -normality and c si301_e -paracompactness of a locally convex space in the obvious ways; the statement of Theorem 1.59 still holds if C is replaced by c si301_e ; moreover, if c E si303_e is a c si301_e -regular Lindelöf space, then it is c si301_e -paracompact ([KRI 97], Proposition 16.2). We already know that any nuclear space E has a topology defined by a family of pre-Hilbert norms ([P2], section 3.11.3(III)). Each of these semi-norms is of class c si301_e in E − {0}. Nuclear Fréchet spaces (also known as N si360_e spaces), nuclear Silva spaces (also known as SN si361_e spaces), and countable products of N si360_e spaces and SN si361_e spaces are paracompact Lindelöf spaces (ibid.). The following result is analogous to Corollary 1.62 ([KRI 97], Theorem 16.10):

Theorem 1.65

Every N si360_e space, every strict inductive limit of N si360_e spaces and every SN si361_e space is c si301_e -paracompact.

1.5 Ordinary differential equations

1.5.1 Existence and uniqueness theorems

(I) Notion of the solution of a differential equation Let I be an interval of si48_e with non-empty interior I si369_e , Ω a non-empty open subset of E = n si370_e and f a mapping from I × Ω into E. We say that a mapping φ : IE is a solution (or integral) of the differential equation

x . = f t x

if the conditions (ODE 1,2,3) below are satisfied:

  • (ODE 1) φ (t) ∈ Ω for all tI;
  • (ODE 2) φ is locally absolutely continuous in I (i.e. each of its components with respect to the canonical basis of n si99_e is locally absolutely continuous: see [P2], section 4.1.7(III));
  • (ODE 3) φ . t = f t φ t si373_e λ-almost everywhere in I, where λ is the Lebesgue measure on si48_e ([P2], section 4.1.1(II)).

A Cauchy problem involves determining a function φ that is a solution of [1.22] and that satisfies the Cauchy condition:

φ t 0 = x 0 t 0 I x 0 Ω .

If φ satisfies the conditions (ODE 1,2,3) and [1.23], then, for all tI,

φ t = x 0 + t 0 t f τ φ τ .

Conversely, suppose that [1.24] holds, (ODE 1) is satisfied and tf (t, φ (t)) is locally λ-integrable. Then, [1.23] holds; furthermore, Lusin’s measurability criterion ([P2], section 4.1.6(II)) implies that the function tf (t, φ (t)) is λ-measurable for every locally absolutely continuous function φ : IE if the conditions (Cat 1,2) below are satisfied:

  • (Cat 1) the function xf (t, x) from Ω into E is continuous for every tI;
  • (Cat 2) the function tf (t, x) from I into E is λ-measurable for every x ∈ Ω.

Suppose further that:

(Cat 3) For all x 0 ∈ Ω and every r > 0 such that B r (x 0) ⊂ Ω, where B r (x 0) is the open ball of center x 0 and radius r in E, there exists a locally λ-integrable function m from I into + si377_e such that | f (t, x)| ≤ m (t) for all (t, x) ∈ I × B r (x 0).

Then, for every locally absolutely continuous function φ : IE satisfying φ (I) ⊂ B r (x 0), the mapping t ↦ | f (t, φ (t))| is locally λ-integrable in I, so tf (t, φ (t)) is locally λ-integrable in I ([P2], section 4.1.2(I)).

Definition 1.66

The conditions (Cat 1,2,3 ) are known as the Carathéodory conditions.

(II) Existence theorem

Theorem 1.67

(Carathéodory) Suppose that the Carathéodory conditions (Cat 1,2,3 ) are satisfied. Then, for all (t 0, x 0) ∈ I × Ω, there exists an interval JI with some point t 0 in its interior and a mapping φ : JE such that φ (J) ⊂ B r (x o ), φ is a solution of [1.22] and this solution satisfies the Cauchy condition [1.23] .

Proof

We will show that there exist an interval J β = [t 0, t 0 + β] ⊂ I (β > 0) and an absolutely continuous mapping φ : J β E such that φ (J β ) ⊂ B r (x 0) and φ satisfies [1.24]. The same argument works on an interval Jα = [t 0α, t 0] (α > 0). Let M : [ t 0 , + [ I + si378_e be the mapping:

M t = 0 t < t 0 , M t = t 0 t m τ t t 0 , + I .

Since M is continuous and non-decreasing, there exists an interval J β as specified above satisfying the property that, for all tJ β ,

0 M t < r .

We can now inductively define a sequence of absolutely continuous mappings φ i : J β E using the conditions:

φ i t = x 0 t 0 t β / i , φ i t = x 0 + t 0 t β / i f τ φ i τ t 0 + β / i < t t 0 + β .

By (Cat 3), the second equation implies that, for all tJ β ,

φ i t x 0 t 0 t β / i f ( τ φ i τ ) t 0 t 0 + β m τ < r ,

so φ i (t) ∈ B r (x 0) for all tJ β . If t 1, t 2J β , then

φ i t 2 φ i t 1 M t 2 β / i M t 1 β / i ,

and M is uniformly continuous in the compact set J β by Heine’s theorem ([P2], section 2.4.5, Theorem 2.86), so | M (t 2β/i) − M (t 1β/i)| → 0 uniformly in i if t 2t 1 → 0, and the set H φ i : i × si384_e is equicontinuous ([P2], section 2.7.3). Since H t φ i t : i × si385_e is contained in B r (x 0) for all tJ β , the third Ascoli–Arzelà theorem (ibid.) implies that H is relatively compact in φ C J β E si386_e equipped with the uniform structure of uniform convergence. Hence, there exists a subsequence (φ i k ) that converges uniformly to some mapping φ C J β E si386_e . Moreover, | f(tφ i k (t))| ≤ m(t)(t 0 ≤ t ≤ t 0 + β) and f(tφ i k (t)) → f(tφ(t)) for i k → ∞ by (Cat 1); furthermore, as we saw earlier, (Cat 2) implies that tf (t, φ (t)) is measurable. The Lebesgue dominated convergence theorem ([P2], section 4.1.2(II), Theorem 4.9) therefore implies that, for all tJ β ,

t 0 t f τ φ i k τ t 0 t f τ φ τ i k .

But, for all tJ β ,

φ i k t = x 0 + t 0 t f τ φ i k τ t β / i t f τ φ i k τ ,

and the second integral tends to 0 as i k → ∞, so the equality [1.24] is satisfied for all tJ β . Finally, φ i k  → φ in the Banach space AC (J β ; E) of absolutely continuous mappings from J β into E ([P2], section 4.1.7(III)), so φ is absolutely continuous.

Corollary 1.68

(Peano’s theorem) Suppose that f is continuous in I × Ω. Let J be a compact interval that is a neighborhood of t 0 in I and let m = sup t J , x B r x 0 f t x si390_e . For every compact interval [t 0, t 0 + β] contained in J satisfying β < r/m, there exists a solution of [1.24] that takes values in B r (x 0).

Proof

We have M = , so the inequality [1.25] is satisfied if and only if β < r/m.

Remark 1.69

Corollary 1.68 (and hence Theorem 1.67 ) fails if E is replaced by an arbitrary Banach space ([BOU 76], Chapter 4 , section 1, Exercise 18). We can define an absolutely continuous mapping φ : J β E as we did in ([P2], section 4.1.7(I)), but it is not true in general that φ t φ t 0 = t 0 t φ . τ si391_e for all tJ β (however, this property does hold if E is assumed to be reflexive). Furthermore, since the ball B r (x 0) is not relatively compact when E is infinite-dimensional, the proof of Theorem 1.67 no longer works.

(III) Uniqueness theorem The fourth Carathéodory condition can be stated as follows (reusing some of the notation of Cat 3):

(Cat 4) For all x 0E and every real number r > 0 such that B r (x 0) ⊂ Ω, there exists a locally λ-integrable function k from I into + si377_e such that

f t x f ( t x ) k t . x x , t x , t x I × B r x 0 .

Any locally Lipschitz function in the second variable clearly satisfies the conditions Cat 1,2,3,4, and we may therefore apply Theorem 1.70 to this function. We also have the following result:

Corollary 1.72

(Cauchy–Lipschitz theorem) If f is Lipschitz with constant k in the second variable in I × Ω, let J be a compact interval contained in I with non-empty interior, t 0 a point of I si369_e , x 0 a point of E, r > 0 a real number such that B r (x 0) ⊂ Ω, m = sup t J , x B r x 0 f t x si390_e and ρ = min {r/m, 1/k}. For every compact interval K contained in J ∩ ]t 0 − ρ, t 0 + ρ[, there exists a unique mapping φ that is a solution of [1.22] and which satisfies the Cauchy condition [1.23] .

Remark 1.73

  1. i)  Theorem 1.70 fails if E is replaced by an arbitrary Banach space. However, Corollary 1.72 remains valid, with an identical proof. Furthermore, ρ = r/m ( [BOU 76] , Chapter 4 , section 1.5 , Theorem 1) 12 . Interested readers can find additional existence and uniqueness results for the solutions of [1.22] in infinite dimensions in [DEI 77] , section 8, and [SCH 89] . Note, however, that “infinite-dimensional systems” are not governed by a functional differential equation of the form [1.22], where E is a Banach space: see [HAL 77] .
  2. ii)  By the mean value theorem ( Theorem 1.13 ), in order for f to be locally Lipschitz, it is sufficient for it to be of class C 1 .

Theorem 1.74

Let E be the space n si99_e (respectively any Banach space), I si414_e an interval with some interior point t 0, Ω a non-empty open subset of E and f : I × Ω → E a mapping satisfying the conditions Cat 1,2,3,4 (respectively a locally Lipschitz function in the second variable). For all x 0 ∈ Ω, there exists a maximal interval J (t 0, x 0) ⊂ I with interior point t 0 in which [1.22] has a solution φ satisfying the Cauchy condition [1.23] and such that φ (J (t 0, x 0)) ⊂ Ω. This solution φ (.; t 0, x 0) is unique.

Proof

Let M si415_e be the set of intervals LI with non-empty interior containing t 0 as an interior point and such that there exists a solution φ of [1.22] in L satisfying the Cauchy condition [1.23] and φ (L) ⊂ Ω. The set M si415_e is not empty by Theorem 1.70 (respectively Corollary 1.72 and Remark 1.73). Let L , L M si417_e with LL′. If φ, φ′ are solutions of [1.22] in L, L′, respectively, satisfying the Cauchy condition [1.23], then it follows from Theorem 1.70 (respectively Corollary 1.72 and Remark 1.73) (arguing by contradiction: exercise) that φ′ is a continuation of φ. Let J t 0 x 0 = L M L si418_e ; there must therefore exist a unique solution φ of [1.22] in J (t 0, x 0) satisfying the Cauchy condition [1.23] and φ (J (t 0, x 0)) ⊂ Ω.

Definition 1.75

The solution φ (.; t 0, x 0) defined in J is called the maximal integral of [1.22] satisfying [1.23] .

Remark 1.76

Suppose that f is locally Lipschitz in the second variable.

  1. 1)  Let t f := sup (J (t 0, x 0)) ≤ +∞ (the argument below also works when t i := inf (J (t 0, x 0)) ≥ −∞, mutatis mutandis). If f (t, φ (.;t 0, x 0)) is bounded in J (t 0, x 0), then ψ (t; t 0, x 0) admits a limit c := φ (t f − 0; t 0, x 0); furthermore, c is a frontier point of Ω if J(t 0x 0) ∩ [t 0, +  ∞ [ ≠ I ∩ [t 0, +  ∞ [. This inequality, together with the condition Ω = E, implies that lim t → t f  | φ(tt 0x 0)| =  + ∞; by contrast, if t f I and Ω = E, then J(t 0x 0) ∩ [t 0, +  ∞ [ = I ∩ [t 0, +  ∞ [ ( [BOU 76] , Chapter 4 , section 1.5 , Theorems 2 and Corollaries 1 and 2). In addition to this remark, see Theorem 5.67 in section 5.7.1 .
  2. 2)  Let (τ, ξ) be an arbitrary point of I × Ω. There exists an interval KI, a neighborhood of τ in I and a neighborhood S of ξ in Ω, such that, for every point (t 0, x 0) ∈ K × S, there is a unique solution φ (., t 0, x 0) of [1.22] satisfying [1.23] , definedin K (i.e. J(t 0, x 0) ⊃ K). The mapping (t, t 0, x 0) ↦ φ(t; t 0, x 0) from K × K × S into Ω is uniformly continuous ([BOU 76] , Chapter 4 ,section 1.7, Theorem 4).

(IV) Differential equations in implicit form Let I be an open interval of si48_e , t 0 a point of I, F a Banach space and Ω a non-empty open subset of F n + 1, where n is a natural integer. Let g : I × Ω → F be a mapping of class C 1 and consider the differential equation:

g t y y . y n 1 y n = 0 .

Let η 0 = (η 0 0, η 0 1,…, η 0 n − 1) ∈ F n and ξ 0F such that (η 0, ξ 0) ∈ Ω, g (t 0, η 0, ξ 0) = 0, and suppose that the following condition (Inv) is satisfied:

(Inv) g ξ t 0 η 0 ξ 0 si421_e is invertible in F si422_e .

By the implicit function theorem (Theorem 1.30), there exists an open neighborhood J × U of (t 0, η 0) in I × F n , an open neighborhood V of ξ 0 in F, where U × V ⊂ Ω, and a mapping h : J × UV of class C 1, such that, for all (t, η, ξ) ∈ J × U × V,

g t η ξ = 0 ξ = h t η .

Therefore, any mapping ψ : JF such that ψ t ψ . t ψ n 1 t U si424_e and ψ (n) (t) ∈ V for all tJ is a solution of [1.26] in J if and only if ψ is a solution of the following differential equation on J, said to be in explicit form:

y n = h t y y . y n 1 .

Let x i = y (i − 1) (i = 1,…, n) and x =(x 0, …, x n ). For (t, x) ∈ J × U × V, the differential equation [1.27] is equivalent to [1.22], where f = (f 1,…, f n ) and

f 1 t x = x 2 , , f n 1 t x = x n , f n t x = h t x .

The mapping f : J × UF n is of class C 1 and hence locally Lipschitz in x by the mean value theorem (Theorem 1.13(i)). We can therefore apply Theorem 1.72 according to Remark 1.73. Let x 0 = (η 0 1, …, η 0 n − 1); any solution φ = (φ 0, …, φ n ) of [1.22] of class C 1 in J satisfies the Cauchy condition [1.23] if and only if φ = φ 1 is a solution of [1.26], of class C n in J and ψ (t 0) = η 0 1, …, ψ (n − 1) (t 0) = η 0 n − 1.

Remark 1.77

If the condition (Inv) is not satisfied, the differential equation [1.26] is singular. We already encountered this situation in the linear case ([P2], section 5.4.6), where it was necessary to introduce solutions in the form of hyperfunctions. The nonlinear case does not have an equivalent general theory.

1.5.2 Linear differential equations

(I) Let I be an interval of si48_e with non-empty interior I si369_e and let E = n si370_e . Consider the linear differential equation

x . = A t . x + b t ,

where A : I E si431_e and b : IE are locally λ-integrable. The Carathéodory conditions Cat 1,2,3,4 are all satisfied. (In particular, setting f (t, x) = A (t).x + b (t), we have

f t x f ( t x ) A t . x x ) ,

which shows that Cat 4 is satisfied.) Hence, for all t 0 I si433_e and every x 0E, there exists a unique solution φ (.; t 0, x 0) of [1.28] in I satisfying [1.23].

The linear equation [1.28] is said to be homogeneous if b = 0, in which case

x . = A t . x .

(II) The set of solutions of [1.29] (where A is locally λ-integrable) is an si48_e -vector space. Let φ 0 (.; t 0, x 0) be a solution of [1.29] in I satisfying [1.23]. The mapping x 0φ 0 (t; t 0, x 0) is a bijective linear mapping Φ (t, t 0) from E onto E, and Φ(., t 0) is identical to the solution of the differential equation

dU dt = A t U

for U (t 0) = 1 E . For all t 1, t 2, t 3I, we have Φ(t 3, t 1) = Φ (t 3, t 2) ∘ Φ(t 2, t 1) and Φ(t 1, t 2) = Φ(t 2, t 1)− 1.

Definition 1.78

The mapping Φ is called the resolvent of the equation [1.29] . The matrix representing this resolvent with respect to the canonical basis of E is called the transition matrix.

Theorem 1.79

We have

det Φ t t 0 = exp t 0 t Tr A τ . .

(III) The above shows that the general solution of [1.29] is of the form t ↦ Φ(t, t 0)ξ. The “variation of constants” method (exercise) allows us to obtain the following solution (defined in I) of [1.28] and the Cauchy condition [1.23]:

φ t t 0 x 0 = Φ t t 0 . x 0 + t 0 t Φ τ t 0 . b τ . .

(IV) The integration of linear differential equations with constant coefficients is a classical problem and is performed using the Jordan normal form ([P1], section 3.4.3 (IV)): see, for example, [BOU 10], section 12.5.2.

1.5.3 Parameter dependence of solutions

Let I be an interval of si48_e with non-empty interior, E a Banach space (see Remark 1.73), Ω a non-empty open subset of E, Λ a topological space and f a mapping from I × Ω × Λ into E. Write f λ (t, x) for the value of f at the point (t, x, λ) ∈ I × Ω × Λ. It is possible to show the following result ([BOU 76], Chapter 4, section 1.6, Theorem 3):

Theorem 1.80

(parameter dependence of solutions) Suppose that, for all λ ∈ Λ, (t, x) ↦ f λ (t, x) is Lipschitz in the second variable x in I × Ω and that f λ (tx) → f λ 0 (tx) uniformly in I × Ω as λλ 0. Let φ λ 0 be a solution of x . = f λ 0 t x si445_e satisfying the Cauchy condition φ λ 0 t 0 = x 0 t 0 I x 0 Ω si446_e , defined on an interval J = [t 0, t 0 + β[ ⊂ I and taking values in Ω. For every compact interval [t 0, t 1] ⊂ J, there exists a neighborhood V of λ 0 in Λ such that, for all λV, the differential equation

x . = f λ t x

admits a unique solution φ λ defined in [t 0, t 1] satisfying the Cauchy condition φ λ (t 0) = x 0 and taking values in Ω; furthermore, as λλ 0, φ λ  → φ λ 0 uniformly in [t 0, t 1].

Suppose now that Λ is an open subset of a normed vector space F and that the mapping (t, x, λ) ↦ f λ (t, x) is continuous with continuous partial differentials 13 t x λ f λ x t x si448_e and t x λ f λ λ t x si449_e . Then, f λ is locally Lipschitz in the second variable x (see section 1.5.1 (V)). Suppose further that the mappings x 0 : Λ → Ω: λx 0 (λ) and t 0 : Λ → I : λt 0 (λ) are of class C 1 in Λ. For all λ ο ∈ Λ, Theorem 1.80 implies that there exist an open neighborhood V of λ 0 in Λ and an open interval JI such that t 0 (λ 0) ∈ J, where the sets V and J satisfy the following condition: for all λV, t 0 (λ) ∈ J and there exists a solution φ λ = φ (.,t 0 (λ), x 0 (λ)) of [1.31] in J taking the value x 0 (λ) ∈ Ω at the point t 0 (λ). Then, we have the following result:

Theorem 1.81

For every tJ, the mapping λφ λ (t) from V into Ω has a differential h t F E si450_e at the point λ 0 of V ; h is of class C 1 in J and is the unique solution of the linear equation:

y . = A t y + b t

where

A t f λ x ( t φ λ t ) λ = λ 0 , b t f λ λ ( t φ λ t ) λ = λ 0 .

This differential h satisfies the Cauchy condition:

h t 0 λ 0 = Dx 0 λ 0 f λ ( t 0 λ x 0 λ ) λ = λ 0 Dt λ 0 .

Note that, in the linear equation [1.32], A t F E si457_e ; moreover, y , A(t) ∘ y , and b (t) belong to F E si458_e . In the Cauchy condition, f λ (t 0 λ), x 0 λ ) λ = λ 0 E E si459_e , Dx 0 λ 0 F E si460_e , Dt λ 0 F = F si461_e , and f λ (t 0 (λ), x 0 λ ) λ = λ 0 Dt λ 0 F E si462_e .

Corollary 1.82

(differentiability of the solution with respect to x 0) Suppose that F = E, Λ = Ω, and λ = x 0, and adopt the same hypotheses as Theorem 1.81 (mutatis mutandis). Write f and φ for f λ and φ λ respectively. Then:

φ x 0 t t 0 x 0 = Φ t t 0

where Φ is the resolvent ( Definition 1.78 ) of the linear differential equation

y . = A t y , A t f x t φ t t 0 x 0 .

Proof

By Theorem 1.81, h . t = A t h t si465_e and h (t 0) = 1 E .

Corollary 1.83

(differentiability with respect to the initial time) Suppose that F = si466_e , Λ = I, and λ = t 0 . With the notation of Corollary 1.82 and the same hypotheses (exercise):

φ t 0 t t 0 x 0 = Φ t t 0 f t 0 x 0 .

Remark 1.84

Let E be finite-dimensional. The statement of Corollary 1.82 still holds under the weaker condition ( [ALE 87] , Chapter 2 ,section 2.5.6) that f satisfies the Carathéodory conditions ( Cat 1,2,3 ) and Lusin’s condition (L) holds:

(L): For all t ∈ I, the mapping xf (t, x) is continuously differentiable on Ω and, for every compact set K ⊂ Ω, there exists a locally λ-integrable function k : I + si468_e such that

f x t x k t , t x I × K .

It is clear that condition (L) implies (Cat 4 ) by the mean value theorem. This condition does not require f to be continuous in t, which is very important in optimal control theory (Pontryagin maximum principle): see op. cit. and [PON 62] . We can further weaken (L) by considering generalized gradients and differential inclusions ( [CLA 90] ,Theorem 7.4.1), but this exceeds the scope of the present book.