7.1 Introduction
Bilevel optimization has been well recognized as a theoretically very challenging and practically important area of applied mathematics. We refer the reader to the monographs [2, 6, 14, 30, 41], the extensive bibliographies and commentaries therein, as well as to the advanced material included in this book for various approaches, theoretical and numerical results, and a variety of practical applications of bilevel optimization and related topics.
One of the characteristic features of bilevel optimization problems is their intrinsic nonsmoothness, even if their initial data are described by linear functions. This makes natural to develop an approach of modern variational analysis and generalized differentiation to the study and applications of major models in bilevel optimization. It has been done in numerous publications, which are presented and analyzed in the author’s recent book [30].
The main goal we pursue here is to overview this approach together with the corresponding machinery of variational analysis and to apply it to deriving necessary optimality conditions in optimistic bilevel models with Lipschitzian data while also commenting on other versions in bilevel optimization with posting open questions. To make this chapter largely self-contained and more accessible for the reader, we present here the basic background from variational analysis and generalized differentiation, which is needed for applications to bilevel optimization. For brevity and simplicity we confine ourselves to problems in finite-dimensional spaces.
The rest of this work is organized as follows. In Sect. 7.2 we recall those constructions of generalized differentiation in variational analysis, which are broadly used in the subsequent text. Section 7.3 presents the fundamental extremal principle that is behind generalized differential calculus and applications to optimization in the geometric approach to variational analysis developed in [29, 30]. Section 7.4 is devoted to deriving—via the extremal principle—the two basic calculus rules, which are particularly useful for applications to optimality conditions. In Sect. 7.5 we establish subdifferential evaluations and efficient conditions that ensure the local Lipschitz continuity of optimal value functions in general problems of parametric optimization. These results are crucial for variational applications to bilevel programming.
To proceed with such applications, we first consider in Sect. 7.6 problems of nondifferentiable programming with Lipschitzian data. Subdifferential necessary optimality conditions for Lipschitzian programs are derived there by using the extremal principle and calculus rules. Section 7.7 contains the formulation of the bilevel optimization problems under consideration and the description of the variational approach to their study. Based on this approach and subdifferentiation of the optimal value functions for lower-level problems, we establish in Sect. 7.8 necessary optimality conditions for Lipschitzian bilevel programs. The other developments in this direction for bilevel optimization problems with Lipschitzian data is presented in Sect. 7.9 by using the subdifferential difference rule based on a certain variational technique. The concluding Sect. 7.10 discusses further perspectives of employing concepts and techniques of variational analysis to bilevel optimization with formulations of some open questions.
Throughout this chapter we use the standard notation and terminology of variational analysis and generalized differentiation; see, e.g., [29, 30, 40].
7.2 Basic Constructions of Generalized Differentiation
Here we present the basic definitions of generalized normals to sets, coderivatives of set-valued mappings, and subgradients of extended-real-valued functions initiated by the author [26] that are predominantly used in what follows. The reader is referred to the books [29, 30, 40] for more details. Developing a geometric approach to generalized differentiation, we start with normals to sets, then continue with coderivatives of (set-valued and single-valued) mappings, and finally pass to subgradients of extended-real-valued functions.
![$$\Omega \subset \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq1.png)
![$$\bar {x}\in \Omega $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq2.png)
![$$x\in \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq3.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq4.png)
![$$\displaystyle \begin{aligned} \Pi(x;\Omega ):=\big\{w\in\Omega \big|\;\|x-w\|=\displaystyle\min_{u\in\Omega }\|x-u\|\big\}. \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equa.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq5.png)
![$$\displaystyle \begin{aligned} \begin{array}{ll} N(\bar{x};\Omega ):=\Big\{v\in\mathbb{R}^n\Big|&\exists\,x_k\to\bar{x},\;\exists\,w_k\in\Pi(x_k;\Omega ),\;\exists\,\alpha_k\ge 0\\ &\mbox{such that }\;\alpha_k(x_k-w_k)\to v\;\mbox{ as }\;k\to\infty\Big\}. \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ1.png)
![$$\bar {x}=(0,0)\in \mathbb {R}^2$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq6.png)
![$$N(\bar {x};\Omega )\neq \{0\}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq7.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq8.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq9.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq10.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_Equ2_HTML.png](../images/480569_1_En_7_Chapter/480569_1_En_7_Equ2_HTML.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_IEq11_HTML.gif](../images/480569_1_En_7_Chapter/480569_1_En_7_IEq11_HTML.gif)
![../images/480569_1_En_7_Chapter/480569_1_En_7_Equ3_HTML.png](../images/480569_1_En_7_Chapter/480569_1_En_7_Equ3_HTML.png)
![$$\hat {N}_{\varepsilon _k}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq12.png)
![$$F\colon \mathbb {R}^n\rightrightarrows \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq13.png)
![$$F(x)\subset \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq14.png)
![$$\displaystyle \begin{aligned} \mbox{dom}\, F:=\big\{x\in\mathbb{R}^n\big|\;F(x)\neq \emptyset\big\}\;\mbox{ and }\;\mbox{gph}\, F:=\big\{(x,y)\in\mathbb{R}^n\times\mathbb{R}^m\big|\;y\in F(x)\big\}. \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equb.png)
![$$F\colon \mathbb {R}^n\to \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq15.png)
![$$(\bar {x},\bar {y})\in \mbox{gph}\, F$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq16.png)
![$$\displaystyle \begin{aligned} D^*F(\bar{x},\bar{y})(w):=\big\{v\in\mathbb{R}^n\big|\;(v,-w)\in N\big((\bar{x},\bar{y});\mbox{gph}\, F\big)\big\},\quad w\in\mathbb{R}^m. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ4.png)
![$$D^*F(\bar {x},\bar {y})\colon \mathbb {R}^m\rightrightarrows \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq17.png)
![$$F\colon \mathbb {R}^n\to \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq18.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq19.png)
![$$\bar {y}=F(\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq20.png)
![$$\displaystyle \begin{aligned} D^*F(\bar{x})(w)=\big\{\nabla F(\bar{x})^*w\big\}\;\mbox{ for all }\;w\in\mathbb{R}^m. \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equc.png)
![$$F\colon \mathbb {R}^n\rightrightarrows \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq21.png)
![$$(\bar {x},\bar {y})\in \mbox{gph}\, F$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq22.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq23.png)
![$$\bar {y}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq24.png)
![$$\displaystyle \begin{aligned} F(x)\cap V\subset F(u)+\ell\|x-u\|\mathbb{B}\;\mbox{ for all }\;x,u\in U, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ5.png)
![$$\mathbb {B}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq25.png)
![$$V=\mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq26.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq27.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq28.png)
![$$\displaystyle \begin{aligned} D^*F(\bar{x},\bar{y})(0)=\{0\}. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ6.png)
![$$\|D^*F(\bar {x},\bar {y})\|$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq29.png)
![$$w\mapsto D^*F(\bar {x},\bar {y})(w)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq30.png)
![$$\varphi \colon \mathbb {R}^n\to \overline {\mathbb {R}}:=(-\infty ,\infty ]$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq31.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq32.png)
![$$\displaystyle \begin{aligned} \mbox{dom}\,\varphi:=\big\{x\in\mathbb{R}^n\big|\;\varphi(x)<\infty\big\}\;\mbox{ and }\;\mbox{epi}\,\varphi:=\big\{(x,\mu)\in\mathbb{R}^n\times\mathbb{R}\big|\;\mu\ge\varphi(x)\big\} \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equd.png)
![$$\bar {x}\in \mbox{dom}\,\varphi $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq33.png)
![$$(\bar {x},\varphi (\bar {x}))$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq34.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq35.png)
![$$\displaystyle \begin{aligned} \partial\varphi(\bar{x}):=\big\{v\in\mathbb{R}^n\big|\;(v,-1)\in N\big((\bar{x},\varphi(\bar{x}));\mbox{epi}\,\varphi\big)\big\}, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ7.png)
![$$\displaystyle \begin{aligned} \partial^\infty\varphi(\bar{x}):=\big\{v\in\mathbb{R}^n\big|\;(v,0)\in N\big((\bar{x},\varphi(\bar{x}));\mbox{epi}\,\varphi\big)\big\}. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ8.png)
![$$\{\nabla \varphi (\bar {x})\}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq36.png)
![$$\partial \varphi (\bar {x})=D^*E_\varphi (\bar {x},\varphi (\bar {x}))(1)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq37.png)
![$$\partial ^\infty \varphi (\bar {x})=D^*E_\varphi (\bar {x},\varphi (\bar {x}))(0)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq38.png)
![$$E_\varphi \colon \mathbb {R}^n\rightrightarrows \mathbb {R}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq39.png)
![$$E_\varphi (x):=\{\mu \in \mathbb {R}|\;\mu \ge \varphi (x)\}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq40.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq41.png)
![$$\displaystyle \begin{aligned} \partial^\infty\varphi(\bar{x})=\{0\}. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ9.png)
![$$\Omega \subset \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq42.png)
![$$\displaystyle \begin{aligned} \partial\delta (\bar{x};\Omega )=\partial^\infty\delta (\bar{x};\Omega )=N(\bar{x};\Omega )\;\mbox{ whenever }\;\bar{x}\in\Omega . \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ10.png)
![$$\displaystyle \begin{aligned} \hat\partial\varphi(x):=\Big\{v\in\mathbb{R}^n\Big|\;\displaystyle\liminf_{u\to x}\frac{\varphi(u)-\varphi(x)-\langle v,u-x\rangle}{\|u-x\|}\ge 0\Big\} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ11.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq43.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_Equ12_HTML.png](../images/480569_1_En_7_Chapter/480569_1_En_7_Equ12_HTML.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_Equ13_HTML.png](../images/480569_1_En_7_Chapter/480569_1_En_7_Equ13_HTML.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_IEq44_HTML.gif](../images/480569_1_En_7_Chapter/480569_1_En_7_IEq44_HTML.gif)
![$$x\to \bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq45.png)
![$$\varphi (x)\to \varphi (\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq46.png)
![$$\hat \partial _{\varepsilon _k}\varphi $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq47.png)
7.3 Extremal Principle in Variational Analysis
In this section we recall, following [22], the notion of locally extremal points for systems of finitely many sets and then derive the fundamental extremal principle, which gives us necessary conditions for extremality of closed set systems in .
![$$\mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq49.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq50.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq51.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq52.png)
![$$a_{ik}\in \mathbb {R}^n,\ i=1,\ldots ,s$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq53.png)
![$$\displaystyle \begin{aligned} \bigcap_{i=1}^s\big(\Omega _i-a_{ik}\big)\cap U=\emptyset\;\mbox{ whenever }\;k=1,2,\ldots. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ14.png)
△
Geometrically Definition 7.3.1 says that is a locally extremal point of the set system { Ω1, …, Ωs} if these sets can be pushed apart from each other locally around
. Observe also that for the case of two sets Ω1, Ω2 containing
the above definition can be equivalently reformulated as follows: there is a neighborhood U of
such that for any ε > 0 there exists a vector
with ∥a∥≤ ε and ( Ω1 − a) ∩ Ω2 ∩ U = ∅.
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq59.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq60.png)
![$$\{\Omega ,\{\bar {x}\}\}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq61.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq62.png)
![$$\displaystyle \begin{aligned} \mbox{minimize }\;\varphi(x)\;\mbox{ subject to }\;x\in\Omega \subset\mathbb{R}^n,\end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Eque.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq63.png)
![$$(\bar {x},\varphi (\bar {x}))$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq64.png)
![$$\Omega _2:=\Omega \times \{\varphi (\bar {x})\}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq65.png)
Now we are ready to formulate and prove the basic extremal principle of variational analysis for systems of finitely many closed sets in by using the normal cone construction (7.2.1).
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq67.png)
![$$\mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq68.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq69.png)
![$$v_i\in N(\bar {x};\Omega _i)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq70.png)
![$$\displaystyle \begin{aligned} v_1+\ldots+v_s=0. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ15.png)
△
![$$U=\mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq71.png)
![$$\displaystyle \begin{aligned} \mbox{minimize }\;\varphi_k(x):=\Big[\sum_{i=1}^s d^2(x+a_{ik};\Omega _i)\Big]^{1/2}+\|x-\bar{x}\|{}^2,\quad x\in\mathbb{R}^n, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ16.png)
![$$\displaystyle \begin{aligned} \gamma_k:=\Big[\sum_{i=1}^sd^2(x_k+a_{ik};\Omega _i)\Big]^{1/2}>0. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ17.png)
![$$\displaystyle \begin{aligned} 0<\gamma_k+\|x_k-\bar{x}\|{}^2=\varphi_k(x_k)\le\varphi_k(\bar{x})=\Big[\sum_{i=1}^s\|a_{ik}\|{}^2\Big]^{1/2}\downarrow 0, \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equf.png)
![$$x_k\to \bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq72.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq73.png)
![$$\displaystyle \begin{aligned} \mbox{minimize }\;\psi_k(x):=\Big[\sum_{i=1}^s\|x+a_{ik}-w_{ik}\|{}^2\Big]^{1/2}+\|x-\bar{x}\|{}^2,\quad x\in\mathbb{R}^n \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ18.png)
![$$\displaystyle \begin{aligned} \nabla\psi_k(x_k)=\sum_{i=1}^s v_{ik}+2(x_k-\bar{x})=0 \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ19.png)
![$$\displaystyle \begin{aligned} \|v_{1k}\|{}^2+\ldots+\|v_{sk}\|{}^2=1\;\mbox{ for all }\;k=1,2,\ldots. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ20.png)
![$$\mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq74.png)
![$$v_i\in N(\bar {x};\Omega _i)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq75.png)
7.4 Fundamental Calculus Rules
Employing the extremal principle, we derive here two fundamental rules of generalized differential calculus, which are broadly used in this chapter and from which many other calculus rules follow; see [29, 30]. The first result is the intersection rule for basic normals (7.2.1).
![$$\mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq77.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq78.png)
![$$\displaystyle \begin{aligned} \big[x_i\in N(\bar{x};\Omega _i),\;x_1+\ldots+x_s=0\big]\Longrightarrow x_i=0\;\mathit{\mbox{ for all }}\;i=1,\ldots,s. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ21.png)
![$$\displaystyle \begin{aligned} \displaystyle N\Big(\bar{x};\bigcap_{i=1}^s\Omega _i\Big)\subset N(\bar{x};\Omega _1)+\ldots+N(\bar{x};\Omega _s). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ22.png)
△
![$$v\in N(\bar {x};\Omega _1\cap \Omega _2)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq79.png)
![$$x_k\to \bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq80.png)
![$$v_k\in \hat N(\bar {x},\Omega _1\cap \Omega _2)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq81.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \Theta_1:=\Omega _1\times\mathbb{R}_+,\\ &\displaystyle &\displaystyle \Theta_{2k}:=\big\{(x,\alpha)\big|\;x\in\Omega _2,\,\langle v_k,x-x_k\rangle-\varepsilon_k\|x-x_k\|\ge\alpha\big\}\;\mbox{ for any }\;k=1,2,\ldots. \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ23.png)
![$$\displaystyle \begin{aligned} \Theta_1\cap\big(\Theta_{2k}-(0,\gamma)\big)\cap(U\times\mathbb{R})=\emptyset \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equh.png)
![$$\mathbb {R}^{n+1}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq82.png)
![$$\displaystyle \begin{aligned} (u_k,\lambda_k)\in N\big((x_k,0);\Theta_1\big)\;\mbox{ and }\;(-u_k,-\lambda_k)\in N\big((x_k,0);\Theta_{2k}\big). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ24.png)
![$$(u,\lambda )\in \mathbb {R}^{n+1}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq83.png)
![$$(u,\lambda )\in N((\bar {x},0);\Theta _1)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq84.png)
![$$u\in N(\bar {x};\Omega _1)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq85.png)
![$$\displaystyle \begin{aligned} (-\lambda v-u,\lambda)\in N\big((\bar{x},0);\Omega _2\times\mathbb{R}_+\big). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ25.png)
![$$k\in \mathbb {N}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq86.png)
![$$\displaystyle \begin{aligned} \Theta_{2k}=\big\{\big(x,\langle v_k,x-x_k\rangle-\varepsilon_k\|x-x_k\|-\alpha\big)\big|\;x\in\Omega _2,\;\alpha\in\mathbb{R}_+\big\}. \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equi.png)
![$$k\in \mathbb {N}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq87.png)
![$$\displaystyle \begin{aligned} (-u_{km},-\lambda_{km})\in\hat N\big((x_{km},\langle v_k,x_{km}-x_k\rangle-\varepsilon_k\|x_{km}-x_k\|-\alpha_{km});\Theta_{2k}\big)\end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equj.png)
![$$m\in \mathbb {N}.$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq88.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_Equk_HTML.png](../images/480569_1_En_7_Chapter/480569_1_En_7_Equk_HTML.png)
![$$k\in \mathbb {N}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq89.png)
![$$m\in \mathbb {N}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq90.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_Equl_HTML.png](../images/480569_1_En_7_Chapter/480569_1_En_7_Equl_HTML.png)
![$$\displaystyle \begin{aligned} (-u_{km}-\lambda_{km}v_k,\lambda_{km})\in\hat N_{\varepsilon_{km}}\big((x_{km},\alpha_{km});\Omega _2\times\mathbb{R}_+\big)\;\mbox{ with }\;\varepsilon_{km}:=\frac{|\lambda_{km}|\varepsilon_k}{1+\|v_k\|+\varepsilon_k}. \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equm.png)
Observe that having λ = 0 in (7.4.4) contradicts the qualification condition (7.4.1) for s = 2. Thus λ < 0, which readily implies that .
To proceed finally by induction for s > 2, we observe that the induction assumption for (7.4.2) in the previous step yields the validity of the qualification condition (7.4.1) needed for the current step of induction. This completes the proof of the theorem. □
Next we derive the subdifferential sum rules concerning both basic subdifferential (7.2.7) and singular subdifferential (7.2.8). For our subsequent applications to bilevel optimization, it is sufficient to consider the case where all but one of the functions involved in summation are locally Lipschitzian around the reference point. This case allows us to obtain the subdifferential sum rules without any qualification conditions.
![$$\varphi _1\colon \mathbb {R}^n\to \overline {\mathbb {R}}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq92.png)
![$$\bar {x}\in \mathit{\mbox{dom}}\,\varphi _1$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq93.png)
![$$\varphi _i\colon \mathbb {R}^n\to \overline {\mathbb {R}}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq94.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq95.png)
![$$\displaystyle \begin{aligned} \partial\Big(\displaystyle\sum_{i=1}^s\varphi_i\Big)(\bar{x})\subset\sum_{i=1}^s\partial\varphi_i(\bar{x}), \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ26.png)
![$$\displaystyle \begin{aligned} \partial^\infty\Big(\displaystyle\sum_{i=1}^s\varphi_i\Big)(\bar{x})=\partial^\infty\varphi_1(\bar{x}). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ27.png)
△
We consider the case where only two functions are under summation since the general case of finitely many functions obviously follows by induction. Let us start with the basic subdifferential sum rule (7.4.5) for s = 2 therein.
![$$v\in \partial (\varphi _1+\varphi _2)(\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq96.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl} (v,-1)\in N\big((\bar{x},(\varphi_1+\varphi_2)(\bar{x})); \mbox{epi}\,(\varphi_1+\varphi_2)\big). \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ28.png)
![$$\displaystyle \begin{aligned} \Omega _i:=\big\{(x,\mu_1,\mu_2)\in\mathbb{R}^n\times\mathbb{R}\times\mathbb{R}\big|\;\mu_i\ge\varphi_i(x)\big\}\;\mbox{ for }\;i=1,2. \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equn.png)
![$$\bar \mu _i:=\varphi _i(\bar {x}),\ i=1,2$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq97.png)
![$$(\bar {x},\bar \mu _1,\bar \mu _2)\in \Omega _1\cap \Omega _2$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq98.png)
![$$(v,-1,-1)\in N((\bar {x},\bar \mu _1,\bar \mu _2);\Omega _1\cap \Omega _2)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq99.png)
![$$(v_i,-\lambda _i)\in N((\bar {x},\bar \mu _i);\mbox{epi}\,\varphi _i)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq100.png)
![$$\displaystyle \begin{aligned} (v,-1,-1)=(v_1,-\lambda_1,0)+(v_2,0,-\lambda_2),\end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equo.png)
![$$v_i\in \partial \varphi _i(\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq101.png)
![$$v\in \partial ^\infty (\varphi _1+\varphi _2)(\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq102.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_IEq103_HTML.gif](../images/480569_1_En_7_Chapter/480569_1_En_7_IEq103_HTML.gif)
![$$\displaystyle \begin{aligned}\langle v_k,x-x_k\rangle+\nu_k(\mu-\mu_k)\le\gamma_k(\|x-x_k\|+|\mu-\mu_k|)\end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equp.png)
![$$x\in x_k+\eta _k\mathbb {B}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq104.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq105.png)
![$$\tilde {\eta }_k:=\eta _k/2(\ell +1)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq106.png)
![$$\widetilde {\mu }_k:=\mu _k-\varphi _2(x_k)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq107.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_IEq108_HTML.gif](../images/480569_1_En_7_Chapter/480569_1_En_7_IEq108_HTML.gif)
![$$\displaystyle \begin{aligned}(x,\mu+\varphi_2(x))\in\mbox{epi}\,(\varphi_1+\varphi_2),\quad |(\mu+\varphi_2(x))-\mu_k|\le\eta_k \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equq.png)
![$$(x,\mu )\in \mbox{epi}\,\varphi _1,\ x\in x_k+\widetilde {\eta }_k\mathbb {B}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq109.png)
![$$|\mu -\tilde {\mu }_k|\le \widetilde {\eta }_k$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq110.png)
![$$\displaystyle \begin{aligned}\langle v_k,x-x_k\rangle+\nu_k(\mu-\tilde{\mu}_k)\le\varepsilon_k(\|x-x_k\|+|\mu-\tilde{\mu}_k|) \mbox{ with }\;\varepsilon_k:=\gamma_k(1+\ell)+|\nu_k|\ell\end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equr.png)
![$$x\in x_k+\tilde {\eta }_k\mathbb {B}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq111.png)
![$$|\mu -\tilde {\mu }_k|\le \tilde {\eta }_k$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq112.png)
![$$(v_k,\nu _k)\in \hat {N}_{\varepsilon _k}((x_k,\tilde {\mu }_k);\mbox{epi}\,\varphi _1)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq113.png)
![$$(v,0)\in N((\bar {x},\varphi (\bar {x}));\mbox{epi}\,\varphi _1)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq114.png)
![$$\partial ^\infty \varphi _1(\bar {x})\subset \partial ^\infty (\varphi _2+\varphi _1)(\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq115.png)
7.5 Subdifferentials and Lipschitz Continuity of Value Functions
![$$\vartheta \colon \mathbb {R}^n\to \overline {\mathbb {R}}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq116.png)
![$$\displaystyle \begin{aligned} \vartheta(x):=\inf\big\{\varphi(x,y)\big|\;y\in F(x)\big\},\quad x\in\mathbb{R}^n, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ29.png)
![$$\varphi \colon \mathbb {R}^n\times \mathbb {R}^m\to \overline {\mathbb {R}}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq117.png)
![$$F\colon \mathbb {R}^n\rightrightarrows \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq118.png)
![$$\displaystyle \begin{aligned}\mbox{minimize }\;\varphi(x,y)\;\mbox{ subject to }\;y\in F(x)\end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equs.png)
As seen below, functions of type (7.5.1) play a crucial role in applications to bilevel optimization while revealing intrinsic nonsmoothness of the latter class of optimization problems. This section presents evaluations of both basic and singular subdifferentials of (7.5.1), which are equally important for the aforementioned applications. Singular subdifferential evaluations are used for establishing the Lipschitz continuity of 𝜗(x) with respect to the parameter x that allows us to reduce the bilevel model under consideration to a single-level problem of Lipschitzian programming. On the other hand, basic subdifferential evaluations open the gate to derive in this way necessary optimality conditions for Lipschitzian bilevel programs.
![$$M\colon \mathbb {R}^n\rightrightarrows \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq119.png)
![$$\displaystyle \begin{aligned} M(x):=\big\{y\in F(x)\big|\;\varphi(x,y)=\vartheta(x)\big\},\quad x\in\mathbb{R}^n,\end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ30.png)
![$$(\bar {x},\bar {y})\in \mbox{gph}\, M$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq120.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_IEq121_HTML.gif](../images/480569_1_En_7_Chapter/480569_1_En_7_IEq121_HTML.gif)
![$$\bar {y}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq122.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq123.png)
The following theorem gives us efficient upper estimates of both basic and singular subdifferentials of the optimal value function 𝜗 needed for subsequent applications. We confine ourselves to the case of local Lipschitz continuity of the cost function φ in (7.5.1) that is sufficient to derive necessary optimality conditions for bilevel programs in Sects. 7.8 and 7.9.
![$$(\bar {x},\bar {y})\in \mathit{\mbox{gph}}\, M$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq124.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \partial\vartheta(\bar{x})\subset\bigcup_{(v,w)\in\partial\varphi(\bar{x},\bar{y})}\Big[v+D^*F(\bar{x},\bar{y})(w)\Big], \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ31.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \partial^\infty\vartheta(\bar{x})\subset D^*F(\bar{x},\bar{y})(0). \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ32.png)
△
![$$\psi \colon \mathbb {R}^n\times \mathbb {R}^m\to \overline {\mathbb {R}}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq125.png)
![$$\displaystyle \begin{aligned} \psi(x,y):=\varphi(x,y)+\delta\big((x,y);\mbox{gph}\, F\big)\;\mbox{ for all }\;(x,y)\in\mathbb{R}^n\times\mathbb{R}^m \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ33.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \partial\vartheta(\bar{x})\subset\big\{v\in\mathbb{R}^n\big|\;(v,0)\in\partial\psi(\bar{x},\bar{y})\big\}. \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ34.png)
![$$v\in \partial \vartheta (\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq126.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_IEq127_HTML.gif](../images/480569_1_En_7_Chapter/480569_1_En_7_IEq127_HTML.gif)
![$$v_k\in \hat {\partial }\vartheta (x_k)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq128.png)
![$$\displaystyle \begin{aligned}\langle v_k,x-x_k\rangle\le\vartheta(x)-\vartheta(x_k)+\varepsilon_k\|x-x_k\|\;\mbox{ whenever }\;x\in x_k+\eta_k\mathbb{B},\quad k=1,2,\ldots. \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equt.png)
![$$\displaystyle \begin{aligned}\langle(v_k,0),(x,y)-(x_k,y_k)\rangle\le\psi(x,y)-\psi(x_k,y_k)+\varepsilon_k\big(\|x-x_k\|+\|y-y_k\|\big) \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equu.png)
![$$(x,y)\in (x_k,y_k)+\eta _k\mathbb {B}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq129.png)
![$$(v_k,0)\in \hat {\partial }_{\varepsilon _k}\psi (x_k,y_k)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq130.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq131.png)
![$$\bar {y}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq132.png)
![$$\vartheta (x_k)\to \vartheta (\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq133.png)
![$$\psi (x_k,y_k)\to \psi (\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq134.png)
![$$(v,0)\in \partial \psi (\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq135.png)
![$$\displaystyle \begin{aligned} \partial^\infty\vartheta(\bar{x})\subset\big\{v\in\mathbb{R}^n\big|\;(v,0)\in\partial^\infty\psi(\bar{x},\bar{y})\big\} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ35.png)
![$$v\in \partial ^\infty \vartheta (\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq136.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_IEq137_HTML.gif](../images/480569_1_En_7_Chapter/480569_1_En_7_IEq137_HTML.gif)
![$$\displaystyle \begin{aligned}\langle v_k,x-x_k\rangle+\nu_k(\mu-\mu_k)\le\varepsilon_k\big(\|x-x_k\|+|\mu-\mu_k|\big) \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equv.png)
![$$(x,\mu )\in \mbox{epi}\,\vartheta ,\ x\in x_k+\eta _k\mathbb {B}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq138.png)
![../images/480569_1_En_7_Chapter/480569_1_En_7_IEq139_HTML.gif](../images/480569_1_En_7_Chapter/480569_1_En_7_IEq139_HTML.gif)
![$$\mu _k\downarrow \psi (\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq140.png)
![$$\displaystyle \begin{aligned}(v_k,0,\nu_k)\in\hat{N}_{\varepsilon_k}\big((x_k,y_k,\mu_k);\mbox{epi}\,\psi\big)\;\mbox{ for all }\;k=1,2,\ldots, \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equw.png)
![$$\hat N_{\varepsilon _k}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq141.png)
As mentioned above, in our applications to bilevel programming we need to have verifiable conditions that ensure the local Lipschitz continuity of the optimal value function (7.5.1). This is provided by the following corollary, which is a direct consequence of Theorem 7.5.1 and the coderivative criterion (7.2.6) for the Lipschitz-like property.
In addition to the assumptions of Theorem
7.5.1, suppose that the constraint mapping F is Lipschitz-like around
in (7.5.2). Then the optimal value function (7.5.1) is locally Lipschitzian around
. △
We know from the coderivative criterion (7.2.6) that F is Lipschitz-like around if and only if
. Applying it to (7.5.4) tells us that the assumed Lipschitz-like property of the constraint mapping F in (7.5.1) ensures that
. Furthermore, it easily follows from the assumptions made that the optimal value function is l.s.c. around
. Thus 𝜗 is locally Lipschitzian around
by the characterization of this property given in (7.2.9). □
7.6 Problems of Lipschitzian Programming
Before deriving necessary optimality conditions in Lipschitzian problems of bilevel optimization in the subsequent sections, we devote this section to problems of single-level Lipschitzian programming. The results obtained here are based on the extremal principle and subdifferential characterization of local Lipschitzian functions while being instrumental for applications to bilevel programs given in Sect. 7.8.
![$$\displaystyle \begin{aligned} \begin{array}{ll} \mbox{minimize }\;\varphi_0(x)\;\mbox{ subject to }\\ \varphi_i(x)\le 0\;\mbox{ for all }\;i=1,\ldots,m, \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ36.png)
![$$\varphi _i\colon \mathbb {R}^n\to \mathbb {R},\ i=0,\ldots ,m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq149.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq150.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq151.png)
![$$\displaystyle \begin{aligned} \lambda_i\ge 0\;\mathit{\mbox{ for all }}\;i=0,\ldots,m, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ37.png)
![$$\displaystyle \begin{aligned} \lambda_0+\ldots+\lambda_m\neq 0, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ38.png)
![$$\displaystyle \begin{aligned} \lambda_i\varphi_i(\bar{x})=0\;\mathit{\mbox{ whenever }}\;i=1,\ldots,m, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ39.png)
![$$\displaystyle \begin{aligned} 0\in\displaystyle\sum_{i=0}^{m}\lambda_i\partial\varphi_i(\bar{x}). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ40.png)
![$$\displaystyle \begin{aligned} \Big[\displaystyle\sum_{i\in I(\bar{x})}\lambda_i v_i=0,\;\lambda_i\ge 0\Big]\Longrightarrow\Big[\lambda_i=0\;\mathit{\mbox{ for all }}\;i\in I(\bar{x})\Big] \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ41.png)
whenever
with
. Then the necessary optimality conditions formulated above hold with λ
0 = 1. △
![$$\varphi _0(\bar {x})=0$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq154.png)
![$$(\bar {x},0)\in \mathbb {R}^n\times \mathbb {R}^{m+1}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq155.png)
![$$\mathbb {R}^n\times \mathbb {R}^{m+1}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq156.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl}{} &\displaystyle &\displaystyle \hspace{-6pt}\Omega_i:=\big\{(x,\mu_0,\ldots,\mu_m)\in\mathbb{R}^n\times\mathbb{R}^{m+1}\big| (x,\mu_i)\in\mbox{epi}\,\varphi_i\big\}, i=0,\ldots,m, {}\qquad \qquad \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ42.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \hspace{-6pt}\Omega _{m+1}:=\mathbb{R}^n\times\{0\}.{} \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ43.png)
![$$(\bar {x},0)\in \Omega _0\cap \ldots \cap \Omega _{m+1}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq157.png)
![$$(\bar {x},0)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq158.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq159.png)
![$$\displaystyle \begin{aligned} \big(\Omega _0+a\big)\displaystyle\bigcap_{i=1}^{m+1}\Omega _i\cap\big(U\times\mathbb{R}^{m+1}\big)=\emptyset, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ44.png)
![$$a:=(0,\nu ,0,\ldots ,0)\in \mathbb {R}^n\times \mathbb {R}^{m+1}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq160.png)
![$$\nu \in \mathbb {R}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq161.png)
![$$0\in \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq162.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq163.png)
![$$(\bar {x},0)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq164.png)
![$$(\bar {x},0)$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq165.png)
![$$v_0,\ldots ,v_m\in \mathbb {R}^n,\ \lambda _0,\ldots ,\lambda _m\in \mathbb {R}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq166.png)
![$$\xi =(\xi _1,\ldots ,\xi _{m+1})\in \mathbb {R}^{m+1}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq167.png)
![$$\displaystyle \begin{aligned} (v_i,-\lambda_i)\in N\big((\bar{x},0);\mbox{epi}\,\varphi_i\big)\;\mbox{ for all }\;i=0,\ldots,m \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ45.png)
![$$\displaystyle \begin{aligned} \displaystyle\sum_{i=0}^m\|v_i\|+\sum_{i=0}^m|\lambda_i|+\|\xi\|\neq 0, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ46.png)
![$$\displaystyle \begin{aligned} \displaystyle\sum_{i=0}^m v_i=0,\;\mbox{ and }\;\lambda_i=\xi_{i+1}\;\mbox{ as }\;i=0,\ldots,m. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ47.png)
![$$\displaystyle \begin{aligned} v_i\in\lambda_i\partial\varphi_i(\bar{x})\;\mbox{ for all }\;i=0,\ldots,m, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ48.png)
To check further the complementary slackness conditions in (7.6.4), fix i ∈{1, …, m} and suppose that . Then the continuity of φ
i at
ensures that the pair
is an interior point of the epigraphical set epi φ
i. It readily implies that
, and hence λ
i = 0. This yields
, which justifies (7.6.4).
To complete the proof of the theorem, it remains to check that the validity of (7.6.6) ensures that λ 0 = 1 in (7.6.5). We easily arrive at this assertion while arguing by contradiction. □
If the constraint functions φ
i, i = 1, …, m, are smooth around the reference point , condition (7.6.6) clearly reduces to the classical Mangasarian-Fromovitz constraint qualification. It suggests us to label this condition (7.6.6) as the generalized Mangasarian-Fromovitz constraint qualification, or the generalized MFCQ.
7.7 Variational Approach to Bilevel Optimization
This section is devoted to describing some models of bilevel programming and a variational approach to them that involves nondifferentiable optimal value functions in single-level problems of parametric optimization.
![$$y\in \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq174.png)
![$$x\in \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq175.png)
![$$\displaystyle \begin{aligned} \mbox{minimize }\varphi(x,y)\;\mbox{ subject to }\;y\in F(x)\;\mbox{ with fixed }\;x\in\mathbb{R}^n, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ49.png)
![$$\varphi \colon \mathbb {R}^n\times \mathbb {R}^m\to \mathbb {R}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq176.png)
![$$F\colon \mathbb {R}^n\to \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq177.png)
![$$\displaystyle \begin{aligned} S(x):=\mbox{argmin}\big\{\varphi(x,y)\big|\;y\in F(x)\big\} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ50.png)
![$$x\in \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq178.png)
![$$\psi \colon \mathbb {R}^n\times \mathbb {R}^m\to \mathbb {R}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq179.png)
![$$S\colon \mathbb {R}^n\rightrightarrows \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq180.png)
![$$\displaystyle \begin{aligned} \mbox{minimize }\;\psi(x,y)\;\mbox{ subject to }\;y\in S(x)\;\mbox{ for each }\;x\in\Omega. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ51.png)
![$$\displaystyle \begin{aligned} \mbox{minimize }\;\mu(x)\;\mbox{ subject to }\;x\in\Omega ,\;\mbox{ where }\;\mu(x):=\inf\big\{\psi(x,y)\big|\;y\in S(x)\big\}, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ52.png)
![$$\Omega \subset \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq181.png)
![$$\displaystyle \begin{aligned} \mbox{minimize }\;\eta(x)\;\mbox{ subject to }\;x\in\Omega ,\;\mbox{ where }\;\eta(x):=\sup\big\{\psi(x,y)\big|\;y\in S(x)\big\}. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ53.png)
The main attention in this and subsequent sections is paid to the application of the machinery and results of variational analysis and generalized differentiation to problems of bilevel optimization by implementing the value function approach. This approach is based on reducing bilevel programs to single-level problems of mathematical programming by using the nonsmooth optimal value function𝜗(x) of the lower-level problem defined in (7.5.1). Such a device was initiated by Outrata [35] for a particular class of bilevel optimization problems and was used by him for developing a numerical algorithm to solve bilevel programs. Then this approach was strongly developed by Ye and Zhu [44] who employed it to derive necessary optimality conditions for optimistic bilevel programs by using Clarke’s generalized gradients of optimal value functions. More advanced necessary optimality conditions for optimistic bilevel programs in terms of the author’s generalized differentiation reviewed in Sect. 7.2 were developed in [11, 12, 30, 34, 46, 47]. Optimality and stability conditions for pessimistic bilevel models were derived in [13, 16].
We restrict ourselves in what follows to implementing variational analysis and the aforementioned machinery of generalized differentiation within the value function approach to optimistic bilevel models with Lipschitzian data in finite-dimensional spaces. This allows us to most clearly communicate the basic variational ideas behind this approach, without additional technical complications. The variational results presented in the previous sections make our presentation self-contained and complete.
![$$\displaystyle \begin{aligned} F(x):=\big\{y\in\mathbb{R}^m\big|\;f_i(x,y)\le 0\;\mbox{ for }\;i=1,\ldots,r\big\}, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ54.png)
![$$\displaystyle \begin{aligned} \Omega :=\big\{x\in\mathbb{R}^n\big|\;g_j(x)\le 0\;\mbox{ for }\;j=1,\ldots,s\big\}.\end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ55.png)
7.8 Optimality Conditions for Lipschitzian Bilevel Programs
![$$\displaystyle \begin{aligned} \vartheta(x)=\inf\big\{\varphi(x,y)\big|\;f_i(x,y)\le 0,\;i=1,\ldots,r\big\},\quad x\in\mathbb{R}^n. \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ56.png)
![$$\displaystyle \begin{aligned} \begin{array}{ll} &\mbox{minimize }\;\psi(x,y)\;\mbox{ subject to }\;g_j(x)\le 0,\;j=1,\ldots,s,\\ &f_i(x,y)\le 0,\;i=1,\ldots,r,\;\mbox{ and }\;\varphi(x,y)\le\vartheta(x). \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ57.png)
![$$\displaystyle \begin{aligned} \begin{array}{ll} &\mbox{minimize }\;\psi(x,y)\;\mbox{ subject to }\;g_j(x)\le 0,\;j=1,\ldots,s,\\ &f_i(x,y)\le 0,\;i=1,\ldots,r,\;\mbox{ and }\;\varphi(x,y)-\vartheta(x)+\nu=0\;\mbox{ as }\;\nu\in\mathbb{R}. \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ58.png)
There are various efficient conditions, which ensure the fulfillment of the partial calmness property for (7.8.2). They include the uniform sharp minimum condition [44], linearity of the lower-level problem with respect to the decision variable [9], the kernel condition [30], etc. On the other hand, partial calmness may fail in rather common situations; see [19, 30] for more results and discussions on partial calmness and related properties.
The main impact of partial calmness to deriving necessary optimality conditions for (7.8.2) is its equivalence to the possibility of transferring the troublesome constraint φ(x, y) ≤ 𝜗(x) into the penalized cost function as in the following proposition.
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq184.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq185.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \begin{array}{ll} \mathit{\mbox{minimize }}\;\psi(x,y)+\kappa\big(\varphi(x,y)-\vartheta(x)\big)\;\mathit{\mbox{ subject to }}\\ g_j(x)\le 0,\;j=1,\ldots,s,\;\mathit{\mbox{ and }}\;f_i(x,y)\le 0,\;i=1,\ldots,r, \end{array} \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ60.png)
where the number κ > 0 is taken from (7.8.4). △
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq186.png)
![$$\tilde U:=[(\bar {x},\bar {y})+\eta \mathbb {B}]\times (-\gamma ,\gamma )\subset U$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq187.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl} |\psi(x,y)-\psi(\bar{x},\bar{y})|\le\kappa\gamma\;\mbox{ for all }\;(x,y)-(\bar{x},\bar{y})\in\eta\mathbb{B}. \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ61.png)
![$$\displaystyle \begin{aligned} \psi(x,y)-\psi(\bar{x},\bar{y})+\kappa\big(\varphi(x,y)-\vartheta(x)\big)\ge 0\;\mbox{ if }\;g_j(x)\le 0,\;j=1,\ldots,s, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ62.png)
![$$(x,y)\in [(\bar {x},\bar {y})+\eta \mathbb {B}]\cap \mbox{gph}\, F$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq188.png)
![$$(x,y,\vartheta (x)-\varphi (x,y))\in \tilde U$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq189.png)
![$$(x,y,\vartheta (x)-\varphi (x,y))\notin \tilde U$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq190.png)
![$$\displaystyle \begin{aligned} \varphi(x,y)-\vartheta(x)\ge\gamma,\;\mbox{ and hence }\;\kappa\big(\varphi(x,y)-\vartheta(x)\big)\ge\kappa\gamma, \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equx.png)
![$$\psi (x,y)-\psi (\bar {x},\bar {y})\ge -\kappa \gamma $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq191.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq192.png)
![$$\varphi (\bar {x},\bar {y})-\vartheta (\bar {x})=0$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq193.png)
Thus the imposed partial calmness allows us to deduce the original problem of optimistic bilevel optimization to the single-level mathematical program (7.8.5) with conventional inequality constraints, where the troublesome term φ(x, y) − 𝜗(x) enters the penalized cost function. To derive necessary optimality conditions for (7.8.5), let us reformulate the generalized MFCQ (7.6.6) in conventional bilevel terms as in the case of bilevel programs with smooth data [6].
![$$(\bar {x},\bar {y})\in \mathbb {R}^n\times \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq194.png)
![$$\displaystyle \begin{aligned} \Big[\displaystyle\sum_{i\in I(\bar{x},\bar{y})}\lambda_i v_i=0,\;\lambda_i\ge 0\Big]\Longrightarrow\Big[\lambda_i=0\;\mbox{ for all }\;i\in I(\bar{x},\bar{y})\Big] \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ63.png)
![$$(u_i,v_i)\in \partial f_i(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq195.png)
![$$u\in \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq196.png)
![$$\displaystyle \begin{aligned}I(\bar{x},\bar{y}):=\big\{i\in\{1,\ldots,r\}\big|\;f_i(\bar{x},\bar{y})=0\big\}.\end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equy.png)
![$$\bar {x}\in \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq197.png)
![$$\displaystyle \begin{aligned} \Big[\displaystyle 0\in\sum_{j\in J(\bar{x})}\mu_j\partial g_j(\bar{x}),\;\mu_j\ge 0\Big]\Longrightarrow\Big[\mu_j=0\;\mbox{ whenever }\;j\in J(\bar{x})\Big] \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ64.png)
![$$J(\bar {x}):=\big \{j\in \{1,\ldots ,s\}\big |\;g_j(\bar {x})=0\big \}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq198.png)
Now we are ready for the application of Theorem 7.6.1 to the optimistic bilevel program in the equivalent form (7.8.5). To proceed, we have to verify first that the optimal value function 𝜗(x) in the cost function of (7.8.5) is locally Lipschitzian around the reference point and then to be able to upper estimate the basic subdifferential (7.2.7) of the function − 𝜗(⋅). Since the basic subdifferential does not possess the plus-minus symmetry while its convex hull
does, we need to convexify in the proof the set on the right-hand side of the upper estimate in (7.5.3). In this way we arrive at the following major result.
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq201.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq202.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq203.png)
![$$\displaystyle \begin{aligned} \lambda_i\ge 0,\;\;\lambda_i f_i(\bar{x},\bar{y})=0\;\mathit{\mbox{ for }}\;i=1,\ldots,r, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ65.png)
![$$\displaystyle \begin{aligned} \mu_j\ge 0,\;\;\mu_j g_j(\bar{x})=0\;\mathit{\mbox{ for }}\;j=1,\ldots,s, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ66.png)
![$$\displaystyle \begin{aligned} \nu_i\ge 0,\;\;\nu_i f_i(\bar{x},\bar{y})=0\;\mathit{\mbox{ for }}\;i=1,\ldots,r, \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ67.png)
![$$u\in \mathit{\mbox{co}}\,\partial \vartheta (\bar {x}):$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq204.png)
![$$\displaystyle \begin{aligned} (u,0)\in\mathit{\mbox{co}}\,\partial\varphi(\bar{x},\bar{y})+\displaystyle\sum_{i=1}^r\nu_i\mathit{\mbox{co}}\,\partial f_i(\bar{x},\bar{y}), \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ68.png)
![$$\displaystyle \begin{aligned} (u,0)\in\partial\varphi(\bar{x},\bar{y})+\kappa^{-1}\partial\psi(\bar{x},\bar{y})+\displaystyle\sum_{i=1}^r\lambda_i\partial f_i(\bar{x},\bar{y})+\sum_{j=1}^s\mu_j\Big(\partial g_j(\bar{x}),0\Big). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ69.png)
△
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq205.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq206.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq207.png)
![$$\partial ^\infty \vartheta (\bar {x})=\{0\}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq208.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq209.png)
![$$\displaystyle \begin{aligned} \partial^\infty\vartheta(\bar{x})\subset D^*F(\bar{x})(0)\;\mbox{ with }\;F(x)=\big\{y\in\mathbb{R}^m\big|\;f_i(x,y)\le 0,\;i=1,\ldots,r\big\}. \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equz.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq210.png)
![$$F\colon \mathbb {R}^n\rightrightarrows \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq211.png)
![$$\displaystyle \begin{aligned} \mbox{gph}\, F=\big\{(x,y)\in\mathbb{R}^n\times\mathbb{R}^m\big|\;f_i(x,y)\le 0,\;i=1,\ldots,r\big\}, \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equaa.png)
![$$D^*F(\bar {x},\bar {y})(0)=\{0\}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq212.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq213.png)
![$$\displaystyle \begin{aligned} \begin{array}{ll} 0&\in\partial\psi(\bar{x},\bar{y})+\kappa\partial\varphi(\bar{x},\bar{y})+\Big(\kappa\partial(-\vartheta)(\bar{x}),0\Big)\\ &+\displaystyle\sum_{i=1}^r\lambda_i\partial f_i(\bar{x},\bar{y})+\sum_{j=1}^s\mu_j\Big(\partial g_j(\bar{x}),0\Big). \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ70.png)
![$$\partial (-\vartheta )(\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq214.png)
![$$\displaystyle \begin{aligned} \partial(-\vartheta)(\bar{x})\subset\overline\partial(-\vartheta)(\bar{x})=-\overline\partial\vartheta(\bar{x})=-\mbox{co}\,\partial\vartheta(\bar{x}), \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equab.png)
![$$\overline \partial $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq215.png)
![$$u\in \mbox{co}\,\partial \vartheta (\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq216.png)
![$$\displaystyle \begin{aligned} \kappa(u,0)\in\partial\psi(\bar{x},\bar{y})+\partial\varphi(\bar{x},\bar{y})+\displaystyle\sum_{i=1}^r\lambda_i\partial f_i(\bar{x},\bar{y})+\sum_{j=1}^s\Big(\mu_j\partial g_j(\bar{x}),0\Big). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ71.png)
The next section presents an independent set of necessary optimality conditions for optimistic bilevel programs with Lipschitzian data that are obtained without using any convexification while employing instead yet another variational device and subdifferential calculus rule.
7.9 Bilevel Optimization via Subdifferential Difference Rule
Considering the single-level problem (7.8.5), which we are finally dealing with while deriving necessary optimality conditions for optimistic bilevel programs, note that the objective therein contains the difference of two nonsmooth functions. The basic subdifferential (7.2.7) does not possesses any special rule for difference of nonsmooth functions, but the regular subdifferential (7.2.11) does, as was first observed in [33] by using a smooth variational description of regular subgradients. Here we employ this approach to establish necessary optimality conditions for Lipschitzian bilevel programs that are different from those in Theorem 7.8.3.
The derivation of these necessary optimality conditions is based on the following two results, which are certainly of their independent interest. The first one provides a smooth variational description of regular subgradients of arbitrary functions .
Let
be finite at
, and let
. Then there exists a neighborhood U of
and a function
such that
, that ψ is Fréchet differentiable at
with
, and that the difference ψ − φ achieves at
its local maximum on U. △
![$$\displaystyle \begin{aligned} \hat\partial\varphi(\bar{x})=\big\{v\in\mathbb{R}^n\big|\;(v,-1)\in\hat N\big((\bar{x},\varphi(\bar{x}));\mbox{epi}\,\varphi\big)\big\}, \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equac.png)
![$$w\in \hat N(\bar {z};\Omega )$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq227.png)
![$$\bar {z}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq228.png)
![$$\psi \colon \mathbb {R}^n\to \mathbb {R}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq229.png)
![$$\bar {z}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq230.png)
![$$\nabla \psi (\bar {z})=w$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq231.png)
![$$\bar {z}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq232.png)
![$$\psi \colon U\to \mathbb {R}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq233.png)
![$$\displaystyle \begin{aligned} \psi(z)=\psi(\bar{z})+\langle w,z-\bar{z}\rangle+o(\|z-\bar{z}\|)\le\psi(\bar{z})\;\mbox{ whenever }\;z\in U. \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equad.png)
![$$\langle w,z-\bar {z}\rangle +o(\|z-\bar {z}\|)\le 0$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq234.png)
![$$w\in \hat N(\bar {z};\Omega )$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq235.png)
![$$w\in \hat N(\bar {z};\Omega )$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq236.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl} \psi(z):=\left\{\begin{array}{ll} \min\big\{0,\langle w,z-\bar{z}\rangle\big\}&\displaystyle \mbox{ if }\;z\in\Omega ,\\ \langle w,z-\bar{z}\rangle&\displaystyle \mbox{ otherwise}. \end{array}\right. \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ72.png)
The second lemma gives us the aforementioned difference rule for regular subgradients.
![$$\varphi _1,\varphi _2\colon \mathbb {R}^n\to \overline {\mathbb {R}}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq237.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq238.png)
![$$\hat \partial \varphi _2(\bar {x})\neq \emptyset $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq239.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \hat\partial(\varphi_1-\varphi_2)(\bar{x})\subset\bigcap_{v\in\hat\partial\varphi_2(\bar{x})}\Big[\hat\partial\varphi_1(\bar{x})-v\Big]\subset \hat\partial\varphi_1(\bar{x})-\hat\partial\varphi_2(\bar{x}). \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ73.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq240.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \hat\partial\varphi_2(\bar{x})\subset\hat\partial\varphi_1(\bar{x}). \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ74.png)
△
![$$v\in \hat \partial \varphi _2(\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq241.png)
![$$\psi \colon U\to \mathbb {R}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq242.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq243.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq244.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl} \psi(\bar{x})=\varphi_2(\bar{x}),\quad \nabla\psi(\bar{x})=v,\;\mbox{ and }\;\psi(x)\le\varphi_2(x)\;\mbox{ whenever }\;x\in U. \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ75.png)
![$$w\in \hat \partial (\varphi _1-\varphi _2)(\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq245.png)
![$$\displaystyle \begin{aligned} \begin{array}{rcl} \begin{array}{ll} \langle w,x-\bar{x}\rangle&\displaystyle \le\varphi_1(x)-\varphi_2(x)-\big(\varphi_1(\bar{x})-\varphi_2(\bar{x})\big)+\varepsilon\|x-\bar{x}\|\\ &\displaystyle \le\varphi_1(x)-\psi(x)-\big(\varphi_1(\bar{x})-\psi(\bar{x})\big)+\varepsilon\|x-\bar{x}\| \end{array} \end{array} \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ76.png)
![$$\|x-\bar {x}\|\le \gamma $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq246.png)
![$$\bar {x}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq247.png)
![$$\displaystyle \begin{aligned} w\in\hat\partial(\varphi_1-\psi)(\bar{x})=\hat\partial\varphi_1(\bar{x})-\nabla\psi(\bar{x})=\hat\partial\varphi_1(\bar{x})-v, \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equae.png)
Now we are in a position to derive refined necessary optimality conditions for optimistic bilevel programs with Lipschitzian data.
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq250.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq251.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq252.png)
![$$\hat \partial \vartheta (\bar {x})\neq \emptyset $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq253.png)
![$$u\in \hat \partial \vartheta (\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq254.png)
![$$\displaystyle \begin{aligned} (u,0)\in\partial\varphi(\bar{x},\bar{y})+\displaystyle\sum_{i=1}^r\nu_i\partial f_i(\bar{x},\bar{y}),\end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ77.png)
![$$\displaystyle \begin{aligned} (u,0)\in\partial\varphi(\bar{x},\bar{y})+\kappa^{-1}\partial\psi(\bar{x},\bar{y})+\displaystyle\sum_{i=1}^r\lambda_i\partial f_i(\bar{x},\bar{y}). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ78.png)
△
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq255.png)
![$$\displaystyle \begin{aligned} \mbox{minimize }\;\psi(x,y)+\kappa\big(\varphi(x,y)-\vartheta(x)\big)+\delta\big((x,y);\mbox{gph}\, F\big) \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ79.png)
![$$F\colon \mathbb {R}^n\rightrightarrows \mathbb {R}^m$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq256.png)
![$$\displaystyle \begin{aligned} \big(\kappa\hat\partial\vartheta(\bar{x}),0\big)\subset\hat\partial\big(\psi(\cdot)+\kappa\varphi(\cdot)+\delta(\cdot;\mbox{gph}\, F)\big)(\bar{x},\bar{y}). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ80.png)
![$$\displaystyle \begin{aligned} \big(\hat\partial\vartheta(\bar{x}),0\big)\subset\hat\partial\big(\varphi(\cdot)+\delta\big(\cdot;\mbox{gph}\, F\big)\big)(\bar{x},\bar{y}). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ81.png)
![$$\hat \partial $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq257.png)
![$$\displaystyle \begin{aligned} \Big(\kappa\hat\partial\vartheta(\bar{x}),0\Big)&\subset \partial\psi(\bar{x},\bar{y})+\kappa\partial\varphi(\bar{x},\bar{y})+N\big((\bar{x},\bar{y});\mbox{gph}\, F\big),\\ \Big(\hat\partial\vartheta(\bar{x}),0\Big)&\subset\partial\varphi(\bar{x},\bar{y})+N\big((\bar{x},\bar{y});\mbox{gph}\, F\big). \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ82.png)
![$$(\bar {x},\bar {y})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq258.png)
![$$\displaystyle \begin{aligned} N\big((\bar{x},\bar{y});\mbox{gph}\, F\big)\subset\bigcup\Big\{\displaystyle\sum_{i=1}^r\lambda_i\partial f_i(\bar{x},\bar{y})\Big|\;\lambda_i\ge 0,\;\lambda_i f_i(\bar{x},\bar{y})=0\;\mbox{ as }\; i=1,\ldots,r\Big\}. \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equag.png)
![$$u\in \hat \partial \vartheta (\bar {x})$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq259.png)
![$$\displaystyle \begin{aligned} \kappa(u,0)\in\partial\psi(\bar{x},\bar{y})+\kappa\partial\varphi(\bar{x},\bar{y})+\displaystyle\sum_{i=1}^r\lambda_i\partial f_i(\bar{x},\bar{y}). \end{aligned}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equah.png)
Observe that we always have if the optimal value function (7.8.1) is convex, which is surely the case when all the functions φ and f
i therein are convex. If in addition the upper-level data are also convex, more general results were derived in [17] for problems of semi-infinite programming with arbitrary number of inequality constraints in locally convex topological vector spaces by reducing them to problems of DC programming with objectives represented as differences of convex functions. Note further that the necessary optimality conditions obtained in Theorems 7.8.3 and 7.9.3 are independent of each other even in the case of bilevel programs with smooth data. In particular, we refer the reader to [30, Example 6.24] for illustrating this statement and for using the obtained results to solve smooth bilevel programs. Finally, we mention the possibility to replace the inner semicontinuity assumption on the solution map S(x) imposed in both Theorems 7.8.3 and 7.9.3 by the uniform boundedness of this map in finite-dimensions, or by its inner semicompactness counterpart in infinite-dimensional spaces; cf. [11, 16, 30, 34] for similar transitions in various bilevel settings.
7.10 Concluding Remarks and Open Questions
In this self-contained chapter of the book we described a variational approach to bilevel optimization with its implementation to deriving advanced necessary optimality conditions for optimistic bilevel programs in finite-dimensional spaces. The entire machinery of variational analysis and generalized differentiation (including the fundamental extremal principle, major calculus rules, and subdifferentiation of optimal value functions), which is needed for this device, is presented here with the proofs. The given variational approach definitely has strong perspectives for further developments. Let us briefly discuss some open questions in this direction.
The major difference between the optimistic model (7.7.4) and pessimistic model (7.7.5) in bilevel programming is that the latter invokes the supremum marginal/optimal value function instead of the infimum type in (7.7.4). Subdifferentiation of the supremum marginal functions is more involved in comparison with that of the infimum type. Some results in this vein for problems with Lipschitzian data can be distilled from the recent papers [31, 32, 38], while their implementation in the framework of pessimistic bilevel programs is a challenging issue.
The given proof of the necessary optimality conditions for bilevel programs in Theorem 7.8.3 requires an upper estimate of
, which cannot be directly derived from that for
since
. To obtain such an estimate, we used the subdifferential convexification and the fact that the convexified/Clarke subdifferential of Lipschitz continuous functions possesses the plus-minus symmetry. However, there is a nonconvex subgradient set that is much smaller than Clarke’s one while having this symmetry. It is the symmetric subdifferential
, which enjoys full calculus induced by the basic one. Efficient evaluations of
for infimum and supremum marginal functions would lead us to refined optimality conditions for both optimistic and pessimistic models in bilevel optimization.
The partial calmness property used in both Theorems 7.8.3 and 7.9.3 seems to be rather restrictive when the lower-level problem is nonlinear with respect to the decision variable. It is a challenging research topic to relax this assumption and to investigate more the uniform weak sharp minimum property and its modifications that yield partial calmness.
One of the possible ways to avoid partial calmness in bilevel programming is as follows. Having the solution map
to the lower-level problem, consider the constrained upper-level problem given by
![$$\displaystyle \begin{aligned} \mbox{minimize }\;\Psi(x):=\psi\big(x,S(x)\big)\;\mbox{ subject to }\;x\in\Omega , \end{aligned} $$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_Equ83.png)
![$$\psi \colon \mathbb {R}^n\times \mathbb {R}^m\to \overline {\mathbb {R}}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq267.png)
![$$\Omega \subset \mathbb {R}^n$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq268.png)
![$$\Psi \colon \mathbb {R}^n\rightrightarrows \mathbb {R}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq269.png)
![$$\mathbb {R}$$](../images/480569_1_En_7_Chapter/480569_1_En_7_Chapter_TeX_IEq270.png)
Note that a somewhat related approach to bilevel optimization was developed in [1], where a lower-level problem was replaced by the corresponding KKT system described by a certain generalized equation of the Robinson type [39]. Applying to the latter necessary optimality conditions for upper-level problems with such constraints allowed us to establish verifiable results for the original nonsmooth problem of bilevel programming.
Henrion and Surowiec suggested in [19] a novel approach to derive necessary optimality conditions for optimistic bilevel programs with
-smooth data and convex lower-level problems. Their approach used a reduction to mathematical programs with equilibrium constraints (MPECs) and allowed them to significantly relax the partial calmness assumption. Furthermore, in this way they obtained new necessary optimality conditions for the bilevel programs under consideration, which are described via the Hessian matrices of the program data.
A challenging direction of the future research is to develop the approach and results from [19] to nonconvex bilevel programs with nonsmooth data. It would be natural to replace in this way the classical Hessian in necessary optimality conditions by the generalized one (known as the second-order subdifferential) introduced by the author in [27] and then broadly employed in variational analysis and its applications; see, e.g., [30] and the references therein.
As it has been long time realized, problems of bilevel optimization are generally ill-posed, which creates serious computational difficulties for their numerical solving; see, e.g., [5, 6, 47] for more details and discussions. Furthermore, various regularization methods and approximation procedures devised in order to avoid ill-posedness have their serious drawbacks and often end up with approximate solutions, which may be far enough from optimal ones. Thus it seems appealing to deal with ill-posed bilevel programs how they are and to develop numerical algorithms based on the obtained necessary optimality conditions. Some results in this vein are presented in [47] with involving necessary optimality conditions of the type discussed here, while much more work is required to be done in this very important direction with practical applications.
This research was partly supported by the USA National Science Foundation under grants DMS-1512846 and DMS-1808978, by the USA Air Force Office of Scientific Research under grant 15RT04, and by Australian Research Council Discovery Project DP-190100555.
The author is indebted to an anonymous referee and the editors for their very careful reading of the original presentation and making many useful suggestions and remarks, which resulted in essential improvements of the paper.