© Springer Nature Switzerland AG 2020
S. Dempe, A. Zemkoho (eds.)Bilevel OptimizationSpringer Optimization and Its Applications161https://doi.org/10.1007/978-3-030-52119-6_15

15. Methods for Multiobjective Bilevel Optimization

Gabriele Eichfelder1  
(1)
Institute for Mathematics, TU Ilmenau, Ilmenau, Germany
 
 
Gabriele Eichfelder

Abstract

This chapter is on multiobjective bilevel optimization, i.e. on bilevel optimization problems with multiple objectives on the lower or on the upper level, or even on both levels. We give an overview on the major optimality notions used in multiobjective optimization. We provide characterization results for the set of optimal solutions of multiobjective optimization problems by means of scalarization functionals and optimality conditions. These can be used in theoretical and numerical approaches to multiobjective bilevel optimization.

As multiple objectives arise in multiobjective optimization as well as in bilevel optimization problems, we also point out the results on the connection between these two classes of optimization problems. Finally, we give reference to numerical approaches which have been followed in the literature to solve these kind of problems. We concentrate in this chapter on nonlinear problems, while the results and statements naturally also hold for the linear case.

Keywords
Multiobjective bilevel optimizationSemivectorial bilevel problemScalarizationOptimistic approachNumerical methods

15.1 Introduction

In bilevel optimization one typically assumes that one has scalar-valued objective functions in the two levels of the bilevel program. This may be, for instance, costs, or time which has to be minimized. However, many real world problems can only be modeled adequately by considering several, in most cases conflicting, objective functions simultaneously, as weight and stability, or costs and time.

This occurs for instance in transportation planning [51]. We illustrate this with an example, cf. [17]. Let us consider a city bus transportation system financed by the public authorities. They have as target the reduction of the money losses in this non-profitable business. As a second target they want to motivate as many people as possible to use the buses instead of their own cars, as it is a public mission to reduce the overall traffic. The public authorities can decide about the bus ticket price, but this will influence the customers in their usage of the buses. The public has several, possibly conflicting, objectives, too, as to minimize their transportation time and costs. Hence, the usage of the public transportation system can be modeled on the lower level with the bus ticket price as parameter and with the solutions influencing the objective values of the public authorities on the upper level. Thus, such a problem can be modeled by multiobjective bilevel optimization.

In [17, 40] also a problem in medical engineering was discussed. There, the task was the configuration of coils. In the original version, this problem was a usual scalar-valued standard optimization problem which had to be reformulated as a bilevel optimization problem due to the need of real-time solutions and because of its structure. It turned out that the engineers would accept—or even prefer—solutions which do not satisfy a previous equality constraint strictly in favor of incorporating some additional objectives. This led to the examination of two objective functions on both levels.

Dempe and Mehlitz consider in [11] a bilevel control problem, where the three objective functions are approximating a given target state as good as possible, minimizing the overall control effort, and the sparcity of the chosen control.

In case we have more than one objective function then we speak of a multiobjective optimization problem. Collecting for instance m objective functions in an m-dimensional vector also leads to the name vector optimization, as now vector-valued maps have to be optimized. In case we have a bilevel program with a vector-valued objective on one or both of the levels we speak of a multiobjective bilevel optimization problem. In case one has multiple functions just on one level one also uses the name semivectorial bilevel problem.

For instance, Pilecka studies in [39] bilevel optimization problems with a single-objective lower level and a vector-valued function on the upper level. The same structure was assumed by Gadhi and Dempe in [23, 24] or Ye in [50]. Bonnel and co-authors [3] study semivectorial bilevel problems with a multiobjective optimization problem on the lower level. For optimality conditions for bilevel problems with such a structure (multiobjective on the lower level, single-objective on the upper level), see also Dempe and co-authors in [12].

In this chapter we will give a short introduction to multiobjective optimization. First, an optimality concept for such optimization problems has to be formulated. By doing that, one can define stricter or weaker concepts, which may have advantages from applications point of view or from the theoretical or numerical point of view, respectively.

We give scalarization results and optimality conditions which allow to characterize the set of optimal solutions of a multiobjective optimization problem. We also point out some of the developments on numerical solvers. Throughout the chapter we emphasize more on nonlinear problems. For the linear case of a multiobjective bilevel optimization problem with multiple functions on each level see for instance [9] and the references therein. We will also study reformulations of the feasible set of the upper level problem, which can be used to formulate numerical solution approaches.

In bilevel optimization also multiple objectives arise: a function on the upper level and a function on the lower level. Hence, we will also discuss the relationship between multiobjective and bilevel optimization from this perspective.

15.2 Basics of Multiobjective Optimization

As we will consider bilevel optimization problems with a multiobjective optimization problem on the lower or/and on the upper level, we start by giving an overview on the main optimality notions typically used in multiobjective optimization.

In multiobjective optimization one studies optimization problems formally defined by

$$\displaystyle \begin{aligned} \begin{array}{c} \min f(x)=\left(f_1(x),\ldots,f_m(x)\right)^\top\\ \mbox{subject to the constraint}\\ x\in\Omega, \end{array} \end{aligned} $$
(MOP)
with a vector-valued objective function 
$$f\colon \mathbb {R}^n\to \mathbb {R}^m$$
(
$$m,\,n\in \mathbb {N},\ m\geq 2$$
) and a nonempty set of feasible points 
$$\Omega \subseteq \mathbb {R}^n$$
.
For defining minimality for the multiobjective optimization problem (MOP), we need a partial ordering in the image space. Thus one considers partial orderings introduced by closed pointed convex cones 
$$K\subseteq \mathbb {R}^m$$
. A set K is a convex cone if λ(x + y) ∈ K for all λ ≥ 0, x, y ∈ K, and K is a pointed convex cone if additionally K ∩ (−K) = {0}. A partial ordering introduced by a pointed convex cone is antisymmetric. The partial ordering is then defined by

$$\displaystyle \begin{aligned}x\leq_K y\qquad  \Leftrightarrow\qquad  y-x\in K\ . \end{aligned}$$
Hence, in the following, let 
$$K\subseteq \mathbb {R}^m$$
be a closed pointed convex cone.
For 
$$K=\mathbb {R}^m_+$$
we have for 
$$x,y\in \mathbb {R}^m$$

$$\displaystyle \begin{aligned}x\leq_{\mathbb{R}^m_+} y\qquad \Leftrightarrow\qquad  x_i\leq y_i,\ i=1,\ldots,m,\end{aligned}$$
i.e. the componentwise ordering. Throughout the chapter, we denote by ≤ without a subscript the componentwise ordering defined by 
$$\leq =\leq _{\mathbb {R}^m_+}$$
. For more details on ordering structures in vector optimization we refer to [21].

By using the concept of a partial ordering, we are now able to define two types of optimal solutions of (MOP)

Definition 15.2.1
A point 
$$\bar x\in \Omega $$
is called a K-minimal point, or efficient, for (MOP) if

$$\displaystyle \begin{aligned}\left(\{f(\bar x)\}-K\right)\cap f(\Omega)=\{f(\bar x)\}.\end{aligned}$$
Additionally, for int(K) ≠ ∅, a point 
$$\bar x\in \Omega $$
is called a weakly K-minimal point, or weakly efficient, for (MOP) if

$$\displaystyle \begin{aligned}\left(\{f(\bar x)\}-\mbox{int}(K)\right)\cap f(\Omega)=\emptyset.\end{aligned}$$

Of course, any efficient point is also weakly efficient, in case int(K) ≠ ∅. Whenever we speak of weakly efficient points, we assume that the ordering cone K has a nonempty interior. We denote the set of all K-minimal points, i. e. of all efficient points, as 
$$\mathcal {E}(f(\Omega ),K)$$
, and the set of all weakly efficient points as 
$$\mathcal {E}^w(f(\Omega ),K)$$
. The set 
$$\mathcal {N}(f(\Omega ),K):=\{f(x)\in \mathbb {R}^m\mid x\in \mathcal {E}(f(\Omega ),K)\}$$
is called nondominated set and the set 
$$\mathcal {N}^w(f(\Omega ),K):=\{f(x)\in \mathbb {R}^m\mid x\in \mathcal {E}^w(f(\Omega ),K)\}$$
weakly nondominated set.

For 
$$K=\mathbb {R}^m_+$$
the efficient points are also denoted as Edgeworth-Pareto (EP)-minimal points. Then the definition can also be formulated as follows: a point 
$$\bar x\in \Omega $$
is an EP-minimal point for (MOP) if there is no x ∈ Ω with

$$\displaystyle \begin{aligned}\begin{array}{rcl} f_i(x)&\leq& f_i(\bar x),\ i=1,\ldots,m,\ \mbox{ and }\\ f_j(x)&<& f_j(\bar x) \mbox{ for at least one } j\in\{1,\ldots,m\}. \end{array} \end{aligned}$$
Similarly, a point is called weakly EP-minimal or weakly efficient for (MOP) w.r.t. 
$$K=\mathbb {R}^m_+$$
if there is no x ∈ Ω with

$$\displaystyle \begin{aligned}\begin{array}{rcl} f_i(x)&<& f_i(\bar x),\ i=1,\ldots,m. \end{array} \end{aligned}$$
The weakly efficient points are in most cases more of interest from a theoretical or numerical point of view, see, for instance, Theorem 15.3.2, or the results on optimality conditions for a semivectorial bilevel optimization problem as given in [12, 23, 24]. From an application point of view one would in general only be interested in a weakly efficient point which is also efficient, as otherwise one could improve the value of one objective function without deteriorating the others. Nevertheless, in many studies on multiobjective bilevel optimization problems it is assumed that the lower level decision maker would also be satisfied with a weakly efficient solution, see, for instance, [12]. Note that in case Ω is convex and the functions f i, i = 1, …, m are strictly convex, i.e., if for all i = 1, …, m

$$\displaystyle \begin{aligned}f_i(\lambda\, x+(1-\lambda)\,y)<\lambda\,f_i(x)+(1-\lambda)\,f_i(y)\ \mbox{ for all }\ x,y\in\Omega,\ x\neq y,\ \lambda\in]0,1[,\end{aligned}$$
then any weakly efficient point for (MOP) w.r.t. 
$$K=\mathbb {R}^m_+$$
is also an efficient point of (MOP) w.r.t. 
$$K=\mathbb {R}^m_+$$
.

For iterative numerical approaches, the following result is useful: in case the feasible set becomes larger and one has already determined the efficient points of the smaller set, then this can be used for determining the efficient points w.r.t. a larger superset.

Lemma 15.2.2 ([17])
For two sets 
$$A^0, A^1\subseteq \mathbb {R}^n$$
, and a vector-valued function 
$$f\colon \mathbb {R}^n\to \mathbb {R}^m$$
, consider the sets

$$\displaystyle \begin{aligned}\begin{array}{rcl} A&=& A^0\cup A^1\qquad \mathit{\mbox{ and}}\\ \tilde A & = & \mathcal{E}(f(A^0),\mathbb{R}^m_+)\cup A^1. \end{array} \end{aligned}$$

Let f(A 0) be compact. Then it holds 
$$\mathcal {E}(f(A),\mathbb {R}^m_+)=\mathcal {E}(f(\tilde A),\mathbb {R}^m_+)$$
.△

For numerical calculations, also the notion of ε-minimal solutions can be useful. We state here a definition for the ordering cone 
$$K=\mathbb {R}^m_+$$
, see also for instance [30, 31, 48]:

Definition 15.2.3
Let 
$$\varepsilon \in \mathbb {R}^m$$
with ε i > 0, i = 1, …, m, be given. A point 
$$\bar x\in \Omega $$
is an ε-EP-minimal solution of the multiobjective optimization problem (MOP) if there is no x ∈ Ω with

$$\displaystyle \begin{aligned}\begin{array}{rcl} f_i(x)+\varepsilon_i&\leq & f_i(\bar x)\quad \mbox{ for all }\ i\in\{1,\ldots,m\}\\ \mbox{ and }\quad  f_j(x)+\varepsilon_j&<& f_j(\bar x) \quad \mbox{ for at least one }\ j\in\{1,\ldots,m\}. \end{array}\end{aligned}$$
Example to Optimality Notions
Let 
$$\Omega =\{x\in \mathbb {R}^2\mid x_1\geq 0,\ x_2\geq 0,\ x_1+x_2\geq 1\}$$
, 
$$f\colon \mathbb {R}^2\to \mathbb {R}^2$$
be defined by f(x) = x for all 
$$x\in \mathbb {R}^2$$
and 
$$K=\mathbb {R}^2_+$$
. Then the set of efficient points is

$$\displaystyle \begin{aligned}\mathcal{E}(f(\Omega),K)=\{x\in\mathbb{R}^2\mid x_1\geq0,\ x_2\geq 0,\ x_1+x_2= 1\}\end{aligned}$$
and the set of weakly efficient points is

$$\displaystyle \begin{aligned}\begin{array}{rcl} \mathcal{E}^w(f(\Omega),K)&=&\{x\in\mathbb{R}^2\mid x_1\geq0,\ x_2\geq 0,\ x_1+x_2= 1\}\\&&\cup\{(0,x_2)\in\mathbb{R}^2\mid x_2\geq 1\}\cup\{(x_1,0)\in\mathbb{R}^2\mid x_1\geq 1\}.\end{array}\end{aligned}$$
As the objective map f is the identity we have

$$\displaystyle \begin{aligned}\mathcal{N}(f(\Omega),K)=\mathcal{E}(f(\Omega),K)\ \mbox{ and }\ \mathcal{N}^w(f(\Omega),K)=\mathcal{E}^w(f(\Omega),K).\end{aligned}$$
For ε i = 1∕4 for i = 1, 2, the point 
$$\bar x=(1/2+1/8, 1/2+1/8)$$
is an ε-EP-minimal solution of the multiobjective optimization problem minx ∈ Ωf(x) but not an efficient point.

For multiobjective optimization problems also more general optimality notions have been introduced, which use more general ordering structures. For instance, Ye discusses in [50] semivectorial bilevel problems where the preference relation is locally satiated and almost transitive. Moreover, the concept of using a fixed ordering cone can be replaced by using an ordering map which associates to each element y of the space an ordering cone K(y). This was done by Dempe and Gadhi in [10]. They study semivectorial bilevel problems with a multiobjective optimization problem on the upper level and where the cones K(y) are assumed to be Bishop-Phelps-cones. For such multiobjective optimization problems also proper nonlinear scalarizations exist, see [19], which have also been used by Dempe and Gadhi in their approach for formulating necessary optimality conditions. Variable ordering structures have been intensively studied, cf. [18] and, within the context of set-optimization, in [20]. There is a strong relation between bilevel optimization and set optimization, i.e. with optimization with a set-valued objective function. This arises whenever the optimal solution set on the lower level is not unique, see [39].

15.3 Characterization of Optimal Solutions of a Multiobjective Optimization Problem

For solving problem (MOP) several methods are discussed in the literature. For surveys see [13, 29, 35, 42]. An often used approach is to replace the vector optimization problem by a suitable scalar-valued optimization problem. This fact is also used in numerical methods for multiobjective bilevel problems to reformulate the lower level problem by a single-objective problem or by the optimality conditions of this reformulation.

Because of its simplicity, a widely used scalarization for multiobjective optimization is linear scalarization. For that one uses elements of the dual cone

$$\displaystyle \begin{aligned}K^*:=\{w\in\mathbb{R}^m\mid w^\top y\geq 0\ \mbox{ for all }y\in K\}\end{aligned}$$
as well as of the quasi-interior of the dual cone

$$\displaystyle \begin{aligned}K^\#:=\{w\in\mathbb{R}^m\mid w^\top y> 0\ \mbox{ for all }y\in K\setminus\{0\}\}.\end{aligned}$$
For 
$$K=\mathbb {R}^m_+$$
we have 
$$K^*=\mathbb {R}^m_+$$
and 
$$K^\#=\mbox{int}(\mathbb {R}^m_+)=\{y\in \mathbb {R}^m\mid y_i>0,\ i=1,\ldots ,m\}$$
.

For convex multiobjective optimization problems one can use linear scalarization for a full characterization of the weakly efficient points.

Theorem 15.3.1
  1. (a)
    Let w  K ∖{0} and 
$$\bar x\in \Omega $$
with
    
$$\displaystyle \begin{aligned}w^\top f(\bar x)\leq w^\top f(x)\ \mathit{\mbox{ for all }}\ x\in\Omega .\end{aligned}$$

    Then 
$$\bar x$$
is a weakly efficient point for (MOP). If, additionally, w  K #, then 
$$\bar x$$
is even efficient for (MOP).

     
  2. (b)
    Let f( Ω) + K be convex. Then for any weakly efficient point 
$$\bar x\in \Omega $$
of (MOP) (and thus for any efficient point) there exists some w  K ∖{0} with
    
$$\displaystyle \begin{aligned}w^\top f(\bar x)\leq w^\top f(x) \ \mathit{\mbox{ for all }}\ x\in\Omega .\end{aligned}$$
     

Hence, in the convex case, we have a full characterization of weakly efficient points. A similar result as in (b) does not exist for efficient points and for K #. This is shown with the next example.

Example on Linear Scalarization
Let 
$$f\colon \mathbb {R}^2\to \mathbb {R}^2$$
with f(x) = x, 
$$K=\mathbb {R}^2_+$$
, and 
$$\Omega =\{x\in \mathbb {R}^2\mid \|x\|{ }_2\leq 1\}$$
. Then

$$\displaystyle \begin{aligned}\mathcal{E}(f(\Omega),\mathbb{R}^2_+)=\{x\in\mathbb{R}^2\mid \|x\|{}_2= 1, x_1\leq0,\ x_2\leq 0\}.\end{aligned}$$
Considering the point 
$$\bar x=(0,-1)\in \mathcal {E}(f(\Omega ),\mathbb {R}^2_+)$$
there exists no 
$$w\in K^\#=\mbox{int}(\mathbb {R}^m_+)$$
with

$$\displaystyle \begin{aligned}w^\top f(\bar x)=-w_2\leq w^\top f(x)=w_1x_1+w_2x_2\ \mbox{ for all }x\in \Omega.\end{aligned}$$
Only for w = (0, 1) the point 
$$\bar x$$
is a minimal solution of

$$\displaystyle \begin{aligned}\min_{x\in\Omega}w^\top f(x).\end{aligned} $$
This characterization is often used for solving multiobjective bilevel optimization problems. If we have multiple objectives and if the problem is convex, then the set

$$\displaystyle \begin{aligned} \{\bar x\in\Omega\mid \exists w\in K^*\setminus\{0\}:\ w^\top f(\bar x) \leq w^\top f(x) \mbox{ for all }x\in \Omega\} \end{aligned} $$
(15.3.1)
equals the set of weakly efficient points 
$$\mathcal {E}^w(f(\Omega ),K)$$
. Such linear scalarizations are also called weighted-sum approach, and the components w i of the vector w ∈ K are interpreted as weights of the objective function.

The approach in (15.3.1) is for instance used in [32, 33]. The authors reformulate a linear semivectorial bilevel programming problem (with a multiobjective optimization problem on the lower level) as a special bilevel programming problem, where the lower level is a parametric linear scalar-valued optimization problem. For characterizing the set in (15.3.1) optimality conditions from scalar-valued optimization can be used. This approach with using the weighted sum is also examined in [11] and it is pointed out that this is a delicate issue in case one studies locally optimal solutions of the parameterized problem only.

Also note that from an application point of view, the weakly efficient points are typically not so much of interest. A point x ∈ Ω which is a weakly EP-minimal point of (MOP) but not an EP-minimal point of (MOP) means that there is still some 
$$\tilde x$$
which satisfies all objective functions equally good and is better with respect to one of the objective functions. By considering instead the set

$$\displaystyle \begin{aligned}\{\bar x\in\Omega\mid \exists w\in K^\#:\ w^\top f(\bar x) \leq w^\top f(x) \mbox{ for all }x\in \Omega\}\end{aligned}$$
one only has a subset of the set of efficient points of (MOP) as the Example 15.3 has shown. Note that there are numerical approaches for multiobjective bilevel problems which in fact aim on the efficient instead of the weakly efficient points, see for instance [25]

Theorem 15.3.1(b) requires the set f( Ω) + K to be convex. This holds in case the functions f i are convex, 
$$K=\mathbb {R}^m_+$$
, and Ω is a convex set. For nonconvex multiobjective optimization problems one has to use nonlinear scalarizations. A widely used one is a scalarization by Pascoletti and Serafini, [38], also known as Tammer-Weidner-functional [26, 27]. This scalarization approach allows to determine weakly efficient points for an arbitrary ordering cone K while many other well known nonlinear scalarizations as the ε-constraint problem (see [13, 28, 34, 35]) are mainly developed for determining EP-minimal points, only. Moreover, those can also be seen as a special case of the Pascoletti-Serafini-scalarization, see [15]. Thus, results gained for the Pascoletti-Serafini problem can be applied to these scalarization problems, too.

Hence we consider the parameter dependent scalarization problems (according to Pascoletti and Serafini)

$$\displaystyle \begin{aligned} \begin{array}{c} \min\limits_{(t,x)\in\mathbb{R}^{1+n}} \ t \\ \mbox{subject to the constraints}\\ a+t\,r-f(x)\in K,\\ x\in \Omega, \\ t\in\mathbb{R} \end{array} \end{aligned} $$
(15.3.2)
for parameters 
$$a,r\in \mathbb {R}^m$$
to the multiobjective optimization problem (MOP). The main properties of this scalarization approach are the following:
Theorem 15.3.2
  1. (a)

    Let 
$$\bar x$$
be an efficient point of (MOP) , then 
$$(0,\bar x)$$
is a minimal solution of (15.3.2) with 
$$a=f(\bar x)$$
, r  K ∖{0}.

     
  2. (b)

    Let 
$$(\bar t,\bar x)$$
be a minimal solution of (15.3.2) for any 
$$a,r\in \mathbb {R}^m$$
, then 
$$\bar x\in \mathcal {E}^w(f(\Omega ),K)$$
.

     

Thus all efficient points of the multiobjective optimization problem can be found even for non-convex problems by choosing suitable parameters. We are even able to restrict the choice of the parameter a to a hyper plane in the image space according to the following theorem ([16], Theorem 3.2).

Theorem 15.3.3

Let 
$$\bar x\in \mathcal {E}(f(\Omega ),K)$$
and r  K be given. We define a hyper plane H by 
$$H=\{y\in \mathbb {R}^m\mid b^\top y=\beta \}$$
with 
$$b\in \mathbb {R}^m,\ b^\top r\neq 0,\ \beta \in \mathbb {R}$$
. Then there is a parameter a  H and some 
$$\bar t\in \mathbb {R}$$
such that 
$$(\bar t,\bar x)$$
is a minimal solution of the problem (15.3.2).△

Also other scalarizations are used for deriving results on multiobjective bilevel optimization problems. For instance, Gadhi and Dempe use in [23] the scalarization known as signed-distance functional for deriving optimality conditions. This functional evaluates the distance of a point to the negative of the ordering cone as well as to the complement of the negative of the ordering cone.

We end this section by giving the KKT-type (i.e., Karush-Kuhn-Tucker-type) optimality conditions for constrained multiobjective optimization problems. KKT-conditions, and optimality conditions in general, are often used to characterize the optimal solutions of the lower level of a bilevel optimization problem. Such an approach was for instance named to be a classical approach by Deb and Sinha in [6] and it was used by Chuong in [4] for a nonsmooth multiobjective bilevel optimization problem. As it is known from single-objective optimization, KKT-conditions typically include complementarity conditions which are difficult to handle mathematically. Moreover, the necessary conditions require some regularity assumptions (cf. (A2) in [33]). For more details on the sufficiency and necessity of the optimality conditions we refer for instance to [13, 29].

We require the feasible set to be given by inequality and equality constraints (also basic convex sets could be handled). For shortness of representation we restrict ourselves here to inequality constraints. Thus, let 
$$g_k\colon \mathbb {R}^n\to \mathbb {R}$$
, k = 1, …, p be continuously differentiable functions such that

$$\displaystyle \begin{aligned} \Omega:=\{x\in\mathbb{R}^n\mid g_k(x)\leq 0,\ k=1,\ldots,p\}. \end{aligned} $$
(15.3.3)

We give in the following the necessary conditions which also hold for just locally efficient solutions, too. For a proof see for instance [13, Theorem 3.21].

Theorem 15.3.4 (Necessary Optimality Conditions)
Let Ω be given as in (15.3.3) and let f j, j = 1, …, m and g k, k = 1, …, p be continuously differentiable. Let 
$$\bar x$$
be a weakly efficient solution of (MOP) and assume that there exists some 
$$d\in \mathbb {R}^n$$
such that

$$\displaystyle \begin{aligned}\nabla g_k(\bar x)^\top d<0\ \mathit{\mbox{ for all }}\ k\in I(\bar x):=\{k\in\{1,\ldots,p\}\mid g_k(\bar x)=0\},\end{aligned}$$
i.e. a constraint qualification is satisfied. Then there exist Lagrange multipliers 
$$\lambda \in \mathbb {R}^m_+\setminus \{0\}$$
and 
$$\mu \in \mathbb {R}^p_+$$
such that

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{j=1}^m \lambda_j\nabla f_j(\bar x)+\sum_{k=1}^p\mu_k\nabla g_k(\bar x)&=&0,\\ \mu_kg_k(\bar x)&=&0,\ k=1,\ldots,p. \end{array} \end{aligned} $$
(15.3.4)

Under appropriate convexity assumptions we can also state sufficient conditions. We state here a simplified version. For weaker assumptions on the convexity of the functions, as quasiconvexity for the constraints, we refer to [29].

Theorem 15.3.5 (Sufficient Optimality Conditions)

Let Ω be given as in (15.3.3) and let f j, j = 1, …, m and g k, k = 1, …, p be convex and continuously differentiable. Let 
$$\bar x\in \Omega $$
and assume that there exist Lagrange multipliers 
$$\lambda \in \mathbb {R}^m_+\setminus \{0\}$$
and 
$$\mu \in \mathbb {R}^p_+$$
such that (15.3.4) is satisfied. Then, 
$$\bar x$$
is a weakly efficient solution of (MOP).△

15.4 Connection Between Multiobjective Optimization and Bilevel Optimization

In bilevel optimization one has (at least) two objective functions: one on the upper and one on the lower level. In multiobjective optimization one also studies problems with two or more objective functions. This naturally leads to the question on how these problem classes are related, and whether bilevel problems can be reformulated as multiobjective optimization problems. This question was answered for nonlinear problems by Fliege and Vicente in [22]: only by introducing a very special order relation the optimal solutions of the bilevel problem are exactly the efficient points of a multiobjective optimization problem. Only for continuously differentiable problems on the lower level this order relation can be formulated as a more tractable order relation. We give some of the basic ideas in the following, all based on [22]. For a more recent survey and unifying results as well as a more general characterization of the connections between single-level and bilevel multiobjective optimization we refer to [41]. They also provide a thorough literature survey on this topic as well as the historic sources of this approach for bilevel linear problems.

We assume the bilevel problem to be of the form

$$\displaystyle \begin{aligned} \begin{array}{c} \min_{x_u\in\mathbb{R}^{n_u},\ x_\ell\in\mathbb{R}^{n_\ell}} f_u(x_u,x_\ell)\\ \mbox{ s.t.} \qquad  x_\ell\in\mbox{argmin}_{y\in\mathbb{R}^{n_\ell}}\{f_\ell(x_u,y)\}, \end{array} \end{aligned} $$
(15.4.1)
where 
$$f_u,f_\ell \colon \ \mathbb {R}^{n_u}\times \mathbb {R}^{n_\ell }\to \mathbb {R}$$
are the upper-level and lower-level objective functions respectively. Hence, we assume that there are no constraints in each level and we assume the optimistic approach. The handling of constraints is also shortly described in [22].

As a related multiobjective optimization problem, it is not enough to study just a problem with the vector-valued function (f u(x u, x ), f (x u, x )), as counter examples, which can be found in the literature, demonstrate. Instead, a possibility to give a relation is to use a map which includes the vector x u as a component. But still, it is not enough to use the standard componentwise partial ordering. The authors in [22] present a complete reformulation with an ordering which is not numerically tractable. They also formulate a weaker relation which we present in the following but which gives a sufficient condition only.

The related vector optimization problem, which the authors from [22] finally propose, uses as objective map 
$$F\colon \mathbb {R}^{n_u}\times \mathbb {R}^{n_\ell }\to \mathbb {R}^{n_u}+3$$
defined by

$$\displaystyle \begin{aligned}F(x_u,x_\ell)=(x_u,f_u(x_u,x_\ell),f_\ell(x_u,x_\ell),\|\nabla_{x_\ell}f_\ell(x_u,x_\ell)\|{}_2). \end{aligned} $$
(15.4.2)
In the image space, a cone K is used to define an ordering (but not a partial ordering) by x ≤ y if and only if y − x ∈ K:

$$\displaystyle \begin{aligned} K=\{(x,f_1,f_2,d)\in \mathbb{R}^{n_u+3}\mid (x=0\ \wedge f_2>0)\ \vee\ (f_1>0\ \wedge\ d\geq 0)\}\cup\{0\}. \end{aligned} $$
(15.4.3)
Theorem 15.4.1 ([22, Corollary 4.2])
Let F and K be as in (15.4.2) and (15.4.3). Let a point 
$$\bar x=(\bar x_u,\bar x_\ell )\in \mathbb {R}^{n_u}\times \mathbb {R}^{n_\ell }$$
be such that 
$$F(\bar x)$$
is nondominated w.r.t. K, i.e.

$$\displaystyle \begin{aligned}\left(\{F(\bar x)\}-K\setminus\{0\}\right)\cap \{F(x)\mid x\in \mathbb{R}^{n_u}\times \mathbb{R}^{n_\ell}\}=\emptyset,\end{aligned}$$

then 
$$\bar x$$
is an optimal solution of the bilevel optimization problem (15.4.1).△

The study about the relation between multiobjective and bilevel optimization problems is also continued and unified in [41] as already mentioned above. The authors also study thereby multiobjective bilevel optimization problems and their connection to multiobjective optimization. A thorough exploration of the use of these relations for numerical approaches for bilevel problems is an open research topic.

15.5 A Multiobjective Bilevel Program

In this section, we introduce the multiobjective bilevel optimization problem. We formulate the general bilevel optimization problem (see (15.5.3)). Then we introduce the so-called optimistic approach (see (15.5.4)) which is a special case of the general bilevel optimization problem. Next, we give a possibility on how to describe the feasible set of such an optimistic bilevel problem as the solution set of a specific multiobjective optimization problem. This will be used in Sect. 15.6 within a numerical method. Then, with an example, we illustrate again what might happen in case the lower level problem has not a unique solution, now in the setting that there is only one objective function on the lower level but several on the upper level.

Moreover, we mention a typical approach from the literature, namely scalarization. We discuss what happens in case one considers weakly efficient points of the lower level problem instead of efficient ones, which is often done when using scalarizations. We end the section by introducing coupling constraints between the upper and the lower level.

As mentioned in the introduction, in bilevel optimization we have a parameter dependent optimization problem on the lower level, which depends on the variable 
$$y\in \mathbb {R}^{n_2}$$
of the upper level. In case of multiple objectives, this lower level problems reads as

$$\displaystyle \begin{aligned} \begin{array}{c} \min\limits_{x\in \mathbb{R}^{n_1}} f(x,y)\\ \mbox{subject to the constraint}\\ (x,y)\in G \end{array} \end{aligned} $$
(15.5.1)
with a vector-valued function 
$$f\colon \mathbb {R}^{n_1}\times \mathbb {R}^{n_2}\to \mathbb {R}^{m_1}$$
and a set 
$$G\subseteq \mathbb {R}^{n_1}\times \mathbb {R}^{n_2}$$
(
$$n_1,n_2,m_1\in \mathbb {N}$$
). The constraint (x, y) ∈ G can be replaced by

$$\displaystyle \begin{aligned} x\in G(y):=\{x\in\mathbb{R}^{n_1}\mid (x,y)\in G\}. \end{aligned} $$
(15.5.2)
The variables 
$$x\in \mathbb {R}^{n_1}$$
of this lower level problem are called lower level variables. For a constant 
$$y\in \mathbb {R}^{n_2}$$
, let x(y) be a minimal solution of (15.5.1), hence

$$\displaystyle \begin{aligned}x(y)\in \Psi(y):=\mbox{argmin}_x\{f(x,y)\mid (x,y)\in G\}\subset\mathbb{R}^{n_1}.\end{aligned}$$
For a multiobjective optimization problem on the lower level it has to be clarified what a minimal solution is, i.e. how the set Ψ(y) is defined. An appropriate concept from a practical point of view might be the set of efficient elements w.r.t. a partial ordering, while from a theoretical point of view the set of weakly efficient points might be more suitable.
The optimization problem of the upper level is then given by

$$\displaystyle \begin{aligned} \begin{array}{c} \mathrm{'}\min\limits_{y\in\mathbb{R}^{n_2}}\mathrm{'}\, F(x(y),y)\\ \mbox{subject to the constraints}\\ x(y)\in \Psi(y),\\ y\in \tilde G\end{array} \end{aligned} $$
(15.5.3)
with a vector-valued function 
$$F\colon \mathbb {R}^{n_1}\times \mathbb {R}^{n_2}\to \mathbb {R}^{m_2}$$
, 
$$m_2\in \mathbb {N}$$
, and a compact set 
$$\tilde G\subset \mathbb {R}^{n_2}$$
. Here, the constraint 
$$y\in \tilde G$$
is uncoupled from the lower level variable. An even more general formulation of the bilevel problem is reached by allowing the constraint 
$$y\in \tilde G(x)$$
to depend on the lower level variable x, what we will discuss shortly later. We speak of a multiobjective bilevel optimization problem if m 1 ≥ 2 or m 2 ≥ 2.

If the minimal solution of the lower level problem (15.5.1) is not unique, i. e. the set Ψ(y) consists of more than one point, the objective function F(x(⋅), ⋅) is not well-defined for 
$$y\in \mathbb {R}^{n_2}$$
. That is the reason why the word 
$$\mathrm{'}\min \mathrm{'}$$
is written in quotes in (15.5.3). This difficulty is in some works avoided by just assuming that the solution of the lower level problem is unique. But in the case of a multiobjective optimization problem (m 1 ≥ 2) on the lower level this is in general not reasonable any more.

Hence, a possible approach is to study the set-valued map

$$\displaystyle \begin{aligned}y\mapsto \{F(x,y)\mid x \in \Psi(y)\}.\end{aligned}$$
In the case of non-uniqueness, another common approach is the optimistic approach. There it is assumed that the decision maker of the lower level chooses among all minimal solutions that one, which is best for the upper level, i. e., which is minimal for the objective function of the upper level. Hence, it is solved

$$\displaystyle \begin{aligned} \begin{array}{c} \min\limits_{x,y} F(x,y)\\ \mbox{subject to the constraints} \\ x\in\Psi(y),\\ y\in\tilde G. \end{array} \end{aligned} $$
(15.5.4)

Thus, in the optimistic modification (15.5.4), the objective function of the upper level is minimized w. r. t. x and y, while in the general formulation (15.5.3) it is only minimized w. r. t. the upper level variable y. In the following, we consider the optimistic approach only, i. e. the bilevel optimization problem as in (15.5.4).

When assuming this optimistic approach, one assumes at the same time that the decision maker on the lower level allows the decision maker on the upper level to utilize any solution from the set of efficient points of the lower level, which might not be realistic. See also the discussion in [45, 47].

In [45], also the case where the lower level decision maker has sufficient power to make a deterministic decision from the set of efficient points of the lower level is studied. This is formulated by using a value function on the objective functions of the lower level, which needs to be known in this setting. By using a value function the lower level problem is somehow reduced to a single-objective problem. To overcome the difficulty of determining the lower level value function, Sinha and co-authors suggest in [46] to handle uncertainties in the lower level value function.

For the multiobjective optimization problem on the upper level we assume that the partial ordering is given by the closed pointed convex cone 
$$K^2\subseteq \mathbb {R}^{m_2}$$
and for the lower level by the closed pointed convex cone 
$$K^1\subseteq \mathbb {R}^{m_1}$$
. For the case m 1 ≥ 2 the set Ψ(y) is the solution set of a multiobjective optimization problem w. r. t. the ordering cone K 1. Thus, using (15.5.2), we write instead of x ∈ Ψ(y)

$$\displaystyle \begin{aligned} x\in\mathcal{E}(f(G(y),y),K^1)=:\mathcal{E}_y(f(G),K^1), \end{aligned} $$
(15.5.5)
with 
$$\mathcal {E}(f(G(y),y),K^1)$$
the set of K 1-minimal points of the multiobjective optimization problem (15.5.1) parameterized by y. Hence we study

$$\displaystyle \begin{aligned} \begin{array}{c} \min\limits_{x,y} F(x,y)\\ \mbox{subject to the constraints} \\ x\in \mathcal{E}_y(f(G),K^1),\\ y\in \tilde G.\end{array} \end{aligned} $$
(15.5.6)
A helpful result, which can be used for numerical approaches, is that the set of feasible points of the upper level problem of the considered multiobjective bilevel optimization problem can be expressed as the set of efficient points of a multiobjective optimization problem. We will use that in Sect. 15.6. The set of feasible points Ω of the upper level problem in (15.5.6), also called induced set, is given by

$$\displaystyle \begin{aligned}\Omega=\{(x,y)\in\mathbb{R}^{n_1}\times \mathbb{R}^{n_2}\mid x\in \mathcal{E}_y(f(G),K^1),\ y\in \tilde G\}.\end{aligned}$$
We show that the set Ω is equivalent to the set of 
$$\hat K$$
-minimal points of the multiobjective optimization problem

$$\displaystyle \begin{aligned} \begin{array}{c} \min\limits_{x,y} \hat f(x,y):=\left( \begin{array}{c} f(x,y)\\ y\end{array} \right)\\ \mbox{subject to the constraints} \\ (x,y)\in G,\\ y\in \tilde G\end{array} \end{aligned} $$
(15.5.7)
w. r. t. the ordering cone

$$\displaystyle \begin{aligned}\hat{K}:=K^1\times \{0\}\subset\mathbb{R}^{m_1}\times \mathbb{R}^{n_2}.\end{aligned}$$
A point 
$$(\bar x,\bar y)$$
is thus an element of the set of feasible points Ω if and only if it is a 
$$\hat K$$
-minimal point of the multiobjective optimization problem (15.5.7). This is shown in the following theorem. There is also a close relation to the reformulation using the function F in (15.4.2), see [41].
Theorem 15.5.1

Let 
$$\hat {\mathcal {E}}$$
be the set of 
$$\hat K$$
-minimal points of the multiobjective optimization problem (15.5.7) with 
$$\hat {K}=K^1\times \{0\}$$
. Then it holds 
$$\Omega =\hat {\mathcal {E}}$$
.△

Proof
We have

$$\displaystyle \begin{aligned}\begin{array}{rl} &(\bar x,\bar y)\in\Omega\\ \Leftrightarrow & \bar x\in\mathcal{E}_{\bar y}(f(G),K^1) \quad  \wedge \quad \bar y\in\tilde G\\ \Leftrightarrow & \left( \not\exists \quad  x\in G(\bar y) \mbox{ with } f(\bar x,\bar y)\in f(x,\bar y)+K^1\setminus\{0\}\right)\\ &\wedge \quad \bar y\in\tilde G \quad \wedge \quad \bar x\in G(\bar y)\\ \Leftrightarrow & \left( \not\exists \quad  (x,y)\in G \mbox{ with } f(\bar x,\bar y) \in f(x,y)+K^1\setminus\{0\} \ \wedge\quad  y=\bar y\right) \\& \wedge\quad  \bar y\in\tilde G \quad \wedge \quad (\bar x,\bar y)\in G\\ \Leftrightarrow & \left( \not\exists \quad (x,y)\in G\mbox{ with }\right.\\ & \left.\left(\begin{array}{c}f(\bar x,\bar y)\\\bar y \end{array}\right)\in \left(\begin{array}{c}f(x,y)\\y\end{array}\right)+ (K^1\times\{0\})\setminus\{0\}\right) \\&\wedge\quad  \bar y\in\tilde G\quad \wedge \quad (\bar x,\bar y)\in G\\ \Leftrightarrow & \left( \not\exists (x,y)\in G \mbox{ with } \hat f(\bar x,\bar y)\in \hat f(x,y)+\hat K\setminus\{0\}\right) \\&\wedge\quad  \bar y\in\tilde G\quad \wedge\quad  (\bar x,\bar y)\in G\\ \Leftrightarrow & (\bar x,\bar y)\in\hat{\mathcal{E}}. \end{array} \end{aligned}$$

A multiobjective bilevel optimization problem is a quite challenging problem in case there are multiple objectives on each level, as just discussed. But already in case there are only multiple objectives on the upper level, such semivectorial bilevel problems become quite difficult. This is even more the case when the lower level problem does not have unique optimal solutions. We illustrate this with an example taken from [39, Ex. 4.14]. This example also gives insight to the optimistic approach.

Example on Semivectorial Bilevel Problem
We study a bilevel optimization problem as in (15.5.3) with the lower level problem

$$\displaystyle \begin{aligned}\min_{x\in\mathbb{R}} \{ \max\{ 0,-xy \}\mid x\leq 1 \}.\end{aligned} $$
Then we have for y ∈ [0, 1]

$$\displaystyle \begin{aligned}\Psi(y)= \left\{ \begin{array}{ll} ]-\infty, 1]& \mbox{ if }\ y=0,\\ {}  [0,1]& \mbox{ if }\ y\in]0,1]. \end{array}\right. \end{aligned}$$
For the upper level problem we are now interested in the problem

$$\displaystyle \begin{aligned}\begin{array}{c} \mathrm{'}\min\limits_{y\in\mathbb{R}}\mathrm{'}\, F(x(y),y)\\ \mbox{subject to the contraints}\\ x(y)\in \Psi(y),\\ y\in [0,1] \end{array} \end{aligned}$$
with

$$\displaystyle \begin{aligned}F(x,y)=\left( \begin{array}{c} x-y-1 \\ y-\max\{0,x\} \\ \end{array} \right). \end{aligned}$$
As the lower level solutions are not unique, to each y ∈ [0, 1] we can define the sets

$$\displaystyle \begin{aligned} \tilde F(y):=\bigcup_{x\in \Psi(y)}\{F(x,y)\} \end{aligned}$$
where

$$\displaystyle \begin{aligned}\tilde F(y)= \left\{ \begin{array}{ll} \mbox{conv}(\{(-1,0),(0,-1)\})\cup \{(u,0)\mid u\in ]-\infty,-1]\}& \mbox{ if }\ y=0\\ {}  \mbox{conv}(\{(-y-1,y),(-y,y-1)\})& \mbox{ if }\ y=]0,1] \end{array}\right. \end{aligned}$$
Then 
$$(\hat x,\hat y)=(\hat x,1)$$
with 
$$\hat x\in \Psi (1)=[0,1]$$
and 
$$F(\hat x,\hat y)=(\hat x-2,1-\hat x)$$
is not an optimal solution of the optimistic formulation

$$\displaystyle \begin{aligned}\begin{array}{c} \min\limits_{x,y} F(x(y),y)\\ \mbox{subject to the contraints}\\ x(y)\in \Psi(y),\\ y\in [0,1]. \end{array} \end{aligned}$$
This can be seen by considering (x, y) = (x, 0) with x ∈ Ψ(0)∩] −, −1]. These points have smaller objective function values: F(x, 0) = (x − 1, 0). By comparing just the nondominated elements of the sets 
$$\tilde F(y)$$
, i.e. by solving a set optimization problem, as proposed in [39, p. 60], there would also be solutions with 
$$\hat y=1$$
. This is a hint that studying bilevel problems with non-unique lower-level-optimal solutions by using techniques from set optimization might lead to other solutions, which might also be of interest.
Next, we will have a look on a typical approach to multiobjective bilevel problems. This is to scalarize the lower level problem. Thereby, one often considers weakly efficient points of the lower level problem instead of efficient ones. We will also discuss the impact of that. For 
$$K^1=\mathbb {R}^{m_1}_+$$
, Lv and Wan use in [32, 33] a weighted-sum approach, as discussed in Sect. 15.3, to characterize the optimal solutions of the lower level in case the lower level problem is convex and some regularity assumptions are satisfied. For the theoretical verification see Theorem 15.3.1. Then, instead of (15.5.6), they study

$$\displaystyle \begin{aligned}\begin{array}{c} \min\limits_{x,y,w} F(x,y)\\ \mbox{subject to the constraints} \\ y\in \tilde G,\\ \sum_{i=1}^{m_1} w_i=1,\\ w\in\mathbb{R}^{m_1}_+,\\ x\in\mbox{argmin}\left\{ w^\top f(x,y)\mid (x,y)\in G \right\}. \end{array} \end{aligned}$$
Hence, the weighting vector w, which is here the scalarization parameter, is interpreted as a new upper level variable, and thus the original semivectorial bilevel programming problem is transformed into a standard bilevel programming problem (cf. [11, Theorem 3.1]). Thereby it is assumed that one is interested in weakly efficient solutions of the lower level–and not in efficient solutions only. While this approach can be done in view of globally optimal solutions, one has to take care in case one uses locally optimal solutions on the lower level: Dempe and Mehlitz show in [11] that the reformulation using the weighted-sum scalarization and the additional variable w might have locally optimal solutions which are not locally optimal for the original problem.
In a second step, in [32, 33], the constraint

$$\displaystyle \begin{aligned}x\in\mbox{argmin}\left\{ w^\top f(x,y)\mid (x,y)\in G \right\}\end{aligned}$$
is replaced by the KKT-conditions of this scalar-valued optimization problem.

As done in these reformulations, many authors take the weakly efficient elements instead of the efficient, i.e. they replace the condition 
$$x\in \mathcal {E}(f(G(y),y),K^1)$$
from the lower level problem by 
$$x\in \mathcal {E}^w(f(G(y),y),K^1)$$
. The next example shows that this can have a strong impact. We use for this examples ideas from [11].

Example on Weakly Efficient and Efficient Solutions
We study the optimistic semivectorial bilevel optimization problem

$$\displaystyle \begin{aligned}\begin{array}{c} \min\limits_{x,y} x+2y\\ \mbox{subject to the constraints} \\ x\in\Psi(y)=\mbox{argmin} \left\{ (xy, 1-x)\mid x\in[0,1] \right\},\\ y\in [0,1]. \end{array} \end{aligned}$$
Then we have

$$\displaystyle \begin{aligned}\Psi(y)=\left\{ \begin{array}{ll} \{1\}&\mbox{ if }y=0,\\ {[} 0, 1 {]}&\mbox{ if }y\in ( 0 , 1 ]. \end{array} \right. \end{aligned}$$
The bilevel problem has no optimal solution: for y = 0 the objective function value is 1 and for y > 0 it is 2y. In case one takes the set of weakly efficient solutions of the lower level, which we denote by Ψw(y), one gets

$$\displaystyle \begin{aligned}\Psi^w(y)=[0,1]\ \mbox{ for }\ y\in[0,1]\end{aligned}$$
and as optimal value of the bilevel problem we obtain 0. The minimal solution is then (x, y) = (0, 0). This shows that it might make a huge difference whether weakly efficient or efficient solutions of the lower level problem are taken into account. In the first case, the bilevel problem is even not solvable.

Instead of enlarging the feasible set by taking the weakly efficient solutions instead of the efficient solutions, one could also use the so-called properly efficient solutions, which are a subset of the efficient solutions. Those can completely be characterized by linear scalarizations, at least in the convex case, by using elements w ∈ K #. Such an approach was discussed by Gebhardt and Jahn in [25]. The advantage is that the gained optimal values of the bilevel problem are then upper bounds on the optimal values of the original problem, and, what is more, that the gained optimal solutions are in fact feasible for the original problem, as any properly efficient solution is also efficient.

Finally, we would like to draw the attention to additional difficulties which can arise in case of constraints on the upper level which couple the upper and the lower level variables. Thus, in the remaining of this section, we study instead of problem (15.5.6) the more general formulation

$$\displaystyle \begin{aligned} \begin{array}{c} \min\limits_{x,y} F(x,y)\\ \mbox{subject to the constraints} \\ x\in \mathcal{E}_y(f(G),K^1),\\ (x,y)\in \tilde G\end{array} \end{aligned} $$
(15.5.8)
with the constraint set 
$$\tilde G\subset \mathbb {R}^{n_1}\times \mathbb {R}^{n_2}$$
. This results in a coupling of the upper level variable y and the lower level variable x. Then

$$\displaystyle \begin{aligned}\Omega':=\{(x,y)\in\mathbb{R}^{n_1}\times \mathbb{R}^{n_2}\mid x\in \mathcal{E}_y(f(G),K^1),\ (x,y)\in \tilde G\}\end{aligned}$$
denotes the induced set of the problem (15.5.8).

First notice that the constraint 
$$(x,y)\in \tilde G$$
from the upper level cannot just be moved to the lower level as simple examples, see for instance [14, Example 7.4], show. This has also a practical interpretation. The same constraint on the lower level restricting the constraint set on the lower level has a different meaning as on the upper level. There, the feasibility is restricted after the determination of the minimal solution of the lower level and is thus an implicit constraint. For more details see [8, pp. 25].

A generalization of the results from Theorem 15.5.1 is not possible as discussed in [14, Chapter 7]. We also illustrate this with the following example.

Example on Coupled Constraints
We consider the biobjective bilevel optimization problem

$$\displaystyle \begin{aligned}\begin{array}{c} \min\limits_{y\in\mathbb{R}} F(x,y)= \left(\begin{array}{c}x_1-y \\x_2\end{array}\right) \\ {}  \mbox{subject to the constraints} \\[1.5ex] x=x(y)\in \mbox{argmin}_{x\in\mathbb{R}^2}\bigg\{ f(x,y)= \left( \begin{array}{c} x_1\\x_2 \end{array} \right) \bigg|\ (x,y)\in G\bigg\},\\[2.5ex] (x,y)\in\tilde G \end{array} \end{aligned}$$
with 
$$n_1=2,\ n_2=1,\ K^1=K^2=\mathbb {R}^2_+$$
, m 1 = m 2 = 2,

$$\displaystyle \begin{aligned}G:=\{(x,y)\in\mathbb{R}^3\mid \|x\|{}_2^2\leq y^2\}\end{aligned}$$
and

$$\displaystyle \begin{aligned}\tilde G:=\{(x,y)\in\mathbb{R}^3\mid 0\leq y\leq 1,\ x_1+x_2\geq -1\}.\end{aligned}$$
Then

$$\displaystyle \begin{aligned}\mathcal{E}_y(f(G),K^1)=\{x=(x_1,x_2)\in\mathbb{R}^2\mid \|x\|{}^2_2=y^2,\ x_1\leq 0,\ x_2\leq 0\}\end{aligned}$$
and thus we obtain

$$\displaystyle \begin{aligned}\Omega'=\{(x,y)\in\mathbb{R}^3\mid \|x\|{}^2_2=y^2,\ x_1\leq 0,\ x_2\leq 0,\ 0\leq y\leq 1,\ x_1+x_2\geq -1\}.\end{aligned}$$
Let 
$$\hat {\mathcal {E}}'$$
be the set of (
$$\mathbb {R}^2_+\times \{0\}$$
)-minimal points of the tricriteria optimization problem

$$\displaystyle \begin{aligned}\begin{array}{c} \min\limits_{x,y} \left(\begin{array}{c}x_1 \\x_2\\y\end{array}\right) \\ {}  \mbox{subject to the constraints} \\[1.5ex] (x,y)\in G\cap \tilde G=\{(x,y)\in\mathbb{R}^3\mid 0\leq y\leq 1,\ x_1+x_2\geq -1,\ \|x\|{}_2^2\leq y^2\} \end{array} \end{aligned}$$
similar to (15.5.7). While it holds 
$$\Omega '\subset \hat {\mathcal {E}}'$$
(cf. [14, Theorem 7.5]) we show that it does not hold 
$$\hat {\mathcal {E}}'\subset \Omega '$$
.
For 
$$(\bar x,\bar y)=(-\frac {1}{2},-\frac {1}{2},1)^\top $$
we have 
$$(\bar x,\bar y)\in \hat {\mathcal {E}}'$$
because 
$$(\bar x,\bar y)\in G\cap \tilde G$$
and there is no 
$$(x,y)\in G\cap \tilde G$$
with

$$\displaystyle \begin{aligned}\begin{array}{rl} & \left(\begin{array}{c} \bar x_1 \\ \bar x_2 \\ \bar y \end{array} \right)\in \left\{\left( \begin{array}{c} x_1 \\ x_2 \\ y \\ \end{array} \right)\right\}+(\mathbb{R}^2_+\times\{0\})\setminus\{0\}\\ \\[0.5ex] \Leftrightarrow &\left( \begin{array}{c} \bar x_1 \\ \bar x_2 \\ \end{array} \right)\in \left\{\left( \begin{array}{c} x_1 \\ x_2 \\ \end{array} \right)\right\}+\mathbb{R}^2_+\setminus\{0\}\ \wedge\ y=\bar y. \end{array} \end{aligned}$$
The set 
$$\{x\in \mathbb {R}^2\mid (x,y)\in G\cap \tilde G,\ y=1\}$$
is illustrated in Fig. 15.1. However, 
$$(\bar x,\bar y)\not \in \Omega '$$
because 
$$\|\bar x\|{ }^2_2=\frac {1}{2}\neq \bar y^2$$
.
../images/480569_1_En_15_Chapter/480569_1_En_15_Fig1_HTML.png
Fig. 15.1

The set 
$$\{x\in  \mathbb {R}^2\mid (x,y)\in G\cap \tilde G,\ y=1\}$$
of Example 15.5, cf. [14]

In this example, as the induced set can be determined explicitly, the minimal solution set can be calculated by solving the biobjective optimization problem minx,y{F(x, y)∣(x, y) ∈ Ω}. We get as solution set of the multiobjective bilevel optimization problem:

$$\displaystyle \begin{aligned}\mathcal{S}^{\min}=\bigg\{ (x_1,x_2,y)\ \bigg|\ x_1=-1-x_2,\ x_2=-\frac{1}{2}\pm\frac{1}{4}\sqrt{8y^2-4}, y\in\left[\frac{\sqrt{2}}{2}\,,\,1\right] \bigg\}.\end{aligned}$$
The image 
$$\left \{F(x_1,x_2,y)\mid (x_1,x_2,y)\in \mathcal {S}^{\min }\right \}$$
of this set is

$$\displaystyle \begin{aligned}\bigg\{(z_1,z_2)\in\mathbb{R}^2\,\bigg|\, z_1=-1-z_2-y,\ z_2=-\frac{1}{2}\pm\frac{1}{4}\sqrt{8y^2-4}, y\in\left[\frac{\sqrt{2}}{2}\,,\,1\right]\bigg\}.\end{aligned}$$
These sets are shown in Fig. 15.2.
../images/480569_1_En_15_Chapter/480569_1_En_15_Fig2_HTML.png
Fig. 15.2

Solution set 
$$\mathcal {S}^{ \min }$$
of the biobjective bilevel optimization problem of Example 15.5 and the image 
$$F(\mathcal {S}^{ \min })$$

15.6 Numerical Methods

As stated in the introduction, we concentrate in this chapter on nonlinear problems. Procedures for solving linear multiobjective bilevel problems are presented e. g. in [36].

Much less papers are dealing with nonlinear multiobjective bilevel problems: Shi and Xia [43, 44] present an interactive method, also using a scalarization (the ε-constraint scalarization, cf. [47]). Osman, Abo-Sinna et al. [1, 37] propose the usage of fuzzy set theory for convex problems. Teng et al. [49] give an approach for a convex multiperson multiobjective bilevel problem.

Bonnel and Morgan consider in [2] a semivectorial bilevel optimization problem and propose a solution method based on a penalty approach. However, no numerical results are given.

More recently, in [14, 17] Theorem 15.5.1 was used for developing a numerical algorithm. There, first the efficient set of the multiobjective optimization problem (15.5.7) was calculated (or at least approximated). According to Theorem 15.5.1, this efficient set equals the feasible set of the upper level problem of the original bilevel problem. As the efficient set of (15.5.7) was approximated by a finite set of points, these points can be evaluated and compared easily by the upper level function. The approximation of the efficient set of (15.5.7) is then iteratively refined around points which have had promising objective function values w.r.t. the upper level functions.

Note that in general, it is not possible to determine the whole solution set of problem (15.5.7). We can only calculate a representation or approximation of this set. In [14, 17] it is taken care that this approximation is generated with a high quality in the pre-image space. And then, in a second step, based on sensitivity information, this approximation is refined depending on the behavior of the upper level function. For determining single efficient points of problem (15.5.7) the scalarization according to Pascoletti and Serafini, see Theorem 15.3.2, is used. Thereby, one can also make use of the result given in Lemma 15.2.2. The approach is suitable for low dimensions of the upper level variable only. Summing up, the algorithm consists of the following steps:
  1. 1.

    Calculate an approximation of the efficient set of (15.5.7) w.r.t. the partial ordering defined by the cone 
$$\hat K$$
. This is done by determining equidistant representations of the efficient sets of the multiobjective problems (15.5.1) for various discretization points y in the set 
$$\tilde G$$
.

     
  2. 2.

    Select all non-dominated points of the lower level problems being non-dominated for the upper level problem.

     
  3. 3.

    Refine the approximation of the efficient set of the original problem by refining the approximation of the set of feasible points of the upper level problem.

     
We give the numerical results of this algorithm for one example, cf. [14, Ch. 7].
Numerical Example
We consider the following multiobjective bilevel problem (assuming the optimistic approach) with n 1 = 2, n 2 = 1, m 1 = m 2 = 2 and 
$$K^1=K^2=\mathbb {R}^2_+$$
.

$$\displaystyle \begin{aligned}\begin{array}{c} \min\limits_{x,y} \left(\begin{array}{c} F_1(x,y)\\ F_2(x,y) \end{array}\right)= \left(\begin{array}{c} x_1+x_2^2+y+\sin^2(x_1+y)\\ \cos{}(x_2)\cdot(0.1+y)\cdot\left(\exp(-\frac{\displaystyle x_1}{\displaystyle 0.1+x_2})\right) \end{array}\right)\\ \mbox{subject to the constraints} \\ x\in\mbox{argmin}_{x\in\mathbb{R}^2}\left\{ \left(\begin{array}{c}f_1(x,y)\\f_2(x,y)\end{array}\right) \mid (x,y)\in G\right\},\\ y\in[0,10] \end{array} \end{aligned}$$
with 
$$f_1,\ f_2\colon \mathbb {R}^3\to \mathbb {R}$$
,

$$\displaystyle \begin{aligned}\begin{array}{rcl} f_1(x,y)&=&\frac{\displaystyle (x_1-2)^2+(x_2-1)^2}{\displaystyle 4}+ \frac{\displaystyle x_2y+(5-y)^2}{\displaystyle 16}+\sin\left(\frac{\displaystyle x_2}{\displaystyle 10}\right), \\[1.5ex] f_2(x,y)&=&\frac{\displaystyle x_1^2+(x_2-6)^4-2x_1y-(5-y)^2}{\displaystyle 80} \end{array}\end{aligned}$$
and

$$\displaystyle \begin{aligned}G=\{(x_1,x_2,y)\in\mathbb{R}^3\mid x_1^2-x_2\leq 0,\ 5x_1^2+x_2\leq 10, \ x_2-(5-y/6)\leq 0,\ x_1\geq 0\}. \end{aligned}$$
We apply the algorithm from [14, 17] with distance β = 0.6 for discretizing the interval 
$$\tilde G=[0,10]$$
, i.e. we choose y ∈{0, 0.6, 1.2, 1.8, …, 9.6, 10}. For example, for y = 1.8 one obtains the representation of the efficient set of problem (15.5.1) as shown in Fig. 15.3a. In the image space of the lower level problem this representation is shown in Fig. 15.3b.
../images/480569_1_En_15_Chapter/480569_1_En_15_Fig3_HTML.png
Fig. 15.3

Representation (a) of the efficient set and (b) the nondominated set of problem (15.5.1) for y = 1.8

The representation A 0 of the efficient sets of (15.5.1) for all discretization points y of the interval [0, 10] is shown in Fig. 15.4a. The set A 0 is at the same time an approximation of the solution set of problem (15.5.7) and hence of the set of feasible points of the upper level problem, see Theorem 15.5.1. In Fig. 15.4b the set F(A 0) is given. Here the image points under F of the set 
$$\mathcal {E}(F(A^0),\mathbb {R}^2_+)$$
are marked with circles and are connected with lines.
../images/480569_1_En_15_Chapter/480569_1_En_15_Fig4_HTML.png
Fig. 15.4

(a) Approximation A 0 of the set of feasible points Ω and (b) the image F(A 0) of this set under F

In several iterations the discretization of the feasible set of the upper level problem is refined around the efficient points which results in the representation A f, see Fig. 15.5. Then the algorithm is stopped as only small improvements for the approximation of the efficient set of the multiobjective bilevel optimization problem were gained by the last iteration.
../images/480569_1_En_15_Chapter/480569_1_En_15_Fig5_HTML.png
Fig. 15.5

(a) Final approximation A f of the set of feasible points Ω and (b) the image F(A f) of this set under F

Later, another numerical deterministic approach for nonlinear problems was proposed by Gebhardt and Jahn in [25]. They propose to use a multiobjective search algorithm with a subdivision technique. The studied problems have multiple objectives on both levels. The authors also discuss optimality conditions for replacing the multiobjective problem on the lower level. Instead of taking the weakly efficient solutions of the lower level, they suggest to take the properly efficient solutions. By doing this the feasible set of the upper level gets smaller (instead of larger by using weakly efficient elements) and thus one gets upper bounds, and minimal solutions which are in fact feasible for the original problem. However, the authors do not use this for their numerical approach. Instead, the main ingredient of their algorithm is a subdivision technique for multiobjective problems which uses randomly generated points in boxes, where the size is decreased when the algorithm proceeds. It is assumed that the feasible set of the upper level variable 
$$\tilde G$$
is such that one can easily work with a discretization of it, for instance an interval. Then the upper level variables are fixed to these values coming from the discretization, and the lower level problems are solved with an algorithm from the literature which uses subdivision techniques. While the procedure is appropriate only for small number of variables, it is capable of detecting globally optimal solutions also for highly nonlinear problems.

Also, the construction of numerical test instances is an important aspect. Test instances which are scalable in the number of variables and the number of objectives and where the set of optimal solutions are known are important to evaluate and compare numerical approaches. Deb and Sinha propose five of such test problems in [5]. For the test instances the optimistic approach is chosen.

There have also been some numerical approaches for solving bilevel multiobjective optimization problems by using evolutionary algorithms. An attempt is for instance given by Deb and Sinha in [6]. A hybrid solver combined with a local solver was proposed by the same authors in [7].