Multi-objective Optimisation

Modified

March 28, 2025

Introduction

Consider the following optimisation problem:

\[ \begin{aligned} & \underset{x}{\text{minimize}} & & length(x) = 100 x,~weight(x) = 100 x^2 \\ & \text{subject to} & & 0 \leq x \leq 1 \\ \end{aligned} \]

Here we have two objective functions. These two objective functions are not conflicting with each other as can be seen in Figure 1.

Code
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 1, 20)
length = 100*x
weight = 100*x**2

# plt.rcParams.update({"font.size":17})
plt.figure(figsize=(6,6))
# plt.scatter(length, weight)
plt.plot(length, weight,'-b')
plt.xlabel("length")
plt.ylabel("weight")
plt.title("Non-conflicting objective functions")
plt.grid(True)
plt.show()

Figure 1

Hence, this will not lead to an interesting MOOP. So lets consider the following optimisation problem:

\[\begin{aligned} & \underset{x}{\text{minimize}} & & weight(x) = 100 x^2,~drag(x) = (x-2)^2 \\ & \text{subject to} & & 0 \leq x \leq 1 \\ \end{aligned} \]

Code
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 1, 20)
weight = 100*x**2
drag = (x-2)**2

plt.figure(figsize=(6,6))
# plt.scatter(weight, drag)
plt.plot(weight, drag,'-b')
plt.xlabel("weight")
plt.ylabel("drag")
plt.title("Conflicting objective functions")
plt.grid(True)
plt.show()

Figure 2

Clearly, the two objective functions are conflicting with each other. Hence, this will lead to an interesting MOOP. We cannot minimise both the objective functions simultaneously. We have to make a trade-off between the two objective functions. This trade-off is called the Pareto optimal trade-off. The set of all Pareto optimal trade-offs is called the Pareto front.

Definitions

Design space

Design space is defined as the space spanned by the design variables. In this case, the design space is one dimensional. The design space is also called the parameter space. The design space is denoted by \(\mathbb{R}^n\) where \(n\) is the number of design variables. In this case, \(n = 1\).

Feasible design space

The feasible design space is the subset of the design space that satisfies all the constraints. In this case, the feasible design space is \([0,1]\). The feasible design space is denoted by \(S\). Clearly, the feasible design space is always a subset of the design space. The feasible design space is also called the search space.

Infeasible design space

The infeasible design space is the subset of the design space that does not satisfy all the constraints. In this case, the infeasible design space is \((-\infty,0) \cup (1,\infty)\). The infeasible design space is denoted by \(S^c\). Clearly, the infeasible design space is always a subset of the design space and \(S \cup S^c = \mathbb{R}^n\).

Objective space

Objective space is defined as the space spanned by the objective functions. In this case, the objective space is two dimensional. The objective space is also called the image space. The objective space is denoted by \(\mathbb{R}^m\) where \(m\) is the number of objective functions. In this case, \(m = 2\).

Feasible objective space

The feasible objective space is the subset of the objective space that satisfies all the constraints. In this case, the feasible objective space is \([0,\infty) \times [0,\infty)\). The feasible objective space is denoted by \(Z\).

Infeasible objective space

The infeasible objective space is the subset of the objective space that does not satisfy all the constraints. In this case, the infeasible objective space is \((-\infty,0) \times (-\infty,0)\). The infeasible objective space is denoted by \(Z^c\). Clearly, the infeasible objective space is always a subset of the objective space and \(Z \cup Z^c = \mathbb{R}^m\).

Feasible objective vector

The feasible objective vector is the vector of objective function values for a feasible design point. In this case, the feasible objective vector is \([0,\infty) \times [0,\infty)\). The feasible objective vector is denoted by \(z\).

Ideal objective vector (\(z^*\))

The ideal objective vector is the vector of objective function values for the best feasible design point. In this case, the ideal objective vector is \([0,0]\).

Utopian objective vector (\(z^{**}\))

The utopian objective vector is the vector of objective function values for the best possible design point. In this case, the utopian objective vector is \([0,0]\).

Nadir objective vector (\(z^{nad}\))

The nadir objective vector is the vector of objective function values for the worst feasible design point. In this case, the nadir objective vector is \([1,1]\).

Normalised objective function

\[f_i^{norm} = \frac{f_i - z^*_i}{z_i^{nad} - z_i^{*}}\]

A design point \(x^{(1)}\) is said to dominate \(x^{(2)}\), if both conditions are true.

  1. \(x^{(1)}\) is no worse than \(x^{(2)}\) in all objectives \(f_i,\ \ i \in [1,\ldots,m]\)

  2. \(x^{(1)}\) is strictly better than \(x^{(2)}\) in at least one objective \(f_i\)

Pareto Optimal Point

A point \(x^*\) in the feasible design space \(S\) is called Pareto optimal if there is no other point \(x\) in the set \(S\) that reduces at least one objective function without increasing another one.

The set of all pareto optimal points is called the Pareto front.

We need to formulate strategies that will yield one or all of the points on the pareto optimal front.

Strategy 1: Weighted Sum method

\[f(\mathbf{x}) = \underset{i}{\sum}~{w_i\,f_i(\mathbf{x})}\]

\[\underset{i}{\sum}~w_i = 1\]

Python code implementing the weighted sum method in non-normalised form is given below.

Code
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import minimize as optimize

x = np.linspace(0, 1, 20)
weight = lambda x: 100*x**2
drag = lambda x: (x-2)**2

plt.figure(figsize=(7,5))
plt.plot(weight(x), drag(x), zorder=1)

w1 = 0.5; w2 = 0.5
def f(x):
    return  np.sum(w1*weight(x) + w2*drag(x))

res = optimize(f, [w1,w2])
plt.scatter(weight(res.x), drag(res.x), color='r',marker='D', label=r"$w_1=0.5, \ w_2=0.5$",zorder=2)
w1 = 0.1; w2 = 0.9
res = optimize(f, [w1,w2])
plt.scatter(weight(res.x), drag(res.x), color='g',marker='D', label=r"$w_1=0.1, \ w_2=0.9$",zorder=2)
w1 = 0.01; w2 = 0.99
res = optimize(f, [w1,w2])
plt.scatter(weight(res.x), drag(res.x), color='b',marker='D', label=r"$w_1=0.01, \ w_2=0.99$",zorder=2)

plt.grid(zorder=0)
plt.xlabel("weight")
plt.ylabel("drag")
# plt.axis("image")
plt.legend(loc=[1.1,0.5])
plt.title("Weighted sum method")
plt.show()

Weighted sum method with unnormalised objective functions

As can be seen from the figure, the weighted sum method does not give the entire pareto front. It gives only one point on the pareto front at a time. By changing the weights, we can get different points on the pareto front. We can also observe that a uniform variation of the weights does not result in a uniform variation of the pareto points. Hence, the weighted sum method is not a good strategy to capture the entire pareto front.

Can we overcome this problem by normalizing the objectives?

Normalized objective functions \[\begin{aligned} & \underset{x}{\text{minimize}} & & weight(x) = x^2,~drag(x) = ((x-2)^2-1)/3 \\ & \text{subject to} & & 0 \leq x \leq 1 \\ \end{aligned} \]

Code
weight = lambda x: x**2
drag = lambda x: ((x-2)**2 - 1)/3.0

w1 = [0.0,0.1,0.5,0.9,1.0]
w2 = [1.0,0.9,0.5,0.1,0.0]

def f(x,wVal1,wVal2):
    return  np.sum(wVal1*weight(x) + wVal2*drag(x))

res_1 = optimize(f,0, args=(w1[0],w2[0]))
res_2 = optimize(f,0, args=(w1[1],w2[1]))
res_3 = optimize(f,0, args=(w1[2],w2[2]))
res_4 = optimize(f,0, args=(w1[3],w2[3]))
res_5 = optimize(f,0, args=(w1[4],w2[4]))

plt.figure(figsize=(7,5))
# plt.scatter(weight(x), drag(x),14)
x = np.linspace(0, 1, 20)
plt.plot(weight(x), drag(x),zorder=1)
plt.scatter(weight(res_1.x), drag(res_1.x), marker='D', label=r"$w_1=0.0, \ w_2=1.0$",zorder=2)
plt.scatter(weight(res_2.x), drag(res_2.x), marker='D', label=r"$w_1=0.1, \ w_2=0.9$",zorder=2)
plt.scatter(weight(res_3.x), drag(res_3.x), marker='D', label=r"$w_1=0.5, \ w_2=0.5$",zorder=2)
plt.scatter(weight(res_4.x), drag(res_4.x), marker='D', label=r"$w_1=0.9, \ w_2=0.1$",zorder=2)
plt.scatter(weight(res_5.x), drag(res_5.x), marker='D', label=r"$w_1=1.0, \ w_2=0.0$",zorder=2)
plt.xlabel("weight")
# plt.axis("image")
plt.grid(zorder=0)
plt.ylabel("drag")
plt.title("Weighted sum method")
plt.legend(loc=[1.1,0.5])
plt.show()

Weighted sum method with normalized objective functions
Note
  1. The objective functions were normalized between 0 and 1
  2. But the optimal value for drag function is in negative range and it achieves with x > 1
  3. Thus the last two weights (favouring drag function) goes out of range \(\in [0,1]\)

Strategy 2: Weighted Min-Max method

\[ f(\mathbf{x}) = \underset{i}{\text{max}}~{w_i(f_i(\mathbf{x}) - f_i^0)} \] where \(f_i^0\) is the minimum of the \(i\) objective function.

  • Provides complete Pareto optimal set
  • May provide non-Pareto points as well
  • Requires individual minimum of the objective functions \(f_i\)
Code
weight = lambda x: x**2
drag   = lambda x: ((x-2)**2-1)/3

w1=[0.0, 0.1, 0.5, 0.9, 1.0]
w2=[1.0, 0.9, 0.5, 0.1, 0.0]

def f_obj(x,W1,W2):
    return np.max([W1*(weight(x)-0),W2*(drag(x)-0)])

plt.figure(figsize=(7,5))
# plt.scatter(weight(x),drag(x),14)
plt.plot(weight(x),drag(x),'-b',zorder=0)
for i in range(5):
    res = optimize(f_obj,0.5,args=(w1[i],w2[i]))
    label = str(w1[i])+" - "+str(w2[i])
    plt.scatter(weight(res.x),drag(res.x),marker='D',label=label,zorder=1)
plt.grid()
plt.xlabel("weight")
plt.ylabel("drag")
plt.axis("image")
plt.title("Weighted Min-Max method")
plt.legend(loc=[1.1,0.5])
plt.show()

Weighted Min-Max method

Strategy 3: Weighted global criterion method

\[ f(\mathbf{x}) = \big\{\underset{i}{\sum}[w_i(f_i(\mathbf{x}) - f_i^0)]^p\big\}^{1/p} \] where, \(p \in [1,\infty)\).

Note
  1. Can you calculate the value of \(p\) so that strategy 3 reduces to strategy 1 & 2.
  2. What does the objective function \(f\) represents for \(p = 2\) ?}
Code
x = np.linspace(0,1,101)

f1 = lambda x: x
f2 = lambda x: 1-x


# weighted p-norm scalarization function
def weighted_pnorm_scalaraization(x,w,p):
    return (w*f1(x)**p + (1-w)*f2(x)**p)**(1/p)

# effect of changing weight to focus on f1 and f2
y1 = f1(x)
y2 = f2(x)
y3 = weighted_pnorm_scalaraization(x,0.5,2)
y4 = weighted_pnorm_scalaraization(x,0.1,2)
y5 = weighted_pnorm_scalaraization(x,0.9,2)

plt.figure(figsize=(6,8))
plt.plot(x,y1,label="f1(x)")
plt.plot(x,y2,label="f2(x)")
plt.plot(x,y3,label="p=2, w=0.5")
plt.plot(x,y4,label="p=2, w=0.1")
plt.plot(x,y5,label="p=2, w=0.9")
plt.xlabel("x")
plt.ylabel("f(x)")
plt.axis("image")
plt.grid()
plt.legend(loc=(1.1,0.5))
plt.title("Effect of changing value of w")
plt.show()

Effect of changing w
Code
# effect of changing weight to focus on f1 and f2
y1  = f1(x)
y2  = f2(x)
y6  = weighted_pnorm_scalaraization(x,0.5,1)
y7  = weighted_pnorm_scalaraization(x,0.5,2)
y8  = weighted_pnorm_scalaraization(x,0.5,6)
y9  = weighted_pnorm_scalaraization(x,0.5,10)
y10 = weighted_pnorm_scalaraization(x,0.5,20)

plt.figure(figsize=(6,6))
plt.plot(x,y1,label="f1(x)")
plt.plot(x,y2,label="f2(x)")
plt.plot(x,y6,label="p=1, w=0.5")
plt.plot(x,y7,label="p=2, w=0.5")
plt.plot(x,y8,label="p=6, w=0.5")
plt.plot(x,y9,label="p=10, w=0.5")
plt.plot(x,y10,label="p=20, w=0.5")
plt.xlabel("x")
plt.ylabel("f(x)")
plt.grid()
plt.legend(loc=(1.1,0.5))
plt.axis("image")
plt.title("Effect of changing value of p")
plt.show()

Effect of changing value of p

Drawbacks of the strategies used so far

  1. The objective functions have to be scaled properly. This is a non-trivial task for complex optimisation problems.

  2. Equi-spaced weights to no give equi-spaced pareto points as seen in the strategy two (even for scaled objective functions). Experimentation is required to capture the entire pareto front.

  3. Non-convex regions of the pareto front cannot be captured by these methods

  4. For a single weight \(\bar{w}\), multiple solutions may exist on the pareto front. Hence, a single \(\bar{w}\) may map to multiple points on the pareto front.

  5. On the other hand, different weight vectors may lead to the same pareto point. This usually happens because of constraints. Constraints may cut off some parts of the pareto front.

Strategy 4: \(\epsilon\) constraint method

The original single MOOP is solved as a series of \(m\) problems defined as:

\[\begin{aligned} & \underset{x}{\text{minimize}} & & f_{\mu}(x), \\ & \text{subject to} & & f_m(x) \leq \epsilon_m,\ \ \ m = [1,\ldots,M],\ m \neq \mu \\ & & & \bar{g}(\bar{x}) \leq 0 \\ & & & \bar{h}(\bar{x}) = 0 \\ & & & \bar{x}^{(L)} \leq x \leq \bar{x}^{(U)} \end{aligned}\]

Following is an example of two objective functions defined below

\[\begin{aligned} & \underset{x}{\text{minimize}} & & f_{obj_1}(x) = x,~f_{obj_2}(x) = 1-x-a \sin(b\pi x) \\ & \text{subject to} & & 0 \leq x \leq 1 \\ \end{aligned}\]

with \(a=0.1\) and \(b=2.5\). And the following code applies \(\epsilon\) constriant on the \(f_{obj_1}\) with varying values of \(\epsilon\).

\[ f_{obj_1} \le \epsilon \]

Performing weighted sum of objective functions with \(w=0.5\)

Code
from scipy.optimize import NonlinearConstraint

x = np.linspace(0,1,101)

f_obj1 = lambda x: x
f_obj2 = lambda x: 1-x-0.1*np.sin(2.5*np.pi*x)

# objective function
def f_obj(xVal,W1,W2):
    return W1*f_obj1(xVal)+W2*f_obj2(xVal)

# defining Epsilons
Epsilons = [0.2,0.4,0.5,0.6,0.7,0.8,1.0]

plt.figure(figsize=(7,5))
plt.plot(f_obj1(x),f_obj2(x),'-b')
for epsilon in Epsilons:
    # defining nonlinear inequality constraint
    constraint = NonlinearConstraint(f_obj1, #  constrained objective function
                                     lb=0.0, #  lower bound
                                     ub=epsilon) #  upper bound

    # optimization
    res = optimize(f_obj,10,args=(0.5,0.5),
                   constraints=constraint)

    label = r"$\epsilon = $"+str(epsilon)

    plt.plot(f_obj1(res.x),f_obj2(res.x),'o',label=label)
    plt.axvline(x=f_obj1(res.x),ls='--',color='y',alpha=0.5)
plt.legend(loc=(1.1,0.5))
plt.grid(alpha=0.3)
#plt.axis("image")
plt.xlabel(r"$f_{obj_1}$",fontsize=18)
plt.ylabel(r"$f_{obj_2}$",fontsize=18)
plt.title(r"$\epsilon$ constraint method")
plt.show()

Comments

  1. The big advantage of this method is that it yields the non-convex part of the pareto front.

  2. However, range of variation of the objective functions (\(f_i\)) on the pareto front has to be known apriori.

  3. The method is not suitable for large number of objectives.

Conclusion

There are other methods like Goal programming as well to calculate the entire pareto front. Apart from the gradient based methods discussed in this section, there are a class of population based evolutionary algorithms like NSGA-II that can calculate the entire pareto front in a single run. However, all the methods are prohibitively expensive for engineering optimisation problems. MOOP problem statements are mostly found in the preliminary design stage of the design process. At this stage, most of the analysis routines are low fidelity resulting in cheap function evaluations.