Multi-objective Optimisation

Modified

April 18, 2025

Introduction

Code
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 1, 20)
length = 100*x
weight = 100*x**2

plt.scatter(length, weight)
plt.xlabel("length")
plt.ylabel("weight")
plt.title("Non-conflicting objective functions")
plt.grid(True)
plt.show()
Figure 1

Consider the following optimisation problem:

\[ \begin{aligned} & \underset{x}{\text{minimize}} & & length(x) = 100 x,~weight(x) = 100 x^2 \\ & \text{subject to} & & 0 \leq x \leq 1 \\ \end{aligned} \]

Here we have two objective functions. These two objective functions are not conflicting with each other as can be seen in Figure 1. Hence, this will not lead to an interesting MOOP. So lets consider the following optimisation problem:

\[\begin{aligned} & \underset{x}{\text{minimize}} & & weight(x) = 100 x^2,~drag(x) = (x-2)^2 \\ & \text{subject to} & & 0 \leq x \leq 1 \\ \end{aligned} \]

Code
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 1, 20)
weight = 100*x**2
drag = (x-2)**2

plt.scatter(weight, drag)
plt.xlabel("weight")
plt.ylabel("drag")
plt.title("Conflicting objective functions")
plt.grid(True)
plt.show()

Clearly, the two objective functions are conflicting with each other. Hence, this will lead to an interesting MOOP. We cannot minimise both the objective functions simultaneously. We have to make a trade-off between the two objective functions. This trade-off is called the Pareto optimal trade-off. The set of all Pareto optimal trade-offs is called the Pareto front.

Definitions

Design space

Design space is defined as the space spanned by the design variables. In this case, the design space is one dimensional. The design space is also called the parameter space. The design space is denoted by \(\mathbb{R}^n\) where \(n\) is the number of design variables. In this case, \(n = 1\).

Feasible design space

The feasible design space is the subset of the design space that satisfies all the constraints. In this case, the feasible design space is \([0,1]\). The feasible design space is denoted by \(S\). Clearly, the feasible design space is always a subset of the design space. The feasible design space is also called the search space.

Infeasible design space

The infeasible design space is the subset of the design space that does not satisfy all the constraints. In this case, the infeasible design space is \((-\infty,0) \cup (1,\infty)\). The infeasible design space is denoted by \(S^c\). Clearly, the infeasible design space is always a subset of the design space and \(S \cup S^c = \mathbb{R}^n\).

Objective space

Objective space is defined as the space spanned by the objective functions. In this case, the objective space is two dimensional. The objective space is also called the image space. The objective space is denoted by \(\mathbb{R}^m\) where \(m\) is the number of objective functions. In this case, \(m = 2\).

Feasible objective space

The feasible objective space is the subset of the objective space that satisfies all the constraints. In this case, the feasible objective space is \([0,\infty) \times [0,\infty)\). The feasible objective space is denoted by \(Z\).

Infeasible objective space

The infeasible objective space is the subset of the objective space that does not satisfy all the constraints. In this case, the infeasible objective space is \((-\infty,0) \times (-\infty,0)\). The infeasible objective space is denoted by \(Z^c\). Clearly, the infeasible objective space is always a subset of the objective space and \(Z \cup Z^c = \mathbb{R}^m\).

Feasible objective vector

The feasible objective vector is the vector of objective function values for a feasible design point. In this case, the feasible objective vector is \([0,\infty) \times [0,\infty)\). The feasible objective vector is denoted by \(z\).

Ideal objective vector (\(z^*\))

The ideal objective vector is the vector of objective function values for the best feasible design point. In this case, the ideal objective vector is \([0,0]\).

Utopian objective vector (\(z^{**}\))

The utopian objective vector is the vector of objective function values for the best possible design point. In this case, the utopian objective vector is \([0,0]\).

Nadir objective vector (\(z^{nad}\))

The nadir objective vector is the vector of objective function values for the worst feasible design point. In this case, the nadir objective vector is \([1,1]\).

Normalised objective function

\[f_i^{norm} = \frac{f_i - z^*_i}{z_i^{nad} - z_i^{*}}\]

A design point \(x^{(1)}\) is said to dominate \(x^{(2)}\), if both conditions are true.

  1. \(x^{(1)}\) is no worse than \(x^{(2)}\) in all objectives \(f_i,\ \ i \in [1,\ldots,m]\)

  2. \(x^{(1)}\) is strictly better than \(x^{(2)}\) in at least one objective \(f_i\)

Pareto Optimal Point

A point \(x^*\) in the feasible design space \(S\) is called Pareto optimal if there is no other point \(x\) in the set \(S\) that reduces at least one objective function without increasing another one.

The set of all pareto optimal points is called the Pareto front.

We need to formulate strategies that will yield one or all of the points on the pareto optimal front.

Strategy 1: Weighted Sum method

\[f(\mathbf{x}) = \underset{i}{\sum}~{w_i\,f_i(\mathbf{x})}\]

#| fig-cap: "Weighted sum method with unnormalised objective functions"
f4(x) = w1*weight(x) + w2*drag(x) # Weighted objective function
w1 = 0.5; w2 = 0.5; res = optimize(f4, -1, 1)
p3 = p2 #| hide_line
Plots.plot!(p3, [ weight(res.minimizer[1]) ], [ drag(res.minimizer[1]) ],markershape=:diamond, label="50-50")#| hide_line
w1 = 0.1; w2 = 0.9; res = optimize(f4, -1, 1)
Plots.plot!(p3, [ weight(res.minimizer[1]) ], [ drag(res.minimizer[1]) ],markershape=:diamond, label="10-90")#| hide_line
w1 = 0.01; w2 = 0.99; res = optimize(f4, -1, 1)
Plots.plot!(p3, [ weight(res.minimizer[1]) ], [ drag(res.minimizer[1]) ],markershape=:diamond, label="1-99")#| hide_line

The above code implementing the weight sum method in non-normalised form is given in python.

Code
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import minimize as optimize

x = np.linspace(0, 1, 20)
weight = 100*x**2
drag = (x-2)**2

w1 = 0.5; w2 = 0.5;

def f(x): 
    w1*weight(x) + w2*drag(x)

res = optimize(f, 0, 1)

plt.scatter(weight, drag)
plt.scatter(weight(res.x), drag(res.x), marker='D')
plt.xlabel("weight")
plt.ylabel("drag")
plt.title("Weighted sum method")
plt.grid(True)
plt.show()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[6], line 14
     11 def f(x): 
     12     w1*weight(x) + w2*drag(x)
---> 14 res = optimize(f, 0, 1)
     16 plt.scatter(weight, drag)
     17 plt.scatter(weight(res.x), drag(res.x), marker='D')

File ~/anaconda3/lib/python3.12/site-packages/scipy/optimize/_minimize.py:708, in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
    706     res = _minimize_cg(fun, x0, args, jac, callback, **options)
    707 elif meth == 'bfgs':
--> 708     res = _minimize_bfgs(fun, x0, args, jac, callback, **options)
    709 elif meth == 'newton-cg':
    710     res = _minimize_newtoncg(fun, x0, args, jac, hess, hessp, callback,
    711                              **options)

File ~/anaconda3/lib/python3.12/site-packages/scipy/optimize/_optimize.py:1477, in _minimize_bfgs(fun, x0, args, jac, callback, gtol, norm, eps, maxiter, disp, return_all, finite_diff_rel_step, xrtol, c1, c2, hess_inv0, **unknown_options)
   1474 if maxiter is None:
   1475     maxiter = len(x0) * 200
-> 1477 sf = _prepare_scalar_function(fun, x0, jac, args=args, epsilon=eps,
   1478                               finite_diff_rel_step=finite_diff_rel_step)
   1480 f = sf.fun
   1481 myfprime = sf.grad

File ~/anaconda3/lib/python3.12/site-packages/scipy/optimize/_optimize.py:402, in _prepare_scalar_function(fun, x0, jac, args, bounds, epsilon, finite_diff_rel_step, hess)
    398     bounds = (-np.inf, np.inf)
    400 # ScalarFunction caches. Reuse of fun(x) during grad
    401 # calculation reduces overall function evaluations.
--> 402 sf = ScalarFunction(fun, x0, args, grad, hess,
    403                     finite_diff_rel_step, bounds, epsilon=epsilon)
    405 return sf

File ~/anaconda3/lib/python3.12/site-packages/scipy/optimize/_differentiable_functions.py:166, in ScalarFunction.__init__(self, fun, x0, args, grad, hess, finite_diff_rel_step, finite_diff_bounds, epsilon)
    163     self.f = fun_wrapped(self.x)
    165 self._update_fun_impl = update_fun
--> 166 self._update_fun()
    168 # Gradient evaluation
    169 if callable(grad):

File ~/anaconda3/lib/python3.12/site-packages/scipy/optimize/_differentiable_functions.py:262, in ScalarFunction._update_fun(self)
    260 def _update_fun(self):
    261     if not self.f_updated:
--> 262         self._update_fun_impl()
    263         self.f_updated = True

File ~/anaconda3/lib/python3.12/site-packages/scipy/optimize/_differentiable_functions.py:163, in ScalarFunction.__init__.<locals>.update_fun()
    162 def update_fun():
--> 163     self.f = fun_wrapped(self.x)

File ~/anaconda3/lib/python3.12/site-packages/scipy/optimize/_differentiable_functions.py:145, in ScalarFunction.__init__.<locals>.fun_wrapped(x)
    141 self.nfev += 1
    142 # Send a copy because the user may overwrite it.
    143 # Overwriting results in undefined behaviour because
    144 # fun(self.x) will change self.x, with the two no longer linked.
--> 145 fx = fun(np.copy(x), *args)
    146 # Make sure the function returns a true scalar
    147 if not np.isscalar(fx):

TypeError: f() takes 1 positional argument but 2 were given

As can be seen from the figure, the weighted sum method does not give the entire pareto front. It gives only one point on the pareto front. By changing the weights, we can get different points on the pareto front. We can also observe that a uniform variation of the weights does not result in a uniform variation of the pareto points. Hence, the weighted sum method is not a good strategy to capture the entire pareto front.

Can we overcome this problem by normalising the objectives?

#| fig-cap: "Weighted sum method with normalised objective functions"
## Strategy 1: Weighted Sum method with normalised objective functions
weight1(x) = x^2; drag1(x) = ((x-2)^2 - 1)/3;
w = map(weight1,x); dr = map(drag1,x) #| hide_line
p4 = Plots.plot(w, dr, seriestype=:scatter, xlabel="weight", ylabel="drag", label="") #| hide_line
f4(x) = w1*weight1(x) + w2*drag1(x) # Weighted objective function
w1 = 0.0; w2 = 1.0; res = optimize(f4, 0, 1)
Plots.plot!(p4, [weight1(res.minimizer[1])], [drag1(res.minimizer[1])],markershape=:diamond, label="0-100") #| hide_line
w1 = 0.1; w2 = 0.9; res = optimize(f4, 0, 1)
Plots.plot!(p4, [weight1(res.minimizer[1])] , [drag1(res.minimizer[1])],markershape=:diamond, label="10-90") #| hide_line
w1 = 0.5; w2 = 0.5; res = optimize(f4, 0, 1)
Plots.plot!(p4, [weight1(res.minimizer[1])], [drag1(res.minimizer[1])],markershape=:diamond, label="50-50") #| hide_line
w1 = 0.9; w2 = 0.1; res = optimize(f4, 0, 1)
Plots.plot!(p4, [weight1(res.minimizer[1])], [drag1(res.minimizer[1])],markershape=:diamond, label="90-10") #| hide_line
w1 = 1.0; w2 = 0.0; res = optimize(f4, 0, 1)
Plots.plot!(p4, [weight1(res.minimizer[1])], [drag1(res.minimizer[1])],markershape=:diamond, label="100-0") #| hide_line

Strategy 2: Weighted Min-Max method

\[ f(\mathbf{x}) = \underset{i}{\text{max}}~{w_i(f_i(\mathbf{x}) - f_i^0)} \] where \(f_i^0\) is the minimum of the \(i\) objective function.

  • Provides complete Pareto optimal set
  • May provide non-Pareto poitns as well
  • Requires individual minimum of the objective functions \(f_i\)
#| fig-cap: "Weighted Min-Max method"
## Strategy 2: Weighted Min-Max method
weight1(x) = x^2; drag1(x) = ((x-2)^2 - 1)/3;
w = map(weight1,x); dr = map(drag1,x) 
p5 = Plots.plot(w, dr, seriestype=:scatter, xlabel="weight", ylabel="drag", label="", grid = :on, frame = :box)
f5(x) = max(w1*(weight1(x) - 0), w2*(drag1(x) - 0)) # Weighted objective function
w1=[0.0, 0.1, 0.5, 0.9, 1.0]; w2=[1.0, 0.9, 0.5, 0.1, 0.0]
for i in 1:5
    f5(x) = max(w1[i]*(weight1(x) - 0), w2[i]*(drag1(x) - 0))
    res = optimize(f5, 0, 1)
    str = string(w1[i]*100, "-", w2[i]*100)
    Plots.plot!(p5, [weight1(res.minimizer[1])], [drag1(res.minimizer[1])],markershape=:diamond, label=str)
end
plot(p5)

# w1 = 0.0; w2 = 1.0; res = optimize(f5, 0, 1)
# Plots.plot!(p5, [weight1(res.minimizer[1])], [drag1(res.minimizer[1])],markershape=:diamond, label="0-100")
# w1 = 0.1; w2 = 0.9; res = optimize(f5, 0, 1)
# Plots.plot!(p5, [weight1(res.minimizer[1])] , [drag1(res.minimizer[1])],markershape=:diamond, label="10-90")
# w1 = 0.5; w2 = 0.5; res = optimize(f5, 0, 1)
# Plots.plot!(p5, [weight1(res.minimizer[1])], [drag1(res.minimizer[1])],markershape=:diamond, label="50-50")
# w1 = 0.9; w2 = 0.1; res = optimize(f5, 0, 1)
# Plots.plot!(p5, [weight1(res.minimizer[1])], [drag1(res.minimizer[1])],markershape=:diamond, label="90-10")
# w1 = 1.0; w2 = 0.0; res = optimize(f5, 0, 1)
# Plots.plot!(p5, [weight1(res.minimizer[1])], [drag1(res.minimizer[1])],markershape=:diamond, label="100-0")

Strategy 3: Weighted global criterion method

\[ f(\mathbf{x}) = \big\{\underset{i}{\sum}[w_i(f_i(\mathbf{x}) - f_i^0)]^p\big\}^{1/p} \] where, $ p $.

Note
  1. Can you calculate the value of \(p\) so that strategy 3 reduces to strategy 1 & 2.
  2. What does the objective function \(f\) represents for \(p = 2\) ?}
#| fig-cap: "Weighted global criterion method"
## Strategy 3: Weighted global criterion method

# Code that shows the scalarization techniques for the multi-objective optimization problem
f1(x) = x; f2(x) = 1-x

# Weighted p-norm scalarization function
function weighted_pnorm_scalarize(x, w, p)
    return (w*f1(x)^p + (1- w)*f2(x)^p)^(1/p)
end

# Effect of changing the weight to focus on f1 or f2
x = 0:0.01:1
y1 = f1.(x)
y2 = f2.(x)
y3 = weighted_pnorm_scalarize.(x, 0.5, 2)
y4 = weighted_pnorm_scalarize.(x, 0.1, 2)
y5 = weighted_pnorm_scalarize.(x, 0.9, 2)

plot(x, y1, label="f1(x)", 
            xlabel="x", 
            ylabel="f(x)", 
            title="Effect of changing the value of p", 
            legend=:topleft,
            lw=2, 
            c=:black, 
            grid=:on,
            frame=:box,
            legendfontsize=10,
            tickfontsize=10,
            guidefontsize=10,
            titlefontsize=12)
plot!(x, y2, label="f2(x)")
plot!(x, y3, label="p=2, w=0.5")
plot!(x, y4, label="p=2, w=0.1")
plot!(x, y5, label="p=2, w=0.9")
#| fig-cap: "Effect of changing the value of p"

# Code that shows the scalarization techniques for the multi-objective optimization problem
f1(x) = x; f2(x) = 1-x

# Weighted p-norm scalarization function
function weighted_pnorm_scalarize(x, w, p)
    return (w*f1(x)^p + (1- w)*f2(x)^p)^(1/p)
end

# Plot the scalarization functions
using Plots

# Effect of changing the weight to focus on f1 or f2
x = 0:0.01:1
y1 = f1.(x)
y2 = f2.(x)
# Plot to show the effect of changing the value of p
y6 = weighted_pnorm_scalarize.(x, 0.5, 1)
y7 = weighted_pnorm_scalarize.(x, 0.5, 2)
y8 = weighted_pnorm_scalarize.(x, 0.5, 6)
y9 = weighted_pnorm_scalarize.(x, 0.5, 10)
y10 = weighted_pnorm_scalarize.(x, 0.5, 20)

plot(x, y1, label="f1(x)", 
            xlabel="x", 
            ylabel="f(x)", 
            title="Effect of changing the value of p", 
            legend=:topleft,
            lw=2, 
            c=:black, 
            grid=:on,
            frame=:box,
            legendfontsize=10,
            tickfontsize=10,
            guidefontsize=10,
            titlefontsize=12)

plot!(x, y2, label="f2(x)", lw=2, c=:black, ls=:dash)
plot!(x, y6, label="p=1, w=0.5", lw=2, c=:red)
plot!(x, y7, label="p=2, w=0.5", lw=2, c=:blue)
plot!(x, y8, label="p=6, w=0.5", lw=2, c=:green)
plot!(x, y9, label="p=10, w=0.5", lw=2, c=:purple)
plot!(x, y10, label="p=20, w=0.5", lw=2, c=:orange)

Drawbacks of the strategies used so far

  1. The objective functions have to be scaled properly. This is a non-trivial task for complex optimisation problems.

  2. Equi-spaced weights to no give equi-spaced pareto points as seen in the strategy two (even for scaled objective functions). Experimentation is required to capture the entire pareto front.

  3. Non-convex regions of the pareto front cannot be captured by these methods

  4. For a single weight \(\bar{w}\), multiple solutions may exist on the pareto front. Hence, a single \(\bar{w}\) may map to multiple points on the pareto front.

  5. On the other hand, different weight vectors may lead to the same pareto point. This usually happens because of constraints. Constraints may cut off some parts of the pareto front.

Strategy 4: \(\epsilon\) constraint method

The original single MOOP is solved as a series of \(m\) problems defined as:

\[\begin{aligned} & \underset{x}{\text{minimize}} & & f_{\mu}(x), \\ & \text{subject to} & & f_m(x) \leq \epsilon_m,\ \ \ m = [1,\ldots,M],\ m \neq \mu \\ & & & \bar{g}(\bar{x}) \leq 0 \\ & & & \bar{h}(\bar{x}) = 0 \\ & & & \bar{x}^{(L)} \leq x \leq \bar{x}^{(U)} \end{aligned}\]

#| fig-cap: "Epsilon constraint method"
using JuMP, Ipopt, Plots #| hide_line

obj1(x,y) = x
obj2(x,y) = 1 .+ y.^2 .- x .- a.*sin.(b.*π.*x)
obj_ws(x,y) = w.*x + (1 .- w).*(1 .+ y.^2 .- x .- a.*sin(b.*π.*x))

# Plotting pareto front

# Convex pareto front
#=a = 0.2; b = 1.0=#
# Concave pareto front
a = 0.1; b = 2.5
x₂=-2.0:0.05:2.0
p7=Plots.plot(seriestype=:scatter) 
for x₁=0.0:0.05:1.0
    tmp = minimum(obj2(x₁, x₂))
    println(x₁, tmp)
end

    #Plots.plot!(p7,[x₁],[tmp])
m = Model(Ipopt.Optimizer)

# Global design variables
@variable(m, x₁, start = 0.5)
@variable(m, x₂, start = 0.5)

# Parameter
@NLparameter(m, w == 0.0)

JuMP.register(m, :obj_ws, 2, obj_ws, autodiff=true)
# Box constraints
@constraint(m,  0 <= x₁ <= 1)
@constraint(m, -2 <= x₂ <= 2)

for i in 0.0:0.1:1.0
  println(i)
end
p6 = Plots.plot(xlabel="f₁", ylabel="f₂", title="Pareto Front using weighted sum method")
for i in 0.0:0.1:1.0
    @NLobjective(m, Min, w*x₁ + (1-w)*(1 + x₂^2 - x₁ - a*sin(b*π*x₁)))
    set_value(w,i)
    optimize!(m)
    x₁_sol = getvalue(x₁)
    x₂_sol = getvalue(x₂)
    Plots.plot!(p6, [obj1(x₁_sol,x₂_sol)], [obj2(x₁_sol, x₂_sol)], seriestype=:scatter, markershape=:circle) #hide
#    plot!(obj1(x₁_sol,x₂_sol), obj2(x₁_sol, x₂_sol),"o")
end
grid()

@NLobjective(m, Min, 1 + x₂^2 - x₁ - a*sin(b*π*x₁))
@constraint(m,  0 <= x₁ <= 0.4)
optimize!(m)
x₁_sol = getvalue(x₁)
x₂_sol = getvalue(x₂)
plot(obj1(x₁_sol,x₂_sol), obj2(x₁_sol, x₂_sol),"*")
xlabel("f₁"); ylabel("f₂"); title("Pareto Front using ϵ Constraint method"); grid()

Comments

  1. The big advantage of this method is that it yields the non-convex part of the pareto front.

  2. However, range of variation of the objective functions (\(f_i\)) on the pareto front has to be known apriori.

  3. The method is not suitable for large number of objectives.

Conclusion

There are other methods like Goal programming as well to calculate the entire pareto front. Apart from the gradient based methods discussed in this section, there are a class of population based evolutionary algorithms like NSGA-II that can calculate the entire pareto front in a single run. However, all the methods are prohibitively expensive for engineering optimisation problems. MOOP problem statements are mostly found in the preliminary design stage of the design process. At this stage, most of the analysis routines are low fidelity resulting in cheap function evaluations.

TODO: Add a section on Goal programming TODO: Add a section on NSGA-II