Surrogate Models

numerical analysis
surrogate models
mdo
statistics
Author

Devendra Ghate

Published

March 18, 2024

Problem Formulation

The standard optimization problem is given by \[ \min_{\scriptstyle \alpha\in \mathbb{R}^n} f( \alpha),~\text{subject to}~ \mathbf{h}(\alpha) = 0,~\mathbf{g}(\alpha) \leq 0 \]

where, \(f\) is the objective function, \(\mathbf{h}\) are the equality constraints, \(\mathbf{g}\) are the inequality constraints and \(\alpha\) are the design variables.

Typically, objective functions and the constraints are the implicit functions of design variables requiring iterative solution methodology.

\[ \begin{aligned} f &\equiv f(\alpha, \mathbf{x}(\alpha), \mathbf{u}(\mathbf{x}(\alpha)))\\\\ \mathbf{h}&\equiv \mathbf{h}(\alpha, \mathbf{x}(\alpha), \mathbf{u}(\mathbf{x}(\alpha)))\\\\ \mathbf{g}&\equiv \mathbf{g}(\alpha, \mathbf{x}(\alpha), \mathbf{u}(\mathbf{x}(\alpha))) \end{aligned} \]

A typical example of such a problem is aerodynamic wing design. The objective function is the drag, the equality constraint is the lift and the inequality constraint is the wing tip deflection. Here, \(\alpha\) are the wing design variables, \(\mathbf{x}\) is the volume mesh, and \(\mathbf{u}\) is the flow field. The objective function and the constraints are the implicit functions of the design variables, the volume mesh and the flow field. Wing surface geometry is created using a CAD software using \(\alpha\). A mesh generator then creates the volume mesh from the wing surface geometry. An iterative nonlinear PDE solver that typically solves Reynolds Averaged Navier Stokes (RANS) equations is used to solve for the flow field. Drag, lift and surfaces forces are then calculated from the flow field.

The wing surface forces are then used to calculate the wing tip deflection using a structural solver that solves a set of PDEs iteratively.

The entire process is computationally expensive and time consuming.

Problem Formulation

Surrogate models are approximations to the objective function and the constraints. The idea is to construct an approximation to the objective function (\(\hat{f}\)) and the constraints (\(\hat{\mathbf{h}}\) and \(\hat{\mathbf{g}}\)) and use them to solve the approximate optimization problem instead of the original problem. The approximate optimization problem is given by

\[ \min_{\scriptstyle \alpha\in \mathbb{R}^n} \hat{f}(\alpha),~\text{subject to}~ \hat{\mathbf{h}}(\alpha) = 0,~\hat{\mathbf{g}}(\alpha) \leq 0 \]

When to use Surrogate Models?

In most of the engineering problems, the objective function and the constraints are expensive to evaluate (as explained above with the aeroelastic wing optimisation problem above). Hence, it is useful to have a computationally efficient surrogate model with sufficient accuracy for the optimisation process. This is especially useful with non-gradient based optimisation algorithms that typically require large number of function evaluations.

Secondly, we note that gradient calculation is often noisy and/or expensive. This is especially true in the case of CFD simulations where the gradients are calculated using finite difference methods is an adjoint solver is not available. Surrogate models help in alleviating this problem since

  • expensive experimental resources.

multi-objective optimisation robust optimisaion

In most of the engineering problems, the objective function and the constraints are expensive to evaluate. The evaluation of the objective function and the constraints may require expensive computational resources.

In such a scenario, there are two approaches available. Therefore use of non-gradient algorithms becomes prohibitively expensive and not feseable in many cases.

  • expensive experimental resources.

We observe,

  • functional form of the functions may not be known
  • \(f\), \(\mathbf{g}\) and \(\mathbf{h}\) are expensive (computation or experimentation)
  • gradient calculation is noisy and/or expensive

To alleviate these problems, we want to construct approximate functions \(\hat{f}\), \(\hat{g}\) and \(\hat{h}\), which are

  • accurate,
  • computationally cheap, and
  • easy to differentiate.

Global picture

Classes of surrogates:

  • Interpolation or regression
  • Local or global
  • Linear, quadratic or nonlinear
  • Parametric
    • Polynomial
    • Radial Basis functions (Kernel methods)
  • Non-parametric
    • Neural networks
    • Support vector machines

Methodology

  • Model selection
  • Model training
  • Model testing

Model Selection

Polynomial Models

m order one dimensional model

\[ \hat{f}(\alpha,\mathbf{w},m)=w_{0}+w_1\alpha+\ldots+w\_m\alpha^m=\sum_{i=0}^m \]

\[\hat{f}(\alpha, \mathbf{w}, m) = w*0 + w*1 \alpha+ \ldots + w_m \alpha^m \]

\(2\) dimensional quadratic model

\[ \hat{f}(\alpha, \mathbf{w}) = w_0 + w_1 \alpha_1 + w_2 \alpha_2 + w_3 \alpha_1 \alpha_2 + w_4 \alpha_1^2 + w_5 \alpha_2^2 \label{} \]

Pascal triangle can be used to understand which coefficients play a role. $ $ can be found using least square approach.

Why regression?

Linear Regression

\[ f(\alpha) = \beta_0 + \beta_1 \alpha+ \epsilon \label{} \]

\(\epsilon\) is the error term assumed to be normally distributed.

  • Relationship between the independent variable (\(\alpha\)) and dependent variable \(\hat{f}\) is assumed to be linear

  • Observed data points are statistically independent

  • \(\epsilon\) is i.i.d with \(\epsilon_i = N(0,\sigma)\)

  • Because \(\epsilon_i\) has zero mean, the mean of \(\hat{f}\) for a given \(\alpha\) is \(\beta_0 + \beta_1 \alpha\)

Given that \(\alpha\) and \(f\) are known at finite number of points, the best estimates of unknown parameter \(\beta_0\) and \(\beta_1\) are the ones with which linear function \(\hat{f}\) can explain the observed data with minimum errors.

Least Squares model

\[ min \sum_{i=1}^{n} = min \sum_{i=1}^{n} (f_i - \hat{f}_i)^2 = min \sum_{i=1}^{n} (f_i - (\hat{\beta_0} + \hat{\beta_1} \alpha_i))^2 \]

  • \(\hat{\beta_0}\) is the estimated value of \(\beta_0\) and so on
  • Regression line always passes through \(\bar{\alpha}\) and \(\bar{f}\) of the observed data \(\alpha_i\) and \(f_i\)
  • Sum of square of each measured error \(\epsilon_i\) is \(0\), \(\sum_{i=1}^{n} \epsilon_i^2 = 0\)

Multivariate linear regression

Linear model

\[ \hat{f}(\alpha) = \beta_0 + \beta_1 \alpha_1 + \ldots + \beta_p \alpha_p + \epsilon \] \[ \mathbf{\hat{f}} = \pmb{\alpha} \pmb{\beta} + \pmb{\epsilon} \]

  • Matrix solution of this system provides \(\pmb{\beta}\)
  • \(\pmb{\beta}\) is a \((p+1) \times 1\) vector
  • \(\pmb{\alpha}\) is a \(n \times (p+1)\) matrix

Solution can be found by \(\pmb{\hat{\beta}} = (\pmb{\alpha}^T\pmb{\alpha})^{-1} \pmb{\alpha}^T \mathbf{\hat{f}}\)

Radial Basis functions

\[ \hat{f}(\alpha) = \mathbf{w}^T \pmb{\psi} = \sum_{i=1}^{m} w_i \psi_i(\| \alpha- c^{(i)}\|) \]

where \(c_i\) is the centre of the \(i^{th}\) basis function.

  • Linear \(\psi(r) = r\)
  • Cubic \(\psi(r) = r^3\)
  • Thin plate \(\psi(r) = r^2 ln(r)\)
  • Multi-quadric \(\psi(r) = \sqrt{r^2 + \gamma^2}\)
  • Gaussian \(\psi(r) = \exp{-\gamma r^2}\)

Generally, RBFs decrease (or increase) monotonically from the centre. The beauty of RBF is that it is linear in terms of the basis function weights.

Weighted Linear Regression

Geographically weighted Linear Regression

Nonlinear Regression

Kernel Regression

Parametric Regression

Applications