Assignment 4

Published

March 11, 2024

Abstract
Nonlinear Least Squares as an optimisation problem

Due: 03-04-2024

Problem Statement

Many times in engineering, observations of a function are made at some discrete points. For example, lets say \(y = f(x_1, x_2),\ x_1, x_2 \in [0,1]\) is the function. Observations can be made for this function at \(n\) points in the domain to yield an array \((x_{1}, x_{2}, y)_i,\ \ i \in [1,n]\). Now the question is how best to approximate the original function \(f\) in the entire domain using these \(n\) observations. \(n\) can be in tens, thousands or even a million.

Question

Is it always possible to find an interpolating polynomial through \(n\) points to get an approximation of the function? If yes, under what conditions? If no, why not? Consider the entire range of possible orders of magnitude of \(n\).

Since the data represents a physical process, an intelligent guess of the functional form of the function can be made. For example, one can assume that \(C_L\) is a linear function of \(\alpha\) before the stall region. Therefore the functional form will be \(C_L(\alpha) = a \alpha + b\) where \(a\) and \(b\) are the constants.

Let us assume the following functional form:

\[y = f(x_1, x_2) \approx \hat{f}(x_1, x_2; \beta) = \beta_1 x_1^2 + \beta_2 x_2^2 + \beta_3 x_1 + \beta_4 x_2 + \beta_5\]

where \(\beta_i\) are unknown constants. For a given value of these constants, \(\hat{f}(x_1, x_2; \beta)\) represents an approximation to \(f(x_1, x_2)\). Clearly, at all the observation points, we should have

\[\displaystyle y_i - \hat{f}(x_1, x_2; \beta)|_i = 0,\ \forall \ i \in [1,n].\]

So this curve fitting problem can be recast as an optimisation problem as:

\[\displaystyle \min_{\beta_i} \sum_{i=1}^{n} || y_i - \hat{f}(x_1, x_2; \beta)|_i ||^2\]

This is known as the nonlinear least squares method. Find \(\mathbf{\beta}\) for which this function reaches minima. A set of \(121\) observations can be found in this csv file.

Question

You can choose to use partial data only. Experiment with your code with increasing the number of data points you choose, and comment on the total \(L_2\) norm of the error \(y - \hat{f}(x_1, x_2, \beta)\).

Question

Assume that we are using only 20 data points for the approximation. Now there are \(\frac{121!}{(121-10)!\, 10!}\) ways of selecting those points. Do experimentation and see if the choice of points changes the approximation. Can you think of a way to select these measurement points intelligently to reduce the approximation error?

Question

If the engineer assumed the functional form to be \(y = f(x_1, x_2) \approx \hat{f}(x_1, x_2, \beta) = \beta_1 x_1^2 + \beta_3 x_2 + \beta_5\) what will happen the accuracy of our approximation?

Deliverables

  • A pdf report with source code and a discussion of your implementation methodology and results.