Very often, we have a set of n measurements (xi, yi) of a parameter, y, in different x points. However, describing the parameter using n measurement points is generally not sufficient, especially if we wish to make estimates at points where no measurements are available.
To make it possible to have an estimate of y at every possible point defined by x, a mathematical model can be developed from the knowledge of the measurements available. This model is noted y = f(x) and is called an estimator. It allows us to calculate the y value of the considered parameter, whatever point x retains.
By adopting a linear regression, the sought model is of the form:
where:
Indeed, for the available n measurements (xi, yi), the model error at point i is defined by the difference, at this point, between the measured value of y and the model’s value, yM:
Coefficients a and b are determined by minimizing the sum, S, of squared errors:
or, substituting for ε(i):
and replacing yM(Xi) by its expression:
By developing the squared expression, we obtain:
S is minimal for and
:
Likewise: .
Hence: .
We can then infer the expression of a:
One way to give an idea about the performance of an estimator is to compute the sum of the squared errors obtained.
Recall that in the case of a linear regression, the model is:
with:
and:
Once the parameters, a and b, of the model are calculated, we can determine, at every measurement point defined by (xi, yi), the difference between the value of y estimated by the model and that obtained through measurements. This difference is given by:
The performance of the model is then measured by the sum of squared differences: