The background break

Before we start, I would like to explain several fundamental terms such as variance, standard deviation, normal distribution, estimate, accuracy, precision, mean, expected value and random variable.

I suppose that many readers of this tutorial are familiar with the basic statistics. However, at the beginning of this tutorial, I've promised to supply the necessary background that is required for understanding of the Kalman Filter operation. If you are familiar with this topic, feel free to skip it and jump to the next section.

Mean and Expected Value

Mean and Expected Value are closely related terms. However, they are different.

For example, given five different coins – two 5 cent coins and three 10 cent coins, we can easily calculate the mean value of the coin by averaging coins' values.

\[ V_{mean}= \frac{1}{N} \sum _{n=1}^{N}V_{n}= \frac{1}{5} \left( 5+5+10+10+10 \right) = 8cent \]

The above outcome can't be defined as expected value, since the states of the system (coin values) are not hidden, and we've used all the population (all 5 coins) for the mean value calculation.

Now assume five different weight measurements of the same person: 79.8kg, 80kg, 80.1kg, 79.8kg, and 80.2kg.

Man on scales

The measurements are different due to the random measurement error of the scales. We don't know the true value of the weight, since it is a Hidden Variable. However, we can estimate the weight by averaging the scales measurements.

\[ W= \frac{1}{N} \sum _{n=1}^{N}W_{n}= \frac{1}{5} \left( 79.8+80+80.1+79.8+80.2 \right) =79.98kg \]

The outcome of the estimate is the expected value of the weight.

The mean is usually denoted by a Greek letter μ.

The expected value is usually denoted by letter E.

Variance and Standard deviation

The Variance is a measure of spreading of the data set from its mean.

The Standard Deviation is the square root of the variance.

The Standard deviation is denoted by a Greek letter \( \sigma \) (sigma). Consequently, the variance is denoted by \( \sigma ^{2} \).

For example, we would like to compare heights of two high school basketball teams. The following table provides the player's height of both teams and their mean.

Player 1 Player 2 Player 3 Player 4 Player 5 Mean
Team A 1.89m 2.1m 1.75m 1.98m 1.85m 1.914m
Team B 1.94m 1.9m 1.97m 1.89m 1.87m 1.914m

As we can see, the mean height of both teams is the same. Now let's examine the height variance.

Since the variance measures the spreading of the data set, we would like to know the data set deviation from its mean. We can calculate the distance from the mean for each variable by subtracting the mean from each variable.

We will denote the height by \( x \) and the mean of the heights by Greek letter \( \mu \). The distance from the mean for each variable would be:

\[ x_{n}- \mu = x_{n}-1.914m \]

The following table presents the distance from the mean for each variable.

Player 1 Player 2 Player 3 Player 4 Player 5
Team A -0.024m 0.186m -0.164m 0.066m -0.064m
Team B 0.026m -0.014m 0.056m -0.024m -0.044m

Some of the values are negative. In order to get rid of the negative values, let's square the distance from the mean:

\[ \left( x_{n}- \mu \right) ^{2} = \left( x_{n}- 1.914m \right) ^{2} \]

The following table presents the squared distance from the mean for each variable.

Player 1 Player 2 Player 3 Player 4 Player 5
Team A 0.000576m2 0.034596m2 0.026896m2 0.004356m2 0.004096m2
Team B 0.000676m2 0.000196m2 0.003136m2 0.000576m2 0.001936m2

In order to calculate the variance of the data set, we need to find the average value of all squared distances from the mean:

\[ \sigma ^{2}= \frac{1}{N} \sum _{n=1}^{N} \left( x_{n}- \mu \right) ^{2} \]

For the team A, the variance would be:

\[ \sigma _{A}^{2} = \frac{1}{N} \sum _{n=1}^{N} \left( x_{n}- \mu \right) ^{2}= \frac{1}{5} \left( 0.000576+ 0.034596+ 0.026896+ 0.004356+ 0.004096 \right) = 0.014m^{2} \]

For the team B, the variance would be:

\[ \sigma _{B}^{2} = \frac{1}{N} \sum _{n=1}^{N} \left( x_{n}- \mu \right) ^{2}= \frac{1}{5} \left( 0.000676+ 0.000196+ 0.003136+ 0.000576+ 0.001936 \right) = 0.0013m^{2} \]

We can see that although the mean of both teams is the same, the measure of the height spreading of Team A is higher than the measure of the height spreading of Team B. It means that the Team A players are more diverse, there are players for different positions like ball handler, center and guards; while the Team B players are not versatile.

The units of the variance are squared; it is more convenient to look on standard deviation. As I've already mentioned, the standard deviation is a square root of the variance.

\[ \sigma =\sqrt[]{\frac{1}{N} \sum _{n=1}^{N} \left( x_{n}- \mu \right) ^{2}} \]

The standard deviation of the Team A players' heights would be 0.12m.

The standard deviation of the Team B players' heights would be 0.036m.

Now, assume that we would like to calculate the mean and variance of all basketball players in all high schools. It is a very hard task; we need to collect the data of all players from all high schools.

On the other hand, we can estimate the mean and the variance of the players by picking a large data set and making the calculations on this data set.

The data set of 100 randomly selected players can be sufficient for the accurate estimation.

However, when we estimate the variance, the equation for variance calculation is slightly different. Instead of normalizing by the factor \( N \) , we shall normalize by the factor \( N-1 \):

\[ \sigma ^{2}= \frac{1}{N-1} \sum _{n=1}^{N} \left( x_{n}- \mu \right) ^{2} \]

The factor of \( N-1 \) is called Bessel's correction.

You can see the mathematical proof of the above equation on the visiondummy or Wikipedia.

Normal Distribution

It turns out that many natural phenomena follow Normal Distribution. Continuing the example with the basketball players' height, if we build a big data set of randomly selected players and build a plot of the frequency of heights vs. heights, we will get the "bell" shaped curve, as shown on the following chart:


As you can see the curve is symmetrical around the mean value, which is 1.9m. The frequency of the values around the mean is higher than frequency of the distant values.

The standard deviation of the heights equals to 0.2m. 68.26% of the values lie within one standard deviation of the mean. As you can see on the chart below, 68.26% of the values lie between 1.7m and 2.1m (the green area is 68.26% of the total area under the curve).

Standard Deviation

95.44% of the values lie within two standard deviations of the mean.
99.74% of the values lie within three standard deviations of the mean.

The normal distribution, also known as the Gaussian (it is named after the mathematician Carl Friedrich Gauss), and it is described by the following equation:

\[ f \left( x; \mu , \sigma ^{2} \right) = \frac{1}{\sqrt[]{2 \pi \sigma ^{2}}}e^{\frac{- \left( x- \mu \right) ^{2}}{2 \sigma ^{2}}} \]

The Gaussian curve is also called the Probability Density Function (PDF) for the normal distribution.

Usually, the measurement errors are distributed normally. The Kalman Filter design assumes normal distribution of the measurement errors.

Random Variables

A mathematician, a physicist and an engineer are driving in 60mph (miles per hour) zone. They are stopped by a policeman that measures the car speed with the laser speed gun.

The speed gun measurement is 70mph. The speed gun measurement distributes normally with the standard deviation of 5mph.

The speed gun measurement is a Random Variable. We don't know the precise speed value; the Expected Value of the speed is 70mph.

The mathematician would say that the car velocity can be any number between negative infinity and positive infinity, while the probability of the velocity to be between 65mph and 75mph is 68.26%.

The physicist would say that the car velocity can be any number larger then the negative speed of light and smaller then the positive speed of light.

The engineer would say, that the car velocity can be any number above zero and below 140mph (since the car movement direction is positive and the maximal speed of the car is 140mph).

The policeman, would say that the car speed was 70mph and write a fine ticket.

The random variable can be continuous or discrete:

  • The battery charge time or marathon race time are continuous random variables.
  • The number of the website visitors or number students in the class are discrete random variables, since they can be counted.

All measurements are continuous random variables.

Estimate, Accuracy and Precision

Estimate is about evaluating the hidden state of the system. The aircraft true position is hidden from the observer. We can estimate the aircraft position using sensors, such as radar. The estimate can be significantly improved by using multiple sensors and applying advanced estimation and tracking algorithms (such as Kalman Filter). Every measured or computed parameter is an estimate.

Accuracy indicates how close the measurement is to the true value.

Precision describes how much variability there is in a number of measurements of the same parameter. Accuracy and precision form the basis of the estimate.

The following figure illustrates accuracy and precision.

Accuracy and Precision

The high-precision systems have low variance in their measurements (i.e. low uncertainty), while the low precision systems have high variance in their measurements (i.e. high uncertainty). The variance is produced by the random measurement error.

The low accuracy systems are called biased systems, since their measurements have a built-in systematic error (bias).

The influence of the variance can be significantly reduced by averaging or smoothing measurements. For example, if we measure temperature using a thermometer with a random measurement error, we can make multiple measurements and average them. Since the error is random, some of the measurements would be above the true value and other below the true value. The estimate would be close to a true value. The more measurements we make, the closer the estimate would be.

On the other side, if the thermometer is biased, the estimate will include a constant systematic error.

All examples in this tutorial assume unbiased systems.


The following figure represents a statistical view of the measurement.

Statistical view of the measurement

The measurement is a random variable, described by the Probability Density Function (PDF).

The measurements mean is the Expected Value of the random variable.

The offset between the measurements mean and the true value is the measurements accuracy also known as bias or systematic measurement error.

The dispersion of the distribution is the measurement precision, also known as the measurement noise or random measurement error or measurement uncertainty.

Previous Next