Before we tackle the multidimensional Kalman Filter, we'll need to review some essential math topics:

- Matrix operations
- Covariance and covariance matrices
- Expectation algebra

If you are familiar with these topics, you can jump to the next chapter.

All you need to know is basic terms and operations such as:

- Vector and matrix addition and multiplication
- Matrix Transpose
- Matrix Inverse (you don’t need to invert matrices by yourself, you just need to know what the inverse of the matrix is)
- Symmetric Matrix

There are numerous Linear Algebra textbooks and web tutorials that cover these topics.

You can find a good tutorial on this topic on the visiondummy site:

https://www.visiondummy.com/2014/04/geometric-interpretation-covariance-matrix/

I am going to extensively use the expectation algebra rules for Kalman Filter equations derivations. If you are interested in understanding the derivations, you need to master expectation algebra.

You already know what a random variable is and what an expected value (or expectation) is. If not, please read the previous background break page.

The expectation is denoted by the capital letter \( E \).

The expectation of the random variable \( E(X) \) equals the mean of the random variable:

\[ E(X) = \mu_{X} \]

where \( \mu_{X} \) is the mean of the random variable.

Here are some basic expectation rules:

Rule | Notes | |
---|---|---|

1 | \( E(X) = \mu_{X} = \Sigma xp(x) \) | \( p(x) \) is the probability of \( x \) (discrete case) |

2 | \( E(a) = a \) | \( a \) is constant |

3 | \( E(aX) = aE(X) \) | \( a \) is constant |

4 | \( E(a \pm X) = a \pm E(X) \) | \( a \) is constant |

5 | \( E(a \pm bX) = a \pm bE(X) \) | \( a \) and \( b \) are constant |

6 | \( E(X \pm Y) = E(X) \pm E(Y) \) | \( Y \) is another random variable |

7 | \( E(XY) = E(X)E(Y) \) | if \( X \) and \( Y \) are independent |

The following table includes the variance and covariance expectation rules.

Rule | Notes | |
---|---|---|

8 | \( V(a) = 0 \) | \( V(a) \) is the variance of \( a \) \( a \) is constant |

9 | \( V(a \pm X) = V(X) \) | \( V(X) \) is the variance of \( X \) \( a \) is constant |

10 | \( V(X) = E(X^{2}) - \mu_{X}^{2} \) | \( V(X) \) is the variance of \( X \) |

11 | \( COV(X,Y) = E(XY) - \mu_{X}\mu_{Y} \) | \( COV(X,Y) \) is a covariance of \( X \) and \( Y \) |

12 | \( COV(X,Y) = 0 \) | if \( X \) and \( Y \) are independent |

13 | \( V(aX) = a^{2}V(X) \) | \( a \) is constant |

14 | \( V(X \pm Y) = V(X) + V(Y) \pm 2COV(X,Y) \) | |

15 | \( V(XY) \neq V(X)V(Y) \) |

The variance and covariance expectation rules are not straightforward. I will prove some of them.

\[ V(a) = 0 \]

A constant does not vary, so the variance of a constant is 0.

\[ V(a \pm X) = V(X) \]

Adding a constant to the variable does not change its variance.

\[ V(X) = E(X^{2}) - \mu_{X}^{2} \]

__The proof:__

Notes | |
---|---|

\( V(X) = \sigma_{X}^2 = E((X - \mu_{X})^2) = \) | |

\( E(X^2 -2X\mu_{X} + \mu_{X}^2) = \) | |

\( E(X^2) - E(2X\mu_{X}) + E(\mu_{X}^2) = \) | Applied rule number 5: \( E(a \pm bX) = a \pm bE(X) \) |

\( E(X^2) - 2\mu_{X}E(X) + E(\mu_{X}^2) = \) | Applied rule number 3: \( E(aX) = aE(X) \) |

\( E(X^2) - 2\mu_{X}E(X) + \mu_{X}^2 = \) | Applied rule number 2: \( E(a) = a \) |

\( E(X^2) - 2\mu_{X}\mu_{X} + \mu_{X}^2 = \) | Applied rule number 1: \( E(X) = \mu_{X} \) |

\( E(X^2) - \mu_{X}^2 \) |

\[ COV(X,Y) = E(XY) - \mu_{X}\mu_{Y} \]

__The proof:__

Notes | |
---|---|

\( COV(X,Y) = E((X - \mu_{X})(Y - \mu_{Y}) \) = | |

\( E(XY - X \mu_{Y} - Y \mu_{X} + \mu_{X}\mu_{Y}) = \) | |

\( E(XY) - E(X \mu_{Y}) - E(Y \mu_{X}) + E(\mu_{X}\mu_{Y}) = \) | Applied rule number 6: \( E(X \pm Y) = E(X) \pm E(Y) \) |

\( E(XY) - \mu_{Y} E(X) - \mu_{X} E(Y) + E(\mu_{X}\mu_{Y}) = \) | Applied rule number 3: \( E(aX) = aE(X) \) |

\( E(XY) - \mu_{Y} E(X) - \mu_{X} E(Y) + \mu_{X}\mu_{Y} = \) | Applied rule number 2: \( E(a) = a \) |

\( E(XY) - \mu_{Y} \mu_{X} - \mu_{X} \mu_{Y} + \mu_{X}\mu_{Y} = \) | Applied rule number 1: \( E(X) = \mu_{X} \) |

\( E(XY) - \mu_{X}\mu_{Y} \) |

\[ V(aX) = a^{2}V(X) \]

__The proof:__

Notes | |
---|---|

\( V(K) = \sigma_{K}^2 = E(K^{2}) - \mu_{K}^2 \) | |

\( K = aX \) | |

\( V(K) = V(aX) = E((aX)^{2} ) - (a \mu_{X})^{2} = \) | Substitute \( K \) by \( aX \) |

\( E(a^{2}X^{2}) - a^{2} \mu_{X}^{2} = \) | |

\( a^{2}E(X^{2}) - a^{2}\mu_{X}^{2} = \) | Applied rule number 3: \( E(aX) = aE(X) \) |

\( a^{2}(E(X^{2}) - \mu_{X}^{2}) = \) | |

\( a^{2}V(X) \) | Applied rule number 10: \( V(X) = E(X^{2}) - \mu_{X}^2 \) |

For constant velocity motion:

Where:

\( x \) | is the displacement of the body |

\( v \) | is the velocity of the body |

\( \Delta t \) | is the time interval |

\[ V(X \pm Y) = V(X) + V(Y) \pm 2COV(X,Y) \]

__The proof:__

Notes | |
---|---|

\( V(X \pm Y) = \) | |

\( E((X \pm Y)^{2}) - (\mu_{X} \pm \mu_{Y})^{2} = \) | Applied rule number 10: \( V(X) = E(X^{2}) - \mu_{X}^2 \) |

\( E(X^{2} \pm 2XY + Y^{2}) - (\mu_{X}^2 \pm 2\mu_{X}\mu_{Y} + \mu_{y}^2) = \) | |

\( \color{red}{E(X^{2}) - \mu_{X}^2} + \color{blue}{E(Y^{2}) - \mu_{Y}^2} \pm 2(E(XY) - \mu_{X}\mu_{Y} ) = \) | Applied rule number 6: \( E(X \pm Y) = E(X) \pm E(Y) \) |

\( \color{red}{V(X)} + \color{blue}{V(Y)} \pm 2(E(XY) - \mu_{X}\mu_{Y} ) = \) | Applied rule number 10: \( V(X) = E(X^{2}) - \mu_{X}^2 \) |

\( V(X) + V(Y) \pm 2COV(X,Y) \) | Applied rule number 11: \( COV(X,Y) = E(XY) - \mu_{X}\mu_{Y} \) |

Assume vector \( \boldsymbol{x} \) with \( k \) elements:

\[ \boldsymbol{x} = \left[ \begin{matrix} x_{1}\\ x_{2}\\ \vdots \\ x_{k}\\ \end{matrix} \right] \]

The covariance matrix of the vector \( \boldsymbol{x} \) is given by:

__The proof:__

\[ COV(\boldsymbol{x}) = E \left( \left[ \begin{matrix} (x_{1} - \mu_{x_{1}})^{2} & (x_{1} - \mu_{x_{1}})(x_{2} - \mu_{x_{2}}) & \cdots & (x_{1} - \mu_{x_{1}})(x_{k} - \mu_{x_{k}}) \\ (x_{2} - \mu_{x_{2}})(x_{1} - \mu_{x_{1}}) & (x_{2} - \mu_{x_{2}})^{2} & \cdots & (x_{2} - \mu_{x_{2}})(x_{k} - \mu_{x_{k}}) \\ \vdots & \vdots & \ddots & \vdots \\ (x_{k} - \mu_{x_{k}})(x_{1} - \mu_{x_{1}}) & (x_{k} - \mu_{x_{k}})(x_{2} - \mu_{x_{2}}) & \cdots & (x_{k} - \mu_{x_{k}})^{2} \\ \end{matrix} \right] \right) = \]

\[ = E \left( \left[ \begin{matrix} (x_{1} - \mu_{x_{1}}) \\ (x_{2} - \mu_{x_{2}}) \\ \vdots \\ (x_{k} - \mu_{x_{k}}) \\ \end{matrix} \right] \left[ \begin{matrix} (x_{1} - \mu_{x_{1}}) & (x_{2} - \mu_{x_{2}}) & \cdots & (x_{k} - \mu_{x_{k}}) \end{matrix} \right] \right) = \]

\[ = E \left( \left( \boldsymbol{x - \mu_{x}} \right) \left( \boldsymbol{x - \mu_{x}} \right)^{T} \right) \]