Before we tackle the multidimensional Kalman Filter, we'll need to review some essential math topics:

- Matrix operations
- Covariance and covariance matrices
- Expectation algebra

You can jump to the next chapter if you are familiar with these topics.

All you need to know is basic terms and operations such as:

- Vector and matrix addition and multiplication
- Matrix Transpose
- Matrix Inverse (you don't need to invert matrices by yourself, you just need to know what the inverse of the matrix is)
- Symmetric Matrix

There are numerous Linear Algebra textbooks and web tutorials that cover these topics.

You can find a good tutorial on this topic on the visiondummy site:

https://www.visiondummy.com/2014/04/geometric-interpretation-covariance-matrix/

I extensively use the expectation algebra rules for Kalman Filter equations derivations. If you are interested in understanding the derivations, you need to master expectation algebra.

You already know what a random variable is and what an expected value (or expectation) is. If not, please read the previous background break page.

The expectation is denoted by the capital letter \( E \).

The expectation of the random variable \( E(X) \) equals the mean of the random variable:

\[ E(X) = \mu_{X} \]

Where \( \mu_{X} \) is the mean of the random variable.

Here are some basic expectation rules:

Rule | Notes | |
---|---|---|

1 | \( E(X) = \mu_{X} = \Sigma xp(x) \) | \( p(x) \) is the probability of \( x \) (discrete case) |

2 | \( E(a) = a \) | \( a \) is constant |

3 | \( E(aX) = aE(X) \) | \( a \) is constant |

4 | \( E(a \pm X) = a \pm E(X) \) | \( a \) is constant |

5 | \( E(a \pm bX) = a \pm bE(X) \) | \( a \) and \( b \) are constant |

6 | \( E(X \pm Y) = E(X) \pm E(Y) \) | \( Y \) is another random variable |

7 | \( E(XY) = E(X)E(Y) \) | If \( X \) and \( Y \) are independent |

The following table includes the variance and covariance expectation rules.

Rule | Notes | |
---|---|---|

8 | \( V(a) = 0 \) | \( V(a) \) is the variance of \( a \) \( a \) is constant |

9 | \( V(a \pm X) = V(X) \) | \( V(X) \) is the variance of \( X \) \( a \) is constant |

10 | \( V(X) = E(X^{2}) - \mu_{X}^{2} \) | \( V(X) \) is the variance of \( X \) |

11 | \( COV(X,Y) = E(XY) - \mu_{X}\mu_{Y} \) | \( COV(X,Y) \) is a covariance of \( X \) and \( Y \) |

12 | \( COV(X,Y) = 0 \) | if \( X \) and \( Y \) are independent |

13 | \( V(aX) = a^{2}V(X) \) | \( a \) is constant |

14 | \( V(X \pm Y) = V(X) + V(Y) \pm 2COV(X,Y) \) | |

15 | \( V(XY) \neq V(X)V(Y) \) |

The variance and covariance expectation rules are not straightforward. I prove some of them.

\[ V(a) = 0 \]

A constant does not vary, so the variance of a constant is 0.

\[ V(a \pm X) = V(X) \]

Adding a constant to the variable does not change its variance.

\[ V(X) = E(X^{2}) - \mu_{X}^{2} \]

__The proof:__

Notes | |
---|---|

\( V(X) = \sigma_{X}^2 = E((X - \mu_{X})^2) = \) | |

\( E(X^2 -2X\mu_{X} + \mu_{X}^2) = \) | |

\( E(X^2) - E(2X\mu_{X}) + E(\mu_{X}^2) = \) | Applied rule number 5: \( E(a \pm bX) = a \pm bE(X) \) |

\( E(X^2) - 2\mu_{X}E(X) + E(\mu_{X}^2) = \) | Applied rule number 3: \( E(aX) = aE(X) \) |

\( E(X^2) - 2\mu_{X}E(X) + \mu_{X}^2 = \) | Applied rule number 2: \( E(a) = a \) |

\( E(X^2) - 2\mu_{X}\mu_{X} + \mu_{X}^2 = \) | Applied rule number 1: \( E(X) = \mu_{X} \) |

\( E(X^2) - \mu_{X}^2 \) |

\[ COV(X,Y) = E(XY) - \mu_{X}\mu_{Y} \]

__The proof:__

Notes | |
---|---|

\( COV(X,Y) = E((X - \mu_{X})(Y - \mu_{Y}) \) = | |

\( E(XY - X \mu_{Y} - Y \mu_{X} + \mu_{X}\mu_{Y}) = \) | |

\( E(XY) - E(X \mu_{Y}) - E(Y \mu_{X}) + E(\mu_{X}\mu_{Y}) = \) | Applied rule number 6: \( E(X \pm Y) = E(X) \pm E(Y) \) |

\( E(XY) - \mu_{Y} E(X) - \mu_{X} E(Y) + E(\mu_{X}\mu_{Y}) = \) | Applied rule number 3: \( E(aX) = aE(X) \) |

\( E(XY) - \mu_{Y} E(X) - \mu_{X} E(Y) + \mu_{X}\mu_{Y} = \) | Applied rule number 2: \( E(a) = a \) |

\( E(XY) - \mu_{Y} \mu_{X} - \mu_{X} \mu_{Y} + \mu_{X}\mu_{Y} = \) | Applied rule number 1: \( E(X) = \mu_{X} \) |

\( E(XY) - \mu_{X}\mu_{Y} \) |

\[ V(aX) = a^{2}V(X) \]

__The proof:__

Notes | |
---|---|

\( V(K) = \sigma_{K}^2 = E(K^{2}) - \mu_{K}^2 \) | |

\( K = aX \) | |

\( V(K) = V(aX) = E((aX)^{2} ) - (a \mu_{X})^{2} = \) | Substitute \( K \) by \( aX \) |

\( E(a^{2}X^{2}) - a^{2} \mu_{X}^{2} = \) | |

\( a^{2}E(X^{2}) - a^{2}\mu_{X}^{2} = \) | Applied rule number 3: \( E(aX) = aE(X) \) |

\( a^{2}(E(X^{2}) - \mu_{X}^{2}) = \) | |

\( a^{2}V(X) \) | Applied rule number 10: \( V(X) = E(X^{2}) - \mu_{X}^2 \) |

For constant velocity motion:

Where:

\( x \) | is the displacement of the body |

\( v \) | is the velocity of the body |

\( \Delta t \) | is the time interval |

\[ V(X \pm Y) = V(X) + V(Y) \pm 2COV(X,Y) \]

__The proof:__

Notes | |
---|---|

\( V(X \pm Y) = \) | |

\( E((X \pm Y)^{2}) - (\mu_{X} \pm \mu_{Y})^{2} = \) | Applied rule number 10: \( V(X) = E(X^{2}) - \mu_{X}^2 \) |

\( E(X^{2} \pm 2XY + Y^{2}) - (\mu_{X}^2 \pm 2\mu_{X}\mu_{Y} + \mu_{y}^2) = \) | |

\( \color{red}{E(X^{2}) - \mu_{X}^2} + \color{blue}{E(Y^{2}) - \mu_{Y}^2} \pm 2(E(XY) - \mu_{X}\mu_{Y} ) = \) | Applied rule number 6: \( E(X \pm Y) = E(X) \pm E(Y) \) |

\( \color{red}{V(X)} + \color{blue}{V(Y)} \pm 2(E(XY) - \mu_{X}\mu_{Y} ) = \) | Applied rule number 10: \( V(X) = E(X^{2}) - \mu_{X}^2 \) |

\( V(X) + V(Y) \pm 2COV(X,Y) \) | Applied rule number 11: \( COV(X,Y) = E(XY) - \mu_{X}\mu_{Y} \) |

Assume vector \( \boldsymbol{x} \) with \( k \) elements:

\[ \boldsymbol{x} = \left[ \begin{matrix} x_{1}\\ x_{2}\\ \vdots \\ x_{k}\\ \end{matrix} \right] \]

The covariance matrix of the vector \( \boldsymbol{x} \) is given by:

__The proof:__

\[ COV(\boldsymbol{x}) = E \left( \left[ \begin{matrix} (x_{1} - \mu_{x_{1}})^{2} & (x_{1} - \mu_{x_{1}})(x_{2} - \mu_{x_{2}}) & \cdots & (x_{1} - \mu_{x_{1}})(x_{k} - \mu_{x_{k}}) \\ (x_{2} - \mu_{x_{2}})(x_{1} - \mu_{x_{1}}) & (x_{2} - \mu_{x_{2}})^{2} & \cdots & (x_{2} - \mu_{x_{2}})(x_{k} - \mu_{x_{k}}) \\ \vdots & \vdots & \ddots & \vdots \\ (x_{k} - \mu_{x_{k}})(x_{1} - \mu_{x_{1}}) & (x_{k} - \mu_{x_{k}})(x_{2} - \mu_{x_{2}}) & \cdots & (x_{k} - \mu_{x_{k}})^{2} \\ \end{matrix} \right] \right) = \]

\[ = E \left( \left[ \begin{matrix} (x_{1} - \mu_{x_{1}}) \\ (x_{2} - \mu_{x_{2}}) \\ \vdots \\ (x_{k} - \mu_{x_{k}}) \\ \end{matrix} \right] \left[ \begin{matrix} (x_{1} - \mu_{x_{1}}) & (x_{2} - \mu_{x_{2}}) & \cdots & (x_{k} - \mu_{x_{k}}) \end{matrix} \right] \right) = \]

\[ = E \left( \left( \boldsymbol{x - \mu_{x}} \right) \left( \boldsymbol{x - \mu_{x}} \right)^{T} \right) \]