Before we tackle the multidimensional Kalman Filter, you need a refresh with essential math topics:

- Matrix operations
- Covariance and covariance matrix
- Expectation algebra

If you are familiar with these topics, you can jump to the next chapter.

All you need to know is basic terms and operations such as:

- Vector and matrix addition and multiplication
- Matrix Transpose
- Matrix Inverse (you donâ€™t need to inverse matrixes by yourself, you just need to know what the inverse of the matrix is)
- Symmetric Matrix

There are numerous Linear Algebra textbooks and web tutorials that cover these topics.

You can find a good tutorial on this topic on the visiondummy site:

https://www.visiondummy.com/2014/04/geometric-interpretation-covariance-matrix/

I am going to use extensively the expectation algebra rules for Kalman Filter equations derivations. If you are interested to understand the derivations, you need to master the expectation algebra.

You already know, what the random variable is and what the expected value (or expectation) is. If not, please read the previous background break page.

The expectation is denoted by capital letter \( E \).

The expectation of the random variable \( E(X) \) equals to the mean of the random variable:

\[ E(X) = \mu_{X} \]

where \( \mu_{X} \) is the mean of the random variable.

Here are some basic expectation rules:

Rule | Notes | |
---|---|---|

1 | \( E(X) = \mu_{X} = \Sigma xp(x) \) | \( p(x) \) is the probability of \( x \) (discrete case) |

2 | \( E(a) = a \) | \( a \) is constant |

3 | \( E(aX) = aE(X) \) | \( a \) is constant |

4 | \( E(a \pm X) = a \pm E(X) \) | \( a \) is constant |

5 | \( E(a \pm bX) = a \pm bE(X) \) | \( a \) and \( b \) are constant |

6 | \( E(X \pm Y) = E(X) \pm E(Y) \) | \( Y \) is another random variable |

7 | \( E(XY) = E(X)E(Y) \) | if \( X \) and \( Y \) are independent |

The following table includes the variance and covariance expectation rules.

Rule | Notes | |
---|---|---|

8 | \( V(a) = 0 \) | \( V(a) \) is the variance of \( a \) \( a \) is constant |

9 | \( V(a \pm X) = V(X) \) | \( V(X) \) is the variance of \( X \) \( a \) is constant |

10 | \( V(X) = E(X^{2}) - \mu_{X}^{2} \) | \( V(X) \) is the variance of \( X \) |

11 | \( COV(X,Y) = E(XY) - \mu_{X}\mu_{Y} \) | \( COV(X,Y) \) is a covariance of \( X \) and \( Y \) |

12 | \( COV(X,Y) = 0 \) | if \( X \) and \( Y \) are independent |

13 | \( V(aX) = a^{2}V(X) \) | \( a \) is constant |

14 | \( V(X \pm Y) = V(X) + V(Y) \pm 2COV(X,Y) \) | |

15 | \( V(XY) \neq V(X)V(Y) \) |

The variance and covariance expectation rules are not straightforward. I will prove some of them.

\[ V(a) = 0 \]

A constant does not vary, so the variance of a constant is 0.

\[ V(a \pm X) = V(X) \]

Adding a constant to the variable does not change its variance.

\[ V(X) = E(X^{2}) - \mu_{X}^{2} \]

__The proof:__

Notes | |
---|---|

\( V(X) = \sigma_{X}^2 = E((X - \mu_{X})^2) = \) | |

\( E(X^2 -2X\mu_{X} + \mu_{X}^2) = \) | |

\( E(X^2) - E(2X\mu_{X}) + E(\mu_{X}^2) = \) | Applied rule number 5: \( E(a \pm bX) = a \pm bE(X) \) |

\( E(X^2) - 2\mu_{X}E(X) + E(\mu_{X}^2) = \) | Applied rule number 3: \( E(aX) = aE(X) \) |

\( E(X^2) - 2\mu_{X}E(X) + \mu_{X}^2 = \) | Applied rule number 2: \( E(a) = a \) |

\( E(X^2) - 2\mu_{X}\mu_{X} + \mu_{X}^2 = \) | Applied rule number 1: \( E(X) = \mu_{X} \) |

\( E(X^2) - \mu_{X}^2 \) |

\[ COV(X,Y) = E(XY) - \mu_{X}\mu_{Y} \]

__The proof:__

Notes | |
---|---|

\( COV(X,Y) = E((X - \mu_{X})(Y - \mu_{Y}) \) = | |

\( E(XY - X \mu_{Y} - Y \mu_{X} + \mu_{X}\mu_{Y}) = \) | |

\( E(XY) - E(X \mu_{Y}) - E(Y \mu_{X}) + E(\mu_{X}\mu_{Y}) = \) | Applied rule number 6: \( E(X \pm Y) = E(X) \pm E(Y) \) |

\( E(XY) - \mu_{Y} E(X) - \mu_{X} E(Y) + E(\mu_{X}\mu_{Y}) = \) | Applied rule number 3: \( E(aX) = aE(X) \) |

\( E(XY) - \mu_{Y} E(X) - \mu_{X} E(Y) + \mu_{X}\mu_{Y} = \) | Applied rule number 2: \( E(a) = a \) |

\( E(XY) - \mu_{Y} \mu_{X} - \mu_{X} \mu_{Y} + \mu_{X}\mu_{Y} = \) | Applied rule number 1: \( E(X) = \mu_{X} \) |

\( E(XY) - \mu_{X}\mu_{Y} \) |

\[ V(aX) = a^{2}V(X) \]

__The proof:__

Notes | |
---|---|

\( V(K) = \sigma_{K}^2 = E(K^{2}) - \mu_{K}^2 \) | |

\( K = aX \) | Substitute \( K \) by \( aX \) |

\( V(K) = V(aX) = E((aX)^{2} - (a \mu_{X})^{2}) = \) | Substitute \( K \) by \( aX \) |

\( E((aX)^{2}) - E(a^{2} \mu_{X}^{2}) = \) | Applied rule number 6: \( E(X \pm Y) = E(X) \pm E(Y) \) |

\( E((aX)^{2}) - a^{2} \mu_{X}^{2} = \) | Applied rule number 2: \( E(a) = a \) |

\( a^{2}E(X^{2}) - a^{2}\mu_{X}^{2} = \) | Applied rule number 3: \( E(aX) = aE(X) \) |

\( a^{2}(E(X^{2}) - \mu_{X}^{2}) = \) | |

\( a^{2}V(X) \) | Applied rule number 10: \( V(X) = E(X^{2}) - \mu_{X}^2 \) |

For the constant velocity motion:

Where:

\( x \) | is the displacement of the body |

\( v \) | is the velocity of the body |

\( \Delta t \) | is the time interval |

\[ V(X \pm Y) = V(X) + V(Y) \pm 2COV(X,Y) \]

__The proof:__

Notes | |
---|---|

\( V(X \pm Y) = \) | |

\( E((X \pm Y)^{2}) - (\mu_{X} \pm \mu_{Y})^{2} = \) | Applied rule number 10: \( V(X) = E(X^{2}) - \mu_{X}^2 \) |

\( E(X^{2} \pm 2XY + Y^{2}) - (\mu_{X}^2 \pm 2\mu_{X}\mu_{Y} + \mu_{y}^2) = \) | |

\( \color{red}{E(X^{2}) - \mu_{X}^2} + \color{blue}{E(Y^{2}) - \mu_{Y}^2} \pm 2(E(XY) - \mu_{X}\mu_{Y} ) = \) | Applied rule number 6: \( E(X \pm Y) = E(X) \pm E(Y) \) |

\( \color{red}{V(X)} + \color{blue}{V(Y)} \pm 2(E(XY) - \mu_{X}\mu_{Y} ) = \) | Applied rule number 10: \( V(X) = E(X^{2}) - \mu_{X}^2 \) |

\( V(X) + V(Y) \pm 2COV(X,Y) \) | Applied rule number 11: \( COV(X,Y) = E(XY) - \mu_{X}\mu_{Y} \) |

Assume vector \( \boldsymbol{x} \) with \( k \) elements:

\[ \boldsymbol{x} = \left[ \begin{matrix} x_{1}\\ x_{2}\\ \vdots \\ x_{k}\\ \end{matrix} \right] \]

The covariance matrix of the vector \( \boldsymbol{x} \) is given by:

__The proof:__

\[ COV(\boldsymbol{x}) = E \left( \left[ \begin{matrix} (x_{1} - \mu_{x_{1}})^{2} & (x_{1} - \mu_{x_{1}})(x_{2} - \mu_{x_{2}}) & \cdots & (x_{1} - \mu_{x_{1}})(x_{k} - \mu_{x_{k}}) \\ (x_{2} - \mu_{x_{2}})(x_{1} - \mu_{x_{1}}) & (x_{2} - \mu_{x_{2}})^{2} & \cdots & (x_{2} - \mu_{x_{2}})(x_{k} - \mu_{x_{k}}) \\ \vdots & \vdots & \ddots & \vdots \\ (x_{k} - \mu_{x_{k}})(x_{1} - \mu_{x_{1}}) & (x_{k} - \mu_{x_{k}})(x_{2} - \mu_{x_{2}}) & \cdots & (x_{k} - \mu_{x_{k}})^{2} \\ \end{matrix} \right] \right) = \]

\[ = E \left( \left[ \begin{matrix} (x_{1} - \mu_{x_{1}}) \\ (x_{2} - \mu_{x_{2}}) \\ \vdots \\ (x_{k} - \mu_{x_{k}}) \\ \end{matrix} \right] \left[ \begin{matrix} (x_{1} - \mu_{x_{1}}) & (x_{2} - \mu_{x_{2}}) & \cdots & (x_{k} - \mu_{x_{k}}) \end{matrix} \right] \right) = \]

\[ = E \left( \left( \boldsymbol{x - \mu_{x}} \right) \left( \boldsymbol{x - \mu_{x}} \right)^{T} \right) \]