Covariance Extrapolation Equation

We've already met the Covariance Extrapolation Equation (or Predictor Covariance Equation) in the "One-dimensional Kalman Filter" section. I assume that you are already familiar with the concept of covariance extrapolation (prediction). In this section, we will derive the Kalman Filter Covariance Extrapolation Equation in matrix notation.

The general form of the Covariance Extrapolation Equation is given by:

\[ \boldsymbol{P_{n+1,n} = FP_{n,n}F^{T} + Q} \]
Where:
\( \boldsymbol{P_{n,n}} \) is the uncertainty of an estimate - covariance matrix of the current state
\( \boldsymbol{P_{n+1,n}} \) is the uncertainty of a prediction - covariance matrix for the next state
\( \boldsymbol{F} \) is the state transition matrix that we derived in the "Modeling linear dynamic systems" section
\( \boldsymbol{Q} \) is the process noise matrix

The estimate uncertainty without process noise

Let's assume that the process noise is equal to zero \( (Q=0) \), then:

\[ \boldsymbol{P_{n+1,n} = FP_{n,n}F^{T}} \]

The derivation is quite straightforward. I've shown in the "Background break" chapter, that:

\[ COV(\boldsymbol{x}) = E \left( \left( \boldsymbol{x - \mu_{x}} \right) \left( \boldsymbol{x - \mu_{x}} \right)^{T} \right) \]

where vector \( x \) is a system state vector.

Therefore:

\[ \boldsymbol{P_{n,n}} = E \left( \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right)^{T} \right) \]

According to the state extrapolation equation:

\[ \boldsymbol{\hat{x}_{n+1,n}=F\hat{x}_{n,n}+G\hat{u}_{n,n}} \]

Therefore:

\[ \boldsymbol{P_{n+1,n}} = E \left( \left( \boldsymbol{\hat{x}_{n+1,n} - \mu_{x_{n+1,n}}} \right) \left( \boldsymbol{\hat{x}_{n+1,n} - \mu_{x_{n+1,n}}} \right)^{T} \right) = \]

\[ = E \left( \left( \boldsymbol{F\hat{x}_{n,n} + G\hat{u}_{n,n} - F\mu_{x_{n,n}} - G\hat{u}_{n,n}} \right) \left( \boldsymbol{F\hat{x}_{n,n} + G\hat{u}_{n,n} - F\mu_{x_{n,n}} - G\hat{u}_{n,n}} \right)^{T} \right) = \]

\[ = E \left( \boldsymbol{F} \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \left( \boldsymbol{F} \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \right)^{T} \right) = \]

Apply the matrix transpose property: \( \boldsymbol{(AB)^T = B^T A^T} \)

\[ = E \left(\boldsymbol{F} \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right)^{T} \boldsymbol{F^{T}} \right) = \]

\[ = \boldsymbol{F} E \left( \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right)^{T} \right) \boldsymbol{F^{T}} = \]

\[ = \boldsymbol{F P_{n,n} F^{T}} \]

Constructing the process noise matrix \( Q \)

As you already know, the system dynamics is described by:

\[ \boldsymbol{\hat{x}_{n+1,n}=F\hat{x}_{n,n}+G\hat{u}_{n,n}+w_{n}} \]

Where \( w_{n} \) is the process noise at the time step \( n \).

We've discussed the process noise and its influence on the Kalman Filter performance in the "One-dimensional Kalman Filter" section. In the one-dimensional Kalman Filter, the process noise variance is denoted by \( q \).

In the multidimensional Kalman Filter, the process noise is a covariance matrix denoted by \( \boldsymbol{Q} \).

We've seen that the process noise variance has a critical influence on the Kalman Filter performance. Too small \( q \) causes a lag error (see Example 7). If the \( q \) value is too large, the Kalman Filter will follow the measurements (see Example 8) and produce noisy estimations.

The process noise can be independent between different state variables. In this case, the process noise covariance matrix \( \boldsymbol{Q} \) is a diagonal matrix:

\[ \boldsymbol{Q} = \left[ \begin{matrix} q_{11} & 0 & \cdots & 0 \\ 0 & q_{22} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & q_{kk} \\ \end{matrix} \right] \]

The process noise can also be dependent. We will see a constant velocity model example. The model assumes zero acceleration (\(a=0\)). However, a random variance in acceleration \( \sigma^{2}_{a} \) will cause a variance in velocity and position. In this case the process noise is corelated between the state variables.

There are two models for the environment process noise.

  • Discrete noise model
  • Continuous noise model

I will describe both models.

Discrete noise model

The discrete noise model assumes that the noise is different at each time period, but it is constant between time periods.

Discrete Noise

For the constant velocity model, the process noise covariance matrix looks like:

\[ \boldsymbol{Q} = \left[ \begin{matrix} V(x) & COV(x,v) \\ COV(v,x) & V(v) \\ \end{matrix} \right] \]

We will express the position and velocity variance and covariance in terms of the random acceleration variance of the model: \( \sigma^{2}_{a} \).

I will derive the matrix elements using the expectations algebra rules (you can find them in the "Background" section).

\[ V(v) = \sigma^{2}_{v} = E\left(v^{2}\right) - \mu_{v}^{2} = E \left( \left( a\Delta t\right)^{2}\right) - \left(\mu_{a}\Delta t\right)^{2} = \Delta t^{2}\left( E\left(a^{2}\right) - \mu_{a}^{2} \right) = \Delta t^{2}\sigma^{2}_{a} \]

\[ V(x) = \sigma^{2}_{x} = E\left(x^{2}\right) - \mu_{x}^{2} = E \left( \left( \frac{1}{2}a\Delta t^{2}\right)^{2}\right) - \left(\frac{1}{2}\mu_{a}\Delta t^{2}\right)^{2} = \frac{\Delta t^{4}}{4}\left( E\left(a^{2}\right) - \mu_{a}^{2} \right) = \frac{\Delta t^{4}}{4}\sigma^{2}_{a} \]

\[ COV(x,v) = COV(v,x) = E\left(xv\right) - \mu_{x}\mu_{v} = E\left( \frac{1}{2}a\Delta t^{2}a\Delta t\right) - \left( \frac{1}{2}\mu_{a}\Delta t^{2}\mu_{a}\Delta t\right) = \frac{\Delta t^{3}}{2}\left( E\left(a^{2}\right) - \mu_{a}^{2} \right) = \frac{\Delta t^{3}}{2}\sigma^{2}_{a} \]

Now we can substitute the results into \( \boldsymbol{Q} \) matrix:

\[ \boldsymbol{Q} = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{4}}{4} & \frac{\Delta t^{3}}{2} \\ \frac{\Delta t^{3}}{2} & \Delta t^{2} \\ \end{matrix} \right] \]

I will also show two faster methods to construct the \( \boldsymbol{Q} \) matrix.

Projection using the state transition matrix

If the dynamic model doesn't include a control input, we can project the random variance in acceleration \( \sigma^{2}_{a} \) on our dynamic model using the state transition matrix.

Let us define a matrix \( \boldsymbol{Q}_{a} \):

\[ \boldsymbol{Q}_{a} = \left[ \begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \\ \end{matrix} \right] \sigma^{2}_{a} \]

The process noise matrix is:

\[ \boldsymbol{Q} = \boldsymbol{F}\boldsymbol{Q}_{a}\boldsymbol{F^{T}} \]

For the motion model, the \( \boldsymbol{F} \) matrix is given by:

\[ \boldsymbol{F} = \left[ \begin{matrix} 1 & \Delta t & \frac{\Delta t^{2}}{2} \\ 0 & 1 & \Delta t \\ 0 & 0 & 1 \\ \end{matrix} \right] \]

\[ \boldsymbol{Q} = \boldsymbol{F}\boldsymbol{Q}_{a}\boldsymbol{F^{T}} = \]

\[ = \left[ \begin{matrix} 1 & \Delta t & \frac{\Delta t^{2}}{2} \\ 0 & 1 & \Delta t \\ 0 & 0 & 1 \\ \end{matrix} \right] \left[ \begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \\ \end{matrix} \right] \left[ \begin{matrix} 1 & 0 & 0 \\ \Delta t & 1 & 0 \\ \frac{\Delta t^{2}}{2} & \Delta t & 1 \\ \end{matrix} \right] \sigma^{2}_{a} = \]

\[ = \left[ \begin{matrix} 0 & 0 & \frac{\Delta t^{2}}{2} \\ 0 & 0 & \Delta t \\ 0 & 0 & 1 \\ \end{matrix} \right] \left[ \begin{matrix} 1 & 0 & 0 \\ \Delta t & 1 & 0 \\ \frac{\Delta t^{2}}{2} & \Delta t & 1 \\ \end{matrix} \right] \sigma^{2}_{a} = \]

\[ = \left[ \begin{matrix} \frac{\Delta t^{4}}{4} & \frac{\Delta t^{3}}{2} & \frac{\Delta t^{2}}{2} \\ \frac{\Delta t^{3}}{2} & \Delta t^{2} & \Delta t \\ \frac{\Delta t^{2}}{2} & \Delta t & 1 \\ \end{matrix} \right] \sigma^{2}_{a} \]

Projection using the control matrix

If the dynamic model includes a control input, we can compute the \( \boldsymbol{Q} \) matrix even faster. We can project the random variance in acceleration \( \sigma^{2}_{a} \) on our dynamic model using the state transition matrix.

\[ \boldsymbol{Q} = \boldsymbol{G}\sigma^{2}_{a}\boldsymbol{G^{T}} \]

where \( \boldsymbol{G} \) is the control matrix (or input transition matrix)

For the motion model, the \( \boldsymbol{G} \) matrix is given by:

\[ \boldsymbol{G} = \left[ \begin{matrix} \frac{\Delta t^{2}}{2} \\ \Delta t \\ \end{matrix} \right] \]

\[ \boldsymbol{Q} = \boldsymbol{G}\sigma^{2}_{a}\boldsymbol{G^{T}} = \sigma^{2}_{a}\boldsymbol{G}\boldsymbol{G^{T}} = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{2}}{2} \\ \Delta t \\ \end{matrix} \right] \left[ \begin{matrix} \frac{\Delta t^{2}}{2} & \Delta t \\ \end{matrix} \right] = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{4}}{4} & \frac{\Delta t^{3}}{2} \\ \frac{\Delta t^{3}}{2} & \Delta t^{2} \\ \end{matrix} \right] \]

You can use any of the above methods to construct the discrete \( \boldsymbol{Q} \) matrix.

Continuous noise model

The continuous model assumes that the noise changes continuously over time.

Continuous Noise

In order to derive the process noise covariance matrix for the continuous model \( \boldsymbol{Q_{C}} \), we need to integrate the discrete process noise covariance matrix \( \boldsymbol{Q} \) over time.

\[ \boldsymbol{Q_{C}} = \int _{0}^{ \Delta t}\boldsymbol{Q}dt = \int _{0}^{ \Delta t} \sigma^{2}_{a} \left[ \begin{matrix} \frac{t^{4}}{4} & \frac{t^{3}}{2} \\ \frac{t^{3}}{2} & t^{2} \\ \end{matrix} \right] dt = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{5}}{20} & \frac{\Delta t^{4}}{8} \\ \frac{\Delta t^{4}}{8} & \frac{\Delta t^{3}}{3} \\ \end{matrix} \right] \]

Which model to choose?

Before answering this question, you need to select the right value for the process noise variance. You can calculate it using the stochastic statistics formulas or choose a reasonable value based on your engineering practice (which is preferable).

In the radar world, the \( \sigma^{2}_{a} \) depends on the target characteristics and model completeness. For maneuvering targets, like airplanes, the \( \sigma^{2}_{a} \) shall be quite large. For non- maneuvering targets, like rockets, you can use a smaller \( \sigma^{2}_{a} \). The model completeness is also a factor in selecting the process noise variance. If your model includes environmental influences like air drag, then the degree of the process noise randomness is smaller and vice versa.

Once you've selected a reasonable process noise variance value, you have to choose the noise model. Should it be discrete or continuous?

There is no clear answer on this question. When \( \Delta t \) is very small you can use the discrete noise model, when \( \Delta t \) is large it is better to use the continuous noise model. I recommend trying both models and check which one performs better with your Kalman Filter.

Previous Next