Covariance Extrapolation Equation

I assume the reader is already familiar with the concept of covariance extrapolation (prediction). We've already met the Covariance Extrapolation Equation (or Predictor Covariance Equation) in the "One-dimensional Kalman Filter" section. In this section, we derive the Kalman Filter Covariance Extrapolation Equation in matrix notation.

The general form of the Covariance Extrapolation Equation is given by:

\[ \boldsymbol{P}_{n+1,n} = \boldsymbol{FP}_{n,n}\boldsymbol{F}^{T} + \boldsymbol{Q} \]
Where:
\( \boldsymbol{P}_{n,n} \) is the uncertainty of an estimate (covariance matrix) of the current state
\( \boldsymbol{P}_{n+1,n} \) is the uncertainty of a prediction (covariance matrix) for the next state
\( \boldsymbol{F} \) is the state transition matrix that we derived in the "Modeling linear dynamic systems" section
\( \boldsymbol{Q} \) is the process noise matrix

The estimate uncertainty without process noise

Let's assume that the process noise is equal to zero \( (Q=0) \), then:

\[ \boldsymbol{P}_{n+1,n} = \boldsymbol{FP}_{n,n}\boldsymbol{F}^{T} \]

The derivation is relatively straightforward. I've shown in the "Essential background II" section, that:

\[ COV(\boldsymbol{x}) = E \left( \left( \boldsymbol{x - \mu_{x}} \right) \left( \boldsymbol{x - \mu_{x}} \right)^{T} \right) \]

Where vector \( x \) is a system state vector.

Therefore:

\[ \boldsymbol{P}_{n,n} = E \left( \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right)^{T} \right) \]

According to the state extrapolation equation:

\[ \boldsymbol{\hat{x}}_{n+1,n} = \boldsymbol{F\hat{x}}_{n,n} + \boldsymbol{G\hat{u}}_{n,n} \]

Therefore:

\[ \boldsymbol{P}_{n+1,n} = E \left( \left( \boldsymbol{\hat{x}}_{n+1,n} - \boldsymbol{\mu}_{x_{n+1,n}} \right) \left( \boldsymbol{\hat{x}}_{n+1,n} - \boldsymbol{\mu}_{x_{n+1,n}} \right)^{T} \right) = \]

\[ = E \left( \left( \boldsymbol{F\hat{x}}_{n,n} + \boldsymbol{G\hat{u}}_{n,n} - \boldsymbol{F\mu_{x}}_{n,n} - \boldsymbol{G\hat{u}}_{n,n} \right) \left( \boldsymbol{F\hat{x}}_{n,n} + \boldsymbol{G\hat{u}}_{n,n} - \boldsymbol{F\mu_{x}}_{n,n} - \boldsymbol{G\hat{u}}_{n,n} \right)^{T} \right) = \]

\[ = E \left( \boldsymbol{F} \left( \boldsymbol{\hat{x}}_{n,n} - \boldsymbol{\mu}_{x_{n,n}} \right) \left( \boldsymbol{F} \left( \boldsymbol{\hat{x}}_{n,n} - \boldsymbol{\mu}_{x_{n,n}} \right) \right)^{T} \right) = \]

Apply the matrix transpose property: \( \boldsymbol{(AB)}^T = \boldsymbol{B}^T \boldsymbol{A}^T \)

\[ = E \left(\boldsymbol{F} \left( \boldsymbol{\hat{x}}_{n,n} - \boldsymbol{\mu}_{x_{n,n}} \right) \left( \boldsymbol{\hat{x}}_{n,n} - \boldsymbol{\mu}_{x_{n,n}} \right)^{T} \boldsymbol{F}^{T} \right) = \]

\[ = \boldsymbol{F} E \left( \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right)^{T} \right) \boldsymbol{F}^{T} = \]

\[ = \boldsymbol{F} \boldsymbol{P}_{n,n} \boldsymbol{F}^{T} \]

Constructing the process noise matrix \( Q \)

As you already know, the system dynamics is described by:

\[ \boldsymbol{\hat{x}}_{n+1,n} = \boldsymbol{F\hat{x}}_{n,n} + \boldsymbol{G\hat{u}}_{n,n} + \boldsymbol{w}_{n} \]

Where \( \boldsymbol{w}_{n} \) is the process noise at the time step \( n \).

We've discussed the process noise and its influence on the Kalman Filter performance in the "One-dimensional Kalman Filter" section. In the one-dimensional Kalman Filter, the process noise variance is denoted by \( q \).

In the multidimensional case, the process noise is a covariance matrix denoted by \( \boldsymbol{Q} \).

We've seen that the process noise variance has a critical influence on the Kalman Filter performance. Too small \( q \) causes a lag error (see Example 7). If the \( q \) value is too high, the Kalman Filter follows the measurements (see Example 8) and produces noisy estimations.

The process noise can be independent between different state variables. In this case, the process noise covariance matrix \( \boldsymbol{Q} \) is a diagonal matrix:

\[ \boldsymbol{Q} = \left[ \begin{matrix} q_{11} & 0 & \cdots & 0 \\ 0 & q_{22} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & q_{kk} \\ \end{matrix} \right] \]

The process noise can also be dependent. For example, the constant velocity model assumes zero acceleration (\(a=0\)). However, a random variance in acceleration \( \sigma^{2}_{a} \) causes a variance in velocity and position. In this case, the process noise is correlated with the state variables.

There are two models for the environmental process noise.

  • Discrete noise model
  • Continuous noise model

Discrete noise model

The discrete noise model assumes that the noise is different at each period but is constant between periods.

Discrete Noise

For the constant velocity model, the process noise covariance matrix looks like the following:

\[ \boldsymbol{Q} = \left[ \begin{matrix} V(x) & COV(x,v) \\ COV(v,x) & V(v) \\ \end{matrix} \right] \]

We express the position and velocity variance and covariance in terms of the random acceleration variance of the model: \( \sigma^{2}_{a} \).

We derived the matrix elements using the expectation arithmetic rules in the "Essential Background II" section.

\[ V(v) = \sigma^{2}_{v} = E\left(v^{2}\right) - \mu_{v}^{2} = E \left( \left( a\Delta t\right)^{2}\right) - \left(\mu_{a}\Delta t\right)^{2} = \Delta t^{2}\left( E\left(a^{2}\right) - \mu_{a}^{2} \right) = \Delta t^{2}\sigma^{2}_{a} \]

\[ V(x) = \sigma^{2}_{x} = E\left(x^{2}\right) - \mu_{x}^{2} = E \left( \left( \frac{1}{2}a\Delta t^{2}\right)^{2}\right) - \left(\frac{1}{2}\mu_{a}\Delta t^{2}\right)^{2} = \frac{\Delta t^{4}}{4}\left( E\left(a^{2}\right) - \mu_{a}^{2} \right) = \frac{\Delta t^{4}}{4}\sigma^{2}_{a} \]

\[ COV(x,v) = COV(v,x) = E\left(xv\right) - \mu_{x}\mu_{v} = E\left( \frac{1}{2}a\Delta t^{2}a\Delta t\right) - \left( \frac{1}{2}\mu_{a}\Delta t^{2}\mu_{a}\Delta t\right) = \frac{\Delta t^{3}}{2}\left( E\left(a^{2}\right) - \mu_{a}^{2} \right) = \frac{\Delta t^{3}}{2}\sigma^{2}_{a} \]

Now we can substitute the results into \( \boldsymbol{Q} \) matrix:

\[ \boldsymbol{Q} = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{4}}{4} & \frac{\Delta t^{3}}{2} \\ \frac{\Delta t^{3}}{2} & \Delta t^{2} \\ \end{matrix} \right] \]

There are two methods for faster construction of the \( \boldsymbol{Q} \) matrix.

Projection using the state transition matrix

If the dynamic model doesn't include a control input, we can project the random variance in acceleration \( \sigma^{2}_{a} \) on our dynamic model using the state transition matrix.

Let us define a matrix \( \boldsymbol{Q}_{a} \):

\[ \boldsymbol{Q}_{a} = \left[ \begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \\ \end{matrix} \right] \sigma^{2}_{a} \]

The process noise matrix is:

\[ \boldsymbol{Q} = \boldsymbol{F}\boldsymbol{Q}_{a}\boldsymbol{F}^{T} \]

For the motion model, the \( \boldsymbol{F} \) matrix is given by:

\[ \boldsymbol{F} = \left[ \begin{matrix} 1 & \Delta t & \frac{\Delta t^{2}}{2} \\ 0 & 1 & \Delta t \\ 0 & 0 & 1 \\ \end{matrix} \right] \]

\[ \boldsymbol{Q} = \boldsymbol{F}\boldsymbol{Q}_{a}\boldsymbol{F}^{T} = \]

\[ = \left[ \begin{matrix} 1 & \Delta t & \frac{\Delta t^{2}}{2} \\ 0 & 1 & \Delta t \\ 0 & 0 & 1 \\ \end{matrix} \right] \left[ \begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \\ \end{matrix} \right] \left[ \begin{matrix} 1 & 0 & 0 \\ \Delta t & 1 & 0 \\ \frac{\Delta t^{2}}{2} & \Delta t & 1 \\ \end{matrix} \right] \sigma^{2}_{a} = \]

\[ = \left[ \begin{matrix} 0 & 0 & \frac{\Delta t^{2}}{2} \\ 0 & 0 & \Delta t \\ 0 & 0 & 1 \\ \end{matrix} \right] \left[ \begin{matrix} 1 & 0 & 0 \\ \Delta t & 1 & 0 \\ \frac{\Delta t^{2}}{2} & \Delta t & 1 \\ \end{matrix} \right] \sigma^{2}_{a} = \]

\[ = \left[ \begin{matrix} \frac{\Delta t^{4}}{4} & \frac{\Delta t^{3}}{2} & \frac{\Delta t^{2}}{2} \\ \frac{\Delta t^{3}}{2} & \Delta t^{2} & \Delta t \\ \frac{\Delta t^{2}}{2} & \Delta t & 1 \\ \end{matrix} \right] \sigma^{2}_{a} \]

Projection using the control matrix

If the dynamic model includes a control input, we can compute the \( \boldsymbol{Q} \) matrix even faster. We can project the random variance in acceleration \( \sigma^{2}_{a} \) on our dynamic model using the state transition matrix.

\[ \boldsymbol{Q} = \boldsymbol{G}\sigma^{2}_{a}\boldsymbol{G^{T}} \]

where \( \boldsymbol{G} \) is the control matrix (or input transition matrix).

For the motion model, the \( \boldsymbol{G} \) matrix is given by:

\[ \boldsymbol{G} = \left[ \begin{matrix} \frac{\Delta t^{2}}{2} \\ \Delta t \\ \end{matrix} \right] \]

\[ \boldsymbol{Q} = \boldsymbol{G}\sigma^{2}_{a}\boldsymbol{G^{T}} = \sigma^{2}_{a}\boldsymbol{G}\boldsymbol{G^{T}} = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{2}}{2} \\ \Delta t \\ \end{matrix} \right] \left[ \begin{matrix} \frac{\Delta t^{2}}{2} & \Delta t \\ \end{matrix} \right] = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{4}}{4} & \frac{\Delta t^{3}}{2} \\ \frac{\Delta t^{3}}{2} & \Delta t^{2} \\ \end{matrix} \right] \]

You can use the above methods to construct the discrete \( \boldsymbol{Q} \) matrix.

Continuous noise model

The continuous model assumes that the noise changes continuously over time.

Continuous Noise

To derive the process noise covariance matrix for the continuous model \( \boldsymbol{Q}_{c} \), we need to integrate the discrete process noise covariance matrix \( \boldsymbol{Q} \) over time.

\[ \boldsymbol{Q}_{c} = \int _{0}^{ \Delta t}\boldsymbol{Q}dt = \int _{0}^{ \Delta t} \sigma^{2}_{a} \left[ \begin{matrix} \frac{t^{4}}{4} & \frac{t^{3}}{2} \\ \frac{t^{3}}{2} & t^{2} \\ \end{matrix} \right] dt = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{5}}{20} & \frac{\Delta t^{4}}{8} \\ \frac{\Delta t^{4}}{8} & \frac{\Delta t^{3}}{3} \\ \end{matrix} \right] \]

Which model to choose?

Before answering this question, you need to select the right value for the process noise variance. You can calculate it using the stochastic statistics formulas or choose a reasonable value based on your engineering practice (which is preferable).

In the radar world, the \( \sigma^{2}_{a} \) depends on the target characteristics and model completeness. For maneuvering targets, like airplanes, the \( \sigma^{2}_{a} \) should be relatively high. For non-maneuvering targets, like rockets, you can use smaller \( \sigma^{2}_{a} \). The model completeness is also a factor in selecting the process noise variance. If your model includes environmental influences like air drag, then the degree of the process noise randomness is smaller and vice versa.

Once you've selected a reasonable process noise variance value, you should choose the noise model. Should it be discrete or continuous?

There is no clear answer to this question. I recommend trying both models and checking which one performs better with your Kalman Filter. When \( \Delta t \) is very small, you can use the discrete noise model. When \( \Delta t \) is high, it is better to use the continuous noise model.

Previous Next