We've already met the Covariance Extrapolation Equation (or Predictor Covariance Equation) in the "One-dimensional Kalman Filter" section. I assume, that you are already familiar with the concept of covariance extrapolation (prediction). At this section, we will derive the Kalman Filter Covariance Extrapolation Equation in the matrix notation.

The general form of the Covariance Extrapolation Equation is given by:

Where:

\( \boldsymbol{P_{n,n}} \) | is an estimate uncertainty (covariance) matrix of the current sate |

\( \boldsymbol{P_{n+1,n}} \) | is a predicted estimate uncertainty (covariance) matrix for the next state |

\( \boldsymbol{F} \) | is a state transition matrix that we've derived in "Modeling linear dynamic systems" section |

\( \boldsymbol{B} \) | is a input matrix |

\( \boldsymbol{Q} \) | is a process noise matrix |

Let's assume that the process noise equals to zero \( (Q=0) \), then:

\[ \boldsymbol{P_{n+1,n} = FP_{n,n}F^{T}} \]

The derivation is quite straightforward. I've shown in the "Background break" chapter, that:

\[ COV(\boldsymbol{x}) = E \left( \left( \boldsymbol{x - \mu_{x}} \right) \left( \boldsymbol{x - \mu_{x}} \right)^{T} \right) \]

where vector \( x \) is a system state vector.

Therefore:

\[ \boldsymbol{P_{n,n}} = E \left( \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right)^{T} \right) \]

According to the state extrapolation equation:

\[ \boldsymbol{\hat{x}_{n+1,n}=F\hat{x}_{n,n}+G\hat{u}_{n,n}} \]

Therefore:

\[ \boldsymbol{P_{n+1,n}} = E \left( \left( \boldsymbol{\hat{x}_{n+1,n} - \mu_{x_{n+1,n}}} \right) \left( \boldsymbol{\hat{x}_{n+1,n} - \mu_{x_{n+1,n}}} \right)^{T} \right) = \]

\[ = E \left( \left( \boldsymbol{F\hat{x}_{n,n} + G\hat{u}_{n,n} - F\mu_{x_{n,n}} - G\hat{u}_{n,n}} \right) \left( \boldsymbol{F\hat{x}_{n,n} + G\hat{u}_{n,n} - F\mu_{x_{n,n}} - G\hat{u}_{n,n}} \right)^{T} \right) = \]

\[ = E \left( \boldsymbol{F} \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \left( \boldsymbol{F} \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \right)^{T} \right) = \]

Apply matrix transpose property: \( \boldsymbol{(AB)^T = B^T A^T} \)

\[ = E \left(\boldsymbol{F} \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right)^{T} \boldsymbol{F^{T}} \right) = \]

\[ = \boldsymbol{F} E \left( \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right) \left( \boldsymbol{\hat{x}_{n,n} - \mu_{x_{n,n}}} \right)^{T} \right) \boldsymbol{F^{T}} = \]

\[ = \boldsymbol{F P_{n,n} F^{T}} \]

As you already know, the systems dynamic is described by:

\[ \boldsymbol{\hat{x}_{n+1,n}=F\hat{x}_{n,n}+G\hat{u}_{n,n}+w_{n}} \]

Where \( w_{n} \) is the process noise at the time step \( n \).

We've discussed the process noise and it's influence on the Kalman Filter performance in "One-dimensional Kalman Filter" section. In the one-dimensional Kalman Filter, the process noise variance is denoted by \( q \).

In the multidimensional Kalman Filter, the process noise is a covariance matrix denoted by \( \boldsymbol{Q} \).

We've seen that the process noise variance has a critical influence on the Kalman Filter performance. Too small \( q \) causes a lag error (see Example 7). If the \( q \) value is too large, the Kalman Filter will follow the measurements (see Example 8) and produce noisy estimations.

The process noise can be independent between different state variables. In this case, the process noise is a covariance matrix \( \boldsymbol{Q} \) is a diagonal matrix:

\[ \boldsymbol{Q} = \left[ \begin{matrix} q_{11} & 0 & \cdots & 0 \\ 0 & q_{22} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & q_{kk} \\ \end{matrix} \right] \]

The process noise can also be dependent. We will see a constant velocity model example. The model assumes zero acceleration (\(a=0\)). However, a random variance in acceleration \( \sigma^{2}_{a} \) will cause a variance in velocity and position. In this case the process noise is corelated between the state variables.

There are two models of the environment process noise.

- Discrete noise model
- Continuous noise model

I will describe both models.

The discrete noise model assumes that the noise is different at each time period, but it is constant between time periods.

For the constant velocity model, the process noise covariance matrix looks like:

\[ \boldsymbol{Q} = \left[ \begin{matrix} V(x) & COV(x,v) \\ COV(v,x) & V(v) \\ \end{matrix} \right] \]

I will derive the matrix elements using the expectations algebra rules (you can find them in the "Background" section).

\[ V(v) = \sigma^{2}_{v} = E\left(v^{2}\right) - \mu_{v}^{2} = E \left( \left( a\Delta t\right)^{2}\right) - \left(\mu_{a}\Delta t\right)^{2} = \Delta t^{2}\left( E\left(a^{2}\right) - \mu_{a}^{2} \right) = \Delta t^{2}\sigma^{2}_{a} \]

\[ V(x) = \sigma^{2}_{x} = E\left(x^{2}\right) - \mu_{x}^{2} = E \left( \left( \frac{1}{2}a\Delta t^{2}\right)^{2}\right) - \left(\frac{1}{2}\mu_{a}\Delta t^{2}\right)^{2} = \frac{\Delta t^{4}}{4}\left( E\left(a^{2}\right) - \mu_{a}^{2} \right) = \frac{\Delta t^{4}}{4}\sigma^{2}_{a} \]

\[ COV(x,v) = COV(v,x) = E\left(xv\right) - \mu_{x}\mu_{v} = E\left( \frac{1}{2}a\Delta t^{2}a\Delta t\right) - \left( \frac{1}{2}\mu_{a}\Delta t^{2}\mu_{a}\Delta t\right) = \frac{\Delta t^{3}}{2}\left( E\left(a^{2}\right) - \mu_{a}^{2} \right) = \frac{\Delta t^{3}}{2}\sigma^{2}_{a} \]

Now we can substitute results into \( \boldsymbol{Q} \) matrix:

\[ \boldsymbol{Q} = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{4}}{4} & \frac{\Delta t^{3}}{2} \\ \frac{\Delta t^{3}}{2} & \Delta t^{2} \\ \end{matrix} \right] \]

I will also show another method to construct the \( \boldsymbol{Q} \) matrix:

\[ \boldsymbol{Q} = \boldsymbol{G}\sigma^{2}_{a}\boldsymbol{G^{T}} \]

where \( \boldsymbol{G} \) is the control matrix (or input transition matrix)

For the motion model, the \( \boldsymbol{G} \) matrix is given by:

\[ \boldsymbol{G} = \left[ \begin{matrix} \frac{\Delta t^{2}}{2} \\ \Delta t \\ \end{matrix} \right] \]

\[ \boldsymbol{Q} = \boldsymbol{G}\sigma^{2}_{a}\boldsymbol{G^{T}} = \sigma^{2}_{a}\boldsymbol{G}\boldsymbol{G^{T}} = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{2}}{2} \\ \Delta t \\ \end{matrix} \right] \left[ \begin{matrix} \frac{\Delta t^{2}}{2} & \Delta t \\ \end{matrix} \right] = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{4}}{4} & \frac{\Delta t^{3}}{2} \\ \frac{\Delta t^{3}}{2} & \Delta t^{2} \\ \end{matrix} \right] \]

You can use any of the above methods to construct the discrete \( \boldsymbol{Q} \) matrix.

The continuous model assumes that the noise changes continuously over time.

In order to derive the process noise covariance matrix for continuous model \( \boldsymbol{Q_{C}} \), we need to integrate the discrete process noise covariance matrix \( \boldsymbol{Q} \) over time.

\[ \boldsymbol{Q_{C}} = \int _{0}^{ \Delta t}\boldsymbol{Q}dt = \int _{0}^{ \Delta t} \sigma^{2}_{a} \left[ \begin{matrix} \frac{t^{4}}{4} & \frac{t^{3}}{2} \\ \frac{t^{3}}{2} & t^{2} \\ \end{matrix} \right] dt = \sigma^{2}_{a} \left[ \begin{matrix} \frac{\Delta t^{5}}{20} & \frac{\Delta t^{4}}{8} \\ \frac{\Delta t^{4}}{8} & \frac{\Delta t^{3}}{3} \\ \end{matrix} \right] \]

Before answering this question, you need to select the right value for process noise variance. You can calculate it using the stochastic statistics formulas or choose a reasonable value based on your engineering practice (preferable).

In the radar world, the \( \sigma^{2}_{a} \) depends on the target characteristics and model completeness. For maneuvering targets, like airplanes, the \( \sigma^{2}_{a} \) shall be quite large. For non- maneuvering targets, like rockets, you can use a smaller \( \sigma^{2}_{a} \). The model completeness is also a factor in selecting the process noise variance. If your model includes environmental influences like an air drag, then the degree of the process noise randomness is smaller and vice versa.

One you've selected a reasonable process noise variance value, you have to choose the noise model. Should it be discrete model or continuous?

There is no clear answer on this question. When \( \Delta t \) is very small you can use the discrete noise model, when \( \Delta t \) is large it is better to use the continuous noise model. I recommend trying both models and check which one performs better with your Kalman Filter.