A Guide to Kalman Filter

In this article, we’ll make a few notes on how Kalman filter works, and will show how to overcome most common problems that might appear while using it.

What is Kalman filter?

The Kalman filter is an algorithm that estimates the state of a dynamic system from a series of noisy measurements. Developed by Rudolf Kalman in the 1960s, it has become a core component of control systems, robotics, navigation, and finance for tracking and prediction. The Kalman filter is particularly useful because it provides an efficient computational means to make real-time predictions and corrections, minimizing the influence of noise.

Key Concepts in the Kalman Filter

  1. State Variables: Represent the underlying true state of the system, such as position and velocity in tracking problems. These variables evolve over time and are not directly measurable.
  2. Prediction and Update: The filter operates in a loop of two main steps:
  • Prediction Step: Predicts the next state based on the current state and a known model of system dynamics.
  • Update Step: Adjusts the prediction based on a new measurement, correcting for observed discrepancies.
  1. Noise: Assumes noise in both the process (system dynamics) and measurements. The Kalman filter uses statistical assumptions to model these uncertainties.
  2. Gaussian Assumptions: It assumes that all noise is Gaussian (i.e., follows a normal distribution), which allows it to rely on Gaussian statistics to make optimal estimations.

In order to use the Kalman filter to estimate the internal state of a process given only a sequence of noisy observations, one must model the process in accordance with the following framework. This means specifying the matrices, for each time-step k, following:

  • FkF_k, the state-transition model;
  • HkH_k, the observation model;
  • QkQ_k, the Variance and Covariance|covariance of the process noise;
  • RkR_k, the Variance and Covariance|covariance of the observation noise;
  • sometimes BkB_k, the control input model, only if it exists;
  • uku_k, the control vector, represents the controlling input into control-input sequence.

Predict step:

We are trying to predict how our state changes both for the model and for the

  • Predicted (a priori) state estimate: x^kk1=Fkx^k1k1+Bkuk\hat x_{k|k-1} = F_k \hat x_{k-1|k-1} + B_k u_k, or what measurement we predict.
  • Predicted (a priori) estimate covariance: Pkk1=FkPk1k1Fkt+QkP_{k|k-1}=F_k P_{k-1|k-1} F_k^t + Q_k, or the uncertainty of the predicted measurement. Works due to the covariance transformation rule.

Update step:

  • Innovation or measurement pre-fit residual: y~=zkHkx^kk1\widetilde y = z_k - H_k \hat x_{k|k-1}, where zkz_k is measurement
  • Innovation (or pre-fit residual) covariance: Sk=HkPkk1HkT+RkS_k = H_k P_{k|k-1} H_k^T + R_k
  • Optimal Kalman gain: Kk=Pkk1HkTSk1K_k = P_{k|k-1} H_k^T S_k^{-1}
  • Updated (a posteriori) state estimate: x^kk=x^kk1+Kky~k\hat x_{k|k} = \hat x_{k|k-1} + K_k \widetilde y_k
  • Updated (a posteriori) estimate covariance: Pkk=(IKkHk)Pkk1P_{k|k} = (I - K_k H_k) P_{k|k-1}
  • Measurement post-fit residual: y~=zkHkx^kk\widetilde y = z_k - H_k \hat x_{k|k}

Basic Kalman Filter implementation

Several lines of output:

prefit residual: -10.19742879 postfit residual: -7.31806471

prefit residual: -12.33359224 postfit residual: -8.85095993

prefit residual: -8.91856241 postfit residual: -6.40016151

prefit residual: -10.93745076 postfit residual: -7.84887742

Kalman filter without a model

We still can apply Kalman filter for measurements denoising when we do not have a process model, but only have measurements themselves.

Kalman filter with a complex transition model

In cases where the prediction model is complex or nonlinear, and it’s difficult or impossible to represent it with a simple matrix, you can use the predicted value from the model directly rather than relying on matrix calculations in the Kalman filter. This approach is common, especially in nonlinear Kalman filtering techniques like the Extended Kalman Filter (EKF) or the Unscented Kalman Filter (UKF).

Extended Kalman Filter (EKF)

This one can be used if you can calculate Jacobian of the transition function. In the standard (linear) Kalman filter, the state prediction step is:

xnn1=Fxn1 \mathbf{x}_{n|n-1} = F \mathbf{x}_{n-1}

where FF is the state transition matrix. However, if you have a nonlinear or complicated model, you can replace this matrix-based prediction with a custom prediction function ff that captures the complex behavior of your model.

  1. Define the Model Function: Let f(xn1)f(\mathbf{x}_{n-1}) represent your complex model that predicts the next state directly. Instead of computing xnn1\mathbf{x}_{n|n-1} using Fxn1F \mathbf{x}_{n-1}, compute it as:

xnn1=f(xn1) \mathbf{x}_{n|n-1} = f(\mathbf{x}_{n-1})

where ff could be any function (e.g., a nonlinear equation, simulation output, or a black-box prediction model).

  1. You’ll still need a linear approximation of ff to update the covariance matrix PP. This is done by computing the Jacobian matrix FnF_n of ff around the current state estimate:

Fn=fxxn1 F_n = \frac{\partial f}{\partial \mathbf{x}} \Big|_{\mathbf{x}_{n-1}}

This Jacobian matrix FnF_n replaces the original state transition matrix FF and is used to update the covariance matrix as:

Pnn1=FnPn1Fn+Q P_{n|n-1} = F_n P_{n-1} F_n^\top + Q

where QQ is the process noise covariance.

If calculating the Jacobian is difficult or impossible, you can use the Unscented Kalman Filter (UKF).

Unscented Kalman Filter (UKF)

The UKF doesn’t require an explicit Jacobian because it approximates the distribution of the state by propagating a set of “sigma points” through the nonlinear function ff. This makes it well-suited for complex models without linearization:

xnn1=f(sigma points of xn1) \mathbf{x}_{n|n-1} = f(\text{sigma points of } \mathbf{x}_{n-1})

Proceed with the Standard Update Step: After obtaining xnn1\mathbf{x}_{n|n-1} (either from ff in EKF/UKF or through matrix multiplication in the linear Kalman filter), you can proceed with the usual Kalman filter update step, where measurements are incorporated to correct the predicted state.

The Unscented Kalman Filter (UKF) propagates sigma points through the nonlinear model to approximate the state distribution after the prediction step. Instead of linearizing the model (as in the Extended Kalman Filter), the UKF uses a set of carefully chosen sigma points that capture the mean and covariance of the state. These sigma points are propagated through the nonlinear model, and the resulting transformed points are used to approximate the predicted mean and covariance of the state distribution.

Here’s a step-by-step explanation of how sigma points propagation works in the UKF.

Step 1: Generate Sigma Points

For a state vector xn1\mathbf{x}_{n-1} of dimension LL with mean xn1\mathbf{x}_{n-1} and covariance Pn1P_{n-1}, we create 2L+12L + 1 sigma points. These sigma points are chosen to capture the spread of the distribution around the mean.

  1. Calculate the square root of the scaled covariance matrix:

(L+λ)Pn1 \sqrt{(L + \lambda) P_{n-1}}

where λ\lambda is a scaling parameter, often chosen as λ=α2(L+κ)L\lambda = \alpha^2 (L + \kappa) - L, with tuning parameters α\alpha (often set to a small positive value, e.g., 10310^{-3}) and κ\kappa (often set to 0 or 3-L).

  1. Generate the sigma points:

X0=xn1 \mathbf{X}_{0} = \mathbf{x}_{n-1}

Xi=xn1+[(L+λ)Pn1]i,i=1,,L \mathbf{X}_{i} = \mathbf{x}_{n-1} + \left[ \sqrt{(L + \lambda) P_{n-1}} \right]_i, \quad i = 1, \dots, L

Xi+L=xn1[(L+λ)Pn1]i,i=1,,L \mathbf{X}_{i+L} = \mathbf{x}_{n-1} - \left[ \sqrt{(L + \lambda) P_{n-1}} \right]_i, \quad i = 1, \dots, L

where [(L+λ)Pn1]i\left[ \sqrt{(L + \lambda) P_{n-1}} \right]_i is the ii-th column of the matrix square root of (L+λ)Pn1(L + \lambda) P_{n-1}.

This results in a set of 2L+12L + 1 sigma points:

X0,X1,,X2L \mathbf{X}_{0}, \mathbf{X}_{1}, \dots, \mathbf{X}_{2L}

These points are distributed around the mean xn1\mathbf{x}_{n-1} and represent the original state distribution.

Step 2: Propagate Sigma Points through the Nonlinear Function

Each sigma point is then propagated through the nonlinear state transition function ff:

Xipred=f(Xi),i=0,1,,2L \mathbf{X}_{i}^{\text{pred}} = f(\mathbf{X}_i), \quad i = 0, 1, \dots, 2L

This gives a set of transformed sigma points Xipred\mathbf{X}_{i}^{\text{pred}} that represent the distribution of the predicted state after applying the nonlinear function.

Step 3: Compute the Predicted Mean and Covariance

Once we have the transformed sigma points, we calculate the predicted mean xnn1\mathbf{x}_{n|n-1} and predicted covariance Pnn1P_{n|n-1} of the state.

  1. Compute the predicted mean:

xnn1=i=02LWi(m)Xipred \mathbf{x}_{n|n-1} = \sum_{i=0}^{2L} W_{i}^{(m)} \mathbf{X}_{i}^{\text{pred}}

where Wi(m)W_{i}^{(m)} are the weights for the mean, which depend on the scaling parameter λ\lambda. Typically, W0(m)=λL+λW_0^{(m)} = \frac{\lambda}{L + \lambda} and Wi(m)=12(L+λ)W_i^{(m)} = \frac{1}{2(L + \lambda)} for i=1,,2Li = 1, \dots, 2L.

  1. Compute the predicted covariance:

Pnn1=i=02LWi(c)(Xipredxnn1)(Xipredxnn1)+Q P_{n|n-1} = \sum_{i=0}^{2L} W_{i}^{(c)} \left( \mathbf{X}_{i}^{\text{pred}} - \mathbf{x}_{n|n-1} \right) \left( \mathbf{X}_{i}^{\text{pred}} - \mathbf{x}_{n|n-1} \right)^\top + Q

where Wi(c)W_{i}^{(c)} are the weights for the covariance (similar to Wi(m)W_{i}^{(m)}, but can be adjusted if desired), and QQ is the process noise covariance.

Step 4: Predict the Measurement and Update (Correction Step)

The prediction is completed at this point. In the UKF, the correction step (updating the state estimate based on the measurement) also uses sigma points.

  1. Predict measurement sigma points: Transform the sigma points using the measurement function hh (if nonlinear):

Yi=h(Xipred),i=0,1,,2L \mathbf{Y}_{i} = h(\mathbf{X}_{i}^{\text{pred}}), \quad i = 0, 1, \dots, 2L

  1. Compute predicted measurement mean and covariance:

Calculate the predicted measurement mean ynn1\mathbf{y}_{n|n-1} and the measurement covariance PyyP_{yy}, as well as the cross-covariance PxyP_{xy} between the state and measurement. These are calculated using weighted sums similar to the predicted mean and covariance for the state.

  1. Compute Kalman gain:

K=PxyPyy1 K = P_{xy} P_{yy}^{-1}

  1. Update state and covariance:

Finally, use the Kalman gain to update the state estimate and covariance based on the measurement zn\mathbf{z}_n:

xn=xnn1+K(znynn1) \mathbf{x}_{n} = \mathbf{x}_{n|n-1} + K \left( \mathbf{z}_n - \mathbf{y}_{n|n-1} \right)

Pn=Pnn1KPyyK P_{n} = P_{n|n-1} - K P_{yy} K^\top

How to get required matrixes if you have only data samples

In the Kalman filter, the covariance matrices (particularly PP, the error covariance matrix, and QQ, the process noise covariance matrix) play essential roles in managing and representing uncertainties, but they are not always directly calculated from scratch. Here’s how each covariance matrix is determined and managed in a typical Kalman filter:

Error Covariance Matrix (P)

The error covariance matrix PP represents the uncertainty in the state estimate. It is initialized and then continuously updated by the Kalman filter.

Initialization of P

  • At the start, PP is usually initialized based on an initial guess about the uncertainties of the state estimates. For example, if the initial state variables are position and velocity, the initial PP matrix could be set with large values on the diagonal (to indicate high uncertainty if the initial estimate is rough) or small values if the initial state estimate is already quite accurate.

P0=[σposition200σvelocity2] P_0 = \begin{bmatrix} \sigma_{position}^2 & 0 \\ 0 & \sigma_{velocity}^2 \end{bmatrix}

where σposition2\sigma_{position}^2 and σvelocity2\sigma_{velocity}^2 represent initial variances (uncertainties) in position and velocity estimates, respectively.

Update of P during Prediction and Correction

After initialization, PP is updated as part of each prediction and correction cycle:

  1. Prediction Step:

During the prediction phase, the filter updates PP to account for the passage of time and the possibility of errors or variations in the system model. This is achieved by applying the following equation:

Ptt1=APt1t1AT+Q P_{t|t-1} = A \cdot P_{t-1|t-1} \cdot A^T + Q

  • AA: The state transition matrix, which models the dynamics of how each state variable (position, velocity, etc.) affects the next state.
  • Ptt1P_{t|t-1}: Predicted error covariance matrix, representing the updated uncertainty after accounting for system dynamics and added process noise.
  • QQ: Process noise covariance matrix, which introduces additional uncertainty to account for model imperfections.

This update increases PP to reflect the increasing uncertainty as we move forward in time without a new measurement.

  1. Correction Step:

When a new measurement is available, PP is updated to reduce uncertainty using the Kalman gain KK:

Ptt=(IKtH)Ptt1 P_{t|t} = (I - K_t \cdot H) \cdot P_{t|t-1}

  • KtK_t: Kalman gain, which determines how much the measurement influences the state estimate.
  • HH: Observation matrix, mapping state variables to the measurement space.

This step reduces the uncertainty in PP by combining the prediction with the new measurement, allowing the Kalman filter to become more certain about the updated state.

Process Noise Covariance Matrix (Q)

The process noise covariance matrix QQ represents the uncertainty in the system’s dynamics or model. It’s essentially a model of how much unpredicted variation or disturbance (process noise) we expect in each state variable over time.

Setting Q

In most cases, QQ is designed rather than calculated based on the following considerations:

  • Expected Random Disturbances: If we are tracking a moving object, for instance, we might set a higher QQ if we expect the object to experience random, unpredictable accelerations (like wind or terrain changes).
  • Model Imperfections: Higher values in QQ are used to account for limitations in the system model.

For example, if we are tracking both position and velocity, we might set QQ as:

Q=[σprocess position200σprocess velocity2] Q = \begin{bmatrix} \sigma_{process\ position}^2 & 0 \\ 0 & \sigma_{process\ velocity}^2 \end{bmatrix}

where:

  • σprocess position2\sigma_{process\ position}^2 and σprocess velocity2\sigma_{process\ velocity}^2 are the variances representing the expected uncertainty in position and velocity due to unknown disturbances or inaccuracies in the model.

In some cases, domain knowledge or experimentation is used to tune QQ to achieve desired performance. There are also advanced methods, such as adaptive filtering, where QQ can be adjusted dynamically based on observed errors.

Measurement Noise Covariance Matrix (R)

The measurement noise covariance matrix RR represents the uncertainty (noise) associated with the measurements received from sensors. Each element in RR quantifies how much noise is expected in each measurement.

Setting R

The values in RR are typically derived from:

  • Sensor Specifications: Often, the manufacturer provides information on the sensor’s accuracy, which can be used to set RR. For example, a GPS sensor may have an accuracy of ±3 meters, which would influence the variance (uncertainty) for position measurements.
  • Experimental Data: Sometimes RR is determined empirically by taking repeated measurements in a controlled environment to estimate the variance.

For example, if the Kalman filter is using a GPS sensor to measure position, RR might be set as:

R=[σmeasurement position2] R = \begin{bmatrix} \sigma_{measurement\ position}^2 \end{bmatrix}

where σmeasurement position2\sigma_{measurement\ position}^2 is the variance associated with the position measurement’s noise.

Matrices extraction from data summary

Here’s a summary of how each covariance matrix is determined in a Kalman filter:

Covariance Matrix

Covariance Matrix Initialization Update Method
PP (Error Covariance) Initialized based on initial state uncertainty, often chosen manually. Updated each cycle based on predictions and new measurements.
QQ (Process Noise Covariance) Set based on expected model uncertainties and external disturbances. Often chosen empirically or based on domain knowledge. Usually remains constant, though it can be adapted based on observed performance.
RR (Measurement Noise Covariance) Set based on sensor noise characteristics, from specifications or empirical measurements. Typically remains constant, but can also be adjusted if the sensor’s noise characteristics change.

In summary, these covariance matrices are often set based on known characteristics of the system and sensors, and then refined through the Kalman filter’s recursive process. The values in PP, QQ, and RR are crucial for determining how the Kalman filter balances the predicted state and measurement to achieve an optimal estimate.

Conclusion

The Kalman filter is a cornerstone algorithm for estimating the state of dynamic systems in the presence of noise and uncertainty. By leveraging a mathematical model of the system, it predicts future states and continuously refines these estimates using noisy measurements. At the heart of the Kalman filter are key components: the state transition matrix, observation model, and covariance matrices that describe the uncertainties in both the process and measurements. These matrices, derived from system dynamics and data, play a crucial role in balancing prediction accuracy and adaptability to real-world variability.

Understanding and fine-tuning these components based on system knowledge, sensor data, and empirical observations is essential for the filter to perform optimally. Whether tracking motion, filtering noisy signals, or managing uncertainties in complex systems, the Kalman filter provides a versatile framework that evolves as data is processed. This adaptability makes it invaluable across fields such as robotics, navigation, and finance, offering reliable insights even in unpredictable environments.

Banner that links to Serokell Shop. You can buy stylish FP T-shirts there!
More from Serokell
Past and Present of Haskell: Interview with Simon Peyton JonesPast and Present of Haskell: Interview with Simon Peyton Jones
TypescriptTypescript
Applications of AI in Oil and GasApplications of AI in Oil and Gas