In this article, we’ll make a few notes on how Kalman filter works, and will show how to overcome most common problems that might appear while using it.
What is Kalman filter?
The Kalman filter is an algorithm that estimates the state of a dynamic system from a series of noisy measurements. Developed by Rudolf Kalman in the 1960s, it has become a core component of control systems, robotics, navigation, and finance for tracking and prediction. The Kalman filter is particularly useful because it provides an efficient computational means to make real-time predictions and corrections, minimizing the influence of noise.
Key Concepts in the Kalman Filter
- State Variables: Represent the underlying true state of the system, such as position and velocity in tracking problems. These variables evolve over time and are not directly measurable.
- Prediction and Update: The filter operates in a loop of two main steps:
- Prediction Step: Predicts the next state based on the current state and a known model of system dynamics.
- Update Step: Adjusts the prediction based on a new measurement, correcting for observed discrepancies.
- Noise: Assumes noise in both the process (system dynamics) and measurements. The Kalman filter uses statistical assumptions to model these uncertainties.
- Gaussian Assumptions: It assumes that all noise is Gaussian (i.e., follows a normal distribution), which allows it to rely on Gaussian statistics to make optimal estimations.
In order to use the Kalman filter to estimate the internal state of a process given only a sequence of noisy observations, one must model the process in accordance with the following framework. This means specifying the matrices, for each time-step k, following:
- , the state-transition model;
- , the observation model;
- , the Variance and Covariance|covariance of the process noise;
- , the Variance and Covariance|covariance of the observation noise;
- sometimes , the control input model, only if it exists;
- , the control vector, represents the controlling input into control-input sequence.
Predict step:
We are trying to predict how our state changes both for the model and for the
- Predicted (a priori) state estimate: , or what measurement we predict.
- Predicted (a priori) estimate covariance: , or the uncertainty of the predicted measurement. Works due to the covariance transformation rule.
Update step:
- Innovation or measurement pre-fit residual: , where is measurement
- Innovation (or pre-fit residual) covariance:
- Optimal Kalman gain:
- Updated (a posteriori) state estimate:
- Updated (a posteriori) estimate covariance:
- Measurement post-fit residual:
Basic Kalman Filter implementation
Several lines of output:
prefit residual: -10.19742879 postfit residual: -7.31806471
prefit residual: -12.33359224 postfit residual: -8.85095993
prefit residual: -8.91856241 postfit residual: -6.40016151
prefit residual: -10.93745076 postfit residual: -7.84887742
Kalman filter without a model
We still can apply Kalman filter for measurements denoising when we do not have a process model, but only have measurements themselves.
Kalman filter with a complex transition model
In cases where the prediction model is complex or nonlinear, and it’s difficult or impossible to represent it with a simple matrix, you can use the predicted value from the model directly rather than relying on matrix calculations in the Kalman filter. This approach is common, especially in nonlinear Kalman filtering techniques like the Extended Kalman Filter (EKF) or the Unscented Kalman Filter (UKF).
Extended Kalman Filter (EKF)
This one can be used if you can calculate Jacobian of the transition function. In the standard (linear) Kalman filter, the state prediction step is:
where is the state transition matrix. However, if you have a nonlinear or complicated model, you can replace this matrix-based prediction with a custom prediction function that captures the complex behavior of your model.
- Define the Model Function: Let represent your complex model that predicts the next state directly. Instead of computing using , compute it as:
where could be any function (e.g., a nonlinear equation, simulation output, or a black-box prediction model).
- You’ll still need a linear approximation of to update the covariance matrix . This is done by computing the Jacobian matrix of around the current state estimate:
This Jacobian matrix replaces the original state transition matrix and is used to update the covariance matrix as:
where is the process noise covariance.
If calculating the Jacobian is difficult or impossible, you can use the Unscented Kalman Filter (UKF).
Unscented Kalman Filter (UKF)
The UKF doesn’t require an explicit Jacobian because it approximates the distribution of the state by propagating a set of “sigma points” through the nonlinear function . This makes it well-suited for complex models without linearization:
Proceed with the Standard Update Step: After obtaining (either from in EKF/UKF or through matrix multiplication in the linear Kalman filter), you can proceed with the usual Kalman filter update step, where measurements are incorporated to correct the predicted state.
The Unscented Kalman Filter (UKF) propagates sigma points through the nonlinear model to approximate the state distribution after the prediction step. Instead of linearizing the model (as in the Extended Kalman Filter), the UKF uses a set of carefully chosen sigma points that capture the mean and covariance of the state. These sigma points are propagated through the nonlinear model, and the resulting transformed points are used to approximate the predicted mean and covariance of the state distribution.
Here’s a step-by-step explanation of how sigma points propagation works in the UKF.
Step 1: Generate Sigma Points
For a state vector of dimension with mean and covariance , we create sigma points. These sigma points are chosen to capture the spread of the distribution around the mean.
- Calculate the square root of the scaled covariance matrix:
where is a scaling parameter, often chosen as , with tuning parameters (often set to a small positive value, e.g., ) and (often set to 0 or 3-L).
- Generate the sigma points:
where is the -th column of the matrix square root of .
This results in a set of sigma points:
These points are distributed around the mean and represent the original state distribution.
Step 2: Propagate Sigma Points through the Nonlinear Function
Each sigma point is then propagated through the nonlinear state transition function :
This gives a set of transformed sigma points that represent the distribution of the predicted state after applying the nonlinear function.
Step 3: Compute the Predicted Mean and Covariance
Once we have the transformed sigma points, we calculate the predicted mean and predicted covariance of the state.
- Compute the predicted mean:
where are the weights for the mean, which depend on the scaling parameter . Typically, and for .
- Compute the predicted covariance:
where are the weights for the covariance (similar to , but can be adjusted if desired), and is the process noise covariance.
Step 4: Predict the Measurement and Update (Correction Step)
The prediction is completed at this point. In the UKF, the correction step (updating the state estimate based on the measurement) also uses sigma points.
- Predict measurement sigma points: Transform the sigma points using the measurement function (if nonlinear):
- Compute predicted measurement mean and covariance:
Calculate the predicted measurement mean and the measurement covariance , as well as the cross-covariance between the state and measurement. These are calculated using weighted sums similar to the predicted mean and covariance for the state.
- Compute Kalman gain:
- Update state and covariance:
Finally, use the Kalman gain to update the state estimate and covariance based on the measurement :
How to get required matrixes if you have only data samples
In the Kalman filter, the covariance matrices (particularly , the error covariance matrix, and , the process noise covariance matrix) play essential roles in managing and representing uncertainties, but they are not always directly calculated from scratch. Here’s how each covariance matrix is determined and managed in a typical Kalman filter:
Error Covariance Matrix (P)
The error covariance matrix represents the uncertainty in the state estimate. It is initialized and then continuously updated by the Kalman filter.
Initialization of P
- At the start, is usually initialized based on an initial guess about the uncertainties of the state estimates. For example, if the initial state variables are position and velocity, the initial matrix could be set with large values on the diagonal (to indicate high uncertainty if the initial estimate is rough) or small values if the initial state estimate is already quite accurate.
where and represent initial variances (uncertainties) in position and velocity estimates, respectively.
Update of P during Prediction and Correction
After initialization, is updated as part of each prediction and correction cycle:
- Prediction Step:
During the prediction phase, the filter updates to account for the passage of time and the possibility of errors or variations in the system model. This is achieved by applying the following equation:
- : The state transition matrix, which models the dynamics of how each state variable (position, velocity, etc.) affects the next state.
- : Predicted error covariance matrix, representing the updated uncertainty after accounting for system dynamics and added process noise.
- : Process noise covariance matrix, which introduces additional uncertainty to account for model imperfections.
This update increases to reflect the increasing uncertainty as we move forward in time without a new measurement.
- Correction Step:
When a new measurement is available, is updated to reduce uncertainty using the Kalman gain :
- : Kalman gain, which determines how much the measurement influences the state estimate.
- : Observation matrix, mapping state variables to the measurement space.
This step reduces the uncertainty in by combining the prediction with the new measurement, allowing the Kalman filter to become more certain about the updated state.
Process Noise Covariance Matrix (Q)
The process noise covariance matrix represents the uncertainty in the system’s dynamics or model. It’s essentially a model of how much unpredicted variation or disturbance (process noise) we expect in each state variable over time.
Setting Q
In most cases, is designed rather than calculated based on the following considerations:
- Expected Random Disturbances: If we are tracking a moving object, for instance, we might set a higher if we expect the object to experience random, unpredictable accelerations (like wind or terrain changes).
- Model Imperfections: Higher values in are used to account for limitations in the system model.
For example, if we are tracking both position and velocity, we might set as:
where:
- and are the variances representing the expected uncertainty in position and velocity due to unknown disturbances or inaccuracies in the model.
In some cases, domain knowledge or experimentation is used to tune to achieve desired performance. There are also advanced methods, such as adaptive filtering, where can be adjusted dynamically based on observed errors.
Measurement Noise Covariance Matrix (R)
The measurement noise covariance matrix represents the uncertainty (noise) associated with the measurements received from sensors. Each element in quantifies how much noise is expected in each measurement.
Setting R
The values in are typically derived from:
- Sensor Specifications: Often, the manufacturer provides information on the sensor’s accuracy, which can be used to set . For example, a GPS sensor may have an accuracy of ±3 meters, which would influence the variance (uncertainty) for position measurements.
- Experimental Data: Sometimes is determined empirically by taking repeated measurements in a controlled environment to estimate the variance.
For example, if the Kalman filter is using a GPS sensor to measure position, might be set as:
where is the variance associated with the position measurement’s noise.
Matrices extraction from data summary
Here’s a summary of how each covariance matrix is determined in a Kalman filter:
Covariance Matrix
Covariance Matrix | Initialization | Update Method |
---|---|---|
(Error Covariance) | Initialized based on initial state uncertainty, often chosen manually. | Updated each cycle based on predictions and new measurements. |
(Process Noise Covariance) | Set based on expected model uncertainties and external disturbances. Often chosen empirically or based on domain knowledge. | Usually remains constant, though it can be adapted based on observed performance. |
(Measurement Noise Covariance) | Set based on sensor noise characteristics, from specifications or empirical measurements. | Typically remains constant, but can also be adjusted if the sensor’s noise characteristics change. |
In summary, these covariance matrices are often set based on known characteristics of the system and sensors, and then refined through the Kalman filter’s recursive process. The values in , , and are crucial for determining how the Kalman filter balances the predicted state and measurement to achieve an optimal estimate.
Conclusion
The Kalman filter is a cornerstone algorithm for estimating the state of dynamic systems in the presence of noise and uncertainty. By leveraging a mathematical model of the system, it predicts future states and continuously refines these estimates using noisy measurements. At the heart of the Kalman filter are key components: the state transition matrix, observation model, and covariance matrices that describe the uncertainties in both the process and measurements. These matrices, derived from system dynamics and data, play a crucial role in balancing prediction accuracy and adaptability to real-world variability.
Understanding and fine-tuning these components based on system knowledge, sensor data, and empirical observations is essential for the filter to perform optimally. Whether tracking motion, filtering noisy signals, or managing uncertainties in complex systems, the Kalman filter provides a versatile framework that evolves as data is processed. This adaptability makes it invaluable across fields such as robotics, navigation, and finance, offering reliable insights even in unpredictable environments.