Failure Prediction System
Serokell developed an ML-based time-to-failure prediction system for electric locomotive engines for the company providing innovative solutions in the rail transportation industry.
The objective was to develop a cutting-edge AI failure prediction system with high forecast accuracy to prevent costly breakdowns, minimize service-related downtime, and avoid frequent inspections. Another requirement was the system's ability to provide interpretable predictions.
Relying on a black box-based AI failure predictor, which requires halting the operation of expensive locomotives for maintenance, can hinder effective decision-making for business owners.
However, a failure prediction system that offers easily understandable, fact-supported alerts to technicians and service engineers can gain the trust of transportation corporations and is highly sought after in the logistics market.
Specialized sensors were integrated in the engine system to collect raw data, thus leveraging traditional locomotive equipment to obtain additional technical details.
We used big data analysis to identify critical points and enable the system to accurately forecast the time to failure.
Our solution combined the power of the internet of things (IoT) and machine learning.
Challenges
Designing a software solution for a unique and complex multivariate system is a challenge in itself. However, our main obstacle was the lack of continuous data, including that preсeding the breakdown.
The provided dataset contained data from several electric locomotives for the last two years, presented as 36-hour intervals with timestamps from each sensor every few seconds.
When working on the project, we faced several difficulties related to data quality: data biases, outliers, and missing values. An additional challenge was meeting the requirements to ensure the transparency and interpretability of our results.
- In some instances, missing values accounted for up to 16% of the total timestamps available.
- The intervals between time data were not in consecutive order, gaps of up to three weeks between them.
- Up to 20% of intervals lacked enough metainformation to determine their position in the timeline and had to be dismissed from the analysis.
- Some sensors in locomotives did not work at all, resulting in missing or corrupted data.
- We did not have the opportunity to collect more data from sensors for the last hours, and even days before the engine failure.
Due to the limited data availability, we had to dismiss several machine learning algorithms which were highly sensitive to data quality and could not provide us with the expected level of accuracy. However, after thorough investigation, we developed a solution based on decision tree ensembles, specifically the XGBoost model.
Our tasks included:
We spent significant time cleaning and preprocessing the dataset to address the above challenges.
This involved:
- Handling missing values, outliers, and data biases.
- Transforming the dataset to fit the ensemble of the XGBoost model decision trees.
To handle time-to-failure prediction and meet the customer's requirement for interpretability, we incorporated the widely used Weibull distribution into the XGBoost model. The Weibull distribution, based on fragmentation data, calculates a continuous distribution function and includes both increasing and decreasing failure rates as parameters.
Results
We developed a transparent and interpretable ML-based time-to-failure prediction system for electric locomotive engines that can provide breakdown forecasts with a good accuracy up to 36 hours in advance and reasonably accurate predictions for up to 7 days.
The developed system is able to predict a fault in 36 hours with an F1 score of more than 0.8 and in 7 days with an F1 score of 0.7.
We have designed an interpretable AI model that uses the statistical Weibull probability density function, enabling engineers in the railway company to understand the factors underlying the predictions.
We have designed an interpretable AI model that uses the statistical Weibull probability density function, enabling engineers in the railway company to understand the factors underlying the predictions.
By preventing breakdowns, transportation companies can reduce the risk of accidents and minimize the cost of repairs, resulting in increased ROI. This project has significant benefits, both commercially and as a safety-protection measure for the rail transportation industry.
Let’s Have a Talk
Partner with Serokell to bring to life your vision for modern AI development.
Contact us