Data-Driven Remaining Useful Life (RUL) Prediction

Introduction

Remaining useful life (RUL) prediction is the study of predicting when something is going to fail, given its present state. The problem has a prophetic charm associated with it. While a soothsayer can make a prediction about almost anything (including RUL of a machine) confidently, many people will not accept the prediction because of its lack of scientific basis. Here, we will try to solve the problem with scientific reasoning.

A component (or a machine) is said to have failed when it can no longer perform its desired task to the satisfaction of the user. For example, Li-Ion battery of an electric vehicle is said to have failed when it requires frequent recharging to travel a small distance. Similarly, a bearing of a machine is said to have failed, if level of vibration produced at the bearing goes above some acceptable limit. Other examples can be thought of for different applications. The goal then is to predict beforehand when something is going to fail. Knowledge of a component’s expected time of failure will help us prepare well for the inevitable. In industrial setting, where any unplanned shutdown of a critical component has huge monetary cost, knowing when a machine is going to fail will result in significant monetary gains.

There are many techniques developed over the years to predict RUL of a component. All those techniques can be broadly divided into two categories.

• Model Based Methods
• Data-Driven Methods

In model based methods, we try to formulate a mathematical model of the system under consideration. Then using that model we try to predict RUL of the component. Though model based methods are used in some cases, there are many other applications where formulating a full mathematical model of the system is extremely difficult. In some cases, the underlying physics is so complex that we have to make many simplifying assumptions. Whether the simplifying assumptions are justified or not is determined by collecting real data from the machine. Therefore, it requires extensive domain knowledge and thus is a territory of only a select few who can actually do these things.

In contrast, in data-driven methods all information about a machine is gained from the data collected from it. With readily available sensors we can collect huge amounts of data for almost any application. By analyzing that data we can get an idea about the condition of the machine. That will help us in making an informed decision about the RUL of the machine. In this process we make no assumptions about the machine. Increasingly, data-driven methods are getting better at making reliable predictions. As the name of the project suggests, we will only focus on data-driven methods for RUL prediction. The problem of RUL prediction is also know as prognosis in some fields. Some people also call it prognostics. We will only use the term RUL prediction. In the beginning, we will mainly focus on predicting RUL of mechanical components. Later we will explore other application areas.

Aim of the project

This is an ongoing project and modifications and additions of new techniques will be done over time. Python and R are two popular programming languages that are used in machine learning applications. We will use Python to demonstrate our results. At a later stage we might add equivalent R code. To implement deep learning models, we will use Tensorflow.

Results using NASA’s Turbofan Engine Degradation Dataset

We will first apply classical machine learning methods (so-called shallow learning methods) to obtain results and then apply deep learning based methods. Dataset description and preprocessing steps can be found at this link. We will use the same preprocessing steps, with minor changes, in all notebooks. We strongly encourage readers to first go over data preparation notebook before using results notebooks. In the table below, we report Root Mean Square Error (RMSE) values. Click on the numbers in the table to view corresponding notebooks.

Note on last column of following table: The last column specifies the degradation model used in the notebooks. There are two common degradation models that are used for this particular turbofan dataset: Linear degradation model and Piecewise linear degradation model. For more details about both, see this. When we use piecewise linear degradation model, we have to assume an early RUL value. This is nothing but the value of RUL that is assumed when the component is relatively new. In literature, different people use different early RUL values. In our examples, when we specify an early RUL value, that means that we apply the same early RUL across all 4 datasets.

Method FD001 FD002 FD003 FD004 Degradation Model
Gradient Boosting 19.06 28.97 20.55 29.49 Piecewise Linear (Early RUL = 125)
Random Forest 19.15 29.00 20.53 29.75 Piecewise Linear (Early RUL = 125)
Support Vector Regression (SVR) 18.28 30.50 21.37 34.11 Piecewise Linear (Early RUL: 125 (FD001, FD003), 150 (FD002, FD004))
Gradient Boosting 33.24* 29.88 47.94* 40.34* Linear

*See the notebook to get a complete picture.

Enter Deep Learning

In this section, we will apply deep learning to predict RUL of Turbofan dataset. Due to the nondeterministic nature of operations used in deep learning and dependence of libraries like Tensorflow on computer architecture, readers might obtain slightly different results than those in the notebooks. For reproducibility of our results, we also share the saved models of each notebook. All saved models for Turbofan dataset can be found at this link. A notebook describing the steps to use the saved models can be found here.

Method FD001 FD002 FD003 FD004 Degradation Model
LSTM 15.16 - 15.54 - Piecewise Linear (Early RUL = 125)
1D CNN 15.84 - 15.78 - Piecewise Linear (Early RUL = 125)

Attention Based RUL Prediction

What is attention? We recommend Chapter 10 of this book for more details. We provide notebooks that implement GRU based additive attention for RUL prediction. For reproducibility of our results, we share trained weights. All trained weights can be found here. We also provide separate notebooks describing steps to use trained weights to reproduce exact results as obtained by us.

Method FD001 FD002 FD003 FD004 Degradation Model
GRU + Additive Attention 14.21 27.99 14.64 26.77 Piecewise Linear (Early RUL: 125 (FD001, FD003), 150 (FD002, FD004))

(This table will be updated gradually.)

Why have I used only Jupyter notebooks?

These notebooks are for educational purposes only. Our experiments are relatively small scale and can be run in a reasonable amount of time in a notebook. I personally love the interactive nature of jupyter notebooks. We can see what we are doing. So the answer to the above question is: personal choice. I also don’t intend to deploy these, at least for the time being, in a production environment. Readers who wish to build deployment ready systems should bear in mind that they have to do many other things than just run an algorithm in a jupyter notebook.

For attribution, cite this project as

BibTeX citation
author = {Sahoo, Biswajit},
title = {Data-Driven Remaining Useful Life (RUL) Prediction},
url = {https://biswajitsahoo1111.github.io/rul_codes_open/},
year = {2018}
}

Readers should cite original datasets separately.

Biswajit Sahoo
PhD Student

My research interests include machine learning, deep learning, signal processing and data-driven machinery condition monitoring.