# Data-Driven Machinery Fault Diagnosis

## Introduction

Condition based maintenance (CBM) is the process of doing maintenance only when it is required. Adoption of this maintenance strategy leads to significant monetary gains as it precludes periodic maintenance and reduces unplanned downtime. Another term commonly used for condition based maintenance is predictive maintenance. As the name suggests, in this method we predict in advance when to perform maintenance. Maintenance is required, if fault has already occurred or is imminent. This leads us to the problem of fault diagnosis and prognosis.

In fault diagnosis, fault has already occurred and our aim is to find what type of fault is there and what is its severity. In fault prognosis, our aim is to predict the time of occurrence of fault in future, given its present state. These two problem are central to condition based maintenance. There are many methods to solve these problems. These methods can be broadly divided into two groups:

• Model Based Approaches
• Data-Driven Approaches

In model based approach, a complete model of the system is formulated and it is then used for fault diagnosis and prognosis. But this method has several limitations. Firstly, it is a difficult task to accurately model a system. Modeling becomes even more challenging with variations in working conditions. Secondly, we have to formulate different models for different tasks. For example, to diagnose bearing fault and gear fault, we have to formulate two different models. Data-driven methods provide a convenient alternative to these problems.

In data-driven approach, we use operational data of the machine to design algorithms that are then used for fault diagnosis and prognosis. The operational data may be vibration data, thermal imaging data, acoustic emission data, or something else. These techniques are robust to environmental variations. Accuracy obtained by data-driven methods is also at par and sometimes even better than accuracy obtained by model based approaches. Due to these reasons data-driven methods are becoming increasingly popular at diagnosis and prognosis tasks.

## Aim of the project

In this project we will apply some of the standard machine learning techniques to publicly available data sets and show their results with code. There are not many publicly available data sets in machinery condition monitoring. So we will manage with those that are publicly available. Unlike machine learning community where almost all data and codes are open, in condition monitoring very few things are open, though some people are gradually making codes open. This project is a step towards that direction, even though a tiny one.

This is an ongoing project and modifications and additions of new techniques will be done over time. Python and R are two popular programming languages that are used in machine learning applications. We will use those for our demonstrations. Tensorflow will be used for deep learning applications. This page contains results on fault diagnosis only. Results on fault prognosis can be found here.

## Results using Case Western Reserve University Bearing Data*

We will first apply classical feature based methods (so-called shallow learning methods) to obtain results and then apply deep learning based methods. In feature based methods, we will extensively use wavelet packet energy features and wavelet packet entropy featues that are calculated from raw time domain data. Dataset description and time domain preprocessing steps can be found here. Steps to compute time domain features are explained in this notebook. The procedure detailing calculation of wavelet packet energy features can be found at this link and similar calculations for wavelet packet entropy features can be found at this link. Also see the following two notebooks for computation of wavelet packet features in Python: Wavelet packet energy features in Python and Wavelet packet entropy features in Python.

1. SVM on time domain features (10 classes, sampling frequency: 48k) (Overall accuracy: 96.5%) (Python code) (R code)

2. SVM on wavelet packet energy features (10 classes, sampling frequency: 48k) (Overall accuracy: 99.3%) (Python code) (R code)

3. SVM on wavelet packet entropy features (10 classes, sampling frequency: 48k) (Overall accuracy: 99.3%) (Python code) (R code)

4. SVM on time and wavelet packet features (12 classes, sampling frequency: 12k) (Achieves 100% test accuracy in one case) (Python code) (R code)

5. Multiclass Logistic Regression on wavelet packet energy features (10 classes, sampling frequency: 48k) (Overall accuracy: 98.5%) (Python code) (R code)

6. Multiclass Logistic Regression on wavelet packet energy features (12 classes, sampling frequency: 12k) (Overall accuracy: 99.7%) (Python code) (R code)

7. LDA on wavelet packet energy features (10 classes, sampling frequency: 48k) (Overall accuracy: 89.8%) (Python code) (R code)

8. LDA on wavelet packet energy features (12 classes, sampling frequency: 12k) (Overall accuracy: 99.5%) (Python code) (R code)

9. QDA on wavelet packet energy features (10 classes, sampling frequency: 48k) (Overall accuracy: 96.5%) (Python code) (R code)

10. QDA on wavelet packet energy features (12 classes, sampling frequency: 12k) (Overall accuracy: 99%) (Python code) (R code)

11. kNN on wavelet packet energy features (10 classes, sampling frequency: 48k) (Overall accuracy: 89.8%) (Python code) (R code)

12. kNN on wavelet packet energy features (12 classes, sampling frequency: 12k) (Overall accuracy: 99.5%) (Python code) (R code)

13. Decision tree on wavelet packet energy features (10 classes, sampling frequency: 48k) (Overall accuracy: 94.5%) (Python code) (R code)

14. Decision tree on wavelet packet energy features (12 classes, sampling frequency: 12k) (Overall accuracy: 99.7%) (Python code) (R code)

15. Bagging on wavelet packet energy features (10 classes, sampling frequency: 48k) (Overall accuracy: 97%) (Python code) (R code)

16. Bagging on wavelet packet energy features (12 classes, sampling frequency: 12k) (Overall accuracy: 100%) (Python code) (R code)

17. Boosting on wavelet packet energy features (10 classes, sampling frequency: 48k) (Overall accuracy: 99%) (Python code) (R code)

18. Boosting on wavelet packet energy features (12 classes, sampling frequency: 12k) (Overall accuracy: 100%) (Python code) (R code)

19. Random forest on wavelet packet energy features (10 classes, sampling frequency: 48k) (Overall accuracy: 98.1%) (Python code) (R code)

20. Random forest on wavelet packet energy features (12 classes, sampling frequency: 12k) (Overall accuracy: 100%) (Python code) (R code)

* This hyperlink points to the actual website of CWRU bearing dataset. Unfortunately, it has been recently observed that the original website remains down most of the time. As the dataset is well known, it can still be found on the internet at different places. Interested readers who want to experiment with this dataset can find it here (If it’s not down). Actual data are stored in .mat format. But the data in previous link are first extracted from .mat format and then individually stored in .csv format. Readers should first try to download the data from the original website. If that attempt fails, they should explore other options.

## Enter Deep Learning

In this section, we will show results of fault diagnosis task using deep learning on the same Case Western Reserve University bearing dataset. Due to the nondeterministic nature of operations used in deep learning and dependence of libraries like Tensorflow on computer architecture, readers might obtain slightly different results than those in the notebooks. As a more reliable measure, we report average results of ten iterations. Our models are small enough to permit us to run those that many times in a reasonable amount of time. For reproducibility of our results, we also share the saved models of each notebook. All saved models can be found at this link. A notebook describing the steps to use the saved models can be found here.

1. Fault diagnosis using convolutional neural network (CNN) on raw time domain data (10 classes, sampling frequency: 48k) (Overall accuracy: 98.7%)

2. CNN based fault diagnosis using continuous wavelet transform (CWT) of time domain data (10 classes, sampling frequency: 48k) (Overall accuracy: 99.1%)

(This list will be updated gradually.)

### Why have I used only Jupyter notebooks?

These notebooks are for educational purposes only. Our experiments are relatively small scale and can be run in a reasonable amount of time in a notebook. I personally love the interactive nature of jupyter notebooks. We can see what we are doing. So the answer to the above question is: personal choice. I also don’t intend to deploy these, at least for the time being, in a production environment. Readers who wish to build deployment ready systems should bear in mind that they have to do many other things than just run an algorithm in a jupyter notebook.

For attribution, cite this project as

BibTeX citation
}