Accurate classification of fault is important for condition based maintenance (CBM) applications. Two main approaches commonly used for fault classification are model-based and data-driven. Data-driven approaches are becoming increasingly popular in applications as these methods can be easily automated and achieve higher accuracy at different tasks. Data-driven approaches can be based on shallow learning or deep learning. In shallow learning, useful features are first calculated from raw time domain data. The features may pertain to time domain, or frequency domain, or time-frequency domain. These features are then fed into a machine learning algorithm that does fault classification. In contrast, deep learning models don’t require any hand-crafted features. Representations are learned automatically from data. Thus, deep learning models take raw time domain data as input and produce classification results as output in an end-to-end manner. This makes interpretation of deep learning models difficult. In this paper, we show that the classification ability of deep neural network is derived from hidden representations. Those hidden representations can be used as features in classical machine learning algorithms for fault classification. This helps in explaining the classification ability of different layers of representations of deep networks. This technique has been applied to a real-world bearing dataset producing promising results.