Deep Learning

IndexedSlices in Tensorflow

Run in Google Colab View source on GitHub Download notebook In this post, we will discuss about IndexedSlices class of Tensorflow. We will try to answer the following questions in this blog: What are IndexedSlices? Where do we get it? How to convert from IndexedSlices to tensors? What are IndexedSlices? According to Tensorflow documentation, IndexedSlices are sparse representation of a set of tensor slices at a given index.

Tensorflow 2 code for Attention Mechanisms chapter of Dive into Deep Learning (D2L) book

View GitHub Page ----- View source on GitHub Download code (.zip) This code has been merged with D2L book. See PR: 1756, 1768 This post contains Tensorflow 2 code for Attention Mechanisms chapter of Dive into Deep Learning (D2L) book. The chapter has 7 sections and code for each section can be found at the following links.

Reading multiple files in Tensorflow 2 using Sequence

Run in Google Colab View source on GitHub Download notebook In this post, we will read multiple csv files using Tensroflow Sequence. In an earlier post we had demonstrated the procedure for reading multiple csv files using a custom generator. Though generators are convenient for handling chunks of data from a large dataset, they have limited portability and scalability (see the caution here).

Reading multiple csv files in PyTorch

Run in Google Colab View source on GitHub Download notebook In many engineering applications data are usually stored in CSV (Comma Separated Values) files. In big data applications, it’s not uncommon to obtain thousands of csv files. As the number of files increases, at some point, we can no longer load the whole dataset into computer’s memory. In deep learning applications it is increasingly common to come across datasets that don’t fit in the computer’s memory.

Efficiently reading multiple files in Tensorflow 2

Note: Whether this method is efficient or not is contestable. Efficiency of a data input pipeline depends on many factors. How efficiently data are loaded? What is the computer architecture on which computations are being done? Is GPU available? And the list goes on. So readers might get different performance results when they use this method in their own problems. For the simple (and small) problem considered in this post, we got no perceivable performance improvement.

Reading multiple files in Tensorflow 2

Run in Google Colab View source on GitHub Download notebook In this post, we will read multiple .csv files into Tensorflow using generators. But the method we will discuss is general enough to work for other file formats as well. We will demonstrate the procedure using 500 .csv files. These files have been created using random numbers. Each file contains only 1024 numbers in one column.

Using Python Generators

Run in Google Colab View source on GitHub Download notebook In this post, we will discuss about generators in python. In this age of big data it is not unlikely to encounter a large dataset that can’t be loaded into RAM. In such scenarios, it is natural to extract workable chunks of data and work on it. Generators help us do just that.

Data-Driven Remaining Useful Life (RUL) Prediction

Aim of this project is to produce reproducible results in condition monitoring. We will apply some of the standard machine learning techniques to publicly available datasets and show the results with code for remaining useful life (RUL) prediction task. This is an ongoing project and will evolve over time. Related notebooks can be found at this [github page](https://biswajitsahoo1111.github.io/rul_codes_open/).

Data-Driven Machinery Fault Diagnosis

Aim of this project is to produce reproducible results in condition monitoring. We will apply some of the standard machine learning techniques to publicly available machinery datasets and show the results with code for fault diagnosis task. This is an ongoing project and will evolve over time. Related notebooks and data can be found at this [github page](https://biswajitsahoo1111.github.io/cbm_codes_open/).