Deep Learning

IndexedSlices in Tensorflow

Run in Google Colab View source on GitHub Download notebook In this post, we will discuss about IndexedSlices class of Tensorflow. We will try to answer the following questions in this blog: What are IndexedSlices? Where do we get it? How to convert from IndexedSlices to tensors? What are IndexedSlices? According to Tensorflow documentation, IndexedSlices are sparse representation of a set of tensor slices at a given index.

Reading multiple files in Tensorflow 2 using Sequence

Run in Google Colab View source on GitHub Download notebook In this post, we will read multiple csv files using Tensroflow Sequence. In an earlier post we had demonstrated the procedure for reading multiple csv files using a custom generator. Though generators are convenient for handling chunks of data from a large dataset, they have limited portability and scalability (see the caution here).

Reading multiple csv files in PyTorch

Run in Google Colab View source on GitHub Download notebook In many engineering applications data are usually stored in CSV (Comma Separated Values) files. In big data applications, it’s not uncommon to obtain thousands of csv files. As the number of files increases, at some point, we can no longer load the whole dataset into computer’s memory. In deep learning applications it is increasingly common to come across datasets that don’t fit in the computer’s memory.

Efficiently reading multiple files in Tensorflow 2

Note: Whether this method is efficient or not is contestable. Efficiency of a data input pipeline depends on many factors. How efficiently data are loaded? What is the computer architecture on which computations are being done? Is GPU available? And the list goes on. So readers might get different performance results when they use this method in their own problems. For the simple (and small) problem considered in this post, we got no perceivable performance improvement.

Reading multiple files in Tensorflow 2

Run in Google Colab View source on GitHub Download notebook In this post, we will read multiple .csv files into Tensorflow using generators. But the method we will discuss is general enough to work for other file formats as well. We will demonstrate the procedure using 500 .csv files. These files have been created using random numbers. Each file contains only 1024 numbers in one column.

Using Python Generators

Run in Google Colab View source on GitHub Download notebook In this post, we will discuss about generators in python. In this age of big data it is not unlikely to encounter a large dataset that can’t be loaded into RAM. In such scenarios, it is natural to extract workable chunks of data and work on it. Generators help us do just that.