Tensorflow 2 code for Attention Mechanisms chapter of Dive into Deep Learning (D2L) book

View source on GitHub Download code (.zip)

This code has been merged with D2L book. See PR: 1756, 1768

This post contains Tensorflow 2 code for Attention Mechanisms chapter of Dive into Deep Learning (D2L) book. The chapter has 7 sections and code for each section can be found at the following links. We have given only code implementations. For theory, readers should refer the book.

10.1. Attention Cues

10.2. Attention Pooling: Nadaraya-Watson Kernel Regression

10.3. Attention Scoring Functions

10.4. Bahdanau Attention

10.5. Multi-Head Attention

10.6. Self-Attention and Positional Encoding

10.7. Transformer

Additional sections:

9.7. Sequence to Sequence Learning

Additional Chapters:

Chapter 17: Generative Adversarial Networks

How to run these code:

The best way (in our opinion) is to either clone the repo (or download the zipped repo) and then run each notebook from the cloned (or extracted) folder. All the notebooks will run without any issue.

Note: We claim no originality for the code. Credit goes to the authors of this excellent book. However, all errors and omissions are my own and readers are encouraged to bring it to my notice. Finally, no TF code was available (to the best of my knowledge) for Attention Mechanisms chapter when this repo was first made public.

Biswajit Sahoo
Biswajit Sahoo
Machine Learning Engineer

My research interests include machine learning, deep learning, signal processing and data-driven machinery condition monitoring.

Related