PyTorch Recipes

Pradeepta Mishra

PyTorch RecipesA Problem-Solution Approach

../images/474315_1_En_BookFrontmatter_Figa_HTML.png

Pradeepta Mishra

Bangalore, Karnataka, India

ISBN 978-1-4842-4257-5e-ISBN 978-1-4842-4258-2

https://doi.org/10.1007/978-1-4842-4258-2

Library of Congress Control Number: 2018968538

Apress Standard

Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.

While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.

Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com, or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.

I would like to dedicate this book to my dear parents, my lovely wife, Prajna, and my daughter, Priyanshi (Aarya). This work would not have been possible without their inspiration, support, and encouragement.

Introduction

Development of artificial intelligent products and solutions has recently become a norm; hence, the demand for graph theory–based computational frameworks is on the rise. Making the deep learning models work in real-life applications is possible when the modeling framework is dynamic, flexible, and adaptable to other frameworks.

PyTorch is a recent entrant to the league of graph computation tools/programming languages. Addressing the limitations of previous frameworks, PyTorch promises a better user experience in the deployment of deep learning models, and the creation of advanced models using a combination of convolutional neural networks, recurrent neural networks, LSTMs, and deep neural networks.

PyTorch was created by Facebook’s Artificial Intelligence Research division, which seeks to make the model development process simple, straightforward, and dynamic, so that developers do not have to worry about declaring objects before compiling and executing the model. It is based on the Torch framework and is an extension of Python.

This book is intended for data scientists, natural language processing engineers, artificial intelligence solution developers, existing practitioners working on graph computation frameworks, and researchers of graph theory. This book will get you started with understanding tensor basics, computation, performing arithmetic-based operations, matrix algebra, and statistical distribution-based operations using the PyTorch framework.

Chapters 3 and 4 provide detailed descriptions on neural network basics. Advanced neural networks, such as convolutional neural networks, recurrent neural networks, and LSTMs are explored. Readers will be able to implement these models using PyTorch functions.

Chapters 5 and 6 discuss fine-tuning the models, hyper parameter tuning, and the refinement of existing PyTorch models in production. Readers learn how to choose the hyper parameters to fine-tune the model.

In Chapter 7 , natural language processing is explained. The deep learning models and their applications in natural language processing and artificial intelligence is one of the most demanding skill sets in the industry. Readers will be able to benchmark the execution and performance of PyTorch implementation in deep learning models to execute and process natural language. They will be able to compare PyTorch with other graph computation–based deep learning programming tools.

Acknowledgments

I would like to thank my wife, Prajna, for her continuous inspiration and support, and sacrificing her weekends just to sit alongside me to help me in completing the book; my daughter, Aarya, for being patient all through my writing time; my father, for his eagerness to know how many chapters I had completed.

A big thank you to Nikhil, Celestin, and Divya, for fast-tracking the whole process and helping me and guiding me in the right direction.

I would like to thank my bosses, Ashish and Saty, for always being supportive of my initiatives in the AI and ML journey, and their continuous motivation and inspiration in writing in the AI space.

Chapter 1: Introduction to PyTorch, Tensors, and Tensor Operations 1

What Is PyTorch? 6

PyTorch Installation 7

Recipe 1-1. Using Tensors 9

Problem 9

Solution 10

How It Works 10

Conclusion 27

Chapter 2: Probability Distributions Using PyTorch 29

Recipe 2-1. Sampling Tensors 30

Problem 30

Solution 30

How It Works 30

Recipe 2-2. Variable Tensors 33

Problem 33

Solution 34

How It Works 35

Recipe 2-3. Basic Statistics 36

Problem 36

Solution 36

How It Works 36

Recipe 2-4. Gradient Computation 38

Problem 38

Solution 38

How It Works 39

Recipe 2-5. Tensor Operations 41

Problem 41

Solution 41

How It Works 41

Recipe 2-6. Tensor Operations 42

Problem 42

Solution 42

How It Works 43

Recipe 2-7. Distributions 45

Problem 45

Solution 45

How It Works 45

Conclusion 48

Chapter 3: CNN and RNN Using PyTorch 49

Recipe 3-1. Setting Up a Loss Function 49

Problem 49

Solution 50

How It Works 50

Recipe 3-2. Estimating the Derivative of the Loss Function 53

Problem 53

Solution 53

How It Works 53

Recipe 3-3. Fine-Tuning a Model 59

Problem 59

Solution 59

How It Works 60

Recipe 3-4. Selecting an Optimization Function 62

Problem 62

Solution 62

How It Works 62

Recipe 3-5. Further Optimizing the Function 67

Problem 67

Solution 67

How It Works 67

Recipe 3-6. Implementing a Convolutional Neural Network (CNN) 71

Problem 71

Solution 71

How It Works 71

Recipe 3-7. Reloading a Model 77

Problem 77

Solution 77

How It Works 77

Recipe 3-8. Implementing a Recurrent Neural Network (RNN) 80

Problem 80

Solution 80

How It Works 80

Recipe 3-9. Implementing a RNN for Regression Problems 85

Problem 85

Solution 86

How It Works 86

Recipe 3-10. Using PyTorch Built-in Functions 87

Problem 87

Solution 87

How It Works 88

Recipe 3-11. Working with Autoencoders 91

Problem 91

Solution 91

How It Works 91

Recipe 3-12. Fine-Tuning Results Using Autoencoder 95

Problem 95

Solution 95

How It Works 95

Recipe 3-13. Visualizing the Encoded Data in a 3D Plot 98

Problem 98

Solution 98

How It Works 98

Recipe 3-14. Restricting Model Overfitting 99

Problem 99

Solution 99

How It Works 100

Recipe 3-15. Visualizing the Model Overfit 102

Problem 102

Solution 102

How It Works 102

Recipe 3-16. Initializing Weights in the Dropout Rate 104

Problem 104

Solution 104

How It Works 105

Recipe 3-17. Adding Math Operations 106

Problem 106

Solution 106

How It Works 106

Recipe 3-18. Embedding Layers in RNN 108

Problem 108

Solution 108

How It Works 108

Conclusion 109

Chapter 4: Introduction to Neural Networks Using PyTorch 111

Recipe 4-1. Working with Activation Functions 112

Problem 112

Solution 112

How It Works 112

Recipe 4-2. Visualizing the Shape of Activation Functions 119

Problem 119

Solution 119

How It Works 119

Recipe 4-3. Basic Neural Network Model 122

Problem 122

Solution 122

How It Works 122

Recipe 4-4. Tensor Differentiation 125

Problem 125

Solution 125

How It Works 125

Conclusion 126

Chapter 5: Supervised Learning Using PyTorch 127

Introduction to Linear Regression 129

Recipe 5-1. Data Preparation for the Supervised Model 133

Problem 133

Solution 133

How It Works 133

Recipe 5-2. Forward and Backward Propagation 135

Problem 135

Solution 135

How It Works 136

Recipe 5-3. Optimization and Gradient Computation 139

Problem 139

Solution 139

How It Works 140

Recipe 5-4. Viewing Predictions 141

Problem 141

Solution 141

How It Works 141

Recipe 5-5. Supervised Model Logistic Regression 145

Problem 145

Solution 145

How It Works 145

Conclusion 149

Chapter 6: Fine-Tuning Deep Learning Models Using PyTorch 151

Recipe 6-1. Building Sequential Neural Networks 153

Problem 153

Solution 153

How It Works 153

Recipe 6-2. Deciding the Batch Size 155

Problem 155

Solution 155

How It Works 155

Recipe 6-3. Deciding the Learning Rate 158

Problem 158

Solution 158

How It Works 158

Recipe 6-4. Performing Parallel Training 162

Problem 162

Solution 163

How It Works 163

Conclusion 164

Chapter 7: Natural Language Processing Using PyTorch 165

Recipe 7-1. Word Embedding 168

Problem 168

Solution 169

How It Works 169

Recipe 7-2. CBOW Model in PyTorch 172

Problem 172

Solution 173

How It Works 173

Recipe 7-3. LSTM Model 175

Problem 175

Solution 175

How It Works 175

Index 179

About the Author and About the Technical Reviewer

About the Author

Pradeepta Mishra

../images/474315_1_En_BookFrontmatter_Figb_HTML.png

is a data scientist and artificial intelligence architect. He currently heads NLP, ML, and AI initiatives at Lymbyc, a leading-edge innovator in AI and machine learning based out of Bangalore, India. He has expertise in designing artificial intelligence systems for performing tasks such as understanding natural language and recommendations based on natural language processing. He has filed three patents as an inventor and has authored and co-authored two books: R Data Mining Blueprints (Packt Publishing, 2016) and R: Mining Spatial, Text, Web, and Social Media Data (Packt Publishing, 2017). There are two courses available on Udemy based on these books.

Pradeepta presented a keynote talk on the application of bidirectional LSTM for time series forecasting at the 2018 Global Data Science Conference 2018. He delivered a TEDx talk titled “Can Machines Think?”, a session on the power of artificial intelligence in transforming industries and changing job roles across industries. He has also delivered more than 150 tech talks on data science, machine learning, and artificial intelligence at various meetups, technical institutions, universities, and community forums.

He is on LinkedIn at www.linkedin.com/in/pradeepta/ .

About the Technical Reviewer

Shivendra Upadhyay

../images/474315_1_En_BookFrontmatter_Figc_HTML.png

has more than eight years of experience working for consulting and software firms. He has worked in data science with KPMG for more than three years, and has a firm grasp of machine learning and data science tools and technologies.

Table of Contents

About the Author and About the Technical Reviewer

About the Author

About the Technical Reviewer