Modern Computer Vision with PyTorch by V Kishore Ayyadevara -- Read -- Imperial Library of Trantor

Index

Title Page Copyright and Credits

Modern Computer Vision with PyTorch

Dedication About Packt

Why subscribe?

Contributors

About the authors About the reviewer Packt is searching for authors like you

Preface

Who this book is for What this book covers To get the most out of this book

Download the example code files Download the color images Conventions used

Get in touch

Reviews

Section 1 - Fundamentals of Deep Learning for Computer Vision Artificial Neural Network Fundamentals

Comparing AI and traditional machine learning Learning about the artificial neural network building blocks Implementing feedforward propagation

Calculating the hidden layer unit values Applying the activation function Calculating the output layer values Calculating loss values

Calculating loss during continuous variable prediction Calculating loss during categorical variable prediction

Feedforward propagation in code

Activation functions in code Loss functions in code

Implementing backpropagation

Gradient descent in code Implementing backpropagation using the chain rule

Putting feedforward propagation and backpropagation together Understanding the impact of the learning rate Summarizing the training process of a neural network Summary Questions

PyTorch Fundamentals

Installing PyTorch PyTorch tensors

Initializing a tensor Operations on tensors Auto gradients of tensor objects Advantages of PyTorch's tensors over NumPy's ndarrays

Building a neural network using PyTorch

Dataset, DataLoader, and batch size Predicting on new data points Implementing a custom loss function Fetching the values of intermediate layers

Using a sequential method to build a neural network Saving and loading a PyTorch model

state dict Saving Loading

Summary Questions

Building a Deep Neural Network with PyTorch

Representing an image

Converting images into structured arrays and scalars

Why leverage neural networks for image analysis? Preparing our data for image classification Training a neural network Scaling a dataset to improve model accuracy Understanding the impact of varying the batch size

Batch size of 32 Batch size of 10,000

Understanding the impact of varying the loss optimizer Understanding the impact of varying the learning rate

Impact of the learning rate on a scaled dataset

High learning rate Medium learning rate Low learning rate Parameter distribution across layers for different learning rates

Impact of varying the learning rate on a non-scaled dataset

Understanding the impact of learning rate annealing Building a deeper neural network Understanding the impact of batch normalization

Very small input values without batch normalization Very small input values with batch normalization

The concept of overfitting

Impact of adding dropout Impact of regularization

L1 regularization L2 regularization

Summary Questions

Section 2 - Object Classification and Detection Introducing Convolutional Neural Networks

The problem with traditional deep neural networks Building blocks of a CNN

Convolution Filter Strides and padding

Strides Padding

Pooling Putting them all together How convolution and pooling help in image translation

Implementing a CNN

Building a CNN-based architecture using PyTorch Forward propagating the output in Python

Classifying images using deep CNNs Implementing data augmentation

Image augmentations

Affine transformations Changing the brightness Adding noise Performing a sequence of augmentations

Performing data augmentation on a batch of images and the need for collate_fn Data augmentation for image translation

Visualizing the outcome of feature learning Building a CNN for classifying real-world images

Impact on the number of images used for training

Summary Questions

Transfer Learning for Image Classification

Introducing transfer learning Understanding VGG16 architecture Understanding ResNet architecture Implementing facial key point detection

2D and 3D facial key point detection

Multi-task learning – Implementing age estimation and gender classification Introducing the torch_snippets library Summary Questions

Practical Aspects of Image Classification

Generating CAMs Understanding the impact of data augmentation and batch normalization

Coding up road sign detection

Practical aspects to take care of during model implementation

Dealing with imbalanced data The size of the object within an image Dealing with the difference between training and validation data The number of nodes in the flatten layer Image size Leveraging OpenCV utilities

Summary Questions

Basics of Object Detection

Introducing object detection Creating a bounding box ground truth for training

Installing the image annotation tool

Understanding region proposals

Leveraging SelectiveSearch to generate region proposals Implementing SelectiveSearch to generate region proposals

Understanding IoU Non-max suppression Mean average precision Training R-CNN-based custom object detectors

Working details of R-CNN Implementing R-CNN for object detection on a custom dataset

Downloading the dataset Preparing the dataset Fetching region proposals and the ground truth of offset Creating the training data R-CNN network architecture Predict on a new image

Training Fast R-CNN-based custom object detectors

Working details of Fast R-CNN Implementing Fast R-CNN for object detection on a custom dataset

Summary Questions

Advanced Object Detection

Components of modern object detection algorithms

Anchor boxes Region Proposal Network

Classification and regression

Training Faster R-CNN on a custom dataset Working details of YOLO Training YOLO on a custom dataset

Installing Darknet Setting up the dataset format Configuring the architecture Training and testing the model

Working details of SSD

Components in SSD code

SSD300 MultiBoxLoss

Training SSD on a custom dataset Summary Test your understanding

Image Segmentation

Exploring the U-Net architecture

Performing upscaling

Implementing semantic segmentation using U-Net Exploring the Mask R-CNN architecture

RoI Align Mask head

Implementing instance segmentation using Mask R-CNN

Predicting multiple instances of multiple classes

Summary Questions

Applications of Object Detection and Segmentation

Multi-object instance segmentation

Fetching and preparing data Training the model for instance segmentation Making inferences on a new image

Human pose detection Crowd counting

Coding up crowd counting

Image colorization 3D object detection with point clouds

Theory

Input encoding Output encoding

Training the YOLO model for 3D object detection

Data format Data inspection Training Testing

Summary

Section 3 - Image Manipulation Autoencoders and Image Manipulation

Understanding autoencoders

Implementing vanilla autoencoders

Understanding convolutional autoencoders

Grouping similar images using t-SNE

Understanding variational autoencoders

Working of VAE KL divergence Building a VAE

Performing an adversarial attack on images Performing neural style transfer Generating deep fakes Summary Questions

Image Generation Using GANs

Introducing GANs Using GANs to generate handwritten digits Using DCGANs to generate face images Implementing conditional GANs Summary Questions

Advanced GANs to Manipulate Images

Leveraging the Pix2Pix GAN Leveraging CycleGAN Leveraging StyleGAN on custom images Super-resolution GAN

Architecture Coding SRGAN

Summary Questions

Section 4 - Combining Computer Vision with Other Techniques Training with Minimal Data Points

Implementing zero-shot learning

Coding zero-shot learning

Implementing few-shot learning

Building a Siamese network

Coding Siamese networks

Working details of prototypical networks Working details of relation networks

Summary Questions

Combining Computer Vision and NLP Techniques

Introducing RNNs

The idea behind the need for RNN architecture Exploring the structure of an RNN Why store memory?

Introducing LSTM architecture

The working details of LSTM Implementing LSTM in PyTorch

Implementing image captioning

Image captioning in code

Transcribing handwritten images

The working details of CTC loss Calculating the CTC loss value Handwriting transcription in code

Object detection using DETR

The working details of transformers

Basics of transformers

The working details of DETR Detection with transformers in code

Summary Questions

Combining Computer Vision and Reinforcement Learning

Learning the basics of reinforcement learning

Calculating the state value Calculating the state-action value

Implementing Q-learning

Q-value Understanding the Gym environment Building a Q-table Leveraging exploration-exploitation

Implementing deep Q-learning Implementing deep Q-learning with the fixed targets model

Coding up an agent to play Pong

Implementing an agent to perform autonomous driving

Installing the CARLA environment

Install the CARLA binaries Installing the CARLA Gym environment

Training a self-driving agent

model.py actor.py Training DQN with fixed targets

Summary Questions

Moving a Model to Production

Understanding the basics of an API Creating an API and making predictions on a local server

Installing the API module and dependencies Serving an image classifier

fmnist.py server.py Running the server

Moving the API to the cloud

Comparing Docker containers and Docker images Creating a Docker container

Creating the requirements.txt file Creating a Dockerfile Building a Docker image and creating a Docker container

Shipping and running the Docker container in the cloud

Configuring AWS Creating a Docker repository on AWS ECR and pushing the image Creating an EC2 instance Pulling the image and building the Docker container

Summary

Using OpenCV Utilities for Image Analysis

Drawing bounding boxes around words in an image Detecting lanes in an image of a road Detecting objects based on color Building a panoramic view of images Detecting the number plate of a car Summary

Appendix

Chapter 1 - Artificial Neural Network Fundamentals Chapter 2 - PyTorch Fundamentals Chapter 3 - Building a Deep Neural Network with PyTorch Chapter 4 - Introducing Convolutional Neural Networks Chapter 5 - Transfer Learning for Image Classification Chapter 6 - Practical Aspects of Image Classification Chapter 7 - Basics of Object Detection Chapter 8 - Advanced Object Detection Chapter 9 - Image Segmentation Chapter 11 - Autoencoders and Image Manipulation Chapter 12 - Image Generation Using GANs Chapter 13 - Advanced GANs to Manipulate Images Chapter 14 - Training with Minimal Data Points Chapter 15 - Combining Computer Vision and NLP Techniques Chapter 16 - Combining Computer Vision and Reinforcement Learning

Other Books You May Enjoy

Leave a review - let other readers know what you think

← Prev
Back
Next →

← Prev
Back
Next →