Log In
Or create an account ->
Imperial Library
Title Page
Copyright and Credits
Modern Computer Vision with PyTorch
About Packt
Why subscribe?
About the authors
About the reviewer
Packt is searching for authors like you
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Section 1 - Fundamentals of Deep Learning for Computer Vision
Artificial Neural Network Fundamentals
Comparing AI and traditional machine learning
Learning about the artificial neural network building blocks
Implementing feedforward propagation
Calculating the hidden layer unit values
Applying the activation function
Calculating the output layer values
Calculating loss values
Calculating loss during continuous variable prediction
Calculating loss during categorical variable prediction
Feedforward propagation in code
Activation functions in code
Loss functions in code
Implementing backpropagation
Gradient descent in code
Implementing backpropagation using the chain rule
Putting feedforward propagation and backpropagation together
Understanding the impact of the learning rate
Summarizing the training process of a neural network
PyTorch Fundamentals
Installing PyTorch
PyTorch tensors
Initializing a tensor
Operations on tensors
Auto gradients of tensor objects
Advantages of PyTorch's tensors over NumPy's ndarrays
Building a neural network using PyTorch
Dataset, DataLoader, and batch size
Predicting on new data points
Implementing a custom loss function
Fetching the values of intermediate layers
Using a sequential method to build a neural network
Saving and loading a PyTorch model
state dict
Building a Deep Neural Network with PyTorch
Representing an image
Converting images into structured arrays and scalars
Why leverage neural networks for image analysis?
Preparing our data for image classification
Training a neural network
Scaling a dataset to improve model accuracy
Understanding the impact of varying the batch size
Batch size of 32
Batch size of 10,000
Understanding the impact of varying the loss optimizer
Understanding the impact of varying the learning rate
Impact of the learning rate on a scaled dataset
High learning rate
Medium learning rate
Low learning rate
Parameter distribution across layers for different learning rates
Impact of varying the learning rate on a non-scaled dataset
Understanding the impact of learning rate annealing
Building a deeper neural network
Understanding the impact of batch normalization
Very small input values without batch normalization
Very small input values with batch normalization
The concept of overfitting
Impact of adding dropout
Impact of regularization
L1 regularization
L2 regularization
Section 2 - Object Classification and Detection
Introducing Convolutional Neural Networks
The problem with traditional deep neural networks
Building blocks of a CNN
Strides and padding
Putting them all together
How convolution and pooling help in image translation
Implementing a CNN
Building a CNN-based architecture using PyTorch
Forward propagating the output in Python
Classifying images using deep CNNs
Implementing data augmentation
Image augmentations
Affine transformations
Changing the brightness
Adding noise
Performing a sequence of augmentations
Performing data augmentation on a batch of images and the need for collate_fn
Data augmentation for image translation
Visualizing the outcome of feature learning
Building a CNN for classifying real-world images
Impact on the number of images used for training
Transfer Learning for Image Classification
Introducing transfer learning
Understanding VGG16 architecture
Understanding ResNet architecture
Implementing facial key point detection
2D and 3D facial key point detection
Multi-task learning – Implementing age estimation and gender classification
Introducing the torch_snippets library
Practical Aspects of Image Classification
Generating CAMs
Understanding the impact of data augmentation and batch normalization
Coding up road sign detection
Practical aspects to take care of during model implementation
Dealing with imbalanced data
The size of the object within an image
Dealing with the difference between training and validation data
The number of nodes in the flatten layer
Image size
Leveraging OpenCV utilities
Basics of Object Detection
Introducing object detection
Creating a bounding box ground truth for training
Installing the image annotation tool
Understanding region proposals
Leveraging SelectiveSearch to generate region proposals
Implementing SelectiveSearch to generate region proposals
Understanding IoU
Non-max suppression
Mean average precision
Training R-CNN-based custom object detectors
Working details of R-CNN
Implementing R-CNN for object detection on a custom dataset
Downloading the dataset
Preparing the dataset
Fetching region proposals and the ground truth of offset
Creating the training data
R-CNN network architecture
Predict on a new image
Training Fast R-CNN-based custom object detectors
Working details of Fast R-CNN
Implementing Fast R-CNN for object detection on a custom dataset
Advanced Object Detection
Components of modern object detection algorithms
Anchor boxes
Region Proposal Network
Classification and regression
Training Faster R-CNN on a custom dataset
Working details of YOLO
Training YOLO on a custom dataset
Installing Darknet
Setting up the dataset format
Configuring the architecture
Training and testing the model
Working details of SSD
Components in SSD code
Training SSD on a custom dataset
Test your understanding
Image Segmentation
Exploring the U-Net architecture
Performing upscaling
Implementing semantic segmentation using U-Net
Exploring the Mask R-CNN architecture
RoI Align
Mask head
Implementing instance segmentation using Mask R-CNN
Predicting multiple instances of multiple classes
Applications of Object Detection and Segmentation
Multi-object instance segmentation
Fetching and preparing data
Training the model for instance segmentation
Making inferences on a new image
Human pose detection
Crowd counting
Coding up crowd counting
Image colorization
3D object detection with point clouds
Input encoding
Output encoding
Training the YOLO model for 3D object detection
Data format
Data inspection
Section 3 - Image Manipulation
Autoencoders and Image Manipulation
Understanding autoencoders
Implementing vanilla autoencoders
Understanding convolutional autoencoders
Grouping similar images using t-SNE
Understanding variational autoencoders
Working of VAE
KL divergence
Building a VAE
Performing an adversarial attack on images
Performing neural style transfer
Generating deep fakes
Image Generation Using GANs
Introducing GANs
Using GANs to generate handwritten digits
Using DCGANs to generate face images
Implementing conditional GANs
Advanced GANs to Manipulate Images
Leveraging the Pix2Pix GAN
Leveraging CycleGAN
Leveraging StyleGAN on custom images
Super-resolution GAN
Coding SRGAN
Section 4 - Combining Computer Vision with Other Techniques
Training with Minimal Data Points
Implementing zero-shot learning
Coding zero-shot learning
Implementing few-shot learning
Building a Siamese network
Coding Siamese networks
Working details of prototypical networks
Working details of relation networks
Combining Computer Vision and NLP Techniques
Introducing RNNs
The idea behind the need for RNN architecture
Exploring the structure of an RNN
Why store memory?
Introducing LSTM architecture
The working details of LSTM
Implementing LSTM in PyTorch
Implementing image captioning
Image captioning in code
Transcribing handwritten images
The working details of CTC loss
Calculating the CTC loss value
Handwriting transcription in code
Object detection using DETR
The working details of transformers
Basics of transformers
The working details of DETR
Detection with transformers in code
Combining Computer Vision and Reinforcement Learning
Learning the basics of reinforcement learning
Calculating the state value
Calculating the state-action value
Implementing Q-learning
Understanding the Gym environment
Building a Q-table
Leveraging exploration-exploitation
Implementing deep Q-learning
Implementing deep Q-learning with the fixed targets model
Coding up an agent to play Pong
Implementing an agent to perform autonomous driving
Installing the CARLA environment
Install the CARLA binaries
Installing the CARLA Gym environment
Training a self-driving agent
Training DQN with fixed targets
Moving a Model to Production
Understanding the basics of an API
Creating an API and making predictions on a local server
Installing the API module and dependencies
Serving an image classifier
Running the server
Moving the API to the cloud
Comparing Docker containers and Docker images
Creating a Docker container
Creating the requirements.txt file
Creating a Dockerfile
Building a Docker image and creating a Docker container
Shipping and running the Docker container in the cloud
Configuring AWS
Creating a Docker repository on AWS ECR and pushing the image
Creating an EC2 instance
Pulling the image and building the Docker container
Using OpenCV Utilities for Image Analysis
Drawing bounding boxes around words in an image
Detecting lanes in an image of a road
Detecting objects based on color
Building a panoramic view of images
Detecting the number plate of a car
Chapter 1 - Artificial Neural Network Fundamentals
Chapter 2 - PyTorch Fundamentals
Chapter 3 - Building a Deep Neural Network with PyTorch
Chapter 4 - Introducing Convolutional Neural Networks
Chapter 5 - Transfer Learning for Image Classification
Chapter 6 - Practical Aspects of Image Classification
Chapter 7 - Basics of Object Detection
Chapter 8 - Advanced Object Detection
Chapter 9 - Image Segmentation
Chapter 11 - Autoencoders and Image Manipulation
Chapter 12 - Image Generation Using GANs
Chapter 13 - Advanced GANs to Manipulate Images
Chapter 14 - Training with Minimal Data Points
Chapter 15 - Combining Computer Vision and NLP Techniques
Chapter 16 - Combining Computer Vision and Reinforcement Learning
Other Books You May Enjoy
Leave a review - let other readers know what you think
← Prev
Next →
← Prev
Next →