Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
Title Page
Copyright and Credits
Hands-On Computer Vision with TensorFlow 2
Dedication
About Packt
Why subscribe?
Contributors
About the authors
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download and run the example code files
Download the code files
Study and run the experiments
Study the Jupyter notebooks online
Run the Jupyter notebooks on your machine
Run the Jupyter notebooks in Google Colab
Download the color images
Conventions used
Get in touch
Reviews
Section 1: TensorFlow 2 and Deep Learning Applied to Computer Vision
Computer Vision and Neural Networks
Technical requirements
Computer vision in the wild
Introducing computer vision
Main tasks and their applications
Content recognition
Object classification
Object identification
Object detection and localization
Object and instance segmentation
Pose estimation
Video analysis
Instance tracking
Action recognition
Motion estimation
Content-aware image edition
Scene reconstruction
A brief history of computer vision
First steps to initial successes
Underestimating the perception task
Hand-crafting local features
Adding some machine learning on top
Rise of deep learning
Early attempts and failures
Rise and fall of the perceptron
Too heavy to scale
Reasons for the comeback
The internet – the new El Dorado of data science
More power than ever
Deep learning or the rebranding of artificial neural networks
What makes learning deep?
Deep learning era
Getting started with neural networks
Building a neural network
Imitating neurons
Biological inspiration
Mathematical model
Implementation
Layering neurons together
Mathematical model
Implementation
Applying our network to classification
Setting up the task
Implementing the network
Training a neural network
Learning strategies
Supervised learning
Unsupervised learning
Reinforcement learning
Teaching time
Evaluating the loss
Backpropagating the loss
Teaching our network to classify
Training considerations – underfitting and overfitting
Summary
Questions
Further reading
TensorFlow Basics and Training a Model
Technical requirements
Getting started with TensorFlow 2 and Keras
Introducing TensorFlow
TensorFlow's main architecture
Introducing Keras
A simple computer vision model using Keras
Preparing the data
Building the model
Training the model
Model performance
TensorFlow 2 and Keras in detail
Core concepts
Introducing tensors
TensorFlow graphs
Comparing lazy execution to eager execution
Creating graphs in TensorFlow 2
Introducing TensorFlow AutoGraph and tf.function
Backpropagating errors using the gradient tape
Keras models and layers
Sequential and functional APIs
Callbacks
Advanced concepts
How tf.function works
Variables in TensorFlow 2
Distribution strategies
Using the Estimator API
Available pre-made Estimators
Training a custom Estimator
The TensorFlow ecosystem
TensorBoard
TensorFlow Addons and TensorFlow Extended
TensorFlow Lite and TensorFlow.js
Where to run your model
On a local machine
On a remote machine
On Google Cloud
Summary
Questions
Modern Neural Networks
Technical requirements
Discovering convolutional neural networks
Neural networks for multidimensional data
Problems with fully connected networks
An explosive number of parameters
A lack of spatial reasoning
Introducing CNNs
CNN operations
Convolutional layers
Concept
Properties
Hyperparameters
TensorFlow/Keras methods
Pooling layers
Concept and hyperparameters
TensorFlow/Keras methods
Fully connected layers
Usage in CNNs
TensorFlow/Keras methods
Effective receptive field
Definitions
Formula
CNNs with TensorFlow
Implementing our first CNN
LeNet-5 architecture
TensorFlow and Keras implementations
Application to MNIST
Refining the training process
Modern network optimizers
Gradient descent challenges
Training velocity and trade-off
Suboptimal local minima
A single hyperparameter for heterogeneous parameters
Advanced optimizers
Momentum algorithms
The Ada family
Regularization methods
Early stopping
L1 and L2 regularization
Principles
TensorFlow and Keras implementations
Dropout
Definition
TensorFlow and Keras methods
Batch normalization
Definition
TensorFlow and Keras methods
Summary
Questions
Further reading
Section 2: State-of-the-Art Solutions for Classic Recognition Problems
Influential Classification Tools
Technical requirements
Understanding advanced CNN architectures
VGG – a standard CNN architecture
Overview of the VGG architecture
Motivation
Architecture
Contributions – standardizing CNN architectures
Replacing large convolutions with multiple smaller ones
Increasing the depth of the feature maps
Augmenting data with scale jittering
Replacing fully connected layers with convolutions
Implementations in TensorFlow and Keras
The TensorFlow model
The Keras model
GoogLeNet and the inception module
Overview of the GoogLeNet architecture
Motivation
Architecture
Contributions – popularizing larger blocks and bottlenecks
Capturing various details with inception modules
Using 1 x 1 convolutions as bottlenecks
Pooling instead of fully connecting
Fighting vanishing gradient with intermediary losses
Implementations in TensorFlow and Keras
Inception module with the Keras Functional API
TensorFlow model and TensorFlow Hub
The Keras model
ResNet – the residual network
Overview of the ResNet architecture
Motivation
Architecture
Contributions – forwarding the information more deeply
Estimating a residual function instead of a mapping
Going ultra-deep
Implementations in TensorFlow and Keras
Residual blocks with the Keras Functional API
The TensorFlow model and TensorFlow Hub
The Keras model
Leveraging transfer learning
Overview
Definition
Human inspiration
Motivation
Transferring CNN knowledge
Use cases
Similar tasks with limited training data
Similar tasks with abundant training data
Dissimilar tasks with abundant training data
Dissimilar tasks with limited training data
Transfer learning with TensorFlow and Keras
Model surgery
Removing layers
Grafting layers
Selective training
Restoring pretrained parameters
Freezing layers
Summary
Questions
Further reading
Object Detection Models
Technical requirements
Introducing object detection
Background
Applications
Brief history
Evaluating the performance of a model
Precision and recall
Precision-recall curve
Average precision and mean average precision
Average precision threshold
A fast object detection algorithm – YOLO
Introducing YOLO
Strengths and limitations of YOLO
YOLO's main concepts
Inferring with YOLO
The YOLO backbone
YOLO's layers output
Introducing anchor boxes
How YOLO refines anchor boxes
Post-processing the boxes
NMS
YOLO inference summarized
Training YOLO
How the YOLO backbone is trained
YOLO loss
Bounding box loss
Object confidence loss
Classification loss
Full YOLO loss
Training techniques
Faster R-CNN – a powerful object detection model
Faster R-CNN's general architecture
Stage 1 – Region proposals
Stage 2 – Classification
Faster R-CNN architecture
RoI pooling
Training Faster R-CNN
Training the RPN
The RPN loss
Fast R-CNN loss
Training regimen
TensorFlow Object Detection API
Using a pretrained model
Training on a custom dataset
Summary
Questions
Further reading
Enhancing and Segmenting Images
Technical requirements
Transforming images with encoders-decoders
Introduction to encoders-decoders
Encoding and decoding
Auto-encoding
Purpose
Basic example – image denoising
Simplistic fully connected AE
Application to image denoising
Convolutional encoders-decoders
Unpooling, transposing, and dilating
Transposed convolution (deconvolution)
Unpooling
Upsampling and resizing
Dilated/atrous convolution
Example architectures – FCN and U-Net
Fully convolutional networks
U-Net
Intermediary example – image super-resolution
FCN implementation
Application to upscaling images
Understanding semantic segmentation
Object segmentation with encoders-decoders
Overview
Decoding as label maps
Training with segmentation losses and metrics
Post-processing with conditional random fields
Advanced example – image segmentation for self-driving cars
Task presentation
Exemplary solution
The more difficult case of instance segmentation
From object segmentation to instance segmentation
Respecting boundaries
Post-processing into instance masks
From object detection to instance segmentation – Mask R-CNN
Applying semantic segmentation to bounding boxes
Building an instance segmentation model with Faster-RCNN
Summary
Questions
Further reading
Section 3: Advanced Concepts and New Frontiers of Computer Vision
Training on Complex and Scarce Datasets
Technical requirements
Efficient data serving
Introducing the TensorFlow Data API
Intuition behind the TensorFlow Data API
Feeding fast and data-hungry models
Inspiration from lazy structures
Structure of TensorFlow data pipelines
Extract, Transform, Load
API interface
Setting up input pipelines
Extracting (from tensors, text files, TFRecord files, and more)
From NumPy and TensorFlow data
From files
From other inputs (generator, SQL database, range, and others)
Transforming the samples (parsing, augmenting, and more)
Parsing images and labels
Parsing TFRecord files
Editing samples
Transforming the datasets (shuffling, zipping, parallelizing, and more)
Structuring datasets
Merging datasets
Loading
Optimizing and monitoring input pipelines
Following best practices for optimization
Parallelizing and prefetching
Fusing operations
Passing options to ensure global properties
Monitoring and reusing datasets
Aggregating performance statistics
Caching and reusing datasets
How to deal with data scarcity
Augmenting datasets
Overview
Why augment datasets?
Considerations
Augmenting images with TensorFlow
TensorFlow Image module
Example – augmenting images for our autonomous driving application
Rendering synthetic datasets
Overview
Rise of 3D databases
Benefits of synthetic data
Generating synthetic images from 3D models
Rendering from 3D models
Post-processing synthetic images
Problem – realism gap
Leveraging domain adaptation and generative models (VAEs and GANs)
Training models to be robust to domain changes
Supervised domain adaptation
Unsupervised domain adaptation
Domain randomization
Generating larger or more realistic datasets with VAEs and GANs
Discriminative versus generative models
VAEs
GANs
Augmenting datasets with conditional GANs
Summary
Questions
Further reading
Video and Recurrent Neural Networks
Technical requirements
Introducing RNNs
Basic formalism
General understanding of RNNs
Learning RNN weights
Backpropagation through time
Truncated backpropagation
Long short-term memory cells
LSTM general principles
LSTM inner workings
Classifying videos
Applying computer vision to video
Classifying videos with an LSTM
Extracting features from videos
Training the LSTM
Defining the model
Loading the data
Training the model
Summary
Questions
Further reading
Optimizing Models and Deploying on Mobile Devices
Technical requirements
Optimizing computational and disk footprints
Measuring inference speed
Measuring latency
Using tracing tools to understand computational performance
Improving model inference speed
Optimizing for hardware
Optimizing on CPUs
Optimizing on GPUs
Optimizing on specialized hardware
Optimizing input
Optimizing post-processing
When the model is still too slow
Interpolating and tracking
Model distillation
Reducing model size
Quantization
Channel pruning and weight sparsification
On-device machine learning
Considerations of on-device machine learning
Benefits of on-device ML
Latency
Privacy
Cost
Limitations of on-device ML
Practical on-device computer vision
On-device computer vision particularities
Generating a SavedModel
Generating a frozen graph
Importance of preprocessing
Example app – recognizing facial expressions
Introducing MobileNet
Deploying models on-device
Running on iOS devices using Core ML
Converting from TensorFlow or Keras
Loading the model
Using the model
Running on Android using TensorFlow Lite
Converting the model from TensorFlow or Keras
Loading the model
Using the model
Running in the browser using TensorFlow.js
Converting the model to the TensorFlow.js format
Using the model
Running on other devices
Summary
Questions
Migrating from TensorFlow 1 to TensorFlow 2
Automatic migration
Migrating TensorFlow 1 code
Sessions
Placeholders
Variable management
Layers and models
Other concepts
References
Chapter 1: Computer Vision and Neural Networks
Chapter 2: TensorFlow Basics and Training a Model
Chapter 3: Modern Neural Networks
Chapter 4: Influential Classification Tools
Chapter 5: Object Detection Models
Chapter 6: Enhancing and Segmenting Images
Chapter 7: Training on Complex and Scarce Datasets
Chapter 8: Video and Recurrent Neural Networks
Chapter 9: Optimizing Models and Deploying on Mobile Devices
Assessments
Answers
Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Other Books You May Enjoy
Leave a review - let other readers know what you think
← Prev
Back
Next →
← Prev
Back
Next →