Hands-On Computer Vision With Tensorflow by Planche, Benjamin -- Read -- Imperial Library of Trantor

Index

Title Page Copyright and Credits

Hands-On Computer Vision with TensorFlow 2

Dedication About Packt

Why subscribe?

Contributors

About the authors About the reviewers Packt is searching for authors like you

Preface

Who this book is for What this book covers To get the most out of this book

Download and run the example code files

Download the code files Study and run the experiments

Study the Jupyter notebooks online Run the Jupyter notebooks on your machine Run the Jupyter notebooks in Google Colab

Download the color images Conventions used

Get in touch

Reviews

Section 1: TensorFlow 2 and Deep Learning Applied to Computer Vision Computer Vision and Neural Networks

Technical requirements Computer vision in the wild

Introducing computer vision Main tasks and their applications

Content recognition

Object classification Object identification Object detection and localization Object and instance segmentation Pose estimation

Video analysis

Instance tracking Action recognition Motion estimation

Content-aware image edition Scene reconstruction

A brief history of computer vision

First steps to initial successes

Underestimating the perception task Hand-crafting local features Adding some machine learning on top

Rise of deep learning

Early attempts and failures

Rise and fall of the perceptron Too heavy to scale

Reasons for the comeback

The internet – the new El Dorado of data science More power than ever

Deep learning or the rebranding of artificial neural networks

What makes learning deep? Deep learning era

Getting started with neural networks

Building a neural network

Imitating neurons

Biological inspiration Mathematical model Implementation

Layering neurons together

Mathematical model Implementation

Applying our network to classification

Setting up the task Implementing the network

Training a neural network

Learning strategies

Supervised learning Unsupervised learning Reinforcement learning

Teaching time

Evaluating the loss Backpropagating the loss Teaching our network to classify Training considerations – underfitting and overfitting

Summary Questions Further reading

TensorFlow Basics and Training a Model

Technical requirements Getting started with TensorFlow 2 and Keras

Introducing TensorFlow

TensorFlow's main architecture Introducing Keras

A simple computer vision model using Keras

Preparing the data Building the model Training the model Model performance

TensorFlow 2 and Keras in detail

Core concepts

Introducing tensors TensorFlow graphs

Comparing lazy execution to eager execution Creating graphs in TensorFlow 2 Introducing TensorFlow AutoGraph and tf.function

Backpropagating errors using the gradient tape Keras models and layers

Sequential and functional APIs Callbacks

Advanced concepts

How tf.function works Variables in TensorFlow 2 Distribution strategies Using the Estimator API

Available pre-made Estimators Training a custom Estimator

The TensorFlow ecosystem

TensorBoard TensorFlow Addons and TensorFlow Extended TensorFlow Lite and TensorFlow.js Where to run your model

On a local machine On a remote machine On Google Cloud

Summary Questions

Modern Neural Networks

Technical requirements Discovering convolutional neural networks

Neural networks for multidimensional data

Problems with fully connected networks

An explosive number of parameters A lack of spatial reasoning

Introducing CNNs

CNN operations

Convolutional layers

Concept Properties Hyperparameters TensorFlow/Keras methods

Pooling layers

Concept and hyperparameters TensorFlow/Keras methods

Fully connected layers

Usage in CNNs TensorFlow/Keras methods

Effective receptive field

Definitions Formula

CNNs with TensorFlow

Implementing our first CNN

LeNet-5 architecture TensorFlow and Keras implementations Application to MNIST

Refining the training process

Modern network optimizers

Gradient descent challenges

Training velocity and trade-off Suboptimal local minima A single hyperparameter for heterogeneous parameters

Advanced optimizers

Momentum algorithms The Ada family

Regularization methods

Early stopping L1 and L2 regularization

Principles TensorFlow and Keras implementations

Dropout

Definition TensorFlow and Keras methods

Batch normalization

Definition TensorFlow and Keras methods

Summary Questions Further reading

Section 2: State-of-the-Art Solutions for Classic Recognition Problems Influential Classification Tools

Technical requirements Understanding advanced CNN architectures

VGG – a standard CNN architecture

Overview of the VGG architecture

Motivation Architecture

Contributions – standardizing CNN architectures

Replacing large convolutions with multiple smaller ones Increasing the depth of the feature maps Augmenting data with scale jittering Replacing fully connected layers with convolutions

Implementations in TensorFlow and Keras

The TensorFlow model The Keras model

GoogLeNet and the inception module

Overview of the GoogLeNet architecture

Motivation Architecture

Contributions – popularizing larger blocks and bottlenecks

Capturing various details with inception modules Using 1 x 1 convolutions as bottlenecks Pooling instead of fully connecting Fighting vanishing gradient with intermediary losses

Implementations in TensorFlow and Keras

Inception module with the Keras Functional API TensorFlow model and TensorFlow Hub The Keras model

ResNet – the residual network

Overview of the ResNet architecture

Motivation Architecture

Contributions – forwarding the information more deeply

Estimating a residual function instead of a mapping Going ultra-deep

Implementations in TensorFlow and Keras

Residual blocks with the Keras Functional API The TensorFlow model and TensorFlow Hub The Keras model

Leveraging transfer learning

Overview

Definition

Human inspiration Motivation Transferring CNN knowledge

Use cases

Similar tasks with limited training data Similar tasks with abundant training data Dissimilar tasks with abundant training data Dissimilar tasks with limited training data

Transfer learning with TensorFlow and Keras

Model surgery

Removing layers Grafting layers

Selective training

Restoring pretrained parameters Freezing layers

Summary Questions Further reading

Object Detection Models

Technical requirements Introducing object detection

Background

Applications Brief history

Evaluating the performance of a model

Precision and recall Precision-recall curve Average precision and mean average precision Average precision threshold

A fast object detection algorithm – YOLO

Introducing YOLO

Strengths and limitations of YOLO YOLO's main concepts

Inferring with YOLO

The YOLO backbone YOLO's layers output Introducing anchor boxes How YOLO refines anchor boxes Post-processing the boxes NMS YOLO inference summarized

Training YOLO

How the YOLO backbone is trained YOLO loss

Bounding box loss Object confidence loss Classification loss Full YOLO loss

Training techniques

Faster R-CNN – a powerful object detection model

Faster R-CNN's general architecture

Stage 1 – Region proposals Stage 2 – Classification

Faster R-CNN architecture RoI pooling

Training Faster R-CNN

Training the RPN The RPN loss Fast R-CNN loss Training regimen

TensorFlow Object Detection API

Using a pretrained model Training on a custom dataset

Summary Questions Further reading

Enhancing and Segmenting Images

Technical requirements Transforming images with encoders-decoders

Introduction to encoders-decoders

Encoding and decoding Auto-encoding Purpose

Basic example – image denoising

Simplistic fully connected AE Application to image denoising

Convolutional encoders-decoders

Unpooling, transposing, and dilating

Transposed convolution (deconvolution) Unpooling Upsampling and resizing Dilated/atrous convolution

Example architectures – FCN and U-Net

Fully convolutional networks U-Net

Intermediary example – image super-resolution

FCN implementation Application to upscaling images

Understanding semantic segmentation

Object segmentation with encoders-decoders

Overview

Decoding as label maps Training with segmentation losses and metrics Post-processing with conditional random fields

Advanced example – image segmentation for self-driving cars

Task presentation Exemplary solution

The more difficult case of instance segmentation

From object segmentation to instance segmentation

Respecting boundaries Post-processing into instance masks

From object detection to instance segmentation – Mask R-CNN

Applying semantic segmentation to bounding boxes Building an instance segmentation model with Faster-RCNN

Summary Questions Further reading

Section 3: Advanced Concepts and New Frontiers of Computer Vision Training on Complex and Scarce Datasets

Technical requirements Efficient data serving

Introducing the TensorFlow Data API

Intuition behind the TensorFlow Data API

Feeding fast and data-hungry models Inspiration from lazy structures

Structure of TensorFlow data pipelines

Extract, Transform, Load API interface

Setting up input pipelines

Extracting (from tensors, text files, TFRecord files, and more)

From NumPy and TensorFlow data From files From other inputs (generator, SQL database, range, and others)

Transforming the samples (parsing, augmenting, and more)

Parsing images and labels Parsing TFRecord files Editing samples

Transforming the datasets (shuffling, zipping, parallelizing, and more)

Structuring datasets Merging datasets

Optimizing and monitoring input pipelines

Following best practices for optimization

Parallelizing and prefetching Fusing operations Passing options to ensure global properties

Monitoring and reusing datasets

Aggregating performance statistics Caching and reusing datasets

How to deal with data scarcity

Augmenting datasets

Overview

Why augment datasets? Considerations

Augmenting images with TensorFlow

TensorFlow Image module Example – augmenting images for our autonomous driving application

Rendering synthetic datasets

Overview

Rise of 3D databases Benefits of synthetic data

Generating synthetic images from 3D models

Rendering from 3D models Post-processing synthetic images

Problem – realism gap

Leveraging domain adaptation and generative models (VAEs and GANs)

Training models to be robust to domain changes

Supervised domain adaptation Unsupervised domain adaptation Domain randomization

Generating larger or more realistic datasets with VAEs and GANs

Discriminative versus generative models VAEs GANs Augmenting datasets with conditional GANs

Summary Questions Further reading

Video and Recurrent Neural Networks

Technical requirements Introducing RNNs

Basic formalism General understanding of RNNs Learning RNN weights

Backpropagation through time Truncated backpropagation

Long short-term memory cells

LSTM general principles LSTM inner workings

Classifying videos

Applying computer vision to video Classifying videos with an LSTM

Extracting features from videos Training the LSTM

Defining the model Loading the data Training the model

Summary Questions Further reading

Optimizing Models and Deploying on Mobile Devices

Technical requirements Optimizing computational and disk footprints

Measuring inference speed

Measuring latency Using tracing tools to understand computational performance

Improving model inference speed

Optimizing for hardware

Optimizing on CPUs Optimizing on GPUs Optimizing on specialized hardware

Optimizing input Optimizing post-processing

When the model is still too slow

Interpolating and tracking Model distillation

Reducing model size

Quantization Channel pruning and weight sparsification

On-device machine learning

Considerations of on-device machine learning

Benefits of on-device ML

Latency Privacy Cost

Limitations of on-device ML

Practical on-device computer vision

On-device computer vision particularities Generating a SavedModel Generating a frozen graph Importance of preprocessing

Example app – recognizing facial expressions

Introducing MobileNet Deploying models on-device

Running on iOS devices using Core ML

Converting from TensorFlow or Keras Loading the model Using the model

Running on Android using TensorFlow Lite

Converting the model from TensorFlow or Keras Loading the model Using the model

Running in the browser using TensorFlow.js

Converting the model to the TensorFlow.js format Using the model

Running on other devices

Summary Questions

Migrating from TensorFlow 1 to TensorFlow 2

Automatic migration Migrating TensorFlow 1 code

Sessions Placeholders Variable management Layers and models Other concepts

References

Chapter 1: Computer Vision and Neural Networks Chapter 2: TensorFlow Basics and Training a Model Chapter 3: Modern Neural Networks Chapter 4: Influential Classification Tools Chapter 5: Object Detection Models Chapter 6: Enhancing and Segmenting Images Chapter 7: Training on Complex and Scarce Datasets Chapter 8: Video and Recurrent Neural Networks Chapter 9: Optimizing Models and Deploying on Mobile Devices

Assessments

Answers

Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9

Other Books You May Enjoy

Leave a review - let other readers know what you think

← Prev
Back
Next →

← Prev
Back
Next →