Mastering Computer Vision With TensorFlow 2.x by Kar, Krishnendu -- Read -- Imperial Library of Trantor

Index

Title Page Copyright and Credits

Mastering Computer Vision with TensorFlow 2.x

About Packt

Why subscribe?

Contributors

About the author About the reviewers Packt is searching for authors like you

Preface

Who this book is for What this book covers To get the most out of this book

Download the example code files Download the color images Conventions used

Get in touch

Reviews

Section 1: Introduction to Computer Vision and Neural Networks Computer Vision and TensorFlow Fundamentals

Technical requirements Detecting edges using image hashing and filtering

Using a Bayer filter for color pattern formation Creating an image vector Transforming an image Linear filtering—convolution with kernels

Image smoothing

The mean filter The median filter The Gaussian filter Image filtering with OpenCV

Image gradient Image sharpening

Mixing the Gaussian and Laplacian operations Detecting edges in an image

The Sobel edge detector The Canny edge detector

Extracting features from an image

Image matching using OpenCV

Object detection using Contours and the HOG detector

Contour detection Detecting a bounding box The HOG detector Limitations of the contour detection method

An overview of TensorFlow, its ecosystem, and installation

TensorFlow versus PyTorch

TensorFlow Installation

Summary

Content Recognition Using Local Binary Patterns

Processing images using LBP

Generating an LBP pattern Understanding the LBP histogram

Histogram comparison methods

The computational cost of LBP

Applying LBP to texture recognition Matching face color with foundation color – LBP and its limitations Matching face color with foundation color – color matching technique Summary

Facial Detection Using OpenCV and CNN

Applying Viola-Jones AdaBoost learning and the Haar cascade classifier for face recognition

Selecting Haar-like features Creating an integral image Running AdaBoost training Attentional cascade classifiers Training the cascade detector

Predicting facial key points using a deep neural network

Preparing the dataset for key-point detection Processing key-point data

Preprocessing before being input into the Keras–Python code Preprocessing within the Keras–Python code

Defining the model architecture Training the model to make key point predictions

Predicting facial expressions using a CNN Overview of 3D face detection

Overview of hardware design for 3D reconstruction Overview of 3D reconstruction and tracking Overview of parametric tracking

Summary

Deep Learning on Images

Understanding CNNs and their parameters

Convolution Convolution over volume – 3 x 3 filter Convolution over volume – 1 x 1 filter Pooling Padding Stride Activation

Fully connected layers

Regularization Dropout Internal covariance shift and batch normalization Softmax

Optimizing CNN parameters

Baseline case Iteration 1 – CNN parameter adjustment Iteration 2 – CNN parameter adjustment Iteration 3 – CNN parameter adjustment Iteration 4 – CNN parameter adjustment

Visualizing the layers of a neural network

Building a custom image classifier model and visualizing its layers

Neural network input and parameters Input image Defining the train and validation generators Developing the model Compiling and training the model Inputting a test image and converting it into a tensor Visualizing the first layer of activation Visualizing multiple layers of activation

Training an existing advanced image classifier model and visualizing its layers

Summary

Section 2: Advanced Concepts of Computer Vision with TensorFlow Neural Network Architecture and Models

Overview of AlexNet Overview of VGG16 Overview of Inception

GoogLeNet detection

Overview of ResNet Overview of R-CNN

Image segmentation

Clustering-based segmentation Graph-based segmentation

Selective search Region proposal Feature extraction Classification of the image Bounding box regression

Overview of Fast R-CNN Overview of Faster R-CNN Overview of GANs Overview of GNNs

Spectral GNN

Overview of Reinforcement Learning Overview of Transfer Learning Summary

Visual Search Using Transfer Learning

Coding deep learning models using TensorFlow

Downloading weights Decoding predictions Importing other common features Constructing a model Inputting images from a directory Loop function for importing multiple images and processing using TensorFlow Keras

Developing a transfer learning model using TensorFlow

Analyzing and storing data Importing TensorFlow libraries Setting up model parameters Building an input data pipeline

Training data generator Validation data generator

Constructing the final model using transfer learning Saving a model with checkpoints Plotting training history

Understanding the architecture and applications of visual search

The architecture of visual search Visual search code and explanation

Predicting the class of an uploaded image Predicting the class of all images

Working with a visual search input pipeline using tf.data Summary

Object Detection Using YOLO

An overview of YOLO

The concept of IOU How does YOLO detect objects so fast? The YOLO v3 neural network architecture A comparison of YOLO and Faster R-CNN

An introduction to Darknet for object detection

Detecting objects using Darknet Detecting objects using Tiny Darknet

Real-time prediction using Darknet YOLO versus YOLO v2 versus YOLO v3 When to train a model? Training your own image set with YOLO v3 to develop a custom model

Preparing images Generating annotation files Converting .xml files to .txt files Creating a combined train.txt and test.txt file Creating a list of class name files Creating a YOLO .data file Adjusting the YOLO configuration file Enabling the GPU for training Start training

An overview of the Feature Pyramid Network and RetinaNet Summary

Semantic Segmentation and Neural Style Transfer

Overview of TensorFlow DeepLab for semantic segmentation

Spatial Pyramid Pooling

Atrous convolution Encoder-decoder network

Encoder module Decoder module

Semantic segmentation in DeepLab – example

Google Colab, Google Cloud TPU, and TensorFlow

Artificial image generation using DCGANs

Generator Discriminator Training

Image inpainting using DCGAN

TensorFlow DCGAN – example

Image inpainting using OpenCV Understanding neural style transfer Summary

Section 3: Advanced Implementation of Computer Vision with TensorFlow Action Recognition Using Multitask Deep Learning

Human pose estimation – OpenPose

Theory behind OpenPose Understanding the OpenPose code

Human pose estimation – stacked hourglass model

Understanding the hourglass model Coding an hourglass model

argparse block Training an hourglass network Creating the hourglass network

Front module Left half-block Connect left to right Right half-block Head block

Hourglass training

Human pose estimation – PoseNet

Top-down approach Bottom-up approach PoseNet implementation Applying human poses for gesture recognition

Action recognition using various methods

Recognizing actions based on an accelerometer Combining video-based actions with pose estimation Action recognition using the 4D method

Summary

Object Detection Using R-CNN, SSD, and R-FCN

An overview of SSD An overview of R-FCN An overview of the TensorFlow object detection API Detecting objects using TensorFlow on Google Cloud Detecting objects using TensorFlow Hub Training a custom object detector using TensorFlow and Google Colab

Collecting and formatting images as .jpg files Annotating images to create a .xml file Separating the file by train and test folders Configuring parameters and installing the required packages Creating TensorFlow records Preparing the model and configuring the training pipeline Monitoring training progress using TensorBoard

TensorBoard running on a local machine TensorBoard running on Google Colab

Training the model Running an inference test Caution when using the neural network model

An overview of Mask R-CNN and a Google Colab demonstration Developing an object tracker model to complement the object detector

Centroid-based tracking SORT tracking DeepSORT tracking The OpenCV tracking method Siamese network-based tracking SiamMask-based tracking

Summary

Section 4: TensorFlow Implementation at the Edge and on the Cloud Deep Learning on Edge Devices with CPU/GPU Optimization

Overview of deep learning on edge devices Techniques used for GPU/CPU optimization Overview of MobileNet Image processing with a Raspberry Pi

Raspberry Pi hardware setup Raspberry Pi camera software setup OpenCV installation in Raspberry Pi OpenVINO installation in Raspberry Pi Installing the OpenVINO toolkit components

Setting up the environmental variable Adding a USB rule Running inference using Python code Advanced inference

Face detection, pedestrian detection, and vehicle detection Landmark models Models for action recognition License plate, gaze, and person detection

Model conversion and inference using OpenVINO

Running inference in a Terminal using ncappzoo Converting the pre-trained model for inference

Converting from a TensorFlow model developed using Keras

Converting a TensorFlow model developed using the TensorFlow Object Detection API

Summary of the OpenVINO Model inference process

Application of TensorFlow Lite

Converting a TensorFlow model into tflite format

Python API TensorFlow Object Detection API – tflite_convert TensorFlow Object Detection API – toco

Model optimization

Object detection on Android phones using TensorFlow Lite Object detection on Raspberry Pi using TensorFlow Lite

Image classification Object detection

Object detection on iPhone using TensorFlow Lite and Create ML

TensorFlow Lite conversion model for iPhone Core ML Converting a TensorFlow model into Core ML format

A summary of various annotation methods

Outsource labeling work to a third party Automated or semi-automated labeling

Summary

Cloud Computing Platform for Computer Vision

Training an object detector in GCP

Creating a project in GCP The GCP setup The Google Cloud Storage bucket setup

Setting up a bucket using the GCP API Setting up a bucket using Ubuntu Terminal

Setting up the Google Cloud SDK Linking your terminal to the Google Cloud project and bucket Installing the TensorFlow object detection API Preparing the dataset

TFRecord and labeling map data

Data preparation Data upload

The model.ckpt files The model config file

Training in the cloud Viewing the model output in TensorBoard The model output and conversion into a frozen graph Executing export tflite graph.py from Google Colab

Training an object detector in the AWS SageMaker cloud platform

Setting up an AWS account, billing, and limits Converting a .xml file to JSON format Uploading data to the S3 bucket Creating a notebook instance and beginning training Fixing some common failures during training

Training an object detector in the Microsoft Azure cloud platform

Creating an Azure account and setting up Custom Vision Uploading training images and tagging them

Training at scale and packaging

Application packaging

The general idea behind cloud-based visual search Analyzing images and search mechanisms in various cloud platforms

Visual search using GCP Visual search using AWS Visual search using Azure

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

← Prev
Back
Next →

← Prev
Back
Next →