Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Title Page Copyright and Credits
Deep Learning for Computer Vision
Packt Upsell
Why subscribe? PacktPub.com
Foreword Contributors
About the author About the reviewers Packt is searching for authors like you
Preface
Who this book is for What this book covers To get the most out of this book
Download the example code files Conventions used
Get in touch
Reviews
Getting Started
Understanding deep learning
Perceptron Activation functions
Sigmoid The hyperbolic tangent function The Rectified Linear Unit (ReLU)
Artificial neural network (ANN)
One-hot encoding Softmax Cross-entropy Dropout Batch normalization L1 and L2 regularization
Training neural networks
Backpropagation Gradient descent Stochastic gradient descent
Playing with TensorFlow playground Convolutional neural network
Kernel Max pooling
Recurrent neural networks (RNN) Long short-term memory (LSTM)
Deep learning for computer vision
Classification Detection or localization and segmentation Similarity learning Image captioning Generative models Video analysis
Development environment setup
Hardware and Operating Systems - OS
General Purpose - Graphics Processing Unit (GP-GPU)
Computer Unified Device Architecture - CUDA CUDA Deep Neural Network - CUDNN
Installing software packages
Python Open Computer Vision - OpenCV The TensorFlow library
Installing TensorFlow TensorFlow example to print Hello, TensorFlow TensorFlow example for adding two numbers TensorBoard The TensorFlow Serving tool
The Keras library
Summary
Image Classification
Training the MNIST model in TensorFlow
The MNIST datasets Loading the MNIST data Building a perceptron
Defining placeholders for input data and targets Defining the variables for a fully connected layer Training the model with data
Building a multilayer convolutional network
Utilizing TensorBoard in deep learning
Training the MNIST model in Keras
Preparing the dataset Building the model
Other popular image testing datasets 
The CIFAR dataset The Fashion-MNIST dataset The ImageNet dataset and competition
The bigger deep learning models
The AlexNet model The VGG-16 model The Google Inception-V3 model The Microsoft ResNet-50 model The SqueezeNet model Spatial transformer networks The DenseNet model
Training a model for cats versus dogs
Preparing the data Benchmarking with simple CNN Augmenting the dataset
Augmentation techniques 
Transfer learning or fine-tuning of a model
Training on bottleneck features
Fine-tuning several layers in deep learning
Developing real-world applications
Choosing the right model Tackling the underfitting and overfitting scenarios Gender and age detection from face Fine-tuning apparel models  Brand safety
Summary
Image Retrieval
Understanding visual features
Visualizing activation of deep learning models Embedding visualization
Guided backpropagation
The DeepDream Adversarial examples
Model inference
Exporting a model Serving the trained model 
Content-based image retrieval
Building the retrieval pipeline
Extracting bottleneck features for an image Computing similarity between query image and target database
Efficient retrieval
Matching faster using approximate nearest neighbour
Advantages of ANNOY
Autoencoders of raw images
Denoising using autoencoders
Summary
Object Detection
Detecting objects in an image Exploring the datasets
ImageNet dataset PASCAL VOC challenge COCO object detection challenge Evaluating datasets using metrics
Intersection over Union The mean average precision
Localizing algorithms 
Localizing objects using sliding windows
The scale-space concept Training a fully connected layer as a convolution layer Convolution implementation of sliding window
Thinking about localization as a regression problem
Applying regression to other problems Combining regression with the sliding window
Detecting objects
Regions of the convolutional neural network (R-CNN) Fast R-CNN Faster R-CNN Single shot multi-box detector
Object detection API
Installation and setup Pre-trained models Re-training object detection models
Data preparation for the Pet dataset Object detection training pipeline Training the model Monitoring loss and accuracy using TensorBoard
Training a pedestrian detection for a self-driving car
The YOLO object detection algorithm  Summary
Semantic Segmentation
Predicting pixels
Diagnosing medical images Understanding the earth from satellite imagery Enabling robots to see
Datasets Algorithms for semantic segmentation
The Fully Convolutional Network The SegNet architecture
Upsampling the layers by pooling Sampling the layers by convolution Skipping connections for better training
Dilated convolutions DeepLab RefiNet PSPnet Large kernel matters DeepLab v3
Ultra-nerve segmentation Segmenting satellite images
Modeling FCN for segmentation
Segmenting instances Summary
Similarity Learning
Algorithms for similarity learning
Siamese networks
Contrastive loss
FaceNet
Triplet loss
The DeepNet model DeepRank Visual recommendation systems
Human face analysis
Face detection Face landmarks and attributes
The Multi-Task Facial Landmark (MTFL) dataset The Kaggle keypoint dataset The Multi-Attribute Facial Landmark (MAFL) dataset Learning the facial key points
Face recognition
The labeled faces in the wild (LFW) dataset The YouTube faces dataset The CelebFaces Attributes dataset (CelebA)  CASIA web face database The VGGFace2 dataset Computing the similarity between faces Finding the optimum threshold
Face clustering 
Summary
Image Captioning
Understanding the problem and datasets Understanding natural language processing for image captioning
Expressing words in vector form Converting words to vectors Training an embedding
Approaches for image captioning and related problems
Using a condition random field for linking image and text Using RNN on CNN features to generate captions Creating captions using image ranking Retrieving captions from images and images from captions Dense captioning  Using RNN for captioning Using multimodal metric space Using attention network for captioning Knowing when to look
Implementing attention-based image captioning Summary
Generative Models
Applications of generative models
Artistic style transfer Predicting the next frame in a video  Super-resolution of images Interactive image generation Image to image translation Text to image generation Inpainting Blending Transforming attributes Creating training data Creating new animation characters 3D models from photos
Neural artistic style transfer
Content loss Style loss using the Gram matrix Style transfer
Generative Adversarial Networks
Vanilla GAN Conditional GAN Adversarial loss Image translation InfoGAN Drawbacks of GAN
Visual dialogue model
Algorithm for VDM
Generator Discriminator
Summary
Video Classification
Understanding and classifying videos 
Exploring video classification datasets
UCF101 YouTube-8M Other datasets
Splitting videos into frames Approaches for classifying videos
Fusing parallel CNN for video classification Classifying videos over long periods Streaming two CNN's for action recognition Using 3D convolution for temporal learning Using trajectory for classification Multi-modal fusion Attending regions for classification
Extending image-based approaches to videos
Regressing the human pose
Tracking facial landmarks
Segmenting videos Captioning videos Generating videos
Summary
Deployment
Performance of models
Quantizing the models MobileNets
Deployment in the cloud
AWS Google Cloud Platform
Deployment of models in devices
Jetson TX2 Android iPhone
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion