Deep Learning by Patterson, Josh -- Read -- Imperial Library of Trantor

Convolution Filters Activation maps Parameter sharing Learned filters and renders ReLU activation functions as layers Convolutional layer hyperparameters

Filter size Output depth Stride Zero-padding

Batch normalization and layers

Pooling Layers Fully Connected Layers Other Applications of CNNs CNNs of Note Summary

Recurrent Neural Networks

Modeling the Time Dimension

Lost in time Temporal feedback and loops in connections Sequences and time-series data Understanding model input and output

3D Volumetric Input

Uneven time-series and masking

Why Not Markov Models? General Recurrent Neural Network Architecture

Recurrent Neural Networks architecture and time-steps

LSTM Networks

Properties of LSTM networks LSTM network architecture LSTM units LSTM layers Training BPTT and truncated BPTT

Domain-Specific Applications and Blended Networks

Recursive Neural Networks

Network Architecture Varieties of Recursive Neural Networks Applications of Recursive Neural Networks

Summary and Discussion

Will Deep Learning Make Other Algorithms Obsolete? Different Problems Have Different Best Methods When Do I Need Deep Learning?

When to use deep learning When to stick with traditional machine learning

5. Building Deep Networks

Matching Deep Networks to the Right Problem

Columnar Data and Multilayer Perceptrons Images and Convolutional Neural Networks Time-series Sequences and Recurrent Neural Networks Using Hybrid Networks

The DL4J Suite of Tools

Vectorization and DataVec Runtimes and ND4J

ND4J and the need for speed

JavaCPP CPU backends GPU backends

Benchmarking ND4J and DL4J

Basic Concepts of the DL4J API

Loading and Saving Models

Writing a trained model to disk

Writing to HDFS

Reading a saved model from disk

Reading from HDFS

Getting Input for the Model

Loading data during training

Setting Up Model Architecture

Building layer-oriented architectures Hyperparameters

Training and Evaluation

Making a prediction Training, validation, and test data

Modeling CSV Data with Multilayer Perceptron Networks

Setting Up Input Data Determining Network Architecture

General hyperparameters First hidden layer Output layer for classification

Training the Model Evaluating the Model

Modeling Handwritten Images Using CNNs

Java Code Listing for the LeNet CNN Loading and Vectorizing the Input Images Network Architecture for LeNet in DL4J

General hyperparameters Convolution layers Max-pooling layers Output layer

Training the CNN

Modeling Sequence Data by Using Recurrent Neural Networks

Generating Shakespeare via LSTMs

High-level modeling workflow Java code for modeling Shakespeare Setting up input data and vectorization LSTM network architecture

General comments on hyperparameters

Training the LSTM network Generating Shakespeare samples

Classifying Sensor Time-series Sequences Using LSTMs

Java code listing for recurrent classification example Setting up input data and vectorization Network architecture and training

Using Autoencoders for Anomaly Detection

Java Code Listing for Autoencoder Example Setting Up Input Data Autoencoder Network Architecture and Training Evaluating the Model

Using Variational Autoencoders to Reconstruct MNIST Digits

Code Listing to Reconstruct MNIST Digits Examining the VAE Model

Understanding the scatterplot Understanding the generated images

Applications of Deep Learning in Natural Language Processing

Learning Word Embedding Using Word2Vec

The Word2Vec model and algorithm Modeling context Learning similar meaning and semantic relationships Vector arithmetic and word embedding Java code listing for Word2Vec example Understanding the Word2Vec example Other practical uses of Word2Vec

Distributed Representations of Sentences with Paragraph Vectors

Building paragraph vectors Understanding the paragraph vectors example

Using Paragraph Vectors for Document Classification

Understanding the paragraph vectors classification example Further exploration of the Word2Vec approach

Extensions into specific domains: Gov2Vec Graphs and Node2Vec Recommendation engines and Item2Vec Computer vision and FaceNet

6. Tuning Deep Networks

Basic Concepts in Tuning Deep Networks

An Intuition for Building Deep Networks Building the Intuition as a Step-by-Step Process

Matching Input Data and Network Architectures

Summary

Relating Model Goal and Output Layers

Regression Model Output Layer Classification Model Output Layer

Single-label classification models Models with more than two labels

Multiclass classification models Multilabel classification models

Working with Layer Count, Parameter Count, and Memory

Feed-Forward Multilayer Neural Networks

Determining hidden-layer count Determining neuron count per layer

Controlling Layer and Parameter Counts

Getting the parameter count for a network

Estimating Network Memory Requirements

Weight Initialization Strategies Using Activation Functions

Summary Table for Activation Functions

Applying Loss Functions Understanding Learning Rates

Using the Ratio of Updates-to-Parameters Specific Recommendations for Learning Rates

How Sparsity Affects Learning Applying Methods of Optimization

SGD Best Practices

Using Parallelization and GPUs for Faster Training

Online Learning and Parallel Iterative Algorithms

Task parallelism Data parallelism

Parallelizing SGD in DL4J

Parallel SGD execution

GPUs

Controlling Epochs and Mini-Batch Size

Understanding Mini-Batch Size Trade-Offs

How to Use Regularization

Priors as Regularizers Max-Norm Regularization Dropout

Issues with dropout