Hands-On Intelligent Agents With OpenAI Gym · Your Guide to Developing AI Agents Using Deep Reinforcement Learning by Palanisamy, Praveen -- Read -- Imperial Library of Trantor

Index

Title Page Copyright and Credits

Hands-On Intelligent Agents with OpenAI Gym

Dedication Packt Upsell

Why subscribe? PacktPub.com

Contributors

About the author About the reviewer Packt is searching for authors like you

Preface

Who this book is for What this book covers To get the most out of this book

Download the example code files Download the color images Conventions used

Get in touch

Reviews

Introduction to Intelligent Agents and Learning Environments

What is an intelligent agent? Learning environments What is OpenAI Gym? Understanding the features of OpenAI Gym

Simple environment interface Comparability and reproducibility Ability to monitor progress

What can you do with the OpenAI Gym toolkit? Creating your first OpenAI Gym environment

Creating and visualizing a new Gym environment

Summary

Reinforcement Learning and Deep Reinforcement Learning

What is reinforcement learning? Understanding what AI means and what's in it in an intuitive way

Supervised learning Unsupervised learning Reinforcement learning

Practical reinforcement learning

Agent Rewards Environment State Model Value function

State-value function Action-value function

Policy

Markov Decision Process Planning with dynamic programming Monte Carlo learning and temporal difference learning SARSA and Q-learning Deep reinforcement learning Practical applications of reinforcement and deep reinforcement learning algorithms Summary

Getting Started with OpenAI Gym and Deep Reinforcement Learning

Code repository, setup, and configuration

Prerequisites Creating the conda environment Minimal install – the quick and easy way Complete install of OpenAI Gym learning environments

Instructions for Ubuntu  Instructions for macOS MuJoCo installation Completing the OpenAI Gym setup

Installing tools and libraries needed for deep reinforcement learning

Installing prerequisite system packages Installing Compute Unified Device Architecture (CUDA) Installing PyTorch

Summary

Exploring the Gym and its Features

Exploring the list of environments and nomenclature

Nomenclature Exploring the Gym environments

Understanding the Gym interface Spaces in the Gym Summary

Implementing your First Learning Agent - Solving the Mountain Car problem

Understanding the Mountain Car problem

The Mountain Car problem and environment

Implementing a Q-learning agent from scratch

Revisiting Q-learning Implementing a Q-learning agent using Python and NumPy

Defining the hyperparameters Implementing the Q_Learner class's __init__ method Implementing the Q_Learner class's discretize method Implementing the Q_Learner's get_action method Implementing the Q_learner class's learn method Full Q_Learner class implementation

Training the reinforcement learning agent at the Gym Testing and recording the performance of the agent A simple and complete Q-Learner implementation for solving the Mountain Car problem Summary

Implementing an Intelligent Agent for Optimal Control using Deep Q-Learning

Improving the Q-learning agent

Using neural networks to approximate Q-functions

Implementing a shallow Q-network using PyTorch 

Implementing the Shallow_Q_Learner Solving the Cart Pole problem using a Shallow Q-Network

Experience replay 

Implementing the experience memory Implementing the replay experience method for the Q-learner class

Revisiting the epsilon-greedy action policy

Implementing an epsilon decay schedule

Implementing a deep Q-learning agent

Implementing a deep convolutional Q-network in PyTorch Using the target Q-network to stabilize an agent's learning Logging and visualizing an agent's learning process

Using TensorBoard for logging and visualizing a PyTorch RL agent's progress

Managing hyperparameters and configuration parameters

Using a JSON file to easily configure parameters The parameters manager

A complete deep Q-learner to solve complex problems with raw pixel input

The Atari Gym environment

Customizing the Atari Gym environment

Implementing custom Gym environment wrappers

Reward clipping Preprocessing Atari screen image frames Normalizing observations Random no-ops on reset Fire on reset Episodic life Max and skip-frame

Wrapping the Gym environment

Training the deep Q-learner to play Atari games

Putting together a comprehensive deep Q-learner Hyperparameters Launching the training process Testing performance of your deep Q-learner in Atari games

Summary

Creating Custom OpenAI Gym Environments - CARLA Driving Simulator

Understanding the anatomy of Gym environments

Creating a template for custom Gym environment implementations Registering custom environments with OpenAI Gym

Creating an OpenAI Gym-compatible CARLA driving simulator environment

Configuration and initialization

Configuration Initialization

Implementing the reset method

Customizing the CARLA simulation using the CarlaSettings object

Adding cameras and sensors to a vehicle in CARLA

Implementing the step function for the CARLA environment

Accessing camera or sensor data Sending actions to control agents in CARLA

Continuous action space in CARLA Discrete action space in CARLA Sending actions to the CARLA simulation server

Determining the end of episodes in the CARLA environment

Testing the CARLA Gym environment

Summary

Implementing an Intelligent - Autonomous Car Driving Agent using Deep Actor-Critic Algorithm

The deep n-step advantage actor-critic algorithm

Policy gradients

The likelihood ratio trick The policy gradient theorem

Actor-critic algorithm Advantage actor-critic algorithm n-step advantage actor-critic algorithm

n-step returns Implementing the n-step return calculation

Deep n-step advantage actor-critic algorithm

Implementing a deep n-step advantage actor critic agent

Initializing the actor and critic networks Gathering n-step experiences using the current policy Calculating the actor's and critic's losses Updating the actor-critic model Tools to save/load, log, visualize, and monitor An extension - asynchronous deep n-step advantage actor-critic 

Training an intelligent and autonomous driving agent

Training and testing the deep n-step advantage actor-critic agent Training the agent to drive a car in the CARLA driving simulator

Summary

Exploring the Learning Environment Landscape - Roboschool, Gym-Retro, StarCraft-II, DeepMindLab

Gym interface-compatible environments

Roboschool

Quickstart guide to setting up and running Roboschool environments

Gym retro

Quickstart guide to setup and run Gym Retro

Other open source Python-based learning environments

StarCraft II - PySC2

Quick start guide to setup and run StarCraft II PySC2 environment

Downloading the StarCraft II Linux packages Downloading the SC2 maps Installing PySC2 Playing StarCraftII yourself or running sample agents

DeepMind lab

DeepMind Lab learning environment interface

reset(episode=-1, seed=None) step(action, num_steps=1) observations() is_running() observation_spec() action_spec() num_steps() fps() events() close()

Quick start guide to setup and run DeepMind Lab

Setting up and installing DeepMind Lab and its dependencies Playing the game, testing a randomly acting agent, or training your own!

Summary

Exploring the Learning Algorithm Landscape - DDPG (Actor-Critic), PPO (Policy-Gradient), Rainbow (Value-Based)

Deep Deterministic Policy Gradients

Core concepts

Proximal Policy Optimization

Core concept

Off-policy learning On-policy

Rainbow 

Core concept

DQN Double Q-Learning Prioritized experience replay Dueling networks Multi-step learning/n-step learning Distributional RL Noisy nets