Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Title Page Copyright and Credits
Hands-On Reinforcement Learning for Games
Dedication About Packt
Why subscribe?
Contributors
About the author About the reviewers Packt is searching for authors like you
Preface
Who this book is for What this book covers To get the most out of this book
Download the example code files Download the color images Conventions used
Get in touch
Reviews
Section 1: Exploring the Environment Understanding Rewards-Based Learning
Technical requirements Understanding rewards-based learning
The elements of RL The history of RL Why RL in games?
Introducing the Markov decision process
The Markov property and MDP Building an MDP
Using value learning with multi-armed bandits
Coding a value learner Implementing a greedy policy Exploration versus exploitation
Exploring Q-learning with contextual bandits
Implementing a Q-learning agent Removing discounted rewards
Summary Questions
Dynamic Programming and the Bellman Equation
Introducing DP
Regular programming versus DP Enter DP and memoization
Understanding the Bellman equation
Unraveling the finite MDP The Bellman optimality equation
Building policy iteration 
Installing OpenAI Gym Testing Gym Policy evaluation Policy improvement
Building value iteration Playing with policy versus value iteration Exercises Summary
Monte Carlo Methods
Understanding model-based and model-free learning Introducing the Monte Carlo method
Solving for  Implementing Monte Carlo Plotting the guesses
Adding RL
Monte Carlo control
Playing the FrozenLake game Using prediction and control
Incremental means
Exercises Summary
Temporal Difference Learning
Understanding the TCA problem Introducing TDL
Bootstrapping and backup diagrams Applying TD prediction TD(0) or one-step TD Tuning hyperparameters
Applying TDL to Q-learning Exploring TD(0) in Q-learning
Exploration versus exploitation revisited Teaching an agent to drive a taxi
Running off- versus on-policy Exercises Summary
Exploring SARSA
Exploring SARSA on-policy learning Using continuous spaces with SARSA
Discretizing continuous state spaces Expected SARSA
Extending continuous spaces  Working with TD (λ) and eligibility traces
Backward views and eligibility traces
Understanding SARSA (λ)
SARSA lambda and the Lunar Lander
Exercises Summary
Section 2: Exploiting the Knowledge Going Deep with DQN
DL for RL
DL frameworks for DRL
Using PyTorch for DL
Computational graphs with tensors Training a neural network – computational graph
Building neural networks with Torch Understanding DQN in PyTorch
Refreshing the environment Partially observable Markov decision process Constructing DQN The replay buffer The DQN class Calculating loss and training
Exercising DQN
Revisiting the LunarLander and beyond
Exercises Summary
Going Deeper with DDQN
Understanding visual state
Encoding visual state
Introducing CNNs Working with a DQN on Atari
Adding CNN layers
Introducing DDQN
Double DQN or the fixed Q targets Dueling DQN or the real DDQN
Extending replay with prioritized experience replay Exercises Summary
Policy Gradient Methods
Understanding policy gradient methods
Policy gradient ascent
Introducing REINFORCE Using advantage actor-critic
Actor-critic Training advantage AC
Building a deep deterministic policy gradient
Training DDPG
Exploring trust region policy optimization
Conjugate gradients Trust region methods The TRPO step
Exercises Summary
Optimizing for Continuous Control
Understanding continuous control with Mujoco Introducing proximal policy optimization
The hows of policy optimization PPO and clipped objectives
Using PPO with recurrent networks Deciding on synchronous and asynchronous actors
Using A2C Using A3C
Building actor-critic with experience replay Exercises Summary
All about Rainbow DQN
Rainbow – combining improvements in deep reinforcement learning Using TensorBoard Introducing distributional RL
Back to TensorBoard
Understanding noisy networks
Noisy networks for exploration and importance sampling
Unveiling Rainbow DQN
When does training fail?
Exercises Summary
Exploiting ML-Agents
Installing ML-Agents Building a Unity environment
Building for Gym wrappers
Training a Unity environment with Rainbow Creating a new environment
Coding an agent/environment
Advancing RL with ML-Agents
Curriculum learning Behavioral cloning Curiosity learning Training generalized reinforcement learning agents
Exercises Summary
DRL Frameworks
Choosing a framework Introducing Google Dopamine Playing with Keras-RL Exploring RL Lib Using TF-Agents Exercises Summary
Section 3: Reward Yourself 3D Worlds
Reasoning on 3D worlds Training a visual agent Generalizing 3D vision
ResNet for visual observation encoding
Challenging the Unity Obstacle Tower Challenge
Pre-training the agent Prierarchy – implicit hierarchies
Exploring Habitat – embodied agents by FAIR
Installing Habitat Training in Habitat
Exercises Summary
From DRL to AGI
Learning meta learning
Learning 2 learn Model-agnostic meta learning Training a meta learner
Introducing meta reinforcement learning
MAML-RL
Using hindsight experience replay Imagination and reasoning in RL
Generating imagination
Understanding imagination-augmented agents Exercises Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion