Soft Actor Critic is Easy in PyTorch | Complete Deep Reinforcement Learning Tutorial
The soft actor critic algorithm is an off policy actor critic method for dealing with reinforcement learning problems in continuous action spaces. It makes use of a novel framework that seeks to maximize the entropy of our agent. We're going to write our very own SAC agent in PyTorch, starting from scratch.
We're going to need to implement several classes for this project:
A Replay buffer to keep track of the states the agent encountered, the actions it took, and the rewards it received along the way.
A critic network that tells the agent how valuable it thinks the chosen actions were.
A value network that informs the agent how valuable each state is.
We will also make use of ideas from double Q learning, like taking the minimum of estimation from two critics, for our update rules for the value and actor network.
We will test our agent in the Inverted Pendulum environment from the PyBullet package, which is an open 3d rendering and physics engine.
Code for this video is here:
https://github.com/philtabor/Youtube-Code-Repository/tree/master/ReinforcementLearning/PolicyGradient/SAC
Learn how to turn deep reinforcement learning papers into code:
Deep Q Learning:
https://www.udemy.com/course/deep-q-learning-from-paper-to-code/?couponCode=DQN-JUNE-2021
Actor Critic Methods:
https://www.udemy.com/course/actor-critic-methods-from-paper-to-code-with-pytorch/?couponCode=AC-JUNE-2021
Natural Language Processing from First Principles:
https://www.udemy.com/course/natural-language-processing-from-first-principles/?couponCode=NLP1-JULY-2021
Reinforcement Learning Fundamentals
https://www.manning.com/livevideo/reinforcement-learning-in-motion
Come hang out on Discord here:
https://discord.gg/Zr4VCdv
Website: https://www.neuralnet.ai
Github: https://github.com/philtabor
Twitter: https://twitter.com/MLWithPhil
#SoftActorCritic #DeepReinforcementLearning #Pytorch
Видео Soft Actor Critic is Easy in PyTorch | Complete Deep Reinforcement Learning Tutorial канала Machine Learning with Phil
We're going to need to implement several classes for this project:
A Replay buffer to keep track of the states the agent encountered, the actions it took, and the rewards it received along the way.
A critic network that tells the agent how valuable it thinks the chosen actions were.
A value network that informs the agent how valuable each state is.
We will also make use of ideas from double Q learning, like taking the minimum of estimation from two critics, for our update rules for the value and actor network.
We will test our agent in the Inverted Pendulum environment from the PyBullet package, which is an open 3d rendering and physics engine.
Code for this video is here:
https://github.com/philtabor/Youtube-Code-Repository/tree/master/ReinforcementLearning/PolicyGradient/SAC
Learn how to turn deep reinforcement learning papers into code:
Deep Q Learning:
https://www.udemy.com/course/deep-q-learning-from-paper-to-code/?couponCode=DQN-JUNE-2021
Actor Critic Methods:
https://www.udemy.com/course/actor-critic-methods-from-paper-to-code-with-pytorch/?couponCode=AC-JUNE-2021
Natural Language Processing from First Principles:
https://www.udemy.com/course/natural-language-processing-from-first-principles/?couponCode=NLP1-JULY-2021
Reinforcement Learning Fundamentals
https://www.manning.com/livevideo/reinforcement-learning-in-motion
Come hang out on Discord here:
https://discord.gg/Zr4VCdv
Website: https://www.neuralnet.ai
Github: https://github.com/philtabor
Twitter: https://twitter.com/MLWithPhil
#SoftActorCritic #DeepReinforcementLearning #Pytorch
Видео Soft Actor Critic is Easy in PyTorch | Complete Deep Reinforcement Learning Tutorial канала Machine Learning with Phil
Показать
Комментарии отсутствуют
Информация о видео
19 августа 2020 г. 23:40:40
01:02:31
Другие видео канала
![Everything You Need To Master Actor Critic Methods | Tensorflow 2 Tutorial](https://i.ytimg.com/vi/LawaN3BdI00/default.jpg)
![Why Every Data Scientist Needs a YouTube Channel and How to Start](https://i.ytimg.com/vi/Is_mQo8wdBQ/default.jpg)
![Reinforcement Learning 6: Policy Gradients and Actor Critics](https://i.ytimg.com/vi/bRfUxQs6xIM/default.jpg)
![Reinforcement Learning in Continuous Action Spaces | DDPG Tutorial (Pytorch)](https://i.ytimg.com/vi/6Yd5WnYls_Y/default.jpg)
![Sergey Levine: Control as Inference and Soft Deep RL](https://i.ytimg.com/vi/IAJ1LywY6Zg/default.jpg)
![How to Become a Deep Learning Expert](https://i.ytimg.com/vi/Vr9TY2OScoo/default.jpg)
![](https://i.ytimg.com/vi/z9VltBI8eDs/default.jpg)
![Can a Random Reinforcement Learning Agent Maximize its Score? Soft Actor Critic (SAC) in Tensorflow2](https://i.ytimg.com/vi/YKhkTOU0l20/default.jpg)
![Multicore Deep Reinforcement Learning | Asynchronous Advantage Actor Critic (A3C) Tutorial (PYTORCH)](https://i.ytimg.com/vi/OcIx_TBu90Q/default.jpg)
![Deep Q Learning is Simple with PyTorch | Full Tutorial 2020](https://i.ytimg.com/vi/wc-FxNENg9U/default.jpg)
![Policy Gradients Are Easy In Keras | Deep Reinforcement Learning Tutorial](https://i.ytimg.com/vi/IS0V8z8HXrM/default.jpg)
![Soft Actor Critic (V2)](https://i.ytimg.com/vi/_nFXOZpo50U/default.jpg)
![How the Questions You Ask Determine Your Mastery of Artificial Intelligence](https://i.ytimg.com/vi/KRtZRR0SyEE/default.jpg)
![Actor Critic Methods Are Easy With Keras](https://i.ytimg.com/vi/2vJtbAha3To/default.jpg)
![AI Learns to Park - Deep Reinforcement Learning](https://i.ytimg.com/vi/VMp6pq6_QjI/default.jpg)
![Deep Reinforcement Learning in the Enterprise: Bridging the Gap from Games to Industry](https://i.ytimg.com/vi/GOsUHlr4DKE/default.jpg)
![Deep Reinforcement Learning Tutorial for Python in 20 Minutes](https://i.ytimg.com/vi/cO5g5qLrLSo/default.jpg)
![Should You Launch an AI Startup in 2020? Here's How To Get Started](https://i.ytimg.com/vi/C6O3qFZkdL8/default.jpg)
![Dueling Double Deep Q Learning is Simple with Tensorflow 2](https://i.ytimg.com/vi/A39cjchWnsU/default.jpg)
![Deep Reinforcement Learning: Neural Networks for Learning Control Laws](https://i.ytimg.com/vi/IUiKAD6cuTA/default.jpg)