Deep Multiagent Reinforcement Learning for Partially Observable Parameterized Environments
As software and hardware agents begin to perform tasks of genuine interest, they will be faced with environments too complex for humans to predetermine the correct actions to take. Three characteristics shared by many complex domains are 1) high-dimensional state and action spaces, 2) partial observability, and 3) multiple learning agents. To tackle such problems I will describe algorithms that combine deep neural network function approximation with reinforcement learning. First I will describe using recurrent neural networks to handle partial observability in Atari games. Next, I will describe a multiagent soccer domain: Half-Field-Offense and approaches for learning effective policies in this parameterized-continuous action space. I will conclude with ongoing work on multiagent learning in HFO.
Видео Deep Multiagent Reinforcement Learning for Partially Observable Parameterized Environments канала Microsoft Research
Видео Deep Multiagent Reinforcement Learning for Partially Observable Parameterized Environments канала Microsoft Research
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![The Role of Multi-Agent Learning in Artificial Intelligence Research at DeepMind](https://i.ytimg.com/vi/CvL-KV3IBcM/default.jpg)
![Writing Great Reward Functions - Bonsai](https://i.ytimg.com/vi/0R3PnJEisqk/default.jpg)
![Scaling AI Infrastructure at OpenAI](https://i.ytimg.com/vi/cK7qFZ9J6k0/default.jpg)
![CS885 Lecture 12: Deep Recurrent Q-Networks](https://i.ytimg.com/vi/aQUNYgwwq1A/default.jpg)
![Richard Murray: "Can We Really Use Machine Learning in Safety Critical Systems?"](https://i.ytimg.com/vi/Wi8Y---ce28/default.jpg)
![Jonathan Eckstein - The ADMM, Progressive Hedging, and Operator Splitting (and workshop welcome)](https://i.ytimg.com/vi/T-JUf23khZU/default.jpg)
![Solving POMDP](https://i.ytimg.com/vi/dMOUp7YzUpQ/default.jpg)
![The opportunities with AI and machine learning](https://i.ytimg.com/vi/954inChlPxE/default.jpg)
![CS885 Lecture 15c: Semi-Markov Decision Processes](https://i.ytimg.com/vi/1nuTmzqKQyE/default.jpg)
!["Reinforcement Learning for Recommender Systems: A Case Study on Youtube," by Minmin Chen](https://i.ytimg.com/vi/HEqQ2_1XRTs/default.jpg)
![Dimitri Bertsekas: "Distributed and Multiagent Reinforcement Learning"](https://i.ytimg.com/vi/nTPuL6iVuwU/default.jpg)
![AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning](https://i.ytimg.com/vi/BTLCdge7uSQ/default.jpg)
![Actor Critic (A3C) Tutorial](https://i.ytimg.com/vi/O5BlozCJBSE/default.jpg)
![Reinforcement Learning 6: Policy Gradients and Actor Critics](https://i.ytimg.com/vi/bRfUxQs6xIM/default.jpg)
![Hierarchical Imitation and Reinforcement Learning - ICML 2018](https://i.ytimg.com/vi/zQy02LsARo0/default.jpg)
![Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks](https://i.ytimg.com/vi/0H3EkUPENSY/default.jpg)
![Artificial intelligence | Lecture 3: Intelligent Agent -1](https://i.ytimg.com/vi/-hAYdk9NhGQ/default.jpg)
![GOTO 2017 • Improving Business Decision Making with Bayesian Artificial Intelligence • Michael Green](https://i.ytimg.com/vi/fOBzWg4y50s/default.jpg)
![Recent Efforts Towards Efficient And Scalable Neural Waveform Coding](https://i.ytimg.com/vi/ybEwJKTaY0k/default.jpg)