Combining Deep Reinforcement Learning and Search for Imperfect-Information Games
The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of successes in single-agent settings and perfect-information games, best exemplified by AlphaZero. However, prior algorithms of this form cannot cope with imperfect-information games. This paper presents ReBeL, a general framework for self-play reinforcement learning and search that provably converges to a Nash equilibrium in any two-player zero-sum game. In the simpler setting of perfect-information games, ReBeL reduces to an algorithm similar to AlphaZero. Results in two different imperfect-information games show ReBeL converges to an approximate Nash equilibrium. We also show ReBeL achieves superhuman performance in heads-up no-limit Texas hold'em poker, while using far less domain knowledge than any prior poker AI.
Noam Brown*, Anton Bakhtin*, Adam Lerer, Qucheng Gong
NeurIPS 2020
https://arxiv.org/abs/2007.13544
Видео Combining Deep Reinforcement Learning and Search for Imperfect-Information Games канала Noam Brown
Noam Brown*, Anton Bakhtin*, Adam Lerer, Qucheng Gong
NeurIPS 2020
https://arxiv.org/abs/2007.13544
Видео Combining Deep Reinforcement Learning and Search for Imperfect-Information Games канала Noam Brown
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
ReBeL - Combining Deep Reinforcement Learning and Search for Imperfect-Information Games (Explained)Depth-Limited Solving in Imperfect-Information GamesHow Super Resolution WorksAlpha Zero and Monte Carlo Tree SearchAI Learns to Park - Deep Reinforcement LearningI tried to make a Valorant AI using computer visionHow to Build a Superhuman Poker AI using CFR | Creating a Poker Bot Part 2MarI/O - Machine Learning for Video GamesBest Paper of NIPS2017 - Safe & Nested Subgame Solving for Imperfect-Information GamesNoam Brown | AI for Imperfect-Information Games: Poker and BeyondSuperhuman AI for heads-up no-limit poker: Libratus beats top professionalsThe State of Techniques for Solving Large Imperfect-Information Games, Including PokerActor Critic AlgorithmsNatural Language Processing (NLP) Tutorial with Python & NLTKIlya Sutskever: OpenAI Meta-Learning and Self-Play | MIT Artificial General Intelligence (AGI)RTX 3090 for Machine Learning and Object Detection?Reinforcement Learning: Crash Course AI#9AI for Imperfect-Information Games: Beating Top Humans in No-Limit PokerLessons from Developing an AI to Play Magic: The Gathering by Melvin ZhangIntro to Game Theory