Все видео Новые видео Популярные видео Категории видео

Авто	Видео-блоги	ДТП, аварии	Для маленьких	Еда, напитки
Животные	Закон и право	Знаменитости	Игры	Искусство
Комедии	Красота, мода	Кулинария, рецепты	Люди	Мото
Музыка	Мультфильмы	Наука, технологии	Новости	Образование
Политика	Праздники	Приколы	Природа	Происшествия
Путешествия	Развлечения	Ржач	Семья	Сериалы
Спорт	Стиль жизни	ТВ передачи	Танцы	Технологии
Товары	Ужасы	Фильмы	Шоу-бизнес	Юмор

Intuition of the DQN algorithm

Back in 2015, Deep Q Networks (DQN) had created quite a splash in the AI world. DQN was certainly not the first to combine neural network with Reinforcement Learning. However, it was the first to successfully learn to win at a range of 1980s arcade games using only high-dimensional information (i.e., images of the screen) as input.

This video aims to help build an intuition of how the algorithm works. We would look at:
- the schematics of the network,
- the role of the replay memory,
- the significance of Q-values,
- the loss for the gradient updates, and
- the need for a target network.

Oh, and the best part, all this with Garfield helping us through it. 🤪

---------------

Hey there, you wonderful being! 👋
Hope you are all doing well. Thanks for stopping by. Hope you enjoyed the video and found it helpful. Hope you have a magical day. ✨

---------------

// Resources related to this topic that you might find interesting: 📑
1. Human-level control through deep reinforcement learning (the 2015 Nature paper): https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf
2. Playing Atari with Deep Reinforcement Learning (the 2013 vanilla-DQN paper): https://arxiv.org/pdf/1312.5602.pdf
3. Formalisation of key concepts like return, state-action value function and so on: https://spinningup.openai.com/en/latest/spinningup/rl_intro.html#key-concepts-and-terminology
4. A series of posts, by me, on DQN: https://www.saashanair.com/dqn-theory/
5. The GitHub repo associated with my blog posts: https://github.com/saashanair/rl-series/tree/master/dqn

// Timestamps ⌛️
0:00 DQN wins at Breakout
0:21 Hello there 👋
0:43 Say hello to Garfield 😸
1:07 Garfield Gridworld and the need for DQN
2:08 But we don’t have a labelled dataset here… 🤔
2:49 What is a Replay Memory? 🧠
3:39 What do Q-values represent: Intuition 🪄
4:14 What do Q-values represent: Formalisation ➗
4:37 Wait, Q-values are recursive 🌀
5:02 NN estimates Q-value
5:26 We finally have labels… 🎉
5:35 Erm, but the targets are estimates… (Bootstrapping) 🧐
5:55 DQN Paper: 2013 vs 2015 (Introducing the Target Network)
7:05 Updated loss function
7:15 Don’t forget to explore during training
8:04 Let’s summarise this information overload 😰
8:42 Don’t forget to check out the blog posts
8:52 Seeeeee ya 🤗

// Who am I? 👩‍💻
I respond to the names: Saasha, Sash and Nair. Unless, of course, I am lost in my own world, which, I must warn you, happens quite often. 🤪
I am interested in all things AI, especially topics relating to Reinforcement Learning and Safety. I am currently living in Pisa, Italy, where I am pursuing a year-long research fellowship at the Sant'Anna School of Advanced Studies. My work focuses on improving the trustworthiness and robustness of the DL-based components in Autonomous Vehicles. 🚗

// Why this YouTube channel? 🎥
First of all, good on you for taking your learning into your own hands. I am proud of you for wanting to expand your limits and for putting in the effort. But self-learning can be quite a lonely journey. So, if you are interested, let's claim this little slice of the internet as our own and build a community where our quirky nerdy selves can shine. Let's support one and other as we follow our curiosities and explore the vast and expansive world of AI. 💙

// Let's connect 📮
Website: https://www.saashanair.com
LinkedIn: https://www.linkedin.com/in/saashanair/
Twitter: https://twitter.com/nair_saasha
Mail: saasha.allthingsai@gmail.com

---------------

Credits

// Video snippets
Computer teaches itself to play games, BBC News: https://youtu.be/nwx96e7qck0
Deep Q-Network Plays Atari 2600 Pong: https://youtu.be/p88R2_3yWPA
Garfield VS Odie, Garfield & Friends: https://youtu.be/9BilR0kUkbs
No More Mondays!, Garfield & Friends: https://youtu.be/N-WE2KQT9lM
Garfield Eating Lasagna Compilation: https://youtu.be/8aKav5zEaTQ

// Music
Colorful Flowers by Tokyo Music Walker https://soundcloud.com/user-356546060
Creative Commons — Attribution 3.0 Unported — CC BY 3.0
Free Download / Stream: https://bit.ly/al-colorful-flowers
Music promoted by Audio Library https://youtu.be/vYp14UesizY

---------------

Subscriber count: 52

Видео Intuition of the DQN algorithm канала Saasha Nair

Показать

Комментарии отсутствуют