Загрузка страницы

Deep Reinforcement Learning, part 1 - Doina Precup - MLSS 2020, Tübingen

Table of Contents (powered by https://videoken.com)
0:00:00 Speaker Introduction
0:01:22 Introduction to Reinforcement Learning: Part 1 - prediction, Value-Based, Model-free, Control (including DQN)
0:05:18 Reinforcement Learning
0:07:01 Example: AlphaGo & AlphaZero
0:12:39 Key Features of RL
0:14:03 Reinforcement Learning
0:14:36 Example: TD-Gammon
0:16:49 Some RL Successes
0:24:44 Computational framework
0:25:37 The Agent-Environment Interface
0:27:21 Supervised vs Reinforcement Learning
0:28:40 Agent's learning task
0:29:34 Return
0:30:59 Episodic Tasks
0:31:19 Example: Mountain Car
0:35:40 Continuing Tasks
0:40:58 4 value functions
0:44:24 Value function approximation
0:45:00 A natural objective in VFA is to minimize the Mean Square Value Error
0:46:02 Simple Monte Carlo
0:48:55 Gradient MC works well on the 1000-state random walk using state aggregation
0:51:09 Markov Decision Processes
0:53:24 Optimal Value Functions
0:54:40 What About Optimal Action-Value Functions?
0:55:20 Bellman Equation for a Policy
0:57:37 cf. Dynamic Programming
0:58:56 Recall: Monte Carlo
0:59:29 Simplest TD Method
1:01:34 TD Prediction
1:03:24 You are the Predictor
1:06:03 TD vs MC
1:07:22 Semi-gradient TD is less accurate than MC on the 1000-state random walk using state aggregation
1:09:00 n-step TD Prediction
1:11:13 Mathematics of n-step TD Targets
1:12:13 The λ-return is a compound update target
1:12:45 Unified View
1:22:48 Value function approximation (VFA) replaces the table with a general parameterized form
1:23:03 Stochastic Gradient Descent (SGD) is the idea behind most approximate learning
1:25:07 Geometric intuition
1:29:30 TD converges to the TD fixedpoint, OTD a biased but interesting answer
1:32:35 Summing up policy evaluation
1:33:56 TD(λ) performance with a
1:34:31 Q&A

Видео Deep Reinforcement Learning, part 1 - Doina Precup - MLSS 2020, Tübingen канала virtual mlss2020
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
9 июля 2020 г. 10:44:07
01:38:26
Яндекс.Метрика