Загрузка...

RLF S3L1: When the Map Runs Out — Why Model-Free RL?

This lecture motivates the entire section by showing where Dynamic Programming (DP) breaks down. Students learn why real-world environments — Blackjack, Atari, robotics — make DP impossible to apply, and how Monte Carlo (MC) methods solve this by replacing the "model" with raw experience. We finish with the one-sentence summary of MC: play many episodes, average the returns you see.

Full Course: https://quanzetta.com/courses/reinforcement-learning-foundation/

Видео RLF S3L1: When the Map Runs Out — Why Model-Free RL? канала Quanzetta

Комментарии отсутствуют

Информация о видео

22 мая 2026 г. 0:10:27

00:08:10

Правообладателям

Жалоба на материал Недопустимый материал Нарушение авторских прав

Комментарии

Поделиться

Другие видео канала

Reinforcment Learning Foundation

RLF S3L2: MC Prediction — Two Ways to Count a Visit

RLF S1L3: Your Roadmap — Course Overview & Structure

RLF S4L3: TD(n) — Multi-Step Returns

RLF S1L1F: When You Can't Write the Rules — Introduction to RL - مقدمة التعليم المعزز

RLF S2L2: The Bellman Equation & Value Functions (V and Q)

RLF S1L2: The 8 Words That Unlock Everything — Core Vocabulary (Arabic - بالعربي)

RLF S1L4: From Zero to First Agent — Setup & Your First Code

Все заметки Новая заметка Страницу в заметки

Страницу в закладки Мои закладки

На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.

О Cookies Напомнить позже Принять