Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning
Here we describe Q-learning, which is one of the most popular methods in reinforcement learning. Q-learning is a type of temporal difference learning. We discuss other TD algorithms, such as SARSA, and connections to biological learning through dopamine. Q-learning is also one of the most common frameworks for deep reinforcement learning.
Citable link for this video: https://doi.org/10.52843/cassyni.ss11hp
This is a lecture in a series on reinforcement learning, following the new Chapter 11 from the 2nd edition of our book "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by Brunton and Kutz
Book Website: http://databookuw.com
Book PDF: http://databookuw.com/databook.pdf
Amazon: https://www.amazon.com/Data-Driven-Science-Engineering-Learning-Dynamical/dp/1108422098/
Brunton Website: eigensteve.com
This video was produced at the University of Washington
Видео Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning канала Steve Brunton
Citable link for this video: https://doi.org/10.52843/cassyni.ss11hp
This is a lecture in a series on reinforcement learning, following the new Chapter 11 from the 2nd edition of our book "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by Brunton and Kutz
Book Website: http://databookuw.com
Book PDF: http://databookuw.com/databook.pdf
Amazon: https://www.amazon.com/Data-Driven-Science-Engineering-Learning-Dynamical/dp/1108422098/
Brunton Website: eigensteve.com
This video was produced at the University of Washington
Видео Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning канала Steve Brunton
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Data-Driven Control: The Goal of Balanced Model ReductionRobust Regression with the L1 Norm [Python]System Identification: Regression ModelsKoopman Spectral Analysis (Multiscale systems)The Wave Equation and Slack Line PhysicsME564 Lecture 21: Linear algebra in 2D and 3D: inner product, norm of a vector, and cross productInterpretable Aeroelastic Models for Control at Insect ScaleExtremum Seeking Control in SimulinkSystems of Differential Equations with Forcing: Example in Control TheoryAirfoil pitching about leading-edge (+/- 20 deg, Re=100), with FTLE visualizationHankel Alternative View of Koopman (HAVOK) Analysis [FULL]Airfoil pitching about leading-edge (+/- 27.1 deg, Re=100), with FTLE visualizationVorticity field for plunging plate in a quiescent fluidEngineering Math Pre-Req: Quick and Dirty Introduction to MatlabAirfoil pitching about quarter-chord (+/- 27.1 deg, Re=100), with FTLE visualizationData-Driven Control: Error Bounds for Balanced TruncationSVD: Eigenfaces 4 [Matlab]Particles starting near positive-time LCS attract onto negative-time LCSMeasure-preserving EDMD: A 4-line structure-preserving & convergent DMD algorithm!Validation of forward-time FTLE field for vortex sheddingME565 Lecture 23: Laplace Transform and ODEs with Forcing and Transfer Functions