09L – Differentiable associative memories, attention, and transformers
Course website: http://bit.ly/DLSP21-web
Playlist: http://bit.ly/DLSP21-YouTube
Speaker: Yann LeCun
Chapters
00:00:00 – Motivation for reasoning & planning
00:09:11 – Inference through energy minimization
00:18:08 – Disclaimer
00:19:02 – Planning through energy minimization
00:32:59 – Q&A Optimal control diagram
00:39:23 – Differentiable associative memory and attention
01:01:03 – Transformers
01:08:14 – Q&A Other differentiable attention architectures
01:10:32 – Transformer architecture
01:27:54 – Transformer applications: 1. Multilingual transformer Architecture XML-R
01:30:16 – 2. Supervised symbol manipulation
01:32:14 – 3. NL understanding & generation
01:36:51 – 4. DETR
01:46:47 – Planing through optimal control
01:55:37 – Conclusion
Видео 09L – Differentiable associative memories, attention, and transformers канала Alfredo Canziani
Playlist: http://bit.ly/DLSP21-YouTube
Speaker: Yann LeCun
Chapters
00:00:00 – Motivation for reasoning & planning
00:09:11 – Inference through energy minimization
00:18:08 – Disclaimer
00:19:02 – Planning through energy minimization
00:32:59 – Q&A Optimal control diagram
00:39:23 – Differentiable associative memory and attention
01:01:03 – Transformers
01:08:14 – Q&A Other differentiable attention architectures
01:10:32 – Transformer architecture
01:27:54 – Transformer applications: 1. Multilingual transformer Architecture XML-R
01:30:16 – 2. Supervised symbol manipulation
01:32:14 – 3. NL understanding & generation
01:36:51 – 4. DETR
01:46:47 – Planing through optimal control
01:55:37 – Conclusion
Видео 09L – Differentiable associative memories, attention, and transformers канала Alfredo Canziani
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
14L – Lagrangian backpropagation, final project winners, and Q&A session12 – Planning and controlPractical 4.1 – RNN forward and backwardWeek 15 – Practicum part B: Training latent variable energy based models (EBMs)Chapter 2, video 4–6Behind the scenesTeraDeep Image Parser02 – Supervised learning / ClassificationPerson detectorPractical 3.2 – CNN modelsWeek 9 – Practicum: (Energy-based) Generative adversarial networks[LIVE] Free energy gentle introductionPurdue theme08 – From LV-EBM to target prop to (vanilla, denoising, contractive, variational) autoencoder06L – Latent variable EBMs for structured predictionWhy not?Matrix multiplication, signals, and convolutionsPractical 3.3 – CNN trainingGoodbye to DL20,21,22,23 apartmentModel-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic05 – Multi-class perceptron, binary and multi-class logistic regression