Training learned optimizers: VeLO paper EXPLAINED
Why tune optimizers hyperparameters (Adam) by hand, when one can train a neural network to behave like an optimizer and dynamically find the best update for your neural network’s weights?
In this video, we explain the work on VeLO to train an optimizer from data from previous training runs.
► Sponsor: Cohere 👉 https://t1p.de/22srn
Check out our daily #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community
📜 VeELO paper: Metz, Luke, James Harrison, C. Daniel Freeman, Amil Merchant, Lucas Beyer, James Bradbury, Naman Agrawal et al. "VeLO: Training Versatile Learned Optimizers by Scaling Up." https://arxiv.org/abs/2211.09760
Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Dres. Trost GbR, Siltax, Edvard Grødem, Vignesh Valliappan, Mutual Information, Mike Ton
Outline:
00:00 VeLO optimizer without any hyperparameters
01:13 Cohere [Sponsor]
02:27 What are optimizers?
04:37 VeLO idea and training data
06:43 VeLO model and training
10:15 What can VeLO do?
11:52 Limitations of VeLO
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research
Video editing: Nils Trost
Music 🎵 : Hey There - half.cool
Видео Training learned optimizers: VeLO paper EXPLAINED канала AI Coffee Break with Letitia
In this video, we explain the work on VeLO to train an optimizer from data from previous training runs.
► Sponsor: Cohere 👉 https://t1p.de/22srn
Check out our daily #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community
📜 VeELO paper: Metz, Luke, James Harrison, C. Daniel Freeman, Amil Merchant, Lucas Beyer, James Bradbury, Naman Agrawal et al. "VeLO: Training Versatile Learned Optimizers by Scaling Up." https://arxiv.org/abs/2211.09760
Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Dres. Trost GbR, Siltax, Edvard Grødem, Vignesh Valliappan, Mutual Information, Mike Ton
Outline:
00:00 VeLO optimizer without any hyperparameters
01:13 Cohere [Sponsor]
02:27 What are optimizers?
04:37 VeLO idea and training data
06:43 VeLO model and training
10:15 What can VeLO do?
11:52 Limitations of VeLO
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research
Video editing: Nils Trost
Music 🎵 : Hey There - half.cool
Видео Training learned optimizers: VeLO paper EXPLAINED канала AI Coffee Break with Letitia
Показать
Комментарии отсутствуют
Информация о видео
10 января 2023 г. 17:35:36
00:12:56
Другие видео канала
![What is LoRA? Low-Rank Adaptation for finetuning LLMs EXPLAINED](https://i.ytimg.com/vi/KEv-F5UkhxU/default.jpg)
![Are ChatBots their own death? | Training on Generated Data Makes Models Forget – Paper explained](https://i.ytimg.com/vi/rrMNWJ9qXlI/default.jpg)
![The first law on AI regulation | The EU AI Act](https://i.ytimg.com/vi/JOKXONV7LuA/default.jpg)
![Say that 3 times in a row. 😅](https://i.ytimg.com/vi/EV8v5P1t84U/default.jpg)
![Author Interviews, Poster Highlights, Summary of the ACL 2023 Toronto NLP](https://i.ytimg.com/vi/-Agcr0nawuk/default.jpg)
![ChatGPT ist not an intelligent agent. It is a cultural technology. – Gopnik Keynote](https://i.ytimg.com/vi/FPqxmkc_qZU/default.jpg)
![Do LLMs understand? Jay Alammar's TLDR of Geoffrey Hinton ACL2023 Keynote](https://i.ytimg.com/vi/BNA0QY79Xhk/default.jpg)
![[Own work] MM-SHAP to measure modality contributions](https://i.ytimg.com/vi/RLaiomLMK9I/default.jpg)
![Eight Things to Know about Large Language Models](https://i.ytimg.com/vi/RX-gGs_EV7M/default.jpg)
![Speaking about AI is hard, even for humans | AI Coffee Break Bloopers](https://i.ytimg.com/vi/w_fmoJz83Cs/default.jpg)
![Moral Self-Correction in Large Language Models | paper explained](https://i.ytimg.com/vi/X_RKCTpuYRA/default.jpg)
![AI beats us at another game: STRATEGO | DeepNash paper explained](https://i.ytimg.com/vi/3vO45gcEbRs/default.jpg)
![Why ChatGPT fails | Language Model Limitations EXPLAINED](https://i.ytimg.com/vi/XstVY5epRWs/default.jpg)
!["Watermarking Language Models" paper and GPTZero EXPLAINED | How to detect text by ChatGPT?](https://i.ytimg.com/vi/-vToUx5SDW4/default.jpg)
![ChatGPT vs Sparrow - Battle of Chatbots](https://i.ytimg.com/vi/SWwQ3k-DWyo/default.jpg)
![Paella: Text to image FASTER than diffusion models | Paella paper explained](https://i.ytimg.com/vi/6zeLSANd41k/default.jpg)
![Generate long form video with Transformers | Phenaki from Google Brain explained](https://i.ytimg.com/vi/RYLomvaPWa4/default.jpg)
![Movie Diffusion explained | Make-a-Video from MetaAI and Imagen Video from Google Brain](https://i.ytimg.com/vi/AcvmyqGgMh8/default.jpg)
![Beyond neural scaling laws – Paper Explained](https://i.ytimg.com/vi/joZaCw5PxYs/default.jpg)
![How does Stable Diffusion work? – Latent Diffusion Models EXPLAINED](https://i.ytimg.com/vi/J87hffSMB60/default.jpg)