Beyond neural scaling laws – Paper Explained
„Beyond neural scaling laws: beating power law scaling via data pruning” paper explained with animations. You do not need to train your neural network on the entire dataset!
Sponsor: NVIDIA ❗ use this link to register for the GTC 👉 https://nvda.ws/3p4E5K8
Google Form to enter DLI credits giveaway: https://forms.gle/rkZS4u5rfyySr6iP9
ERRATUM: See pinned comment for what easy/hard examples are chosen.
📺 PaLM model explained: https://youtu.be/yi-A0kWXEO4
Check out our daily #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community
Paper 📜: Sorscher, Ben, Robert Geirhos, Shashank Shekhar, Surya Ganguli, and Ari S. Morcos. "Beyond neural scaling laws: beating power law scaling via data pruning." arXiv preprint arXiv:2206.14486 (2022). https://arxiv.org/abs/2206.14486
Outline:
00:00 Stable Diffusion is a Latent Diffusion Model
01:43 NVIDIA (sponsor): Register for the GTC!
03:00 What are neural scaling laws? Power laws explained.
05:15 Exponential scaling in theory
07:40 What the theory predicts
09:50 Unsupervised data pruning with foundation models
Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Don Rosenthal, Dres. Trost GbR, Julián Salazar, Edvard Grødem, Vignesh Valliappan, Mutual Information, Mike Ton
➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research
Video editing: Nils Trost
Видео Beyond neural scaling laws – Paper Explained канала AI Coffee Break with Letitia
Sponsor: NVIDIA ❗ use this link to register for the GTC 👉 https://nvda.ws/3p4E5K8
Google Form to enter DLI credits giveaway: https://forms.gle/rkZS4u5rfyySr6iP9
ERRATUM: See pinned comment for what easy/hard examples are chosen.
📺 PaLM model explained: https://youtu.be/yi-A0kWXEO4
Check out our daily #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community
Paper 📜: Sorscher, Ben, Robert Geirhos, Shashank Shekhar, Surya Ganguli, and Ari S. Morcos. "Beyond neural scaling laws: beating power law scaling via data pruning." arXiv preprint arXiv:2206.14486 (2022). https://arxiv.org/abs/2206.14486
Outline:
00:00 Stable Diffusion is a Latent Diffusion Model
01:43 NVIDIA (sponsor): Register for the GTC!
03:00 What are neural scaling laws? Power laws explained.
05:15 Exponential scaling in theory
07:40 What the theory predicts
09:50 Unsupervised data pruning with foundation models
Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Don Rosenthal, Dres. Trost GbR, Julián Salazar, Edvard Grødem, Vignesh Valliappan, Mutual Information, Mike Ton
➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research
Video editing: Nils Trost
Видео Beyond neural scaling laws – Paper Explained канала AI Coffee Break with Letitia
Показать
Комментарии отсутствуют
Информация о видео
10 сентября 2022 г. 13:00:16
00:13:16
Другие видео канала
![Deep Learning for Symbolic Mathematics!? | Paper EXPLAINED](https://i.ytimg.com/vi/l7ofrfmVsd0/default.jpg)
![[Quiz] Eigenfaces, Domain adaptation, Causality, Manifold Hypothesis, Denoising Autoencoder](https://i.ytimg.com/vi/yPXNQ6Ig7hQ/default.jpg)
![[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?](https://i.ytimg.com/vi/xqdHfLrevuo/default.jpg)
![Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR Workshop](https://i.ytimg.com/vi/Ev17hz52FGo/default.jpg)
![Data BAD | What Will it Take to Fix Benchmarking for NLU?](https://i.ytimg.com/vi/W57u1j16iC8/default.jpg)
![Preparing for Virtual Conferences – 7 Tips for recording a good conference talk](https://i.ytimg.com/vi/b6Gad5edd18/default.jpg)
![Can a neural network tell if an image is mirrored? – Visual Chirality](https://i.ytimg.com/vi/rbg1Mdo2LZM/default.jpg)
![AI Coffee Break - Channel Trailer](https://i.ytimg.com/vi/h9xPrgTYP_0/default.jpg)
![[Quiz] Interpretable ML, VQ-VAE w/o Quantization / infinite codebook, Pearson’s, PointClouds](https://i.ytimg.com/vi/KS7UlN9SCg4/default.jpg)
![What is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #Shorts](https://i.ytimg.com/vi/mI4sXRSkzE8/default.jpg)
![Adding vs. concatenating positional embeddings & Learned positional encodings](https://i.ytimg.com/vi/M2ToEXF6Olw/default.jpg)
![GaLore EXPLAINED: Memory-Efficient LLM Training by Gradient Low-Rank Projection](https://i.ytimg.com/vi/VC9NbOir7q0/default.jpg)
![Transformer in Transformer: Paper explained and visualized | TNT](https://i.ytimg.com/vi/HWna2c5VXDg/default.jpg)
![Sparse LLMs at inference: 6x faster transformers! | DEJAVU paper explained](https://i.ytimg.com/vi/DUkWMoi5nG4/default.jpg)
![Training learned optimizers: VeLO paper EXPLAINED](https://i.ytimg.com/vi/9a6PQJxzUpM/default.jpg)
![Pre-training of BERT-based Transformer architectures explained – language and vision!](https://i.ytimg.com/vi/dabFOBE4eZI/default.jpg)
![What is tokenization and how does it work? Tokenizers explained.](https://i.ytimg.com/vi/D8j1c4NJRfo/default.jpg)
![[Quiz] Regularization in Deep Learning, Lipschitz continuity, Gradient regularization](https://i.ytimg.com/vi/zAyDhZFup9k/default.jpg)
![Adversarial Machine Learning explained! | With examples.](https://i.ytimg.com/vi/YyTyWGUUhmo/default.jpg)
![Are Pre-trained Convolutions Better than Pre-trained Transformers? – Paper Explained](https://i.ytimg.com/vi/xchDU2VMR4M/default.jpg)
![Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz](https://i.ytimg.com/vi/Xxts1ithupI/default.jpg)