Загрузка страницы

Beyond neural scaling laws – Paper Explained

„Beyond neural scaling laws: beating power law scaling via data pruning” paper explained with animations. You do not need to train your neural network on the entire dataset!
Sponsor: NVIDIA ❗ use this link to register for the GTC 👉 https://nvda.ws/3p4E5K8
Google Form to enter DLI credits giveaway: https://forms.gle/rkZS4u5rfyySr6iP9

ERRATUM: See pinned comment for what easy/hard examples are chosen.

📺 PaLM model explained: https://youtu.be/yi-A0kWXEO4

Check out our daily #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community
Paper 📜: Sorscher, Ben, Robert Geirhos, Shashank Shekhar, Surya Ganguli, and Ari S. Morcos. "Beyond neural scaling laws: beating power law scaling via data pruning." arXiv preprint arXiv:2206.14486 (2022). https://arxiv.org/abs/2206.14486

Outline:
00:00 Stable Diffusion is a Latent Diffusion Model
01:43 NVIDIA (sponsor): Register for the GTC!
03:00 What are neural scaling laws? Power laws explained.
05:15 Exponential scaling in theory
07:40 What the theory predicts
09:50 Unsupervised data pruning with foundation models

Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Don Rosenthal, Dres. Trost GbR, Julián Salazar, Edvard Grødem, Vignesh Valliappan, Mutual Information, Mike Ton
➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak

#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​

Video editing: Nils Trost

Видео Beyond neural scaling laws – Paper Explained канала AI Coffee Break with Letitia
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
10 сентября 2022 г. 13:00:16
00:13:16
Другие видео канала
Deep Learning for Symbolic Mathematics!? | Paper EXPLAINEDDeep Learning for Symbolic Mathematics!? | Paper EXPLAINED[Quiz] Eigenfaces, Domain adaptation, Causality, Manifold Hypothesis, Denoising Autoencoder[Quiz] Eigenfaces, Domain adaptation, Causality, Manifold Hypothesis, Denoising Autoencoder[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR WorkshopOur paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR WorkshopData BAD | What Will it Take to Fix Benchmarking for NLU?Data BAD | What Will it Take to Fix Benchmarking for NLU?Preparing for Virtual Conferences – 7 Tips for recording a good conference talkPreparing for Virtual Conferences – 7 Tips for recording a good conference talkCan a neural network tell if an image is mirrored? – Visual ChiralityCan a neural network tell if an image is mirrored? – Visual ChiralityAI Coffee Break - Channel TrailerAI Coffee Break - Channel Trailer[Quiz] Interpretable ML, VQ-VAE w/o Quantization / infinite codebook, Pearson’s, PointClouds[Quiz] Interpretable ML, VQ-VAE w/o Quantization / infinite codebook, Pearson’s, PointCloudsWhat is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #ShortsWhat is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #ShortsAdding vs. concatenating positional embeddings & Learned positional encodingsAdding vs. concatenating positional embeddings & Learned positional encodingsGaLore EXPLAINED: Memory-Efficient LLM Training by Gradient Low-Rank ProjectionGaLore EXPLAINED: Memory-Efficient LLM Training by Gradient Low-Rank ProjectionTransformer in Transformer: Paper explained and visualized | TNTTransformer in Transformer: Paper explained and visualized | TNTSparse LLMs at inference: 6x faster transformers! | DEJAVU paper explainedSparse LLMs at inference: 6x faster transformers! | DEJAVU paper explainedTraining learned optimizers: VeLO paper EXPLAINEDTraining learned optimizers: VeLO paper EXPLAINEDPre-training of BERT-based Transformer architectures explained – language and vision!Pre-training of BERT-based Transformer architectures explained – language and vision!What is tokenization and how does it work? Tokenizers explained.What is tokenization and how does it work? Tokenizers explained.[Quiz] Regularization in Deep Learning, Lipschitz continuity, Gradient regularization[Quiz] Regularization in Deep Learning, Lipschitz continuity, Gradient regularizationAdversarial Machine Learning explained! | With examples.Adversarial Machine Learning explained! | With examples.Are Pre-trained Convolutions Better than Pre-trained Transformers? – Paper ExplainedAre Pre-trained Convolutions Better than Pre-trained Transformers? – Paper ExplainedDo Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuizDo Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz
Яндекс.Метрика