Загрузка страницы

PaLM Pathways Language Model explained | 540 Billion parameters can explain jokes!?

It’s time to explain PaLM, Google AI’s Pathways Language model in a coffee break! ☕ Are you ready for this 540 billion dense parameter model to explain jokes to you? But what about chain of thought reasoning? Or any other cool NLP task like the crazy ones listed in BIG Bench?

SPONSOR: Weights & Biases 👉 https://wandb.me/ai-coffee-break

📺 Diffusion models and GLIDE explained: https://youtu.be/344w5h24-h8

Check out our daily #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community
➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/

Paper 📜: Chowdhery, Aakanksha, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham et al. "PaLM: Scaling Language Modeling with Pathways." arXiv preprint arXiv:2204.02311 (2022). https://arxiv.org/abs/2204.02311
🔗 PaLM blog: https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html
🔗 Large language models are slightly boring: https://twitter.com/Thom_Wolf/status/1511642282788896769

Outline:
00:00 DALL-E 2 or PaLM?
01:14 Weights&Biases (Sponsor)
02:25 A brief history of boring large language models
03:43 What is PaLM?
05:11 Training PaLM on all TPUs
08:11 PaLM training data
08:49 What it can do
10:31 Few-shot learning explained
13:20 Explaining jokes and Outlook

Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Don Rosenthal, Dres. Trost GbR, banana.dev -- Kyle Morris, Julián Salazar, Edvard Grødem, Vignesh Valliappan, Kevin Tsai, Mike Ton

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak

#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​

Music 🎵 : That's What It Takes (Instrumental) - NEFFEX
✍️ Arabic Subtitles by Ali Haidar Ahmad https://www.linkedin.com/in/ali-ahmad-0706a51bb/ .

Видео PaLM Pathways Language Model explained | 540 Billion parameters can explain jokes!? канала AI Coffee Break with Letitia
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
26 апреля 2022 г. 18:18:50
00:16:32
Другие видео канала
Deep Learning for Symbolic Mathematics!? | Paper EXPLAINEDDeep Learning for Symbolic Mathematics!? | Paper EXPLAINED[Quiz] Eigenfaces, Domain adaptation, Causality, Manifold Hypothesis, Denoising Autoencoder[Quiz] Eigenfaces, Domain adaptation, Causality, Manifold Hypothesis, Denoising Autoencoder[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR WorkshopOur paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR WorkshopData BAD | What Will it Take to Fix Benchmarking for NLU?Data BAD | What Will it Take to Fix Benchmarking for NLU?Preparing for Virtual Conferences – 7 Tips for recording a good conference talkPreparing for Virtual Conferences – 7 Tips for recording a good conference talkCan a neural network tell if an image is mirrored? – Visual ChiralityCan a neural network tell if an image is mirrored? – Visual ChiralityAI Coffee Break - Channel TrailerAI Coffee Break - Channel Trailer[Quiz] Interpretable ML, VQ-VAE w/o Quantization / infinite codebook, Pearson’s, PointClouds[Quiz] Interpretable ML, VQ-VAE w/o Quantization / infinite codebook, Pearson’s, PointCloudsWhat is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #ShortsWhat is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #ShortsAdding vs. concatenating positional embeddings & Learned positional encodingsAdding vs. concatenating positional embeddings & Learned positional encodingsGaLore EXPLAINED: Memory-Efficient LLM Training by Gradient Low-Rank ProjectionGaLore EXPLAINED: Memory-Efficient LLM Training by Gradient Low-Rank ProjectionTransformer in Transformer: Paper explained and visualized | TNTTransformer in Transformer: Paper explained and visualized | TNTTraining learned optimizers: VeLO paper EXPLAINEDTraining learned optimizers: VeLO paper EXPLAINEDPre-training of BERT-based Transformer architectures explained – language and vision!Pre-training of BERT-based Transformer architectures explained – language and vision!What is tokenization and how does it work? Tokenizers explained.What is tokenization and how does it work? Tokenizers explained.[Quiz] Regularization in Deep Learning, Lipschitz continuity, Gradient regularization[Quiz] Regularization in Deep Learning, Lipschitz continuity, Gradient regularizationAdversarial Machine Learning explained! | With examples.Adversarial Machine Learning explained! | With examples.Are Pre-trained Convolutions Better than Pre-trained Transformers? – Paper ExplainedAre Pre-trained Convolutions Better than Pre-trained Transformers? – Paper ExplainedDo Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuizDo Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuizFNet: Mixing Tokens with Fourier Transforms – Paper ExplainedFNet: Mixing Tokens with Fourier Transforms – Paper Explained
Яндекс.Метрика