Multitask Prompted Training Enables Zero-shot Task Generalization (Explained)
Can zero-shot generalization instead be directly induced by explicit multitask learning? Watch the video to find out!
0:00 - Intro
2:14 - Prompted training format
5:52 - Measuring generalization to unseen tasks
8:45 - Held-out tasks
10:45 - The future of NLP
11:48 - Model
12:17 - Experiment results
Connect
Linkedin https://www.linkedin.com/in/xue-yong-fu-955723a6/
Twitter https://twitter.com/home
email edwindeeplearning@gmail.com
Paper
https://arxiv.org/abs/2110.08207
Code
https://github.com/bigscience-workshop/promptsource/
Abstract
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks. It has been hypothesized that this is a consequence of implicit multitask learning in language model training. Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping general natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts using varying natural language. These prompted datasets allow for benchmarking the ability of a model to perform completely unseen tasks specified in natural language. We fine-tune a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several standard datasets, often outperforming models 16x its size. Further, our approach attains strong performance on a subset of tasks from the BIG-Bench benchmark, outperforming models 6x its size.
Видео Multitask Prompted Training Enables Zero-shot Task Generalization (Explained) канала Deep Learning Explainer
0:00 - Intro
2:14 - Prompted training format
5:52 - Measuring generalization to unseen tasks
8:45 - Held-out tasks
10:45 - The future of NLP
11:48 - Model
12:17 - Experiment results
Connect
Linkedin https://www.linkedin.com/in/xue-yong-fu-955723a6/
Twitter https://twitter.com/home
email edwindeeplearning@gmail.com
Paper
https://arxiv.org/abs/2110.08207
Code
https://github.com/bigscience-workshop/promptsource/
Abstract
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks. It has been hypothesized that this is a consequence of implicit multitask learning in language model training. Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping general natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts using varying natural language. These prompted datasets allow for benchmarking the ability of a model to perform completely unseen tasks specified in natural language. We fine-tune a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several standard datasets, often outperforming models 16x its size. Further, our approach attains strong performance on a subset of tasks from the BIG-Bench benchmark, outperforming models 6x its size.
Видео Multitask Prompted Training Enables Zero-shot Task Generalization (Explained) канала Deep Learning Explainer
Показать
Комментарии отсутствуют
Информация о видео
25 октября 2021 г. 4:05:29
00:16:36
Другие видео канала
![ChatGPTs Take Over a Town: 25 Agents Experience Love, Friendships, and Life!](https://i.ytimg.com/vi/9LzuqQkXEjo/default.jpg)
![ChatGPT Plugins, Github Copilot X, Bard, Bing Image Creator - Crazy Week for AI](https://i.ytimg.com/vi/VoF-iQDb2QE/default.jpg)
![Can Machines Learn Like Humans - In-context Learning\Meta\Zero-shot Learning | #GPT3 (part 3)](https://i.ytimg.com/vi/no5P_0ZYoOw/default.jpg)
![Introduction of GPT-3: The Most Powerful Language Model Ever - #GPT3 Explained Series (part 1)](https://i.ytimg.com/vi/Rv5SeM7LxLQ/default.jpg)
![What Is A Language Model? GPT-3: Language Models Are Few-Shot Learners #GPT3 (part 2)](https://i.ytimg.com/vi/Rp5IVlSYqgc/default.jpg)
![Question and Answer Test-Train Overlap in Open Domain Question Answering Datasets](https://i.ytimg.com/vi/Cb5sj4_Ztfo/default.jpg)
![Wav2CLIP: Connecting Text, Images, and Audio](https://i.ytimg.com/vi/FeKvpJKav5k/default.jpg)
![Magical Way of Self-Training and Task Augmentation for NLP Models](https://i.ytimg.com/vi/0yriOQbNWmo/default.jpg)
![Well read Students Learn Better: On The Importance Of Pre-training Compact Models](https://i.ytimg.com/vi/LoyyKVJgHKo/default.jpg)
![Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning (Paper Explained)](https://i.ytimg.com/vi/Ijrdm0Nb_k0/default.jpg)
![Vokenization Improving Language Understanding with Visual Grounded Supervision (Paper Explained)](https://i.ytimg.com/vi/4T1u3Z2DaZA/default.jpg)
![Sandwich Transformer: Improving Transformer Models by Reordering their Sublayers](https://i.ytimg.com/vi/EM8xFAjtZUQ/default.jpg)
![Too many papers to read? Try TLDR - Extreme Summarization of Scientific Documents](https://i.ytimg.com/vi/wNSiWJxVGQ8/default.jpg)
![REALM: Retrieval-Augmented Language Model Pre-training | Qpen Question Answering SOTA #OpenQA](https://i.ytimg.com/vi/JQ-bxQT5Qsw/default.jpg)
![Teach Computers to Connect Videos and Text without Labeled Data - VideoClip](https://i.ytimg.com/vi/vqMZjsIKUoQ/default.jpg)
![BART: Denoising Sequence-to-Sequence Pre-training for NLG & Translation (Explained)](https://i.ytimg.com/vi/MxNnl_gHV1Y/default.jpg)
![GAN BERT: Generative Adversarial Learning for Robust Text Classification (Paper Explained) #GANBERT](https://i.ytimg.com/vi/vAQsGi6NctY/default.jpg)
![Revealing Dark Secrets of BERT (Analysis of BERT's Attention Heads) - Paper Explained](https://i.ytimg.com/vi/mnU9ILoDH68/default.jpg)
![Transformer Architecture Explained | Attention Is All You Need | Foundation of BERT, GPT-3, RoBERTa](https://i.ytimg.com/vi/ELTGIye424E/default.jpg)
![Linkedin's New Search Engine | DeText: A Deep Text Ranking Framework with BERT | Deep Ranking Model](https://i.ytimg.com/vi/Dd4Rw3t5QQk/default.jpg)