Загрузка страницы

Multitask Prompted Training Enables Zero-shot Task Generalization (Explained)

Can zero-shot generalization instead be directly induced by explicit multitask learning? Watch the video to find out!

0:00 - Intro
2:14 - Prompted training format
5:52 - Measuring generalization to unseen tasks
8:45 - Held-out tasks
10:45 - The future of NLP
11:48 - Model
12:17 - Experiment results
Connect
Linkedin https://www.linkedin.com/in/xue-yong-fu-955723a6/
Twitter https://twitter.com/home
email edwindeeplearning@gmail.com
Paper
https://arxiv.org/abs/2110.08207
Code
https://github.com/bigscience-workshop/promptsource/
Abstract
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks. It has been hypothesized that this is a consequence of implicit multitask learning in language model training. Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping general natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts using varying natural language. These prompted datasets allow for benchmarking the ability of a model to perform completely unseen tasks specified in natural language. We fine-tune a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several standard datasets, often outperforming models 16x its size. Further, our approach attains strong performance on a subset of tasks from the BIG-Bench benchmark, outperforming models 6x its size.

Видео Multitask Prompted Training Enables Zero-shot Task Generalization (Explained) канала Deep Learning Explainer
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
25 октября 2021 г. 4:05:29
00:16:36
Другие видео канала
ChatGPTs Take Over a Town: 25 Agents Experience Love, Friendships, and Life!ChatGPTs Take Over a Town: 25 Agents Experience Love, Friendships, and Life!ChatGPT Plugins, Github Copilot X, Bard, Bing Image Creator - Crazy Week for AIChatGPT Plugins, Github Copilot X, Bard, Bing Image Creator - Crazy Week for AICan Machines Learn Like Humans - In-context Learning\Meta\Zero-shot Learning | #GPT3  (part 3)Can Machines Learn Like Humans - In-context Learning\Meta\Zero-shot Learning | #GPT3 (part 3)Introduction of GPT-3: The Most Powerful Language Model Ever - #GPT3 Explained Series (part 1)Introduction of GPT-3: The Most Powerful Language Model Ever - #GPT3 Explained Series (part 1)What Is A Language Model? GPT-3: Language Models Are Few-Shot Learners  #GPT3 (part 2)What Is A Language Model? GPT-3: Language Models Are Few-Shot Learners #GPT3 (part 2)Question and Answer Test-Train Overlap in Open Domain Question Answering DatasetsQuestion and Answer Test-Train Overlap in Open Domain Question Answering DatasetsWav2CLIP: Connecting Text, Images, and AudioWav2CLIP: Connecting Text, Images, and AudioMagical Way of Self-Training and Task Augmentation for NLP ModelsMagical Way of Self-Training and Task Augmentation for NLP ModelsWell read Students Learn Better: On The Importance Of Pre-training Compact ModelsWell read Students Learn Better: On The Importance Of Pre-training Compact ModelsPre-training Is (Almost) All You Need: An Application to Commonsense Reasoning (Paper Explained)Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning (Paper Explained)Vokenization Improving Language Understanding with Visual Grounded Supervision  (Paper Explained)Vokenization Improving Language Understanding with Visual Grounded Supervision (Paper Explained)Sandwich Transformer: Improving Transformer Models by Reordering their SublayersSandwich Transformer: Improving Transformer Models by Reordering their SublayersToo many papers to read? Try TLDR - Extreme Summarization of Scientific DocumentsToo many papers to read? Try TLDR - Extreme Summarization of Scientific DocumentsREALM: Retrieval-Augmented Language Model Pre-training | Qpen Question Answering SOTA #OpenQAREALM: Retrieval-Augmented Language Model Pre-training | Qpen Question Answering SOTA #OpenQATeach Computers to Connect Videos and Text without Labeled Data - VideoClipTeach Computers to Connect Videos and Text without Labeled Data - VideoClipBART: Denoising Sequence-to-Sequence Pre-training for NLG & Translation (Explained)BART: Denoising Sequence-to-Sequence Pre-training for NLG & Translation (Explained)GAN BERT: Generative Adversarial Learning for Robust Text Classification (Paper Explained) #GANBERTGAN BERT: Generative Adversarial Learning for Robust Text Classification (Paper Explained) #GANBERTRevealing Dark Secrets of BERT (Analysis of BERT's Attention Heads) - Paper ExplainedRevealing Dark Secrets of BERT (Analysis of BERT's Attention Heads) - Paper ExplainedTransformer Architecture Explained | Attention Is All You Need | Foundation of BERT, GPT-3, RoBERTaTransformer Architecture Explained | Attention Is All You Need | Foundation of BERT, GPT-3, RoBERTaLinkedin's New Search Engine | DeText: A Deep Text Ranking Framework with BERT | Deep Ranking ModelLinkedin's New Search Engine | DeText: A Deep Text Ranking Framework with BERT | Deep Ranking Model
Яндекс.Метрика