Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
This video explores the T5 large-scale study on Transfer Learning. This paper takes apart many different factors of the Pre-Training then Fine-Tuning pipeline for NLP. This involves Auto-Regressive Language Modeling vs. BERT-Style Masked Language Modeling and XLNet-style shuffling, as well as the impact of dataset composition, size, and how to best use more computation. Thanks for watching and please check out Machine Learning Street Talk where Tim Scarfe, Yannic Kilcher and I discuss this paper!
Machine Learning Street Talk: https://www.youtube.com/channel/UCMLtBahI5DMrt0NPvDSoIRQ
Paper Links:
T5: https://arxiv.org/abs/1910.10683
Google AI Blog Post on T5: https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html
Train Large, Then Compress: https://arxiv.org/pdf/2002.11794.pdf
Scaling Laws for Neural Language Models: https://arxiv.org/pdf/2001.08361.pdf
The Illustrated Transformer: http://jalammar.github.io/illustrated-transformer/
ELECTRA: https://arxiv.org/pdf/2003.10555.pdf
Transformer-XL: https://arxiv.org/pdf/1901.02860.pdf
Reformer: The Efficient Transformer: https://openreview.net/pdf?id=rkgNKkHtvB
The Evolved Transformer: https://arxiv.org/pdf/1901.11117.pdf
DistilBERT: https://arxiv.org/pdf/1910.01108.pdf
How to generate text (HIGHLY RECOMMEND): https://huggingface.co/blog/how-to-generate
Tokenizers: https://blog.floydhub.com/tokenization-nlp/
Thanks for watching! Please Subscribe!
Видео Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer канала Connor Shorten
Machine Learning Street Talk: https://www.youtube.com/channel/UCMLtBahI5DMrt0NPvDSoIRQ
Paper Links:
T5: https://arxiv.org/abs/1910.10683
Google AI Blog Post on T5: https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html
Train Large, Then Compress: https://arxiv.org/pdf/2002.11794.pdf
Scaling Laws for Neural Language Models: https://arxiv.org/pdf/2001.08361.pdf
The Illustrated Transformer: http://jalammar.github.io/illustrated-transformer/
ELECTRA: https://arxiv.org/pdf/2003.10555.pdf
Transformer-XL: https://arxiv.org/pdf/1901.02860.pdf
Reformer: The Efficient Transformer: https://openreview.net/pdf?id=rkgNKkHtvB
The Evolved Transformer: https://arxiv.org/pdf/1901.11117.pdf
DistilBERT: https://arxiv.org/pdf/1910.01108.pdf
How to generate text (HIGHLY RECOMMEND): https://huggingface.co/blog/how-to-generate
Tokenizers: https://blog.floydhub.com/tokenization-nlp/
Thanks for watching! Please Subscribe!
Видео Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer канала Connor Shorten
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Collaboration of ExpertsAI Weekly Update #12 - November 18th, 2019Healthsea from Spacy on HuggingFace SpacesDL Podcast #3 | Yannic Kilcher | Population-Based SearchDeep Learning Podcast #2 | Edward Peake | Deep Learning in Medical ImagingLong-Short TransformerKnowledge Distillation - Keras Code ExamplesApproximate Nearest Neighbor Benchmarks - Weaviate Podcast RecapBinary Passage Retrieval in Weaviate (32x Memory Savings)Beyond Goldfish Memory!AI Weekly Update 2.0Efficient Transfer Learning with Null PromptsGoogle Research at ICCVDeep Learning for Podcast Content Search (Summary of Interview with Alex Canan at Zencastr)Full Stack Neural SearchEvolving Normalization-Activation LayersCoDA: Contrast-Enhancing and Diversity-Promoting Data Augmentation for NLUHow recent papers from OpenAI may come togetherAI Weekly Update - March 15th, 20201 (#28)!MultiCite - New Research in Scientific Literature Mining!AI Weekly Update - January 27th, 2020 (#14)