Загрузка страницы

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

This video explores the T5 large-scale study on Transfer Learning. This paper takes apart many different factors of the Pre-Training then Fine-Tuning pipeline for NLP. This involves Auto-Regressive Language Modeling vs. BERT-Style Masked Language Modeling and XLNet-style shuffling, as well as the impact of dataset composition, size, and how to best use more computation. Thanks for watching and please check out Machine Learning Street Talk where Tim Scarfe, Yannic Kilcher and I discuss this paper!

Machine Learning Street Talk: https://www.youtube.com/channel/UCMLtBahI5DMrt0NPvDSoIRQ

Paper Links:
T5: https://arxiv.org/abs/1910.10683
Google AI Blog Post on T5: https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html
Train Large, Then Compress: https://arxiv.org/pdf/2002.11794.pdf
Scaling Laws for Neural Language Models: https://arxiv.org/pdf/2001.08361.pdf
The Illustrated Transformer: http://jalammar.github.io/illustrated-transformer/
ELECTRA: https://arxiv.org/pdf/2003.10555.pdf
Transformer-XL: https://arxiv.org/pdf/1901.02860.pdf
Reformer: The Efficient Transformer: https://openreview.net/pdf?id=rkgNKkHtvB
The Evolved Transformer: https://arxiv.org/pdf/1901.11117.pdf
DistilBERT: https://arxiv.org/pdf/1910.01108.pdf
How to generate text (HIGHLY RECOMMEND): https://huggingface.co/blog/how-to-generate
Tokenizers: https://blog.floydhub.com/tokenization-nlp/

Thanks for watching! Please Subscribe!

Видео Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer канала Connor Shorten
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
24 апреля 2020 г. 0:13:27
00:23:43
Яндекс.Метрика