ELECTRA: Pre-Training Text Encoders as Discriminators Rather than Generators
This video explains the new Replaced Token Detection pre-training objective introduced in ELECTRA. ELECTRA is much more compute efficient due to defining the loss on the entire input sequence and avoiding the introduction of the [MASK] token into the self-supervised learning task. ELECTRA-small is trained on 1 GPU for 4 days and outperforms GPT trained with 30x more compute. ELECTRA is on par with RoBERTa and XLNet with 1/4 of the compute and surpasses those models with the same level of compute!
Thanks for watching! Please Subscribe!
Paper Link:
ELECTRA: https://openreview.net/pdf?id=r1xMH1BtvB
BERT: https://arxiv.org/abs/1810.04805
Видео ELECTRA: Pre-Training Text Encoders as Discriminators Rather than Generators канала Connor Shorten
Thanks for watching! Please Subscribe!
Paper Link:
ELECTRA: https://openreview.net/pdf?id=r1xMH1BtvB
BERT: https://arxiv.org/abs/1810.04805
Видео ELECTRA: Pre-Training Text Encoders as Discriminators Rather than Generators канала Connor Shorten
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators | NLP Journal ClubCLIP: Connecting Text and ImagesLSTM is dead. Long Live Transformers!SimCLR Explained!Thomas Wolf (HuggingFace): An Introduction to Transfer Learning and HuggingFaceJina AI DocArray - Documentation OverviewELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators (paper explained)Capsule Networks (CapsNets) – TutorialRethinking Pre-training and Self-TrainingGenerative Adversarial Networks - FUTURISTIC & FUN AI !Natural Language Processing (NLP) Tutorial | Data Science Tutorial | SimplilearnBut what is a neural network? | Chapter 1, Deep learningRoBERTa: A Robustly Optimized BERT Pretraining ApproachBERT Explained!Mushrooms as Medicine with Paul Stamets at Exponential MedicineExploring the Limits of Transfer Learning with a Unified Text-to-Text TransformerSupport Vector Machines: A Visual Explanation with Sample Python CodeCapsule networks: overviewExplaining GPT-3 AI in Hindi. The First steps in Changing The World.