Strategies for pre-training the BERT-based Transformer architecture – language (and vision)
What is masked language modelling? Or next sentence prediction? And why are they working so well? If you ever wondered what tasks the Transformer architectures are trained on and how the Multimodal Transfomer learns about the connection between images and text, then this is the right video for you!
🎬 Ms. Coffee Bean explained the Multimodal Transformer: https://youtu.be/dd7nE4nbxN0
🎬 She also explained the Language-based Transformer: https://youtu.be/FWFA4DGuzSc
Content:
* 00:00 Pre-training strategies
* 00:48 Masked language modelling
* 03:37 Next sentence prediction
* 04:31 Sentence image alignment
* 05:07 Image region classification
* 06:14 Image region regression
* 06:53 Pre-training and fine-tuning on the downstream task
📄 This video has been enabled by the beautiful overview table in the Appendix of this paper:
VL-BERT: Su, Weijie, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, and Jifeng Dai. "Vl-bert: Pre-training of generic visual-linguistic representations." arXiv preprint arXiv:1908.08530 (2019). https://arxiv.org/pdf/1908.08530.pdf
🔗 Links:
YouTube: https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA/
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
#AICoffeeBreak #MsCoffeeBean
Video and thumbnail contain emojis designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0
Видео Strategies for pre-training the BERT-based Transformer architecture – language (and vision) канала AI Coffee Break with Letitia
🎬 Ms. Coffee Bean explained the Multimodal Transformer: https://youtu.be/dd7nE4nbxN0
🎬 She also explained the Language-based Transformer: https://youtu.be/FWFA4DGuzSc
Content:
* 00:00 Pre-training strategies
* 00:48 Masked language modelling
* 03:37 Next sentence prediction
* 04:31 Sentence image alignment
* 05:07 Image region classification
* 06:14 Image region regression
* 06:53 Pre-training and fine-tuning on the downstream task
📄 This video has been enabled by the beautiful overview table in the Appendix of this paper:
VL-BERT: Su, Weijie, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, and Jifeng Dai. "Vl-bert: Pre-training of generic visual-linguistic representations." arXiv preprint arXiv:1908.08530 (2019). https://arxiv.org/pdf/1908.08530.pdf
🔗 Links:
YouTube: https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA/
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
#AICoffeeBreak #MsCoffeeBean
Video and thumbnail contain emojis designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0
Видео Strategies for pre-training the BERT-based Transformer architecture – language (and vision) канала AI Coffee Break with Letitia
Показать
Комментарии отсутствуют
Информация о видео
15 июля 2020 г. 22:58:33
00:08:23
Другие видео канала
Multimodal Machine Learning models do not work. Here is why. Part 1/2 – The SYMPTOMSViT | Is this the extinction of CNNs? Long live the Transformer?How-to Build a Transformer for Language Classification in TensorFlowTransformers on IMAGES just got data-efficient! Facebook’s DeiT paper EXPLAINEDWhat does it take for an AI to understand language? A roadmap to NLUWhat is GPT-3? Examples, possibilities, and implicationsBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Paper Explained)Transformer combining Vision and Language? ViLBERT - NLP meets Computer VisionIntent Recognition with BERT using Keras and TensorFlow 2 in Python | Text Classification TutorialCOVID-19 - automated NLP-based question answering - Codete, SIBB & IE University webinar 5/05/20The curse of dimensionality. Or is it a blessing?Why Multimodal Machine Learning models do not work. Part 2/2 – The CAUSESThe ultimate intro to Graph Neural Networks. Maybe.OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.Training BERT Language Model From Scratch On TPUsTOD-BERT: Pre-trained Transformers for Task-Oriented Dialogue Systems | Research Paper WalkthroughTransformer Neural Networks - EXPLAINED! (Attention is all you need)An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)BERT Research - Ep. 1 - Key Concepts & Sources"What Can We Do to Improve Peer Review in NLP?" 👀