The Narrated Transformer Language Model
AI/ML has been witnessing a rapid acceleration in model improvement in the last few years. The majority of the state-of-the-art models in the field are based on the Transformer architecture. Examples include models like BERT (which when applied to Google Search, resulted in what Google calls "one of the biggest leaps forward in the history of Search") and OpenAI's GPT2 and GPT3 (which are able to generate coherent text and essays).
This video by the author of the popular "Illustrated Transformer" guide will introduce the Transformer architecture and its various applications. This is a visual presentation accessible to people with various levels of ML experience.
Intro (0:00)
The Architecture of the Transformer (4:18)
Model Training (7:11)
Transformer LM Component 1: FFNN (10:01)
Transformer LM Component 2: Self-Attention(12:27)
Tokenization: Words to Token Ids (14:59)
Embedding: Breathe meaning into tokens (19:42)
Projecting the Output: Turning Computation into Language (24:11)
Final Note: Visualizing Probabilities (25:51)
The Illustrated Transformer:
https://jalammar.github.io/illustrated-transformer/
Simple transformer language model notebook:
https://github.com/jalammar/jalammar.github.io/blob/master/notebooks/Simple_Transformer_Language_Model.ipynb
Philosophers On GPT-3 (updated with replies by GPT-3):
https://dailynous.com/2020/07/30/philosophers-gpt-3/
-----
Twitter: https://twitter.com/JayAlammar
Blog: https://jalammar.github.io/
Mailing List: http://eepurl.com/gl0BHL
More videos by Jay:
Jay's Visual Intro to AI
https://www.youtube.com/watch?v=mSTCzNgDJy4
How GPT-3 Works - Easily Explained with Animations
https://www.youtube.com/watch?v=MQnJZuBGmSQ
Making Money from AI by Predicting Sales - Jay's Intro to AI Part 2
https://www.youtube.com/watch?v=V4-lX...
Видео The Narrated Transformer Language Model канала Jay Alammar
This video by the author of the popular "Illustrated Transformer" guide will introduce the Transformer architecture and its various applications. This is a visual presentation accessible to people with various levels of ML experience.
Intro (0:00)
The Architecture of the Transformer (4:18)
Model Training (7:11)
Transformer LM Component 1: FFNN (10:01)
Transformer LM Component 2: Self-Attention(12:27)
Tokenization: Words to Token Ids (14:59)
Embedding: Breathe meaning into tokens (19:42)
Projecting the Output: Turning Computation into Language (24:11)
Final Note: Visualizing Probabilities (25:51)
The Illustrated Transformer:
https://jalammar.github.io/illustrated-transformer/
Simple transformer language model notebook:
https://github.com/jalammar/jalammar.github.io/blob/master/notebooks/Simple_Transformer_Language_Model.ipynb
Philosophers On GPT-3 (updated with replies by GPT-3):
https://dailynous.com/2020/07/30/philosophers-gpt-3/
-----
Twitter: https://twitter.com/JayAlammar
Blog: https://jalammar.github.io/
Mailing List: http://eepurl.com/gl0BHL
More videos by Jay:
Jay's Visual Intro to AI
https://www.youtube.com/watch?v=mSTCzNgDJy4
How GPT-3 Works - Easily Explained with Animations
https://www.youtube.com/watch?v=MQnJZuBGmSQ
Making Money from AI by Predicting Sales - Jay's Intro to AI Part 2
https://www.youtube.com/watch?v=V4-lX...
Видео The Narrated Transformer Language Model канала Jay Alammar
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
How GPT3 Works - Easily Explained with AnimationsNLP for Developers: Transformers | RasaAttention is all you need; Attentional Neural Network Models | Łukasz Kaiser | MasterclassIntuition & Use-Cases of Embeddings in NLP & beyondBERT Research - Ep. 1 - Key Concepts & SourcesJay's Visual Intro to AIRasa Algorithm Whiteboard - Attention 1: Self AttentionUMass CS685 (Advanced NLP): Transformers and sequence-to-sequence modelsMaking Money from AI by Predicting Sales - Jay's Intro to AI Part 2GPT-3 vs Human BrainIllustrated Guide to Transformers Neural Network: A step by step explanationStanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 14 – Transformers and Self-AttentionLive -Transformers Indepth Architecture Understanding- Attention Is All You NeedRethinking Attention with Performers (Paper Explained)LSTM is dead. Long Live Transformers!Language Models are Open Knowledge Graphs (Paper Explained)An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingRasa Algorithm Whiteboard - Attention 3: Multi Head AttentionTransformer Neural Networks - EXPLAINED! (Attention is all you need)