Transformer Model (1/2): Attention Layers
Next Video: https://youtu.be/J4H6A4-dvhE
The Transformer models are state-of-the-art language models. They are based on attention and dense layer without RNN. Instead of studying every module of Transformer, let us try to build a Transformer model from scratch. In this lecture, we eliminate RNNs while keeping attentions. We will get an attention layer and a self-attention layer. In the next lecture, we use attention, self-attention, and dense layers to build a deep neural network which is known as Transformer.
Slides: https://github.com/wangshusen/DeepLearning
Reference:
Vaswani et al. Attention Is All You Need. In NIPS, 2017.
Видео Transformer Model (1/2): Attention Layers канала Shusen Wang
The Transformer models are state-of-the-art language models. They are based on attention and dense layer without RNN. Instead of studying every module of Transformer, let us try to build a Transformer model from scratch. In this lecture, we eliminate RNNs while keeping attentions. We will get an attention layer and a self-attention layer. In the next lecture, we use attention, self-attention, and dense layers to build a deep neural network which is known as Transformer.
Slides: https://github.com/wangshusen/DeepLearning
Reference:
Vaswani et al. Attention Is All You Need. In NIPS, 2017.
Видео Transformer Model (1/2): Attention Layers канала Shusen Wang
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Transformer Model (2/2): Build a Deep Neural Network (1.25x speed recommended)](https://i.ytimg.com/vi/J4H6A4-dvhE/default.jpg)
![Few-Shot Learning (1/3): Basic Concepts](https://i.ytimg.com/vi/hE7eGew4eeg/default.jpg)
![Vision Transformer for Image Classification](https://i.ytimg.com/vi/HZ4j_U3FC94/default.jpg)
![Self-Attenion for RNN (1.25x speed recommended)](https://i.ytimg.com/vi/06r6kp7ujCA/default.jpg)
![The secrets of learning a new language | Lýdia Machová](https://i.ytimg.com/vi/o_XVt5rdpFY/default.jpg)
![Rasa Algorithm Whiteboard - Transformers & Attention 1: Self Attention](https://i.ytimg.com/vi/yGTUuEx3GkA/default.jpg)
![Visual Guide to Transformer Neural Networks - (Episode 2) Multi-Head & Self-Attention](https://i.ytimg.com/vi/mMa2PmYJlCo/default.jpg)
![Sequence to Sequence Learning with Encoder-Decoder Neural Network Models by Dr. Ananth Sankar](https://i.ytimg.com/vi/bBBYPuVUnug/default.jpg)
![[딥러닝 기계 번역] Transformer: Attention Is All You Need (꼼꼼한 딥러닝 논문 리뷰와 코드 실습)](https://i.ytimg.com/vi/AA621UofTUA/default.jpg)
![How does a Transformer work - Working Principle electrical engineering](https://i.ytimg.com/vi/UchitHGF4n8/default.jpg)
![Attention Mechanism | Deep Learning](https://i.ytimg.com/vi/wj3ZYbKKUHI/default.jpg)
![Will Transformers Replace CNNs in Computer Vision? + NVIDIA GTC Giveaway](https://i.ytimg.com/vi/QcCJJOLCeJQ/default.jpg)
![BERT for pretraining Transformers](https://i.ytimg.com/vi/EOmd5sUUA_A/default.jpg)
![RAM Explained - Random Access Memory](https://i.ytimg.com/vi/PVad0c2cljo/default.jpg)
![Illustrated Guide to LSTM's and GRU's: A step by step explanation](https://i.ytimg.com/vi/8HyCNIVRbSU/default.jpg)
![Transformer Neural Networks - EXPLAINED! (Attention is all you need)](https://i.ytimg.com/vi/TQQlZhbC5ps/default.jpg)
![Recurrent Neural Networks | RNN LSTM Tutorial | Why use RNN | On Whiteboard | Compare ANN, CNN, RNN](https://i.ytimg.com/vi/KBftoy0DPxI/default.jpg)
![Rasa Algorithm Whiteboard - Transformers & Attention 3: Multi Head Attention](https://i.ytimg.com/vi/23XUv0T9L5c/default.jpg)
![The Narrated Transformer Language Model](https://i.ytimg.com/vi/-QH8fRhqFHM/default.jpg)
![Attention for RNN Seq2Seq Models (1.25x speed recommended)](https://i.ytimg.com/vi/B3uws4cLcFw/default.jpg)