Загрузка страницы

Transformer Model (1/2): Attention Layers

Next Video: https://youtu.be/J4H6A4-dvhE

The Transformer models are state-of-the-art language models. They are based on attention and dense layer without RNN. Instead of studying every module of Transformer, let us try to build a Transformer model from scratch. In this lecture, we eliminate RNNs while keeping attentions. We will get an attention layer and a self-attention layer. In the next lecture, we use attention, self-attention, and dense layers to build a deep neural network which is known as Transformer.

Slides: https://github.com/wangshusen/DeepLearning

Reference:
Vaswani et al. Attention Is All You Need. In NIPS, 2017.

Видео Transformer Model (1/2): Attention Layers канала Shusen Wang
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
17 апреля 2021 г. 4:59:50
00:32:59
Яндекс.Метрика