Загрузка страницы

Self-Attention with Relative Position Representations – Paper explained

We help you wrap your head around relative positional embeddings as they were first introduced in the “Self-Attention with Relative Position Representations” paper.
➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/

Related videos:
📺 Positional embeddings explained: https://youtu.be/1biZfFLPRSY
📺 Concatenated, learned positional encodings: https://youtu.be/M2ToEXF6Olw
📺 Transformer explained: https://youtu.be/FWFA4DGuzSc

Papers:
📄 Shaw, Peter, Jakob Uszkoreit, and Ashish Vaswani. "Self-Attention with Relative Position Representations." In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp. 464-468. 2018. https://arxiv.org/pdf/1803.02155.pdf

📄 Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. "Attention is all you need." In Advances in neural information processing systems, pp. 5998-6008. 2017. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

💻 Implementation for Relative Position Embeddings: https://github.com/AliHaiderAhmad001/Self-Attention-with-Relative-Position-Representations

Outline:
00:00 Relative positional representations
02:15 How do they work?
07:59 Benefits of relative vs. absolute positional encodings

Music 🎵 : Holi Day Riddim - Konrad OldMoney
✍️ Arabic Subtitles by Ali Haidar Ahmad https://www.linkedin.com/in/ali-ahmad-0706a51bb/ .

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak

#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​

Видео Self-Attention with Relative Position Representations – Paper explained канала AI Coffee Break with Letitia
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
31 июля 2021 г. 23:15:49
00:10:18
Другие видео канала
Deep Learning for Symbolic Mathematics!? | Paper EXPLAINEDDeep Learning for Symbolic Mathematics!? | Paper EXPLAINED[Quiz] Eigenfaces, Domain adaptation, Causality, Manifold Hypothesis, Denoising Autoencoder[Quiz] Eigenfaces, Domain adaptation, Causality, Manifold Hypothesis, Denoising Autoencoder[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR WorkshopOur paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR WorkshopData BAD | What Will it Take to Fix Benchmarking for NLU?Data BAD | What Will it Take to Fix Benchmarking for NLU?Preparing for Virtual Conferences – 7 Tips for recording a good conference talkPreparing for Virtual Conferences – 7 Tips for recording a good conference talkCan a neural network tell if an image is mirrored? – Visual ChiralityCan a neural network tell if an image is mirrored? – Visual ChiralityAI Coffee Break - Channel TrailerAI Coffee Break - Channel Trailer[Quiz] Interpretable ML, VQ-VAE w/o Quantization / infinite codebook, Pearson’s, PointClouds[Quiz] Interpretable ML, VQ-VAE w/o Quantization / infinite codebook, Pearson’s, PointCloudsWhat is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #ShortsWhat is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #ShortsAdding vs. concatenating positional embeddings & Learned positional encodingsAdding vs. concatenating positional embeddings & Learned positional encodingsTransformer in Transformer: Paper explained and visualized | TNTTransformer in Transformer: Paper explained and visualized | TNTTraining learned optimizers: VeLO paper EXPLAINEDTraining learned optimizers: VeLO paper EXPLAINEDPre-training of BERT-based Transformer architectures explained – language and vision!Pre-training of BERT-based Transformer architectures explained – language and vision!What is tokenization and how does it work? Tokenizers explained.What is tokenization and how does it work? Tokenizers explained.[Quiz] Regularization in Deep Learning, Lipschitz continuity, Gradient regularization[Quiz] Regularization in Deep Learning, Lipschitz continuity, Gradient regularizationAre Pre-trained Convolutions Better than Pre-trained Transformers? – Paper ExplainedAre Pre-trained Convolutions Better than Pre-trained Transformers? – Paper ExplainedAdversarial Machine Learning explained! | With examples.Adversarial Machine Learning explained! | With examples.Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuizDo Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuizFNet: Mixing Tokens with Fourier Transforms – Paper ExplainedFNet: Mixing Tokens with Fourier Transforms – Paper ExplainedAI understanding language!? A roadmap to natural language understanding.AI understanding language!? A roadmap to natural language understanding.
Яндекс.Метрика