A brief history of the Transformer architecture in NLP
🏛️ The Transformer architecture has revolutionized Natural Language Processing, being capable to beat the state-of-the-art on overwhelmingly numerous tasks! Check out this video for a brief history of the Transformer development.
Related video: How do we check if a neural network has learned a specific phenomenon? https://youtu.be/fL22NAtMNYo
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
Paper links in order of appearance:
* 00:29 ImageNet challenge SOTA -- https://paperswithcode.com/sota/image-classification-on-imagenet
* 00:58 Word2Vec -- Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems. 2013. https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
* 03:29 The Transformer -- Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017. https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
* 04:30 Translating Programming Languages -- Lachaux, Marie-Anne, et al. “Unsupervised Translation of Programming Languages”. 2020 https://arxiv.org/pdf/2006.03511.pdf
* 04:32 Symbolic Mathematics -- Lample, Guillaume, and François Charton. "Deep learning for symbolic mathematics." arXiv preprint arXiv:1912.01412 (2019). https://arxiv.org/pdf/1912.01412.pdf
* 04:37 Transformer Demo from Huggingface -- https://transformer.huggingface.co/
* 04:52 BERT -- Devlin, Jacob, et al. "BERT: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018). https://arxiv.org/pdf/1810.04805.pdf
* 06:29 Image Transformer -- Parmar, Niki, et al. "Image transformer." arXiv preprint arXiv:1802.05751 (2018). https://arxiv.org/pdf/1802.05751.pdf
🔗 Links:
YouTube: https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA/
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research #TransformerinML
Видео A brief history of the Transformer architecture in NLP канала AI Coffee Break with Letitia
Related video: How do we check if a neural network has learned a specific phenomenon? https://youtu.be/fL22NAtMNYo
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
Paper links in order of appearance:
* 00:29 ImageNet challenge SOTA -- https://paperswithcode.com/sota/image-classification-on-imagenet
* 00:58 Word2Vec -- Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems. 2013. https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
* 03:29 The Transformer -- Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017. https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
* 04:30 Translating Programming Languages -- Lachaux, Marie-Anne, et al. “Unsupervised Translation of Programming Languages”. 2020 https://arxiv.org/pdf/2006.03511.pdf
* 04:32 Symbolic Mathematics -- Lample, Guillaume, and François Charton. "Deep learning for symbolic mathematics." arXiv preprint arXiv:1912.01412 (2019). https://arxiv.org/pdf/1912.01412.pdf
* 04:37 Transformer Demo from Huggingface -- https://transformer.huggingface.co/
* 04:52 BERT -- Devlin, Jacob, et al. "BERT: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018). https://arxiv.org/pdf/1810.04805.pdf
* 06:29 Image Transformer -- Parmar, Niki, et al. "Image transformer." arXiv preprint arXiv:1802.05751 (2018). https://arxiv.org/pdf/1802.05751.pdf
🔗 Links:
YouTube: https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA/
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research #TransformerinML
Видео A brief history of the Transformer architecture in NLP канала AI Coffee Break with Letitia
Показать
Комментарии отсутствуют
Информация о видео
12 июня 2020 г. 18:30:03
00:08:23
Другие видео канала
The Transformer neural network architecture EXPLAINED. “Attention is all you need” (NLP)Why people believe they can’t draw - and how to prove they can | Graham Shaw | TEDxHullAn image is worth 16x16 words: ViT | Is this the extinction of CNNs? Long live the Transformer?UMAP explained | The best dimensionality reduction?The ultimate intro to Graph Neural Networks. Maybe.SDS 513: Transformers for Natural Language Processing — with Denis RothmanTransformers - Experiments and DemosGPT-3 explained with examples. Possibilities, and implications.Tesla Fact vs. Fiction: Why the Public Perception is WrongPositional embeddings in transformers EXPLAINED | Demystifying positional encodings.Pre-training of BERT-based Transformer architectures explained – language and vision!Illustrated Guide to Transformers Neural Network: A step by step explanationTransformer Neural Networks - EXPLAINED! (Attention is all you need)What are Transformer Neural Networks?What is Neuro Linguistic Programming (NLP)?An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)AI Language Models & Transformers - ComputerphileCS480/680 Lecture 19: Attention and Transformer Networks