Загрузка страницы

An image is worth 16x16 words: ViT | Is this the extinction of CNNs? Long live the Transformer?

Mom, it's the Transformers again! They have come to ruin my CNN building blocks! 🥺 An Image is Worth 16x16 Words: paper explained.

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

📺 Ms. Coffee Bean explains the TRANSFORMER: https://youtu.be/FWFA4DGuzSc
📺 Ms. Coffee Bean on the Multimodal Transformer: https://youtu.be/dd7nE4nbxN0

Outline:
* 00:00 Pure Transformer for vision
* 01:17 How does it work?
* 03:58 The CNN Armageddon?

📄 Paper (not anonymous anymore): "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby

📚 Check out this wonderful post by @JacobGildenblat : https://jacobgil.github.io/deeplearning/vision-transformer-explainability

-----------------------------------
🔗 Links:
YouTube: https://www.youtube.com/AICoffeeBreak
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/

#AICoffeeBreak #MsCoffeeBean #ComputerVision #ICLR2021 #MachineLearning #AI #research

Video contains emojis designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0

Видео An image is worth 16x16 words: ViT | Is this the extinction of CNNs? Long live the Transformer? канала AI Coffee Break with Letitia
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
8 октября 2020 г. 18:59:58
00:05:26
Яндекс.Метрика