Vision Transformer Quick Guide - Theory and Code in (almost) 15 min
▬▬ Papers / Resources ▬▬▬
Colab Notebook: https://colab.research.google.com/drive/1P9TPRWsDdqJC6IvOxjG2_3QlgCt59P0w?usp=sharing
ViT paper: https://arxiv.org/abs/2010.11929
Best Transformer intro: https://jalammar.github.io/illustrated-transformer/
CNNs vs ViT: https://arxiv.org/abs/2108.08810
CNNs vs ViT Blog: https://towardsdatascience.com/do-vision-transformers-see-like-convolutional-neural-networks-paper-explained-91b4bd5185c8
Swin Transformer: https://arxiv.org/abs/2103.14030
DeiT: https://arxiv.org/abs/2012.12877
▬▬ Support me if you like 🌟
►Link to this channel: https://bit.ly/3zEqL1W
►Support me on Patreon: https://bit.ly/2Wed242
►Buy me a coffee on Ko-Fi: https://bit.ly/3kJYEdl
►E-Mail: deepfindr@gmail.com
▬▬ Used Music ▬▬▬▬▬▬▬▬▬▬▬
Music from #Uppbeat (free for Creators!):
https://uppbeat.io/t/92elm/jasmine
License code: SMTWRWLNGHZHH0OC
▬▬ Used Icons ▬▬▬▬▬▬▬▬▬▬
All Icons are from flaticon: https://www.flaticon.com/authors/freepik
▬▬ Timestamps ▬▬▬▬▬▬▬▬▬▬▬
00:00 Introduction
00:16 ViT Intro
01:12 Input embeddings
01:50 Image patching
02:54 Einops reshaping
04:13 [CODE] Patching
05:35 CLS Token
06:40 Positional Embeddings
08:09 Transformer Encoder
08:30 Multi-head attention
08:50 [CODE] Multi-head attention
09:12 Layer Norm
09:30 [CODE] Layer Norm
09:55 Feed Forward Head
10:05 Feed Forward Head
10:21 Residuals
10:45 [CODE] final ViT
13:10 CNN vs. ViT
14:45 ViT Variants
▬▬ My equipment 💻
- Microphone: https://amzn.to/3DVqB8H
- Microphone mount: https://amzn.to/3BWUcOJ
- Monitors: https://amzn.to/3G2Jjgr
- Monitor mount: https://amzn.to/3AWGIAY
- Height-adjustable table: https://amzn.to/3aUysXC
- Ergonomic chair: https://amzn.to/3phQg7r
- PC case: https://amzn.to/3jdlI2Y
- GPU: https://amzn.to/3AWyzwy
- Keyboard: https://amzn.to/2XskWHP
- Bluelight filter glasses: https://amzn.to/3pj0fK2
Видео Vision Transformer Quick Guide - Theory and Code in (almost) 15 min канала DeepFindr
Colab Notebook: https://colab.research.google.com/drive/1P9TPRWsDdqJC6IvOxjG2_3QlgCt59P0w?usp=sharing
ViT paper: https://arxiv.org/abs/2010.11929
Best Transformer intro: https://jalammar.github.io/illustrated-transformer/
CNNs vs ViT: https://arxiv.org/abs/2108.08810
CNNs vs ViT Blog: https://towardsdatascience.com/do-vision-transformers-see-like-convolutional-neural-networks-paper-explained-91b4bd5185c8
Swin Transformer: https://arxiv.org/abs/2103.14030
DeiT: https://arxiv.org/abs/2012.12877
▬▬ Support me if you like 🌟
►Link to this channel: https://bit.ly/3zEqL1W
►Support me on Patreon: https://bit.ly/2Wed242
►Buy me a coffee on Ko-Fi: https://bit.ly/3kJYEdl
►E-Mail: deepfindr@gmail.com
▬▬ Used Music ▬▬▬▬▬▬▬▬▬▬▬
Music from #Uppbeat (free for Creators!):
https://uppbeat.io/t/92elm/jasmine
License code: SMTWRWLNGHZHH0OC
▬▬ Used Icons ▬▬▬▬▬▬▬▬▬▬
All Icons are from flaticon: https://www.flaticon.com/authors/freepik
▬▬ Timestamps ▬▬▬▬▬▬▬▬▬▬▬
00:00 Introduction
00:16 ViT Intro
01:12 Input embeddings
01:50 Image patching
02:54 Einops reshaping
04:13 [CODE] Patching
05:35 CLS Token
06:40 Positional Embeddings
08:09 Transformer Encoder
08:30 Multi-head attention
08:50 [CODE] Multi-head attention
09:12 Layer Norm
09:30 [CODE] Layer Norm
09:55 Feed Forward Head
10:05 Feed Forward Head
10:21 Residuals
10:45 [CODE] final ViT
13:10 CNN vs. ViT
14:45 ViT Variants
▬▬ My equipment 💻
- Microphone: https://amzn.to/3DVqB8H
- Microphone mount: https://amzn.to/3BWUcOJ
- Monitors: https://amzn.to/3G2Jjgr
- Monitor mount: https://amzn.to/3AWGIAY
- Height-adjustable table: https://amzn.to/3aUysXC
- Ergonomic chair: https://amzn.to/3phQg7r
- PC case: https://amzn.to/3jdlI2Y
- GPU: https://amzn.to/3AWyzwy
- Keyboard: https://amzn.to/2XskWHP
- Bluelight filter glasses: https://amzn.to/3pj0fK2
Видео Vision Transformer Quick Guide - Theory and Code in (almost) 15 min канала DeepFindr
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![GNN Project #4.3 - Code explanation](https://i.ytimg.com/vi/HFMN-Bs7ywg/default.jpg)
![Fake News Detection using Graphs with Pytorch Geometric](https://i.ytimg.com/vi/QAIVFr24FrA/default.jpg)
![Explaining Twitch Predictions with GNNExplainer](https://i.ytimg.com/vi/aFnlmzFh4iQ/default.jpg)
![Understanding Convolutional Neural Networks | Part 3 / 3 - Transfer Learning and Explainable AI](https://i.ytimg.com/vi/PCIGOK7WqEg/default.jpg)
![Uniform Manifold Approximation and Projection (UMAP) | Dimensionality Reduction Techniques (5/5)](https://i.ytimg.com/vi/iPV7mLaFWyE/default.jpg)
![Python Graph Neural Network Libraries (an Overview)](https://i.ytimg.com/vi/hsxS2IRUzfM/default.jpg)
![How to get started with Data Science (Career tracks and advice)](https://i.ytimg.com/vi/-47FXTCv5Ls/default.jpg)
![Understanding Convolutional Neural Networks | Part 2 / 3 - Wonders of the world CNN with PyTorch](https://i.ytimg.com/vi/QjeuMOpgrAw/default.jpg)
![How to explain Graph Neural Networks (with XAI)](https://i.ytimg.com/vi/NvDM2j8Jgvk/default.jpg)
![Contrastive Learning in PyTorch - Part 1: Introduction](https://i.ytimg.com/vi/u-X_nZRsn5M/default.jpg)
![Converting a Tabular Dataset to a Graph Dataset for GNNs](https://i.ytimg.com/vi/AQU3akndun4/default.jpg)
![Fraud Detection with Graph Neural Networks](https://i.ytimg.com/vi/MZGuz-o7Fl0/default.jpg)
![Understanding Graph Neural Networks | Part 3/3 - Pytorch Geometric and Molecule Data using RDKit](https://i.ytimg.com/vi/0YLZXjMHA-8/default.jpg)
![GNN Project #4.1 - Graph Variational Autoencoders](https://i.ytimg.com/vi/ZyiW_ibeDGc/default.jpg)
![Converting a Tabular Dataset to a Temporal Graph Dataset for GNNs](https://i.ytimg.com/vi/XPTwvvlHaUA/default.jpg)
![Self-/Unsupervised GNN Training](https://i.ytimg.com/vi/3XTuhchTWd8/default.jpg)
![Understanding Graph Neural Networks | Part 2/3 - GNNs and it's Variants](https://i.ytimg.com/vi/ABCGCf8cJOE/default.jpg)
![Recommender Systems using Graph Neural Networks](https://i.ytimg.com/vi/NyNqzDKcKG4/default.jpg)
![Contrastive Learning in PyTorch - Part 2: CL on Point Clouds](https://i.ytimg.com/vi/XpUKZEGWqbU/default.jpg)
![Friendly Introduction to Temporal Graph Neural Networks (and some Traffic Forecasting)](https://i.ytimg.com/vi/WEWq93tioC4/default.jpg)
![Explainable AI explained! | #6 Layerwise Relevance Propagation with MRI data](https://i.ytimg.com/vi/PDRewtcqmaI/default.jpg)