Загрузка страницы

MAMBA and State Space Models explained | SSM explained

We simply explain and illustrate Mamba, State Space Models (SSMs) and Selective SSMs.
SSMs match performance of transformers, but are faster and more memory-efficient than them. This is crucial for long sequences!

AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/ Celebrating our merch launch, here is a limited time offer! 👉 Get 25% discount on AI Coffee Break Merch with the code MAMBABEAN.

This video also comes in blog post format: 👉 https://open.substack.com/pub/aicoffeebreakwl/p/mamba-and-ssms-explained?r=r8s20&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true

Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Dres. Trost GbR, Siltax, Vignesh Valliappan, Michael

Outline:
00:00 Mamba to replace Transformers!?
02:04 State Space Models (SSMs) – high level
03:09 State Space Models (SSMs) – more detail
05:45 Discretization step in SSMs
08:14 SSMs are fast! Here is why.
09:55 SSM training: Convolution trick
12:01 Selective SSMs
15:44 MAMBA Architecture
17:57 Mamba results
20:15 Building on Mamba
21:00 Do RNNs have a comeback?
21:42 AICoffeeBreak Merch

📄 Gu, Albert, and Tri Dao. "Mamba: Linear-time sequence modeling with selective state spaces." arXiv preprint arXiv:2312.00752 (2023). https://arxiv.org/abs/2312.00752
📄 MoE-Mamba https://arxiv.org/abs/2401.04081
📄 Vision Mamba https://arxiv.org/abs/2401.09417
📄 MambaByte https://arxiv.org/abs/2401.13660
🕊️ Mamba rejected from ICLR: https://twitter.com/srush_nlp/status/1750526956452577486
📖 Prefix sum (Scan) with Cuda: https://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-computing/chapter-39-parallel-prefix-sum-scan-cuda
📺 Transformer explained: https://www.youtube.com/playlist?list=PLpZBeKTZRGPNdymdEsSSSod5YQ3Vu0sKY

Great resources to learn about Mamba:
📙 Mamba: https://jameschen.io/jekyll/update/2024/02/12/mamba.html
📕 The Annotated S4: https://srush.github.io/annotated-s4/
📘 Mamba The Easy Way: https://jackcook.com/2024/02/23/mamba.html

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
Join this channel to get access to perks:
https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA/join
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak

#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​

Scientific advising by Mara Popescu
Video editing: Nils Trost
Music 🎵 : Sunny Days – Anno Domini Beats

Видео MAMBA and State Space Models explained | SSM explained канала AI Coffee Break with Letitia
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
17 февраля 2024 г. 19:22:21
00:22:27
Яндекс.Метрика