Загрузка...

Scaled Dot-Product Attention by hand ✍️

Scaled Dot-Product Attention by hand ✍️ I explained attention in my recent AI seminar as follows:

Step 1 — Compare
We take one token as a query and compare it against all keys using dot products. This gives us a grid of similarity scores: every query against every key.

Step 2 — Scale
We divide those scores by √dₖ to keep the values numerically stable.

Step 3 — Normalize
We apply softmax to turn raw scores into a probability distribution—values between 0 and 1 that sum to 1. These are the attention weights.

Step 4 — Combine
We use those weights to compute a weighted sum of the values. Each output token becomes a linear combination of all previous tokens—maybe 37% from one, 23% from another, and a little from the rest.

No blackbox.
Just dot products, scaling, softmax, and weighted sums—built one row at a time in Excel.

Видео Scaled Dot-Product Attention by hand ✍️ канала AI by Hand

Комментарии отсутствуют

Информация о видео

4 января 2026 г. 7:27:07

00:00:30

Правообладателям

Жалоба на материал Недопустимый материал Нарушение авторских прав

Комментарии

Поделиться

Другие видео канала

Lesson 2: Multi-Agents

W4. Activation Function - AI by Hand ✍️ with Mohsena

LLM vs RAG vs Agent Workbook

14. ResNet | CSCI 5722: Computer Vision | Spring 25

W12. Gradient - AI by Hand ✍️ with Mohsena

Frontier AI in SWE Seminar - Daniel Svonava

4. Three Inputs - AI by Hand ✍ with Anna

5. Multi Layer Perceptron (MLP) - AI by Hand ✍ with Anna

8. Neural Network | CSCI 5722: Computer Vision | Spring 25

Keep Up EP3 - AI Music Tools

9. Multi-Layer Perceptron | CSCI 5722: Computer Vision | Spring 25

Building a Vector Database by Hand ✍️ in Excel with Prof Tom Yeh

W2. Matrix Multiplication - AI by Hand ✍️ with Mohsena

W7. Connection - AI by Hand ✍️ with Mohsena

12. CNN - Forward | CSCI 5722: Computer Vision | Spring 25

W6. Batch - AI by Hand ✍️ with Mohsena

W11. Softmax - AI by Hand ✍️ with Mohsena

W10. Wide - AI by Hand ✍️ with Mohsena

Let's Draw DeepSeek with Alex and Prof. Tom Yeh

Multi-Head Attention in Excel

Все заметки Новая заметка Страницу в заметки

Страницу в закладки Мои закладки

На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.

О Cookies Напомнить позже Принять