- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
The 60-Year Hunt for AI's Most Important Function
Every modern AI model relies on activation functions to build complex models.
But what activation functions work and why?
In this video, join me to trace the full evolution.
00:00 Activation function intro
00:26 Feedforward networks in a Transformer
01:55 Heaviside step function
02:26 Sigmoid: differentiable and bounded
02:53 Sigmoid challenge 1: Constrained gradient updates
05:54 Sigmoid challenge 2: Vanishing gradient problem
07:49 Tanh: Zero-centering the signal
09:29 ReLU: the breakthrough
10:29 The "dying ReLU" problem
11:28 Leaky ReLU and Parametric ReLU
12:53 New perspective: content and gate
14:15 GELU and Swish
16:14 Gated Linear Unit: GLU, ReGLU, GEGLU, and SwiGLU
20:20 Adjusting hidden dimension for GLU-based FFN.
21:55 Squared ReLU: Sparse and efficient
23:04 Implementation review
Video made with Manim: https://www.manim.community/
Видео The 60-Year Hunt for AI's Most Important Function канала Jia-Bin Huang
But what activation functions work and why?
In this video, join me to trace the full evolution.
00:00 Activation function intro
00:26 Feedforward networks in a Transformer
01:55 Heaviside step function
02:26 Sigmoid: differentiable and bounded
02:53 Sigmoid challenge 1: Constrained gradient updates
05:54 Sigmoid challenge 2: Vanishing gradient problem
07:49 Tanh: Zero-centering the signal
09:29 ReLU: the breakthrough
10:29 The "dying ReLU" problem
11:28 Leaky ReLU and Parametric ReLU
12:53 New perspective: content and gate
14:15 GELU and Swish
16:14 Gated Linear Unit: GLU, ReGLU, GEGLU, and SwiGLU
20:20 Adjusting hidden dimension for GLU-based FFN.
21:55 Squared ReLU: Sparse and efficient
23:04 Implementation review
Video made with Manim: https://www.manim.community/
Видео The 60-Year Hunt for AI's Most Important Function канала Jia-Bin Huang
Комментарии отсутствуют
Информация о видео
15 мая 2026 г. 7:37:13
00:26:26
Другие видео канала
