- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
How On-Policy Distillation Changes LLM Weights
In this AI Research Roundup episode, Alex discusses the paper: 'Dense Supervision, Sparse Updates: On the Sparsity and Geometry of On-Policy Distillation' On-policy distillation (OPD) is a popular post-training method that combines on-policy student trajectories with dense teacher supervision, but its effects on model parameters have remained poorly understood. This paper analyzes several language and vision-language model pairs to reveal that OPD updates are surprisingly small, coordinate-sparse, and concentrated within the Feed-Forward Network (FFN) modules. The researchers show that training only this discovered sparse subnetwork can almost entirely recover full-training performance. Additionally, the study reveals that these updates are spectrally concentrated, falling primarily on coordinates where the source weights are close to zero, meaning OPD retains unique geometric signatures of on-policy post-training. Finally, the authors find that adaptive optimization like AdamW remains crucial over SGD, as dense teacher supervision preserves essential momentum and scale structures. Paper URL: https://arxiv.org/pdf/2606.13657 #AI #MachineLearning #DeepLearning #LLM #ModelDistillation #PostTraining #Optimization
Видео How On-Policy Distillation Changes LLM Weights канала AI Research Roundup
Видео How On-Policy Distillation Changes LLM Weights канала AI Research Roundup
Комментарии отсутствуют
Информация о видео
15 июня 2026 г. 11:15:15
00:04:41
Другие видео канала





















