- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
ScheduleFree+: Learning-Rate-Free LLM Training
In this AI Research Roundup episode, Alex discusses the paper: 'ScheduleFree+: Scaling Learning-Rate-Free & Schedule-Free Learning to Large Language Models' Schedule-Free Learning has proven to be a practical anytime training method across standard benchmarks, but its effectiveness for large language models (LLMs) was previously limited to small scales. In this paper, researchers from Meta's FAIR Super-Intelligence Labs introduce ScheduleFree+, which scales learning-rate-free and schedule-free learning to larger batch and model sizes. This new method significantly outperforms traditional Warmup-Stable-Decay (WSD) schedules, especially during long-duration pretraining. In fact, at 1000 tokens per parameter, ScheduleFree+ outperforms state-of-the-art schedules by 31%. Additionally, this approach provides a solid theoretical foundation for model averaging and checkpoint merging during pretraining. Paper URL: https://arxiv.org/pdf/2605.19095 #AI #MachineLearning #DeepLearning #LLMs #Optimization #MetaFAIR
Видео ScheduleFree+: Learning-Rate-Free LLM Training канала AI Research Roundup
Видео ScheduleFree+: Learning-Rate-Free LLM Training канала AI Research Roundup
Комментарии отсутствуют
Информация о видео
1 ч. 59 мин. назад
00:04:19
Другие видео канала




















