Загрузка...

HRM-Text: Efficient Pretraining Beyond Scaling

Paper: HRM-Text: Efficient Pretraining Beyond Scaling (2605.20613)
Published: 20 May 2026.

Learn more on Emergent Mind: https://www.emergentmind.com/papers/2605.20613
arXiv: https://arxiv.org/abs/2605.20613
Sign up for our free trending papers email digest: https://www.emergentmind.com/subscribe
Follow us on X: https://x.com/EmergentMind
Join our Discord: https://discord.gg/BhfTC4mTXq

This presentation explores HRM-Text, a groundbreaking approach to language model pretraining that achieves competitive performance with models 2 to 7 times its size while using up to 432 times less compute and 900 times fewer training tokens. Through a dual-timescale recurrent architecture inspired by biological multi-timescale processing, combined with instruction-response training objectives and novel stabilization techniques, HRM-Text demonstrates that brute-force scaling is not the only path to capable language models. We examine the architectural innovations, training methodology, empirical results, and implications for democratizing large language model research.

Видео HRM-Text: Efficient Pretraining Beyond Scaling канала Emergent Mind
Яндекс.Метрика
Все заметки Новая заметка Страницу в заметки
Страницу в закладки Мои закладки
На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.
О CookiesНапомнить позжеПринять