- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
ZAYA1-8B: Zyphra's Reasoning MoE Trained Entirely on AMD MI300X That Punches Far Above Its Weight...
Send us Fan Mail (https://www.buzzsprout.com/2207817/fan_mail/new)
ZAYA1-8B: Zyphra's Reasoning MoE Trained Entirely on AMD MI300X That Punches Far Above Its Weight Class - May 8, 2026
Zyphra dropped ZAYA1-8B this week, a sub-billion-active-parameter mixture of experts reasoning model pretrained end to end on AMD Instinct MI300X GPUs that matches DeepSeek R1 on competition mathematics and approaches Claude 4.5 Sonnet under their novel Markovian RSA test time compute. Chris and Laura unpack the architecture innovations (Compressed Convolutional Attention, MLP-based router, learned residual scaling), the 14-trillion-token AMD-only training run, and what an Apache 2.0 frontier reasoning model on non-NVIDIA silicon means for the next twelve months of AI procurement.
Hosted by Chris and Laura.
The DX Today Podcast brings you daily deep dives into the most consequential stories in the AI ecosystem.
Send us fan mail: https://dxtoday.com/contact
#AI #ZAYA1 #AMD #ReasoningModels #OpenSourceAI
Видео ZAYA1-8B: Zyphra's Reasoning MoE Trained Entirely on AMD MI300X That Punches Far Above Its Weight... канала DX Today Podcast
ZAYA1-8B: Zyphra's Reasoning MoE Trained Entirely on AMD MI300X That Punches Far Above Its Weight Class - May 8, 2026
Zyphra dropped ZAYA1-8B this week, a sub-billion-active-parameter mixture of experts reasoning model pretrained end to end on AMD Instinct MI300X GPUs that matches DeepSeek R1 on competition mathematics and approaches Claude 4.5 Sonnet under their novel Markovian RSA test time compute. Chris and Laura unpack the architecture innovations (Compressed Convolutional Attention, MLP-based router, learned residual scaling), the 14-trillion-token AMD-only training run, and what an Apache 2.0 frontier reasoning model on non-NVIDIA silicon means for the next twelve months of AI procurement.
Hosted by Chris and Laura.
The DX Today Podcast brings you daily deep dives into the most consequential stories in the AI ecosystem.
Send us fan mail: https://dxtoday.com/contact
#AI #ZAYA1 #AMD #ReasoningModels #OpenSourceAI
Видео ZAYA1-8B: Zyphra's Reasoning MoE Trained Entirely on AMD MI300X That Punches Far Above Its Weight... канала DX Today Podcast
Комментарии отсутствуют
Информация о видео
8 мая 2026 г. 15:36:41
00:12:37
Другие видео канала




















