Загрузка...

The latest LLM research shows how they are getting SMARTER and FASTER.

System Design Course at InterviewReady: https://interviewready.io/

This is how LLMs are scaling up their test compute time to deliver better results.

00:00 Agenda
00:20 Scaling Law 1.0
01:48 Using smaller weights
03:44 Increasing model training
06:05 Google Titans
07:40 What's after transformers?
08:45 Neuromorphic Computing

References:
1.58 bit model:
ByteDance - https://chenglin-yang.github.io/1.58bit.flux.github.io/
Microsoft - https://arxiv.org/pdf/2402.17764
1B model outperforms 405b: https://arxiv.org/abs/2502.06703
Sakana AI Transformer square: https://arxiv.org/pdf/2501.06252
Google Titans: https://arxiv.org/pdf/2501.00663
FlashAttention: https://arxiv.org/pdf/2407.08608
Memristor: https://www.nature.com/articles/s41586-024-07902-2

#AI #LLM #Scaling

Видео The latest LLM research shows how they are getting SMARTER and FASTER. канала Gaurav Sen
Страницу в закладки Мои закладки
Все заметки Новая заметка Страницу в заметки