The latest LLM research shows how they are getting SMARTER and FASTER.
System Design Course at InterviewReady: https://interviewready.io/
This is how LLMs are scaling up their test compute time to deliver better results.
00:00 Agenda
00:20 Scaling Law 1.0
01:48 Using smaller weights
03:44 Increasing model training
06:05 Google Titans
07:40 What's after transformers?
08:45 Neuromorphic Computing
References:
1.58 bit model:
ByteDance - https://chenglin-yang.github.io/1.58bit.flux.github.io/
Microsoft - https://arxiv.org/pdf/2402.17764
1B model outperforms 405b: https://arxiv.org/abs/2502.06703
Sakana AI Transformer square: https://arxiv.org/pdf/2501.06252
Google Titans: https://arxiv.org/pdf/2501.00663
FlashAttention: https://arxiv.org/pdf/2407.08608
Memristor: https://www.nature.com/articles/s41586-024-07902-2
#AI #LLM #Scaling
Видео The latest LLM research shows how they are getting SMARTER and FASTER. канала Gaurav Sen
This is how LLMs are scaling up their test compute time to deliver better results.
00:00 Agenda
00:20 Scaling Law 1.0
01:48 Using smaller weights
03:44 Increasing model training
06:05 Google Titans
07:40 What's after transformers?
08:45 Neuromorphic Computing
References:
1.58 bit model:
ByteDance - https://chenglin-yang.github.io/1.58bit.flux.github.io/
Microsoft - https://arxiv.org/pdf/2402.17764
1B model outperforms 405b: https://arxiv.org/abs/2502.06703
Sakana AI Transformer square: https://arxiv.org/pdf/2501.06252
Google Titans: https://arxiv.org/pdf/2501.00663
FlashAttention: https://arxiv.org/pdf/2407.08608
Memristor: https://www.nature.com/articles/s41586-024-07902-2
#AI #LLM #Scaling
Видео The latest LLM research shows how they are getting SMARTER and FASTER. канала Gaurav Sen
Комментарии отсутствуют
Информация о видео
26 февраля 2025 г. 1:05:20
00:12:19
Другие видео канала