Agent Revolution Won't Crown NVIDIA — Who Actually Wins the Chip Boom

▶ AI Supply Chain playlist (binge the whole series): https://www.youtube.com/playlist?list=PLINEVGQA0sVKOKvCDo8fJOTCzXDc1uigX

Everyone is still asking who wins AI training. The agent revolution just made that the wrong question.

A traditional chatbot turn burns about 2,000 tokens. An autonomous agent task — the kind OpenClaw, Claude Code, or any other agent platform actually runs — burns 50,000 to 250,000 tokens. That's a 25–100× multiplier on inference volume. Per task. Per user. The chips that win training are not the chips that win this — and one company that the market has started to count out is going to be the biggest beneficiary of the entire shift.

In this video I walk through why the inference workload is shaped completely differently from training, why hyperscalers are racing to route inference onto their own custom silicon (Google TPU, AWS Trainium2, Microsoft Maia, Meta MTIA), where the inference startups (Groq, Cerebras, SambaNova, Tenstorrent) actually fit in, and why no matter who wins the logic-layer fight, the supply chain bottleneck (HBM trio + TSMC CoWoS-L) gets bought anyway.

Chapters
00:00 Hook — everyone's asking the wrong question
00:58 The token explosion — agent workload vs chatbot
02:12 Layer 1: The inference cliff (training-era silicon under-utilized)
03:48 Layer 2: The hyperscaler escape route (TPU, Trainium, Maia, MTIA)
05:42 Layer 3: The specialty inference startups (Groq, Cerebras, SambaNova)
07:43 Layer 4: The bottleneck nobody can route around (HBM + CoWoS)
09:58 The verdict — three layers, three time horizons, three winners
12:33 Next: Google TPU — the silicon spine of the agent revolution

This is the fifth video in the AI Supply Chain series:
1. GPU vs TPU — Who Actually Wins?
2. The HBM Cartel — How 3 Companies Control AI's Future
3. TSMC's Real Monopoly
4. ASML — The Single Point of Failure for AI
5. Agent Revolution — Who Actually Wins the Chip Boom (this video)

Topics covered: AI agents, autonomous agents, OpenClaw, Claude Code, inference economics, model FLOPs utilization (MFU), first-token latency, NVIDIA Blackwell, Rubin, Google TPU v7 Ironwood, AWS Trainium2, Microsoft Maia 100, Meta MTIA, Anthropic 21B TPU deal, Broadcom custom silicon, Groq LPU, Cerebras wafer-scale, SambaNova, Tenstorrent, HBM3E, HBM4, SK Hynix, Samsung HBM, Micron HBM, TSMC CoWoS-L advanced packaging, on-device NPU.

If this was useful, subscribe — this channel is where the actual semiconductor industry view of the AI race lives, not the marketing slides.

#AIAgents #OpenClaw #ClaudeCode #NVIDIA #TPU #Trainium #Maia #MTIA #Groq #Cerebras #HBM #CoWoS #TSMC #Broadcom #Semiconductors #AIChips #AISupplyChain

Видео Agent Revolution Won't Crown NVIDIA — Who Actually Wins the Chip Boom канала VLSI Tech Explained

Комментарии отсутствуют