Загрузка...

Why LLM Inference Is Memory-Bound, Not Compute-Bound

The limiting factor in LLM inference isn't compute. It's how fast you can move weights from DRAM to the chip.

In this interview, CTO Mathias Lechner speaks with Piotr Mazurek from Liquid AI's inference team about what's actually happening when an LLM handles a request: the prefill/decode distinction, multi-GPU parallelism strategies, and how to choose between inference frameworks like vLLM, SGLang, and TensorRT-LLM depending on latency and throughput requirements.

Liquid AI builds foundation models designed for efficiency and performance across a range of deployment contexts. This series features Mathias in conversation with researchers and engineers across the company.

Subscribe to follow every episode: https://www.youtube.com/@liquid-ai-inc

Careers at Liquid AI: https://www.liquid.ai/careers

Видео Why LLM Inference Is Memory-Bound, Not Compute-Bound канала Liquid AI

Комментарии отсутствуют

Информация о видео

27 мая 2026 г. 19:27:53

00:04:48

Правообладателям

Жалоба на материал Недопустимый материал Нарушение авторских прав

Комментарии

Поделиться

Другие видео канала

What It Takes to Train LFMs at Scale

Liquid AI: The Heart of Liquid

Unlocking the Power of On-Device Generative AI: A Technical and Business Perspective

Efficient AI Privacy, Without Leaving Your Machine: Liquid AI Shieldflow

LFM-Audio-1.5B Release Video

LFM2.5-VL-450M Demo: Structured Visual Intelligence, Edge to Cloud

AI Demand Is Exploding. The Energy to Power It Isn’t.

A Journey with Capgemini: Insights from Four Years of Research Collaboration

Inside Liquid AI: Meet the Team

The Future of Work: Insights and Innovations

Why ONNX? Running Liquid Foundation Models Across Hardware Providers

How a Base Model Learns to Follow Instructions

Liquid AI: What if?

Liquid AI CEO Ramin Hasani Announces New LFM Products at CES 2026 Opening Keynote with Lisa Su

Liquid Demo | Tool-Calling Agents on Consumer Hardware with LFM2-24B-A2B

Liquid AI: Why Liquid?

Liquid AI: Product Launch Webcast 10/23

Shopify & Liquid AI | Commerce Foundation Models

Why Your AI Agent Doesn't Need the Cloud: LFM2.5-8B-A1B

A Personal AI Agent That Never Leaves Your Device

Все заметки Новая заметка Страницу в заметки

Страницу в закладки Мои закладки

На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.

О Cookies Напомнить позже Принять