llm-d Demo: Deploy Large Language Models on Kubernetes with Helm and Trace a Request
Watch this demo of llm-d, the open-source solution for running large language models on Kubernetes!
What You'll Learn:
Sample llm-d installation process using Helm charts
How to use the QuickStart CLI for easy all-in-one deployment
Setting up llm-d on different Kubernetes targets (Minikube, OpenShift)
Understanding the llm-d architecture and request flow
How pre-fill and decode disaggregation works
Redis KV cache integration and model service deployment
Key Features Covered:
✅ Quick start installation with dependencies
✅ Hugging Face model integration
✅ Gateway system setup and routing
✅ Inference scheduler and endpoint picker
✅ Pre-fill and decode pod management
✅ Request tracing and architecture overview
Prerequisites:
Kubernetes cluster (Minikube or OpenShift)
Helm installed
Authorized Hugging Face token (for model downloads)
llm-d is open source under Apache 2.0 license and makes deploying LLMs on Kubernetes straightforward with familiar tools and customizable configurations.
Join the llm-d community:
🌎 https://llm-d.ai
💬 https://inviter.co/llm-d-slack
💻 https://github.com/llm-d
Видео llm-d Demo: Deploy Large Language Models on Kubernetes with Helm and Trace a Request канала llm-d Project
What You'll Learn:
Sample llm-d installation process using Helm charts
How to use the QuickStart CLI for easy all-in-one deployment
Setting up llm-d on different Kubernetes targets (Minikube, OpenShift)
Understanding the llm-d architecture and request flow
How pre-fill and decode disaggregation works
Redis KV cache integration and model service deployment
Key Features Covered:
✅ Quick start installation with dependencies
✅ Hugging Face model integration
✅ Gateway system setup and routing
✅ Inference scheduler and endpoint picker
✅ Pre-fill and decode pod management
✅ Request tracing and architecture overview
Prerequisites:
Kubernetes cluster (Minikube or OpenShift)
Helm installed
Authorized Hugging Face token (for model downloads)
llm-d is open source under Apache 2.0 license and makes deploying LLMs on Kubernetes straightforward with familiar tools and customizable configurations.
Join the llm-d community:
🌎 https://llm-d.ai
💬 https://inviter.co/llm-d-slack
💻 https://github.com/llm-d
Видео llm-d Demo: Deploy Large Language Models on Kubernetes with Helm and Trace a Request канала llm-d Project
Комментарии отсутствуют
Информация о видео
17 июня 2025 г. 21:07:17
00:04:03
Другие видео канала