Загрузка...

llm-d Demo: Deploy Large Language Models on Kubernetes with Helm and Trace a Request

Watch this demo of llm-d, the open-source solution for running large language models on Kubernetes!

What You'll Learn:

Sample llm-d installation process using Helm charts
How to use the QuickStart CLI for easy all-in-one deployment
Setting up llm-d on different Kubernetes targets (Minikube, OpenShift)
Understanding the llm-d architecture and request flow
How pre-fill and decode disaggregation works
Redis KV cache integration and model service deployment

Key Features Covered:
✅ Quick start installation with dependencies
✅ Hugging Face model integration
✅ Gateway system setup and routing
✅ Inference scheduler and endpoint picker
✅ Pre-fill and decode pod management
✅ Request tracing and architecture overview
Prerequisites:

Kubernetes cluster (Minikube or OpenShift)
Helm installed
Authorized Hugging Face token (for model downloads)

llm-d is open source under Apache 2.0 license and makes deploying LLMs on Kubernetes straightforward with familiar tools and customizable configurations.

Join the llm-d community:
🌎 https://llm-d.ai
💬 https://inviter.co/llm-d-slack
💻 https://github.com/llm-d

Видео llm-d Demo: Deploy Large Language Models on Kubernetes with Helm and Trace a Request канала llm-d Project
Страницу в закладки Мои закладки
Все заметки Новая заметка Страницу в заметки

На информационно-развлекательном портале SALDA.WS применяются cookie-файлы. Нажимая кнопку Принять, вы подтверждаете свое согласие на их использование.

Об использовании CookiesПринять