- Популярные видео
- Авто
- Видео-блоги
- ДТП, аварии
- Для маленьких
- Еда, напитки
- Животные
- Закон и право
- Знаменитости
- Игры
- Искусство
- Комедии
- Красота, мода
- Кулинария, рецепты
- Люди
- Мото
- Музыка
- Мультфильмы
- Наука, технологии
- Новости
- Образование
- Политика
- Праздники
- Приколы
- Природа
- Происшествия
- Путешествия
- Развлечения
- Ржач
- Семья
- Сериалы
- Спорт
- Стиль жизни
- ТВ передачи
- Танцы
- Технологии
- Товары
- Ужасы
- Фильмы
- Шоу-бизнес
- Юмор
To Swap or Not To Swap: Memory Management Design Patterns for AI Workloads in Kuber... Nic Vermande
Don't miss out! Join us at our next KubeCon + CloudNativeCon events in Mumbai, India (18-19 June, 2026), Yokohama, Japan (29-30 July, 2026), and Shanghai, China (8-9 September, 2026). Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io
To Swap or Not To Swap: Memory Management Design Patterns for AI Workloads in Kubernetes 1.34+ - Nic Vermande, ScaleOps
Kubernetes swap support is now stable, reopening a debate the industry thought was settled: is swap still evil? For AI/ML workloads with 100GB+ memory footprints, the answer is nuanced.
This talk explores when swap helps vs. hurts GPU inference and training workloads. We'll cover 3 real production scenarios:
- Overcommitting Memory: Running multiple small models on shared nodes where occasional swap prevents OOMKills.
- Burst Traffic Handling: Using swap as a safety valve during traffic spikes when KV cache grows beyond predictions. Live demo with vLLM showing graceful degradation vs. pod eviction.
- When Swap Kills You: Training workloads and real-time inference where swap latency destroys performance.
By the end of this talk, you will know exactly when to enable swap and when to keep it disabled. Production-tested configs included!
Видео To Swap or Not To Swap: Memory Management Design Patterns for AI Workloads in Kuber... Nic Vermande канала CNCF [Cloud Native Computing Foundation]
To Swap or Not To Swap: Memory Management Design Patterns for AI Workloads in Kubernetes 1.34+ - Nic Vermande, ScaleOps
Kubernetes swap support is now stable, reopening a debate the industry thought was settled: is swap still evil? For AI/ML workloads with 100GB+ memory footprints, the answer is nuanced.
This talk explores when swap helps vs. hurts GPU inference and training workloads. We'll cover 3 real production scenarios:
- Overcommitting Memory: Running multiple small models on shared nodes where occasional swap prevents OOMKills.
- Burst Traffic Handling: Using swap as a safety valve during traffic spikes when KV cache grows beyond predictions. Live demo with vLLM showing graceful degradation vs. pod eviction.
- When Swap Kills You: Training workloads and real-time inference where swap latency destroys performance.
By the end of this talk, you will know exactly when to enable swap and when to keep it disabled. Production-tested configs included!
Видео To Swap or Not To Swap: Memory Management Design Patterns for AI Workloads in Kuber... Nic Vermande канала CNCF [Cloud Native Computing Foundation]
Комментарии отсутствуют
Информация о видео
9 апреля 2026 г. 10:25:42
00:34:45
Другие видео канала




