SREcon16 - Performance Checklists for SREs
Brendan Gregg, Netflix
There's limited time for performance analysis in the emergency room. When there is a performance-related site outage, the SRE team must analyze and solve complex performance issues as quickly as possible, and under pressure. Many performance tools and techniques are designed for a different environment: an engineer analyzing their system over the course of hours or days, and given time to try dozens of tools: profilers, tracers, monitoring tools, benchmarks, as well as different tunings and configurations. But when Netflix is down, minutes matter, and there's little time for such traditional systems analysis. As with aviation emergencies, short checklists and quick procedures can be applied by the on-call SRE staff to help solve performance issues as quickly as possible.
Видео SREcon16 - Performance Checklists for SREs канала USENIX
There's limited time for performance analysis in the emergency room. When there is a performance-related site outage, the SRE team must analyze and solve complex performance issues as quickly as possible, and under pressure. Many performance tools and techniques are designed for a different environment: an engineer analyzing their system over the course of hours or days, and given time to try dozens of tools: profilers, tracers, monitoring tools, benchmarks, as well as different tunings and configurations. But when Netflix is down, minutes matter, and there's little time for such traditional systems analysis. As with aviation emergencies, short checklists and quick procedures can be applied by the on-call SRE staff to help solve performance issues as quickly as possible.
Видео SREcon16 - Performance Checklists for SREs канала USENIX
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
SREcon20 Americas - The Smallest Possible SRE TeamComputer Networking Tutorial - 39 - Routing Tables ExplainedLinux Performance Tools, Brendan Gregg, part 1 of 2SREcon15 - SRE HiringMastering Chaos - A Netflix Guide to MicroservicesSREcon18 Asia/Australia - Interviewing for Systems Design SkillsLife of an SRE at Google - JC van Winkel - Codemotion Rome 2017SREcon14 - Keys to SREVelocity 2017: Performance Analysis Superpowers with Linux eBPFGive me 15 minutes and I'll change your view of Linux tracingTracking SLIs and SLOs - Meghan JordanThe Power of Checklists: The Incredible Impact of the Obvious ToolS3 system design | cloud storage system design | Distributed cloud storage system designLISA19 - Linux Systems PerformanceSREcon 2016 - Netflix: 190 Countries and 5 CORE SREsAn expansion to Matt D'avella's checklist video.Setting SLOs and SLIs in the Real WorldSREcon 2017 Americas - Tracking Service Infrastructure at ScaleHow to Make a Checklist in Word | Microsoft Word Tutorials