Using Machine Learning and Observability Together to Reduce Incident Impact | DigitalOcean
Get the slides: https://www.datacouncil.ai/talks/the-observatorium-using-machine-learning-and-observability-together-to-reduce-incident-impact
ABOUT THE TALK
Service organizations often measure themselves on keeping customer downtime to a minimum. In the complex distributed architectures inherent to many modern tech companies, however, blips are bound to occur, rendering the effectiveness of incident response critical to the customer experience. KPIs such as MTTD and MTTR (Mean Time to Detection/Resolution, respectively) are used to better understand the efficiency of said incident response, and maturing organizations would be wise to leverage tooling to improve these metrics.
In a maturing global company such as DigitalOcean, distributed systems reign supreme, and with them the myriad microservices that generate metrics and data (and duly need to be observed effectively). Accordingly, we’ve built a platform named The Observatorium, whose primary goal is to reduce MTTD/MTTR across our cloud; we do so by curating and shepherding information in creative-yet-efficient ways, which I’ll discuss in more depth in this talk.
ABOUT THE SPEAKER
Alex Kass has worked at companies ranging from large financial institutions to early-stage startups, regularly building successful analytical models and systems of varying size. At DigitalOcean, a fast-growing global cloud hosting provider, he has at his disposal sufficient software and hardware firepower to experiment and build with both stable and cutting edge technologies, delivering actionable statistical insights at scale. He currently runs the Observability Applications team, focusing on shining spotlights of transparency onto the performance and reliability of the cloud.
Previous speaking credits range from locations as diverse as Columbia University to Apache: Big Data Europe to OSS: North America & EU.
ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai
Facebook: https://www.facebook.com/datacouncilai
Eventbrite: https://www.eventbrite.com/o/data-council-30357384520
Видео Using Machine Learning and Observability Together to Reduce Incident Impact | DigitalOcean канала Data Council
ABOUT THE TALK
Service organizations often measure themselves on keeping customer downtime to a minimum. In the complex distributed architectures inherent to many modern tech companies, however, blips are bound to occur, rendering the effectiveness of incident response critical to the customer experience. KPIs such as MTTD and MTTR (Mean Time to Detection/Resolution, respectively) are used to better understand the efficiency of said incident response, and maturing organizations would be wise to leverage tooling to improve these metrics.
In a maturing global company such as DigitalOcean, distributed systems reign supreme, and with them the myriad microservices that generate metrics and data (and duly need to be observed effectively). Accordingly, we’ve built a platform named The Observatorium, whose primary goal is to reduce MTTD/MTTR across our cloud; we do so by curating and shepherding information in creative-yet-efficient ways, which I’ll discuss in more depth in this talk.
ABOUT THE SPEAKER
Alex Kass has worked at companies ranging from large financial institutions to early-stage startups, regularly building successful analytical models and systems of varying size. At DigitalOcean, a fast-growing global cloud hosting provider, he has at his disposal sufficient software and hardware firepower to experiment and build with both stable and cutting edge technologies, delivering actionable statistical insights at scale. He currently runs the Observability Applications team, focusing on shining spotlights of transparency onto the performance and reliability of the cloud.
Previous speaking credits range from locations as diverse as Columbia University to Apache: Big Data Europe to OSS: North America & EU.
ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai
Facebook: https://www.facebook.com/datacouncilai
Eventbrite: https://www.eventbrite.com/o/data-council-30357384520
Видео Using Machine Learning and Observability Together to Reduce Incident Impact | DigitalOcean канала Data Council
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![DC_THURS : dbt w/ Drew Banin](https://i.ytimg.com/vi/zc8wLzoAkVc/default.jpg)
![DevOps for Machine Learning & Other Half Truths Processes & Tools for the ML Lifecycle | DataRobot](https://i.ytimg.com/vi/z7m9B6vSVe8/default.jpg)
![Data Discovery Getting More From Your Metadata](https://i.ytimg.com/vi/WDi3rEe_Eow/default.jpg)
![Technical Founders Panel](https://i.ytimg.com/vi/mRgyDCtL6-k/default.jpg)
![Feed The Alligators With the Lights On: How Data Engineers Can See Who Really Uses Data | Stemma](https://i.ytimg.com/vi/4WO3klWEhiI/default.jpg)
![Architecting a Low-Latency Schemaless SQL Engine | Rockset](https://i.ytimg.com/vi/D3OUbQMxmcI/default.jpg)
![Building High Performance Recommender Systems with Feature Stores | Tecton](https://i.ytimg.com/vi/F7-7349p0Ok/default.jpg)
![Office Hours with Stitch Fix Data Platform](https://i.ytimg.com/vi/IabnpQAGkRo/default.jpg)
![DC_THURS on Trino](https://i.ytimg.com/vi/qGvZhwJWAaw/default.jpg)
![Enterprise Data Science Comes of Age | Anaconda](https://i.ytimg.com/vi/VZ3LLPKYjVE/default.jpg)
![Making Friends with Generative Models | Tonic](https://i.ytimg.com/vi/7WdMOfoBDpk/default.jpg)
![The Right Way to Track Mobile Data](https://i.ytimg.com/vi/qGgWe9GBUNk/default.jpg)
![DC_THURS on Feature Engineering](https://i.ytimg.com/vi/ewVwxuDizUQ/default.jpg)
![Scaling Uber's Metric System from Elasticsearch to Pinot | Uber](https://i.ytimg.com/vi/u82r_eqUaiI/default.jpg)
![Rikai: A New Data Format for Analytics on Unstructured Data at Scale](https://i.ytimg.com/vi/FVYOLcKNmsM/default.jpg)
![DC_THURS on DataHub w/ Shirshanka Das (Acryl Data)](https://i.ytimg.com/vi/lBbrilDAFMs/default.jpg)
![The Road to Exceptional Data Correctness](https://i.ytimg.com/vi/Ii2S_prglbc/default.jpg)
![Building an ML Experimentation Platform for Easy Reproducibility | Treeverse](https://i.ytimg.com/vi/FLtqcrJ7Vws/default.jpg)
![How Vercel Builds Dozens of Metrics from One Heterogenous Table](https://i.ytimg.com/vi/n3KUORtd5J4/default.jpg)
![DC_THURS w/ Patrick Thompson, CEO of Iteratively](https://i.ytimg.com/vi/-6zTaAtaLzM/default.jpg)