How Lyft built a streaming data platform with Flink on Kubernetes - Micah Wylde
Access to real-time data is increasingly important for many organizations. This is particularly true for Lyft, which needs to respond immediately to changes of supply and demand in its marketplace, weather and traffic updates, fraud attempts, and dangerous driving situations. This requires processing millions of events per second produced by our microservices, mobile apps, and IoT devices. Lyft runs dozens of Apache Flink and Apache Beam pipelines. Flink provides a powerful framework that makes it easy for non-experts to write correct, high-scale streaming jobs, while Beam extends that power to Lyft’s large base of Python programmers. Lyft also built a real-time SQL engine called Dryft, primarily used by data scientists to power real-time machine learning models, and a near-real-time ad hoc querying system with Presto. Historically, Lyft ran its Flink clusters on bare, custom-managed EC2 instances. In order to achieve greater elasticity and reliability, we rebuilt it on top of Kubernetes. This talk will cover how we designed and built an open source Kubernetes operator for Flink and Beam, some of the unique challenges of running a complex, stateful application on Kubernetes, and the lessons learned along the way.
Видео How Lyft built a streaming data platform with Flink on Kubernetes - Micah Wylde канала Flink Forward
Видео How Lyft built a streaming data platform with Flink on Kubernetes - Micah Wylde канала Flink Forward
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Build a Table-centric Apache Flink Ecosystem - Shaoxuan WangFinding Bad Acorns - Andrew Gao & Jeff SharpeMulti-tenanted streams @Workday - Enrico Agnoli & Leire Fernandez#FlinkForward SF 2017: Ufuk Celebi - The Stream Processor as a DatabaseImproving throughput and latency with Flink's network stack - Nico KruberStreaming for Enterprises - Srikanth SatyaBuilding Unified Streaming Platform at UberAnalytics for the masses - Aslam TajwalaWriting an interactive streaming SQL engine and pre-parser using Flink - Kenny GormanInterview with Gyula Fóra, Data Warehouse Engineer at KingAdventures in Scaling from Zero to 5 Billion Data Points per Day - Dave TorokSplunk Data Stream ProcessorOne SQL to Rule Them All - Fabian HueskeBuilding an open-source ML feature store with Apache FlinkData Pipeline Lifecycle: SQL EverywhereCEP platform handling millions of users - lessons from 3 years in productionWhat turns stream processing from a tool into a platform? - Stephan EwenScotty: Efficient Window Aggregation with General Stream Slicing - Jonas Traub & Philipp GrulichKeeping Redditors safe in real-time with Flink Stateful FunctionsDistributed Processing for Machine Learning Production Pipelines - Altay, Crowe, RokniFlink Forward Berlin 2018 Highlights