Dataflow: A Unified Model for Batch and Streaming Data Processing
Unbounded, unordered, global-scale datasets are increasingly common in day-to-day business (e.g. Web logs, mobile usage statistics, and sensor networks). At the same time, consumers of these datasets have evolved sophisticated requirements, such as event-time ordering and windowing by features of the data themselves. On top of that -- consumers want answers *now*. This talk will cover how Google has evolved its earlier work on batch and streaming systems (including MapReduce, FlumeJava, and Millwheel) into Dataflow, a new programming model that allows users to clearly trade off correctness, latency, and cost. An overview of this model will be provided, including a demo of the fully managed service it enables, and a discussion on some of the many use cases that got Google here.
Presenter
Frances Perry
Видео Dataflow: A Unified Model for Batch and Streaming Data Processing канала @Scale
Presenter
Frances Perry
Видео Dataflow: A Unified Model for Batch and Streaming Data Processing канала @Scale
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Apache Beam: Portable and Parallel Data Processing (Google Cloud Next '17)Lessons learned form Kafka in production (Tim Berglund, Confluent)Batch Processing vs Stream Processing | System Design Primer | Tech PrimersWatermarks: Time and Progress in Apache Beam and BeyondThe secrets of learning a new language | Lýdia MachováI Analyze Data - Best Practices for Implementing a Data Lake in Amazon S3 (Level 200)Get Rid of Traditional ETL, Move to Spark! (Bas Geerdink)15. Batch Size and Learning Rate in CNNsARCADE SCAM SCIENCE (not clickbait)What is Dataflow?SQL vs NoSQL or MySQL vs MongoDBDetecting outliers and anomalies in realtime at Datadog - Homin Lee (OSCON Austin 2016)Azure Stream Analytics Tutorial | Processing stream data with SQLWhat is a Container?ETL Is Dead, Long Live Streams: real-time streams w/ Apache KafkaTriggers in Apache Beam (incubating) - Strata NYC 2016Real-Time Stream Analytics with Google Cloud Dataflow: Common Use Cases & Patterns (Cloud Next '18)MassTransit Batch Message ConsumerRealtime data replication into BigQuery with Datastream and Dataflow