Presto and Apache Hudi
Speakers:
Bhavani Sudha Saktheeswaran, Software Engineer at Moveworks, Apache Hudi PMC, Ex-Uber
Brandon Scheller, Software Engineer at Amazon Web Services
Apache Hudi is a fast growing data lake storage system that helps organizations build and manage petabyte-scale data lakes. Hudi brings stream style processing to batch-like big data by using primitives such as upserts, deletes and incremental pulls. These features help surface faster, fresher data on a unified serving layer. Hudi can be operated on the Hadoop Distributed File System (HDFS) or cloud stores and integrates well with popular query engines such as Presto, Apache Hive, Apache Spark and Apache Impala.
In this talk we are going to introduce Hudi, discuss different table/query types and how Hudi integrates with Presto to support these queries. We like to share our experience on how this integration has evolved over time and also discuss upcoming file listing and query planning improvements in Presto Hudi queries.
prestodb.io
Видео Presto and Apache Hudi канала Presto Foundation
Bhavani Sudha Saktheeswaran, Software Engineer at Moveworks, Apache Hudi PMC, Ex-Uber
Brandon Scheller, Software Engineer at Amazon Web Services
Apache Hudi is a fast growing data lake storage system that helps organizations build and manage petabyte-scale data lakes. Hudi brings stream style processing to batch-like big data by using primitives such as upserts, deletes and incremental pulls. These features help surface faster, fresher data on a unified serving layer. Hudi can be operated on the Hadoop Distributed File System (HDFS) or cloud stores and integrates well with popular query engines such as Presto, Apache Hive, Apache Spark and Apache Impala.
In this talk we are going to introduce Hudi, discuss different table/query types and how Hudi integrates with Presto to support these queries. We like to share our experience on how this integration has evolved over time and also discuss upcoming file listing and query planning improvements in Presto Hudi queries.
prestodb.io
Видео Presto and Apache Hudi канала Presto Foundation
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Presto on AWS using Ahana Cloud at Cartona - Omar Mohamed, Cartona](https://i.ytimg.com/vi/fibzqwCz4ok/default.jpg)
![Powering Uber's global network analytics pipelines in real-time with Apache Hudi | Uber](https://i.ytimg.com/vi/1w3IpavhSWA/default.jpg)
![A Hadoop Ecosystem Overview: Including HDFS, MapReduce, Yarn, Hive, Pig, and HBase](https://i.ytimg.com/vi/kRnh3WpcKXo/default.jpg)
![Speeding up Presto Queries Using Apache Hudi Clustering - Satish Kotha & Nishith Agarwal, Uber](https://i.ytimg.com/vi/1WSg2aiCwDQ/default.jpg)
![3H6 - Big Data Orchestration on Spark, Flink and DataFlow using Apache Beam](https://i.ytimg.com/vi/s_cstCPdnKI/default.jpg)
![How Carbon uses PrestoDB in the Cloud with Ahana to Power its Real-time Customer D... Jordan Hoggart](https://i.ytimg.com/vi/RbRJ35p9GkU/default.jpg)
![Query Apache Hudi Datasets using Amazon Athena](https://i.ytimg.com/vi/TVcreqxBaGA/default.jpg)
![Apache Iceberg - A Table Format for Huge Analytic Datasets](https://i.ytimg.com/vi/mf8Hb0coI6o/default.jpg)
![Amazon EMR Deep Dive and Best Practices - AWS Online Tech Talks](https://i.ytimg.com/vi/dU40df0Suoo/default.jpg)
![Next Gen Data Lakes using Apache Hudi](https://i.ytimg.com/vi/yC5TXyOownw/default.jpg)
![Presto SQL on anything](https://i.ytimg.com/vi/QcLJvSxa_OA/default.jpg)
![Hudi: Large Scale, Near Real Time Pipelines at Uber by Nishith Agarwal Vinoth Chandar (Uber)](https://i.ytimg.com/vi/O89f53xa_0Y/default.jpg)
![AWS re:Invent 2019: Insert, upsert, and delete data in Amazon S3 using Amazon EMR (ANT239)](https://i.ytimg.com/vi/_ckNyL_Nr1A/default.jpg)
![Apache Hive - Hive joins, execution engines (tez and mr) and explain/execution plan](https://i.ytimg.com/vi/y7Z79b2sGQ8/default.jpg)
![Netflix - Presto & Iceberg for Analytics](https://i.ytimg.com/vi/o1rdGJbhXag/default.jpg)
![Streaming Event-Time Partitioning With Apache Flink and Apache Iceberg - Julia Bennett](https://i.ytimg.com/vi/-Q4UcXcIv1o/default.jpg)
![Building Real-Time Analytics Applications Using Apache Pinot](https://i.ytimg.com/vi/mOzjVRf0yt4/default.jpg)
![Presto and Cassandra: Doing SQL and Joins on Cassandra Tables](https://i.ytimg.com/vi/9c8aRYL_9bA/default.jpg)
![Presto at Facebook: State of the Union - Biswapesh Chattopadhyay, Facebook](https://i.ytimg.com/vi/JuWiWmUtn3M/default.jpg)
![Delta Lake for Apache Spark - Why do we need Delta Lake for Spark?](https://i.ytimg.com/vi/0GhFAzN4qs4/default.jpg)