Загрузка страницы

Presto and Apache Hudi

Speakers:
Bhavani Sudha Saktheeswaran, Software Engineer at Moveworks, Apache Hudi PMC, Ex-Uber
Brandon Scheller, Software Engineer at Amazon Web Services

Apache Hudi is a fast growing data lake storage system that helps organizations build and manage petabyte-scale data lakes. Hudi brings stream style processing to batch-like big data by using primitives such as upserts, deletes and incremental pulls. These features help surface faster, fresher data on a unified serving layer. Hudi can be operated on the Hadoop Distributed File System (HDFS) or cloud stores and integrates well with popular query engines such as Presto, Apache Hive, Apache Spark and Apache Impala.

In this talk we are going to introduce Hudi, discuss different table/query types and how Hudi integrates with Presto to support these queries. We like to share our experience on how this integration has evolved over time and also discuss upcoming file listing and query planning improvements in Presto Hudi queries.

prestodb.io

Видео Presto and Apache Hudi канала Presto Foundation
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
18 августа 2020 г. 14:01:40
00:26:28
Яндекс.Метрика