Загрузка страницы

Reducing large S3 API costs using Alluxio at Datasapiens

Alluxio Global Online Meetup
August 4, 2020

For more Alluxio events: https://www.alluxio.io/events/

Speakers:
Koen Michiels, Datasapiens
Juraj Pohanka, Datasapiens
Bin Fan, Alluxio

Datasapiens is an international data-analytics startup based in Prague. We help our clients to uncover the value of their data and open up new revenue streams for them. We provide an end-to-end service that manages the data pipeline and automates the process of generating data insights.

In this talk, we will describe how we have solved an issue with large S3 API costs incurred by Presto under several usage concurrency levels by implementing Alluxio as a data orchestration layer between S3 and Presto. Also, we will show the results of an experiment with estimating the per-query S3 API costs using the TPC-DS dataset.

This talk will focus on:

- The Hadoop ecosystem at Datasapiens
- Drastic increase of S3 API costs during performance tests with Presto
- S3 API costs tests with TPC-DS
- Implications to the cloud data lake architecture

Видео Reducing large S3 API costs using Alluxio at Datasapiens канала Alluxio
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
5 августа 2020 г. 9:31:38
00:39:15
Яндекс.Метрика