Reducing large S3 API costs using Alluxio at Datasapiens
Alluxio Global Online Meetup
August 4, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Koen Michiels, Datasapiens
Juraj Pohanka, Datasapiens
Bin Fan, Alluxio
Datasapiens is an international data-analytics startup based in Prague. We help our clients to uncover the value of their data and open up new revenue streams for them. We provide an end-to-end service that manages the data pipeline and automates the process of generating data insights.
In this talk, we will describe how we have solved an issue with large S3 API costs incurred by Presto under several usage concurrency levels by implementing Alluxio as a data orchestration layer between S3 and Presto. Also, we will show the results of an experiment with estimating the per-query S3 API costs using the TPC-DS dataset.
This talk will focus on:
- The Hadoop ecosystem at Datasapiens
- Drastic increase of S3 API costs during performance tests with Presto
- S3 API costs tests with TPC-DS
- Implications to the cloud data lake architecture
Видео Reducing large S3 API costs using Alluxio at Datasapiens канала Alluxio
August 4, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Koen Michiels, Datasapiens
Juraj Pohanka, Datasapiens
Bin Fan, Alluxio
Datasapiens is an international data-analytics startup based in Prague. We help our clients to uncover the value of their data and open up new revenue streams for them. We provide an end-to-end service that manages the data pipeline and automates the process of generating data insights.
In this talk, we will describe how we have solved an issue with large S3 API costs incurred by Presto under several usage concurrency levels by implementing Alluxio as a data orchestration layer between S3 and Presto. Also, we will show the results of an experiment with estimating the per-query S3 API costs using the TPC-DS dataset.
This talk will focus on:
- The Hadoop ecosystem at Datasapiens
- Drastic increase of S3 API costs during performance tests with Presto
- S3 API costs tests with TPC-DS
- Implications to the cloud data lake architecture
Видео Reducing large S3 API costs using Alluxio at Datasapiens канала Alluxio
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
Alluxio's AI/ML Infra Meetup: Innovations and Insights Revealed!Accelerate Analytics and ML in the Hybrid Cloud EraWhat’s new in Alluxio 2.4Speeding up Machine Learning in the Cloud with Alluxio on Kubernetes [Chinese]DORA: Alluxio’s Next-Gen Architecture for AIBuilding a Distributed File System for the Cloud-Native EraBursting Apache Spark Workloads to the Cloud on Remote DataModern Data Platforms - Thinking Data Flywheel on the CloudSecuring your Open Source ProjectWhat’s new in Alluxio 2.3Integrating Open Source Alluxio in AWS EKS with TerraformBenefits of Paging StorageUnified Data Access with GimelAlluxio Journal Evolution - Towards high availability and fault toleranceWhat's new in Alluxio 2: from seamless operations to structured data managementSolving the Data Loading Challenge for Machine Learning with AlluxioAlluxio-FUSE as a data access layer for DaskAdding NAS/NFS is Not Enough for Efficient Data AccessWords From Alluxio's CEO Haoyuan LiDeploy Alluxio on Kubernetes