Implementing Iceberg for Improved Data Management at Autodesk
#icebergSummit 2025 breakout session delivered by Bhavesh Jaisinghani and Anitha Matta from #Autodesk.
Session Description:
At Autodesk, our core infrastructure is built on a multi-tenant architecture, integrating #ApacheIceberg with Spark, Presto, and Snowflake. Our primary objectives were to enhance data reliability, streamline operations, improve query performance for both ad-hoc analysis and orchestrated jobs, and establish standardized catalog querying within our big data environment.
In this presentation, we will share our journey of transforming our data architecture by adopting Iceberg as the default catalog, supporting both Hive and Glue Data Catalogs. We will discuss how we seamlessly transitioned our internal stakeholders to Iceberg with minimal user impact.
We will guide you through our implementation process, presenting test results from running Spark pipelines on Iceberg tables. Additionally, we will provide insights into the challenges we faced and how we overcame them, offering valuable best practices for those looking to implement Iceberg. Finally, we will highlight the significant efficiencies gained, including improved data reliability, reduced execution times, and simplified data operations.
Видео Implementing Iceberg for Improved Data Management at Autodesk канала Apache Iceberg
Session Description:
At Autodesk, our core infrastructure is built on a multi-tenant architecture, integrating #ApacheIceberg with Spark, Presto, and Snowflake. Our primary objectives were to enhance data reliability, streamline operations, improve query performance for both ad-hoc analysis and orchestrated jobs, and establish standardized catalog querying within our big data environment.
In this presentation, we will share our journey of transforming our data architecture by adopting Iceberg as the default catalog, supporting both Hive and Glue Data Catalogs. We will discuss how we seamlessly transitioned our internal stakeholders to Iceberg with minimal user impact.
We will guide you through our implementation process, presenting test results from running Spark pipelines on Iceberg tables. Additionally, we will provide insights into the challenges we faced and how we overcame them, offering valuable best practices for those looking to implement Iceberg. Finally, we will highlight the significant efficiencies gained, including improved data reliability, reduced execution times, and simplified data operations.
Видео Implementing Iceberg for Improved Data Management at Autodesk канала Apache Iceberg
Комментарии отсутствуют
Информация о видео
30 апреля 2025 г. 22:31:14
00:33:14
Другие видео канала




















