How to Maintain Apache Iceberg Tables — Monthly Webinar by OLake with Amit Gilad, CTO- Lakeops

We’re back with another monthly event by OLake!

This deep-dive covers Apache Iceberg table maintenance best practices to keep your lakehouse fast, cost-efficient, and production-ready.

Guest speaker Amit Gilad, CTO at LakeOps (8+ years in data systems and a frequent Apache Iceberg speaker), shares field-tested tactics for compaction, metadata hygiene, and file layout optimization the exact knobs that reduce scan size, improve pruning, and stabilize p95 query latency across engines like Spark Trino, Flink, DuckDB, and ClickHouse.

Here's what we deep dive into-

-Iceberg compaction strategies explained: bin-packing vs sort compaction vs Z-ordering—how to choose per workload, target file sizes (e.g., 256–512 MB), and when to re-write data files.

-Fixing the small-file problem on object storage (Amazon S3, Azure ADLS, Google Cloud Storage): fewer files, lower planning overhead, better parallelism.

-Snapshot management & metadata hygiene: safe snapshot expiration policies, orphan file cleanup, manifest & manifest-list rewrites to reduce planning time and metadata bloat.

-Partitioning & evolution: avoiding over/under-partitioning; how day→hour partition evolution + sort compaction boosts predicate pruning without exploding costs.

-And most importantly cost-vs-performance.
Who should watch
Data engineers, platform teams, architects, and technical leaders running production Apache Iceberg who want lower cloud bills and stable performance under growing data volumes.
#ApacheIceberg #dataengineering #Lakehouse #spark #Trino #flink #duckdb #clickhouse

OLake website - https://olake.io
OLake GitHub - https://github.com/datazip-inc/olake
Join OLake Slack - https://olake.io/slack
Find us on LinkedIn - https://www.linkedin.com/company/datazipio

➡ MongoDB to Apache Iceberg getting started - https://olake.io/docs/getting-started/mongodb
➡ MySQL to Apache Iceberg getting started -https://olake.io/docs/getting-started/mysql
➡ Postgres to Apache Iceberg getting started -https://olake.io/docs/getting-started/postgres

Видео How to Maintain Apache Iceberg Tables — Monthly Webinar by OLake with Amit Gilad, CTO- Lakeops канала OLake

olake apache-iceberg lakehouse cdc debezium

Комментарии отсутствуют

Информация о видео

26 сентября 2025 г. 16:29:22

01:00:38

OLake

Теги

Правообладателям

Жалоба на материал Недопустимый материал Нарушение авторских прав

Комментарии

Другие видео канала

How to Maintain Apache Iceberg Tables — Monthly Webinar by OLake with Amit Gilad, CTO- Lakeops

Inside the Minds of Two CTOs: The Future Is Apache Iceberg | Fireside Chat by OLake

OLake Community Meetup | 3rd Edition | 13.02.2025

OLake Launch Webinar: Fastest Apache Iceberg Native CDC

Women in Data Engineering

Apache Arrow Meets Apache Iceberg | High-Performance Ingestion with OLake

Apache Arrow + ADBC & Iceberg: From SDK Integration to Query Engine | Matt Topol & Shubham Baldava

Webinar 7: Demystifying Lakehouse Architecture with Akshat Mathur

OLake Community Call #9 - Kafka as a source, Ingestion Controls & Destination Refactorization

A journey into Data Lake: Introducing Apache Iceberg

OLake Community Meetup | 4th Edition | 28.02.2025

How OLake helped @PhysicsWallah solve issues with Debezium in their Data Pipeline

Iceberg Catalogs in 2025:Deep Dive into Polaris, LakeKeeper, Glue & Nessie| Greybeam’s Arsham Eslami

Apache Iceberg: Building Tomorrow's Data Architecture Today with Sachin Tripathi

How to Debug OLake | Open Source Database to Apache Iceberg data replication tool

Webinar 6: Iceberg- Game Changing Capabilities and advantage of REST Catalog by Viktor Kessler

Sync MongoDB data to AWS S3 and LocalFile System in Parquet format

How to Configure the job using OLake-UI.

OLake 6th Community Meetup | 28.04.2025

OLake 5th Community Meetup | 27.03.2025

Sync MongoDB data to Apache Iceberg table format