Iceberg Table Maintenance
Etleap automatically performs two types of maintenance operations on your Iceberg tables:
- Compaction.
- Snapshot Expiry.


Compaction
Compaction rewrites data and deletes files in Iceberg, combining them into fewer, larger files. This reduces the number of files the Iceberg reader needs to scan, and improves the performance of your Iceberg queries.
Etleap runs compactions on Iceberg tables when there have been more than 10,000 row operations (inserts, updates, or deletes) on that table since the previous compaction.
Find more information on compaction in the Iceberg spec here .
Snapshot Expiry
Snapshot expiry removes old versions of the table from the Iceberg metadata, and cleans up any data files that are no longer referenced by the remaining snapshots. It does not affect any data in the snapshots that are kept, so all data in the active version of the table remains queryable.
This prevents bloating of the Iceberg table’s metadata and reduces both read and write times to the table. However, it does limit how far back you can time travel within the table’s history.
Etleap runs a snapshot expiry regularly whenever rows in the table have changed.
The two latest snapshots on the table’s main
branch will always be retained, any older snapshots are expired to limit the disk space used by the table.
Find more information on snapshot expiry in the Iceberg spec here .
Equality and Position Deletes for Update-Mode Pipelines
To ensure fast data ingestion while maintaining excellent query performance on your main table, Etleap uses a dual-branch architecture:
-
real-time
branch: Receives high-frequency streaming writes from your pipeline. This branch prioritizes ingestion speed using equality deletes. -
main
branch: Contains the data you query in your warehouse. Etleap automatically converts data from thereal-time
branch to this optimizedmain
branch using position deletes for better query performance.
Snapshot expiry maintenance will always preserve the two most recent snapshots on the main
branch, along with any snapshots on the real-time
branch that have not yet been converted.
All other snapshots will be removed.
To enable this feature for your organization, please contact support@etleap.com.