Connect to the compute engine that manages the Iceberg catalog (Spark, Flink, or Trino); ensure it has write access to the table's storage location.
Run a rewrite data files procedure to compact small files: in Spark SQL, call CALL catalog.system.rewrite_data_files(table => 'db.table_name') with optional options such as target-file-size-bytes.
Run rewrite_manifests to consolidate manifest files: CALL catalog.system.rewrite_manifests(table => 'db.table_name').
Expire old snapshots to remove stale metadata: CALL catalog.system.expire_snapshots(table => 'db.table_name', older_than => TIMESTAMP 'YYYY-MM-DD HH:MM:SS').
Remove orphan files left by failed operations: CALL catalog.system.remove_orphan_files(table => 'db.table_name', older_than => TIMESTAMP 'YYYY-MM-DD HH:MM:SS').
Known gotchas
expire_snapshots removes metadata for old snapshots; do not expire snapshots that are still referenced by ongoing reads or time-travel queries — retain at least a configurable retention window.
remove_orphan_files scans the storage location and deletes files not referenced by any snapshot; ensure no concurrent writes are in flight when running this procedure.
Compaction rewrites data files in place and produces a new snapshot; concurrent writes during compaction are safe due to snapshot isolation, but very long compaction jobs can generate large numbers of new files.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp