Define the table with a transform-based partition spec such as PARTITIONED BY (hours(event_time), truncate(16, user_id)) so the physical partitioning is hidden from query writers.
Write queries using plain column predicates like WHERE event_time BETWEEN '2024-01-01' AND '2024-01-02' without referencing partition columns directly; Iceberg will prune partitions automatically.
Confirm partition pruning is active by running EXPLAIN on the query and verifying only relevant partition files are scanned.
Use available transforms: identity, bucket(N, col), truncate(L, col), years(col), months(col), days(col), hours(col) based on the query and cardinality patterns of each column.
Review the partition spec via SELECT * FROM my_catalog.db.my_table.partitions to inspect how transforms map to physical partition values.
Known gotchas
Hidden partitioning does not prevent full table scans if the query predicate cannot be mapped to a partition transform; ensure predicates align with the transforms actually used.
The bucket transform is not suitable for range queries since bucket values are hashed; use it only for equality lookups or to distribute writes evenly.
Changing from identity to a transform partition spec requires a spec evolution via ALTER TABLE, but existing data keeps the old layout until rewritten.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp