Create the Delta table with the CLUSTER BY clause specifying up to four clustering columns that reflect the most common filter predicates.
For an existing partitioned table, use ALTER TABLE ... CLUSTER BY to declare clustering columns; this does not immediately recluster existing data.
Run OPTIMIZE on the table (with no ZORDER BY, since liquid clustering supersedes it) to physically recluster files; Delta uses Hilbert curve ordering across the clustering columns.
Schedule periodic OPTIMIZE runs to incrementally recluster data written since the last optimization; Delta tracks which files need reclustering via its transaction log.
Query the table normally; the optimizer reads clustering statistics from the log to skip files that do not overlap the query predicate.
Known gotchas
Liquid clustering is incompatible with partitioned tables created with PARTITIONED BY; you must recreate the table without explicit partitioning to use CLUSTER BY.
Changing clustering columns with ALTER TABLE re-declares the target layout but does not recluster existing files until the next OPTIMIZE run.
Liquid clustering is a Delta 3.x (Databricks Runtime 13+) feature; older runtimes will fail to read or write tables with the clustering metadata.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp