Enable CDF on an existing table with ALTER TABLE ... SET TBLPROPERTIES ('delta.enableChangeDataFeed' = 'true') or at creation with a table property
Perform INSERT, UPDATE, and DELETE operations on the source table to generate change records
Read the change feed using DESCRIBE HISTORY to identify the starting version, then query with table_changes('table_name', start_version) in SparkSQL or the equivalent DataFrame API
Inspect the _change_type column values (insert, update_preimage, update_postimage, delete) to distinguish operation types in the downstream pipeline
Persist the last-consumed version number in the downstream pipeline's checkpoint or state store and use it as the next start_version on the following run
Known gotchas
CDF only captures changes made after enableChangeDataFeed is set; there is no retroactive change history for operations performed before enabling the property
OPTIMIZE and other table maintenance operations generate their own CDF entries (data_compaction change type) that downstream consumers must filter out to avoid duplicate processing
CDF change log files are retained according to the delta.logRetentionDuration setting; if the downstream pipeline falls behind by more than the retention window, it will lose access to intermediate versions and must reseed
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp