Enable Change Data Feed on the source table by setting the delta.enableChangeDataFeed table property to true (ALTER TABLE or in CREATE TABLE TBLPROPERTIES).
Read changes using table_changes in SQL or the readChangeFeed option in Spark, providing a starting version or timestamp and optionally an ending version.
Filter the _change_type column (insert, update_preimage, update_postimage, delete) to apply the appropriate upsert or delete logic to the target table.
In Structured Streaming mode, set readChangeFeed to true and startingVersion to latest or a checkpoint version; the stream will emit new change rows as data is written.
Propagate the _commit_version and _commit_timestamp metadata columns to the sink if the downstream system needs ordering or deduplication guarantees.
Checkpoint the last processed version in the downstream system and use it as startingVersion on restart to avoid reprocessing.
Known gotchas
CDF retains change data only for versions still present in the transaction log; if VACUUM has removed old log entries, attempting to read from those versions will fail.
update_preimage and update_postimage are emitted as separate rows for the same row key; consumers must handle both rows together to reconstruct the full update.
CDF is not available on tables with column mapping mode=id or on tables that have undergone certain schema evolution operations without explicit re-enabling.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp