Stage incoming records in a source DataFrame or temporary view with a business key and an updated_at timestamp
Write a MERGE INTO statement that matches on the business key and updates target columns WHEN MATCHED AND source.updated_at > target.updated_at
Add a WHEN NOT MATCHED BY TARGET THEN INSERT clause to insert net-new rows from the source
Add WHEN NOT MATCHED BY SOURCE THEN DELETE to remove target rows that are absent in the source batch, representing hard deletes
Run DESCRIBE HISTORY after the merge to confirm the operationMetrics show the expected counts for rowsUpdated, rowsInserted, and rowsDeleted
Known gotchas
WHEN NOT MATCHED BY SOURCE requires Delta Lake 2.4 or later; using it on an older version raises a parse error that can be mistaken for a syntax mistake
MERGE on large tables with no predicate filtering will trigger a full scan of the target; always add a partition filter in the merge condition (e.g., AND target.event_date = source.event_date) to limit the scan
Delta MERGE acquires a write lock on the table for the duration of the operation; long-running merges block concurrent writers and readers on tables without row-level concurrency enabled
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp