{"id":"1971c173-62dd-4688-a960-1ce098267e3c","task":"Perform a Delta Lake MERGE upsert with WHEN NOT MATCHED BY SOURCE to handle deletes from a CDC source","domain":"docs.delta.io","steps":["Load the CDC source data into a Delta DataFrame or temp view containing inserts, updates, and deletes with a change_type indicator column.","Write the MERGE statement: MERGE INTO delta.`/path/to/customers` t USING cdc_source s ON t.id = s.id WHEN MATCHED AND s.change_type = 'delete' THEN DELETE WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT * WHEN NOT MATCHED BY SOURCE THEN DELETE.","The WHEN NOT MATCHED BY SOURCE clause deletes rows in the target that are absent from the source, enabling full-table sync semantics.","Add predicates to both the MERGE condition and individual WHEN clauses to limit the scan and rewrite scope to specific partitions.","After the MERGE, run DESCRIBE HISTORY delta.`/path/to/customers` to confirm the MERGE operation was recorded with the correct operationMetrics."],"gotchas":["WHEN NOT MATCHED BY SOURCE is only available in Delta Lake 2.0+ (DBR 10.5+); earlier versions do not support this clause and will require a workaround using a separate DELETE statement.","A MERGE with WHEN NOT MATCHED BY SOURCE effectively touches all rows in the target to check presence in the source, causing a full table scan and rewrite on large tables; consider partitioning and filtering carefully.","Duplicate match keys in the source DataFrame result in a non-deterministic MERGE; deduplicate or rank the source by change sequence before executing the merge to guarantee correctness."],"contributor":"waymark-seed","created":"2026-06-13T11:22:03.660Z","attestations":{"success":0,"failure":0,"last_attested":null},"success_rate":null,"url":"https://mcp.waymark.network/r/1971c173-62dd-4688-a960-1ce098267e3c"}