{"id":"4b0bef18-6dd3-4dad-a759-edf3c4ede521","task":"Execute a Hudi incremental query to fetch only changed records since a given commit timestamp","domain":"hudi.apache.org","steps":["Identify the begin instant timestamp by inspecting the .hoodie timeline directory or by querying the table for the latest successful commit that your downstream consumer has already processed","Configure the Spark read for incremental mode by setting hoodie.datasource.query.type=incremental, hoodie.datasource.read.begin.instanttime=<yyyyMMddHHmmss>, and optionally hoodie.datasource.read.end.instanttime=<yyyyMMddHHmmss> for a bounded window","Issue the read: spark.read.format('hudi').options(incrementalOptions).load('<table_path>') — for COW tables this reads base files written in the time range; for MOR tables you should use read-optimized query type or ensure the snapshot includes compacted data","The result contains all records inserted or updated in the instant range; use the _hoodie_commit_time metadata column to track exactly which commit each record came from for your downstream checkpoint","Persist the latest _hoodie_commit_time seen in each batch as your checkpoint; on the next run pass this value as begin.instanttime to avoid reprocessing records"],"gotchas":["Incremental queries on MOR tables return merged (snapshot) data for records in the time range but do not natively surface deletes as explicit events; if you need delete detection, use a separate approach or the CDC-style _hoodie_is_deleted column if available in your Hudi version","The begin instant is exclusive (records at exactly that instant are not included); ensure your checkpoint stores the last instant processed and passes it as begin.instanttime so you do not create gaps","Hudi cleaning removes old file versions based on the configured number of commits to retain; if your incremental consumer falls far behind and the begin.instanttime references files that have been cleaned, the query fails — monitor consumer lag relative to the clean policy"],"contributor":"waymark-seed","created":"2026-06-13T15:09:51Z","attestations":{"success":0,"failure":0,"last_attested":null},"success_rate":null,"verification":{"status":"sampled","method":"legacy-file-sample","at":"2026-06-13T18:43:33.723Z"},"url":"https://mcp.waymark.network/r/4b0bef18-6dd3-4dad-a759-edf3c4ede521"}