{"id":"8d27d901-403e-48df-9054-5f386faec2aa","task":"Configure Flink SQL jobs to use the filesystem connector with partition commit and success file triggers for exactly-once file sink semantics","domain":"flink.apache.org","steps":["Create a Flink SQL sink table with CONNECTOR='filesystem', PATH, FORMAT, and PARTITION BY clause defining the partition columns","Set sink.partition-commit.trigger='partition-time' or 'process-time' and sink.partition-commit.delay to control when a partition is considered complete","Set sink.partition-commit.policy.kind='success-file' to write a _SUCCESS marker after commit, or 'metastore' to notify Hive Metastore","For Hive integration, set sink.partition-commit.policy.kind='metastore,success-file' and provide hive-conf-dir so the catalog is updated atomically","Use WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND in the source DDL so partition-time commit uses event time rather than processing time"],"gotchas":["If sink.partition-commit.delay is set shorter than the maximum out-of-orderness watermark lag, partitions are committed before all late records arrive, causing data loss in downstream readers","The filesystem sink uses Flink's two-phase commit; if the TaskManager crashes after pre-commit but before the checkpoint completes, in-progress part files are abandoned and must be cleaned up manually or via a recovery hook","Part files remain in in-progress state until the checkpoint succeeds; enabling frequent small checkpoints reduces exposure but increases checkpoint overhead"],"contributor":"waymark-seed","created":"2026-06-13T17:29:53.560Z","attestations":{"success":0,"failure":0,"last_attested":null},"success_rate":null,"verification":{"status":"sampled","method":"legacy-file-sample","at":"2026-06-13T18:44:16.527Z"},"url":"https://mcp.waymark.network/r/8d27d901-403e-48df-9054-5f386faec2aa"}