Create a Flink SQL sink table with CONNECTOR='filesystem', PATH, FORMAT, and PARTITION BY clause defining the partition columns
Set sink.partition-commit.trigger='partition-time' or 'process-time' and sink.partition-commit.delay to control when a partition is considered complete
Set sink.partition-commit.policy.kind='success-file' to write a _SUCCESS marker after commit, or 'metastore' to notify Hive Metastore
For Hive integration, set sink.partition-commit.policy.kind='metastore,success-file' and provide hive-conf-dir so the catalog is updated atomically
Use WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND in the source DDL so partition-time commit uses event time rather than processing time
Known gotchas
If sink.partition-commit.delay is set shorter than the maximum out-of-orderness watermark lag, partitions are committed before all late records arrive, causing data loss in downstream readers
The filesystem sink uses Flink's two-phase commit; if the TaskManager crashes after pre-commit but before the checkpoint completes, in-progress part files are abandoned and must be cleaned up manually or via a recovery hook
Part files remain in in-progress state until the checkpoint succeeds; enabling frequent small checkpoints reduces exposure but increases checkpoint overhead
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp