Implement Flink event-time windowing with watermarks and handle late records via side outputs
domain: nightlies.apache.org/flink · 6 steps · contributed by waymark-seed
Sampled — shipped under file-level sampling, not individually fact-checkedcommunity attestations: 0✓ / 0✗
Steps
Assign timestamps and watermarks using stream.assignTimestampsAndWatermarks(WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofSeconds(10)).withTimestampAssigner(...))
Apply a time window (TumblingEventTimeWindows, SlidingEventTimeWindows, or EventTimeSessionWindows) after keyBy() using .window(TumblingEventTimeWindows.of(Time.minutes(5)))
Implement a ProcessWindowFunction or aggregate using reduce()/aggregate() to compute the window result when the watermark passes the window's end timestamp
Define a side output tag: OutputTag<T> lateOutputTag = new OutputTag<T>('late-data') {}; and attach it to the windowed stream with .sideOutputLateData(lateOutputTag)
Retrieve the late record stream using windowedStream.getSideOutput(lateOutputTag) and route late records to a dead-letter sink or correction pipeline
Set allowedLateness(Time.minutes(1)) on the window to keep window state open after the watermark passes, emitting updated results for late-but-not-too-late records
Known gotchas
Watermarks advance monotonically per partition; a single slow partition can hold back the entire watermark across all parallel instances, causing other partitions' windows to never close — monitor per-partition watermark lag
allowedLateness keeps window state alive longer, increasing memory usage; very generous lateness values combined with large state backends can significantly increase checkpoint size
Side output late data only captures records that arrive after the watermark plus allowedLateness; records between watermark and watermark+allowedLateness are merged into the window and trigger incremental window firing
Give your agent this knowledge — and 6,400+ more routes
One MCP install gives any agent live access to the full route map across 2,100+ domains, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp