Assign a WatermarkStrategy using WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofSeconds(10)) and chain .withTimestampAssigner() to extract the event timestamp field from each record
On the windowed operator, set .allowedLateness(Time.seconds(5)) to accept records that arrive after the watermark but within a 5-second grace period — these re-trigger the window and emit an updated result
Define a side output tag: OutputTag<MyEvent> lateTag = new OutputTag<>("late-events"){} and pass it to .sideOutputLateData(lateTag) on the windowed stream
Collect the side output stream: DataStream<MyEvent> lateStream = windowedResult.getSideOutput(lateTag) and sink it to a monitoring topic or dead-letter store
Tune the out-of-orderness bound to match observed source lag; too large a value increases end-to-end latency, too small causes correct records to be routed to the side output
Monitor the numLateRecordsDropped metric in the Flink UI to confirm the side output is catching genuinely late data and the bound is appropriately calibrated
Known gotchas
allowedLateness only applies to window operators; records that arrive after the watermark on a non-windowed stream are never explicitly routed to a side output — they are processed normally, which may produce incorrect aggregations
The side output only captures records that arrive after both the watermark AND the allowedLateness window; records within the allowedLateness period trigger window re-firing, not side output routing
Flink's default watermark is set to Long.MIN_VALUE at startup; no windows will fire until the first watermark advances past the window's end time, which can cause apparent delays when the pipeline first starts
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp