For a temporal join against a slowly-changing dimension table in a Flink catalog or Hive table, use the FOR SYSTEM_TIME AS OF syntax in SQL: join the event table to the dimension table at the version corresponding to each event's event_time, ensuring point-in-time correctness.
For event-to-event interval joins (two append-only streams where you want to match events within a time bound), use the DataStream API intervalJoin().between() or the SQL INTERVAL predicate in the WHERE clause to restrict the time window.
Register the dimension table as a lookup source with a LookupTableSource implementation (or use the built-in JDBC or HBase connectors) for synchronous enrichment; set a lookup cache TTL to avoid hitting the backend on every record.
Assign watermarks and event-time attributes to both streams before the join; Flink uses the minimum watermark of the two streams to advance join state cleanup.
Set the state TTL (table.exec.state.ttl) to bound the amount of unmatched records retained in state for interval joins in high-throughput scenarios.
Test the join correctness with out-of-order test data using Flink's MiniClusterExtension and AssertCollect utilities.
Known gotchas
Temporal table joins require the right-hand table to have a primary key and a time attribute; missing either causes a compilation error.
Interval joins only emit matched rows; unmatched records are silently dropped. If you need unmatched records, use a regular stream join with a side-output for timeouts.
Using processing-time temporal joins means the dimension snapshot seen depends on wall-clock time of execution, which is non-deterministic on replay; prefer event-time temporal joins for reproducibility.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp