Define a Dataset object with a URI string (e.g., Dataset('s3://bucket/path/to/table')) and import it in both the producer and consumer DAG files
In the producer DAG, set outlets=[dataset] on the task that writes the data, which causes Airflow to record a dataset event when that task completes successfully
In the consumer DAG, set schedule=[dataset] on the DAG definition so it triggers automatically when all listed datasets have been updated in the same logical cycle
Use DatasetAlias in Airflow 2.9+ / Airflow 3 to allow dynamic dataset URI resolution at runtime when the exact path is not known at DAG parse time
Monitor dataset events in the Airflow UI under Browse > Datasets to inspect which DAG runs produced each dataset update and which consumer runs they triggered
Known gotchas
Dataset URIs are matched by exact string equality; a trailing slash or case difference causes producer and consumer to reference different datasets and the trigger never fires
If a producer task fails after partially writing data, no dataset event is emitted and the consumer DAG is not triggered; implement idempotent writes and use task-level retries before treating a partial write as a failure
Dataset-triggered DAGs do not have a natural logical date derived from a cron schedule; the logical_date is set to the time of the triggering event, which can confuse backfill operations that expect schedule-aligned dates
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp