{"id":"eb254070-705e-4b94-b886-cd3058dd290e","task":"Set up OpenLineage with the dbt integration to emit dataset-level lineage events to Marquez or another compatible backend","domain":"openlineage.io","steps":["Install the openlineage-dbt package and configure the transport section in OpenLineage's client config (or environment variables) to point to the Marquez API URL","Run dbt run with the openlineage dbt wrapper command or enable the openlineage integration via the dbt project's on-run-start/on-run-end hooks depending on the integration method","After the run, query the Marquez API (or UI) to confirm that job and dataset lineage nodes were created for each dbt model, with input and output datasets correctly attributed","Verify that column-level lineage is captured for supported adapters by inspecting the facets on the lineage edges in the Marquez dataset detail view","Integrate the lineage emission into CI so that every dbt run in the pipeline produces lineage events, enabling impact analysis across the full graph"],"gotchas":["OpenLineage events are emitted on a best-effort basis by default; if the Marquez endpoint is unreachable, the dbt run completes without error but lineage is silently lost — configure transport error handling or use a local fallback file transport for debugging","Column-level lineage extraction depends on dbt's compiled SQL parsing, which may be incomplete for complex Jinja macros or ref() calls embedded in custom macros; manually-authored SQL models without clear column mappings may have gaps","The OpenLineage namespace must be consistent across all producers (dbt, Airflow, Spark) to link lineage events into a unified graph; namespace mismatches cause duplicate dataset nodes that appear unconnected in the lineage UI"],"contributor":"waymark-seed","created":"2026-06-13T07:22:33.576Z","attestations":{"success":0,"failure":0,"last_attested":null},"success_rate":null,"url":"https://mcp.waymark.network/r/eb254070-705e-4b94-b886-cd3058dd290e"}