Overview
Lineage and Impact Analysis
Track where every value came from and which downstream consumers depend on it.
Why it matters
Lineage answers 'if I change / break this upstream column, what blows up downstream?' Tools (OpenLineage, dbt, Atlas) parse SQL to auto-extract it.
Going deeper
Three lineage layers worth distinguishing:
- Static lineage — parsed from SQL / dbt DAGs. Cheap, complete, but blind to runtime branching (CASE WHEN, dynamic SQL).
- Runtime lineage — emitted by the engine (OpenLineage events from Airflow, Spark, dbt). Captures what actually ran.
- BI / consumer lineage — extends lineage past the warehouse into Looker, Tableau, ML feature stores. The hardest tier, and the one that closes the loop.