Data Contracts — Producers, Consumers, CI

Schemas + SLAs + ownership, enforced in CI before the breakage hits prod.

0/2 done

Theory

A contract turns 'oh sorry, we changed it' into a build failure

A data contract is a producer-owned, machine-readable declaration of:

  • Schema — columns, types, nullability, semantic meaning.
  • Quality expectations — uniqueness, ranges, distributions, freshness SLA.
  • Ownership — a team, an on-call rotation, a Slack channel.
  • Versioning policy — what's a breaking change, deprecation window.

Enforcement happens in CI (the producer's pipeline tests against the contract before merge) and at the boundary (consumer-side validation with Great Expectations / Soda / dbt tests). The cultural shift is the hard part: contracts move responsibility for analytics-grade data from the data team to the team that owns the source system. Done right, they end the 'who changed prod_users.email_v2?' phone call forever.

Analogy

A data contract is the nutrition label plus the supplier agreement on a packaged food. The label promises exactly what's inside (schema), the freshness date (SLA) and who to call if it's wrong (ownership) — and the supplier signs that they won't quietly swap an ingredient without a relabel-and-notice period (versioning). Crucially, the factory's own quality gate rejects a bad batch before it ships (CI enforcement), instead of customers discovering it on the shelf. It moves the responsibility for good data back to whoever produces it — ending the 'who changed prod_users.email_v2?' phone call for good.

Reflect

Contracts work when they're generated and enforced from one file. Contracts fail when they're a Confluence page that nobody updates. Pick the tool (Bitol/DataContract CLI, dbt model contracts, Schema Registry) that fits the org's existing CI.

  • Which one upstream table, if it had a contract today, would have prevented your last big incident?
  • Who would *own* the contract — the producing service team, or the data team? Why?

Reading in progress · 0 of 2 activities done