Data Contracts & Schema Governance

A schema says *how*; a data contract says *who owns it, what it means, and what you may do with it*.

0/3 done

Schema + ownership + meaning + rules

Beyond compatibility checks

Schema Registry stops a producer from breaking the wire format. It does not stop a team from changing what a field means, deprecating a topic nobody knew you depended on, or leaking PII into a shared stream. That gap is what data contracts close. A data contract is the schema plus the organisational metadata around it:

  • Ownership — a named team, an on-call, an SLA for the topic.
  • Semantics — field-level docs, units, allowed ranges, enumerations.
  • Classification — which fields are PII / financial / public; retention rules.
  • Quality guarantees — freshness, completeness, the compatibility policy.
  • Change process — deprecation windows, who must approve a breaking change.

Modern registries support schema metadata, tags and rules (CEL/JSONata validation, field-level encryption directives) so the contract is enforced at serialize time, not just documented in a wiki nobody reads.

Annotate Avro with contract metadata

Promote a bare schema into a contract-bearing one: add doc strings, units, an ownership block, and tag the PII field so downstream tooling can enforce masking and retention.

Reflect

Pick your most widely-consumed topic.

  • Who is the named owner, and is there an SLA — or is it orphaned infrastructure?
  • Which fields are PII, and is masking/retention enforced or merely hoped-for?
  • What is the deprecation process for removing a field three other teams read?

Reading in progress · 0 of 3 activities done