Webhooks, approvals, lineage
Treat the registry as the control plane
Three governance practices that pay off immediately:
- Webhooks on transitions — fire a Slack/Jira/PagerDuty event whenever a model moves to
Production. The act of shipping becomes auditable. - Two-person rule — registry permissions split between proposer and approver. Same pattern as protected branches.
- Lineage — keep
run_id,git_commit, anddata_urias tags on the version. Six months later you can answer: which data trained this model?
A real outage pattern
A model in
Productionstarted returning constant predictions. The registry alias pointed at version 14. Lineage tags showed the run used a data snapshot from a partition that had been dropped two days earlier. Rollback to version 12 took 30 sec.
Without lineage that diagnosis takes a day, not 30 seconds.