Theory — deployment & governance
What ships
- Ontop as a service (
ontop endpoint --ontology=… --mapping=… --properties=…). HTTP SPARQL endpoint, JDBC connection pool to the warehouse, auth in front via the bank's IAM. - The ontology at a stable PURL (
https://atlas.example.com/onto/reporting/1.0.0), signed, mirrored to an internal artifact store. - The R2RML mapping versioned alongside.
- A regression query suite — every regulator question answered to date becomes a CI assertion that re-runs on every release.
Three governance habits
G1 — Explain by default. Every regulator query stores the rewritten SQL alongside the result. An auditor can always retrace: SPARQL → SQL → row → DWH table → source system. The chain of evidence is the product.
G2 — Mapping reviews are mandatory. Ontology PRs can be reviewed by an ontologist alone. Mapping PRs require both the ontologist AND the data engineer who owns the underlying table. A wrong mapping silently reclassifies real customers.
G3 — Profile gate in CI. ontop validate --profile QL runs on every PR. The day someone adds an equivalentClass axiom that pushes the ontology into DL, the build fails — before a regulator submits a query that explodes into exponential rewriting.
Failure modes worth pre-mortemming
- Mapping drift — a DBA renames a column, R2RML breaks silently at next query. Mitigation: schema-check step in CI that diffs DWH metadata against the mapping.
- Over-broad templates —
template "…/party/{id}"whereidis not unique across tables ⇒ IRI collisions. Mitigation: include the table in the template (/party/{table_name}/{id}). - Cost explosions — a regulator submits an unconstrained
?s ?p ?oquery. Mitigation: SPARQL complexity limits at the gateway.