1 · The Brief — Atlas Bank regulatory reporting

Why an investment bank with a perfectly good warehouse needs an ontology — and why duplicating its data into a triplestore is the wrong answer.

0/1 done

Theory — the brief

The setup

Atlas Bank (fictional) is a mid-tier investment bank. Reporting under MiFIR / EMIR / Dodd-Frank requires:

  • Identifying counterparties to a common legal-entity ontology (LEI, GLEIF, jurisdiction-of-incorporation…).
  • Producing daily aggregates the regulator can verify against the bank's own definitions.
  • Answering ad-hoc regulator queries within hours, not weeks of consulting work.

The two wrong answers

Wrong answer 1 — ETL into a triplestore. Materialise the warehouse as RDF, refresh nightly. Now you have two systems of record, two latency horizons, and an auditor who wants to know which one is the source.

Wrong answer 2 — Bespoke per-regulator SQL views. What Atlas has today. Every new regulation = another six-month project. The vocabulary is implicit in column names that differ between systems.

The right answer — virtual RDF (OBDA)

OBDA = Ontology-Based Data Access. The warehouse stays the system of record. The ontology lives next to it as a thin semantic layer. SPARQL queries get rewritten into SQL by an OBDA engine (Ontop is the open-source leader) using an R2RML mapping. The triples are virtual — they exist only for the duration of the query.

Properties of this design:

  • One system of record (the DWH).
  • Zero ETL between SQL and RDF.
  • Fresh data (every query reads live SQL).
  • Reasoning is rewriting — happens at query plan time, polynomial in the query size.

That last property is exactly what OWL 2 QL was designed for.

Architecture map

Click a node to focus its neighbourhood · drag to pan · scroll to zoom

How the pieces fit

Notice: no triplestore. Ontop sits between SPARQL and SQL. The warehouse never knows RDF exists.

Reading in progress · 0 of 1 activity done