R2RML & Ontop — Virtual Graphs over the Warehouse

The W3C standard that turns relational tables into a SPARQL endpoint without ETL.

0/2 done

Overview

The 'no-ETL' bridge

R2RML (Relational-to-RDF Mapping Language) is a 2012 W3C standard that declaratively describes how to render rows of a relational table as RDF triples. Ontop is the most mature open-source R2RML engine; Stardog and Morph-RDB ship commercial / OSS variants.

The trick: the engine does not materialise the triples. Instead, when a SPARQL query arrives, the engine rewrites it into a SQL query against the underlying tables, executes that SQL on the warehouse, and shapes the result rows back into RDF on the way out.

Why this is a bigger deal than it looks

  • The warehouse stays the system of record.
  • The ontology stays the conceptual layer consumers see.
  • No ETL pipeline copies tables into a triplestore — freshness equals warehouse freshness.
  • Existing analytics workloads (Snowflake credits, BigQuery scans) keep running unchanged.

The R2RML mapping shape

@prefix rr:  <http://www.w3.org/ns/r2rml#> .
@prefix ex:  <https://example.org/> .

<#CustomerMap> a rr:TriplesMap;
  rr:logicalTable [ rr:tableName "customers" ];
  rr:subjectMap [
    rr:template "https://example.org/customer/{id}";
    rr:class    ex:Customer
  ];
  rr:predicateObjectMap [
    rr:predicate ex:country;
    rr:objectMap [ rr:column "country" ]
  ].

A SPARQL query like SELECT ?c WHERE { ?c a ex:Customer ; ex:country "FR" } is rewritten by Ontop into roughly SELECT id FROM customers WHERE country = 'FR'.

Cross-link: see ontology-engineering Level 6 case studies for end-to-end Stardog/Anzo deployments.

Subtitles on the warehouse

R2RML is the subtitles on a foreign film. The audio (SQL) never changes; a published mapping file translates every line into another language (RDF/SPARQL) on the fly. Two audiences sit in the same theatre and watch the same scene, but each understands it in their preferred vocabulary. No one re-films the movie.

Reflect

The strategic decision when introducing R2RML is who owns the ontology. If it's a small data team trying to do everything, the project sinks under modelling work. If it's a real Knowledge Engineering function (or you adopt an industry ontology like FIBO), R2RML becomes a force multiplier across every BI consumer.

  • Does your industry have a published ontology you could adopt instead of inventing one?
  • Where in your stack would an Ontop endpoint over the warehouse change which questions become askable?

Reading in progress · 0 of 2 activities done