The mapping problem

Vendor SCADA produces JSON like:

{ "turbine_id":"T-042",
  "sensor_tag":"GBX-TEMP",
  "value": 87.4,
  "unit":"C",
  "ts":"2026-05-12T08:00:00Z" }

Our ontology speaks Turtle and SOSA. The gap between them is closed by a mapping language. The two dominant choices in 2026:

RML (RDF Mapping Language) — the W3C-track successor to R2RML; mature for both SQL and document sources.
SPARQL-Generate / SPARQL-Anything — when the team already lives in SPARQL.

We chose RML because the data team is JSON-native and Carml (Carve-RML) integrates cleanly with our Kafka Streams app — read JSON in, emit RDF out, into a Kafka topic that GraphDB consumes via its kafka connector.

What this lesson does NOT do

It does NOT re-teach Kafka. If brokers, topics, consumer groups and exactly-once feel fuzzy, take the Apache Kafka & Streaming track first — that's its job. Here we focus on the RDF-shaped half of the pipeline.

4 · RML & Streaming — Kafka events become triples

Theory — RML in production

The mapping problem

What this lesson does NOT do

Reflect