4 · When the JSON ontology is no longer enough — graduating to a KG

A thresholded decision for introducing a knowledge graph, what reasoning and identity the graph adds that JSON cannot, and a minimal phase-2 ontology starter.

0/2 done

Theory — the thresholds that justify a graph

The signal that JSON has hit its ceiling

JSON-first is right until it isn't. Graduate from a JSON ontology to a graph-backed one when at least two of these hold:

  1. Multi-hop questions are now the product. 'Show every open claim whose repair workshop also appears in a fraud ring connected to this policyholder' — customer → policy → claim → workshop → fraud pattern. JSON can store those links; it can't traverse them.
  2. Cross-system entity resolution is costing real hours. The same policyholder is cust_4821 in claims, P-99213 in billing, and a name string in the fraud tool. Reconciling them by hand, repeatedly, is the tax a shared identity (IRIs) removes.
  3. Regulators want relationship evidence, not flat logs. 'Prove this claim was never linked to a flagged workshop' is a path query, awkward over event logs.
  4. The rule set outgrows maintainable JSON branching and needs semantic reuse / inference (e.g. automatic hagelschaden ⊑ elementarschaden rollups everywhere).

If none of these bite, stay JSON-first — a graph you don't query multi-hop is pure operational overhead.

Theory — reasoning, identity, and SHACL

What the graph adds that JSON genuinely cannot

Be precise about what you're buying, so the upgrade is a decision, not fashion:

  • Traversal as a first-class operation. Variable-length path queries (customer → … → fraudRing) are native; in JSON they're N+1 lookups you hand-stitch.
  • Inference. Declare hagelschaden rdfs:subClassOf elementarschaden once and every query, report and constraint inherits it. In JSON you maintain the rollup in every place by hand.
  • Global identity. An IRI like …/policy/AB-483920 is the same node no matter which system wrote it, so resolution happens by design, not by nightly reconciliation.
  • Constraints that travel with the data (SHACL). The claim shape ships with the graph instead of living in one validating service.

Crucially, phase 2 keeps the JSON ontology — the lexical normalisation and STT hotwords still do the Swiss-German accuracy work. The graph sits downstream of extraction, on the already-canonicalised entities. You're adding a reasoning layer, not replacing the accuracy layer.

Reflect

Graduating early is as costly as graduating late — decide on evidence.

  • Write the single multi-hop question that, if asked weekly, would justify the graph. Are you being asked it yet?
  • How many hours per week does cross-system entity reconciliation cost you today? That number is your ROI denominator.
  • If you moved to a graph tomorrow, which parts of the JSON ontology would you KEEP (hint: all the Swiss-German accuracy parts)?

Reading in progress · 0 of 2 activities done