3 · Modelling Decisions — every fork, justified

OWL profile, reuse, defined-class strategy, n-ary patterns, open-world implications. The decision log a real ontology team keeps.

0/1 done

Theory — Decision log

Architecture Decision Log (ADR)

In industry, ontology development is not about finding the 'perfect' theoretical model; it's about making pragmatic engineering trade-offs. Each item below represents a fork in the road where we picked one path and formally documented why. This logging is critical for regulators and future engineers.

D1 — Profile: OWL 2 DL (Why not EL, RL, or QL?)

The Problem: We need equivalent-class definitions that use specific dataset values, like enumerations (owl:oneOf) to check if seriousness is exactly 'severe' or 'life-threatening'. The Decision: We selected OWL 2 DL because the EL and QL profiles do not support this expressivity. While the RL profile supports it, RL relies on rule-based engines, and our enterprise architecture is already built around the HermiT reasoner (which natively understands DL logic). DL provides the maximum analytical power we need while remaining computationally tractable for our current dataset volume (< 1 million triples).

D2 — Reuse vs. Minting

The Rule: Never invent a concept if an accepted industry standard already exists.

  • Reuse (SKOS): We annotate heavily against the global MedDRA medical dictionary using skos:exactMatch. Because we don't own MedDRA, we reference it safely rather than redefining it.
  • Reuse (Alignment): We use the open ChEBI database for drug compounds via skos:closeMatch.
  • Minting: We creatively minted the :CausalityAssessment class. Why? Because existing universal PV ontologies lacked a structure that perfectly matched our mandatory internal SOPs. Minting is acceptable when the gap is documented and unavoidable.

D3 — The Defined Class Strategy for 'Reportable Case'

To flag a case as reportable, we faced two choices:

  • Option A (Imperative): Write a Python/SQL script that manually checks flags and explicitly asserts the :ReportableCase status into the database.
  • Option B (Declarative OWL): Define the reporting criteria logically in OWL (owl:equivalentClass) and let the reasoner automatically deduce it.

The Decision: We emphatically chose Option B. The regulatory rule (serious event + WHO-UMC causality ≥ possible) effectively is the legal definition. By embedding it directly into the ontology, regulators can literally read the OWL file to verify compliance. Furthermore, the reasoner automatically generates the audit trail (proving CQ3).

D4 — N-ary Relationships? Avoid Premature Complexity.

In semantic web design, an n-ary relationship handles complex, multi-dimensional connections. Some legacy PV ontologies model a drug exposure as a massive 4-way node connecting (Patient × Drug × Dose × Route of Administration). The Decision: We deferred modeling dose and route until v2.0. For v1.0, connecting :DrugExposure directly to :involvesDrug and :exposureOf is entirely sufficient. Prematurely implementing n-ary patterns is the #1 cause of query bloat and project failure. Start simple.

D5 — The Danger of the Open World Assumption (OWA)

OWL operates dynamically under OWA: If your data is missing an :involvesDrug relationship, OWL doesn't assume the data is 'bad'. It simply assumes: 'The drug exists, I just haven't been told about it yet.' The Decision: Because of OWA, we cannot use OWL to mathematically reject an intake record for missing fields. Instead, we introduce SHACL (Shapes Constraint Language) which operates strictly under a Closed World architecture to validate the incoming data shape. The two layers harmonise perfectly: SHACL acts as the bouncer at the door, and OWL acts as the detective inside.

Analogy

Think of OWL as the constitution (rare changes, big implications, drives inference) and SHACL as the customs form at the airport (every traveller fills it in or they're turned around). Confusing the two — using OWL to enforce required fields or using SHACL to infer class membership — is the most common rookie mistake in this stack.

Reflect

Decision logs feel like overhead until the day a regulator or a new team member asks 'why did you do it this way?'. Then they are the single most valuable artefact in the project.

  • What would D6 be if you had to add 'dose' and 'route' next quarter?
  • When would you switch from OWL 2 DL to OWL 2 EL or QL on this project?

Reading in progress · 0 of 1 activity done