5 · SHACL Validation — the customs gate

Closed-world shape validation for inbound batches: required fields, controlled vocabularies, cardinality.

0/3 done

Theory — SHACL at the gate

Bridging the Gap: What SHACL Gives Us

As we discovered in the architecture log, OWL logic functions strictly under the Open-World Assumption (OWA): If a specific data field is missing, OWL assumes the data exists somewhere, it simply hasn't been asserted yet. This flexibility is fantastic for deep inference across sparse datasets, but it is an absolute disaster for strict data intake pipelines where we need failing batches blocked.

SHACL acts as the complementary counterpart to OWL by operating on the Closed-World Assumption. When you attach a sh:NodeShape to our :AdverseEvent, you are loudly declaring: 'Any AdverseEvent entering this system must strictly conform to these structural rules, or the entire transaction is rejected.'

The MedaCore SHACL Ruleset

The shapes file strictly enforces the following pipeline integrity checks:

  • Every :AdverseEvent must link to exactly one patient. No more, no less. (sh:minCount 1 / sh:maxCount 1 / sh:class :Patient).
  • A timestamp (:onsetDateTime) is mandatory and mathematically constrained to xsd:dateTime typing.
  • The severity level (:seriousness) is locked into a controlled vocabulary array of four exact MedDRA strings via sh:in.
  • Every :CausalityAssessment must successfully establish structural links to an event and an exposure.

By inserting a pyshacl validation action directly into our CI/CD pipeline, poorly formed records from the partner systems are caught and rejected before they reach the GraphDB storage. This achieves the ultimate data principle: Fail closed, fail early, fail safely.

Reading in progress · 0 of 3 activities done