Theory — SHACL at the gate
Bridging the Gap: What SHACL Gives Us
As we discovered in the architecture log, OWL logic functions strictly under the Open-World Assumption (OWA): If a specific data field is missing, OWL assumes the data exists somewhere, it simply hasn't been asserted yet. This flexibility is fantastic for deep inference across sparse datasets, but it is an absolute disaster for strict data intake pipelines where we need failing batches blocked.
SHACL acts as the complementary counterpart to OWL by operating on the Closed-World Assumption. When you attach a sh:NodeShape to our :AdverseEvent, you are loudly declaring: 'Any AdverseEvent entering this system must strictly conform to these structural rules, or the entire transaction is rejected.'
The MedaCore SHACL Ruleset
The shapes file strictly enforces the following pipeline integrity checks:
- Every
:AdverseEventmust link to exactly one patient. No more, no less. (sh:minCount 1 / sh:maxCount 1 / sh:class :Patient). - A timestamp (
:onsetDateTime) is mandatory and mathematically constrained toxsd:dateTimetyping. - The severity level (
:seriousness) is locked into a controlled vocabulary array of four exact MedDRA strings viash:in. - Every
:CausalityAssessmentmust successfully establish structural links to an event and an exposure.
By inserting a pyshacl validation action directly into our CI/CD pipeline, poorly formed records from the partner systems are caught and rejected before they reach the GraphDB storage. This achieves the ultimate data principle: Fail closed, fail early, fail safely.