Theory — the ontology, block by block
Anatomy of the lightweight ontology
Open the artefact below. It is small on purpose, and every block maps to one of the four ontology jobs from Lesson 1:
| Block | Ontology job | What it fixes in Swiss German |
|---|---|---|
concepts.*.canonical_values | Name the concepts | Gives extraction a closed target set instead of free text |
concepts.*.surface_forms | Synonyms / lexical layer | Collapses hagelschade / es het ghaglet → hagelschaden |
concepts.IncidentType.broader | Taxonomy (SKOS-style) | Lets hagelschaden roll up to elementarschaden for reporting |
spoken_number_forms | Datatype normalisation | Maps zwöitusig → 2000 |
PolicyNumber.pattern | Value constraint | Rejects a misheard policy number outright |
constraints.Claim | Shape / SHACL-lite | States what a valid claim must contain |
This is genuinely an ontology — concepts, labels, synonyms, a broader hierarchy, and constraints — just serialised as JSON instead of Turtle. The MVP pipeline (second artefact) consumes this: it feeds surface_forms keys as STT hotwords, uses surface_forms as the normaliser dictionary, and enforces constraints before anything reaches the CRM.