Projects
SPHN Metadata Catalog Schema
Overview
The SPHN Metadata Catalog Schema is a lightweight vocabulary designed to support metadata representation in the SPHN Metadata Catalog. The schema addresses metadata elements that are outside the scope of the FAIR Data Point Ontology (FDP-O) while strictly adhering to the overall structure of the Data Catalog Vocabulary (DCAT).
Approach
- Gap analysis: Systematically compared the metadata requirements identified by SPHN data stewards and researchers against what FDP-O and HealthDCAT-AP already cover. Only elements with no suitable existing representation in those vocabularies were added to the schema.
- DCAT-aligned extension: New classes and properties were designed as extensions of DCAT and DCAT-AP concepts, preserving compatibility with existing tooling and ensuring that SPHN metadata remains interpretable by any standards-compliant catalog client.
- SHACL shapes: Defined SHACL shapes alongside the vocabulary to enable automated validation of dataset descriptions submitted by data providers, catching structural and semantic issues early in the metadata submission workflow.
- Tight integration with the catalog: Developed the schema in lockstep with the SPHN Metadata Catalog, so that each new metadata requirement identified during catalog development could be immediately reflected in the schema.
- Stakeholder iteration: Worked with data stewards and data managers across Swiss university hospitals to review and validate the schema, ensuring it accurately captures the metadata elements most relevant for dataset discoverability and reuse assessment.
- Documentation and publication: Published the schema with human-readable documentation at sphn.gitlab.io/sphn-metacat-schema, providing class and property definitions, usage examples, and a changelog for data providers to reference during metadata submission.
Tech Stack
- Semantic Web: RDF, RDFS, SHACL
- Vocabularies: DCAT, DCAT-AP, DCAT-AP CH 2.0, SPHN Metadata Catalog Schema