Bridging clinical and genomic knowledge in the Swiss Personalized Health Network
Overview
SWAT4HCLS (Semantic Web Applications and Tools for Health Care and Life Sciences) is an annual conference at the intersection of semantic web technologies and biomedical research. The 2024 edition of the conference was held in Leiden, Netherlands.
I had the opportunity to present my work on bridging clinical and genomic knowledge in the context of the Swiss Personalized Health Network.
Background
The Swiss Personalized Health Network (SPHN) is a national research infrastructure initiative that facilitates the exchange of health-related data in a FAIR manner within Switzerland. One key part of the SPHN is its Semantic Interoperability Framework that provides tools and resources for defining semantics for clinical data representation and exchange. The SPHN Dataset (concept definitions) and SPHN RDF Schema (formal representation of the concept definitions) form an essential part of the SPHN Semantic Interoperability Framework, providing a shared semantic layer that enables participating university hospitals to harmonize and exchange clinical data.
In 2023, the SPHN FAIR Data Team collaborated with The Hyve to extend the SPHN RDF Schema to include concepts for representing genomic knowledge as available within Swiss university hospitals. I presented this collaborative work, along with Eelke van der Horst, at SWAT4HCLS 2024.
At the time of this work, the SPHN RDF Schema covered concepts for mostly clinical routine data. However, SPHN's National Data Streams (NDS) were producing a growing volume of omics data (genomic, transcriptomic, and proteomic) that the existing schema could not adequately represent. Bridging this gap was essential for enabling integrated analyses that combine clinical routine data with molecular data.
The Genomics Extension
The genomics extension was developed in close collaboration with clinicians, researchers, bioinformaticians, and data managers from Swiss university hospitals, academic research groups, and omics platforms.
The extension focuses on the general omics process flow:
- Sample Processing: representing biobanking, extraction, and preparation steps
- Library Preparation: capturing protocol details specific to sequencing workflows
- Sequencing Assay: describing the sequencing instrument, platform, and run parameters
- Sequencing Analysis: representing bioinformatics pipelines and data processing steps
- Quality Control: recording QC metrics at key stages of the workflow
The schema also captures important contextual metadata including information about standard operating procedures, sequencing instruments, and analysis parameters.
The extension aligns with and reuses established biomedical vocabularies — including EDAM, OBI, GenEpiO and FAIR Genomes — as value sets, facilitating semantic interoperability with the broader biomedical data ecosystem.
The full scope of the work is described in detail in the CEUR-WS Paper.