Abstract
Background:
SNOMED CT is a large terminology system designed to represent all aspects of healthcare. Its current form and content result from decades of bottom-up evolution. Due to SNOMED CT’s formal descriptions, it can be considered an ontology. The Basic Formal Ontology (BFO) is a foundational ontology that proposes a small set of disjoint, hierarchically ordered classes, supported by relations and axioms. In contrast, as a typical top-down endeavor, BFO was designed as a foundational framework for domain ontologies in the natural sciences and related disciplines. Whereas it is mostly assumed that domain ontologies should be created as extensions of foundational ontologies, a post-hoc harmonization of consolidated domain ontologies in use, such as SNOMED CT, is known to be challenging.
Methods:
We explored the feasibility of harmonizing SNOMED CT with BFO, with a focus on the SNOMED CT
Results:
Under a first scrutiny, the clinical intuition that diseases, disorders, signs and symptoms form a homogeneous ontological upper-level class appeared incompatible with BFO’s upper-level distinction into continuants and occurrents. The
Conclusion:
Our analyses resulted in the proposal of (i) equating SNOMED CT’s ‘role group’ property with the reflexive and transitive BFO relation ‘
Introduction
Standards of meaning
Standards are agreements that facilitate the exchange of products and the joint participation in practices and operations (International Standards Organisation, 2022). Whereas industry standards, i.e. standards for manufactured entities, are well established, the extension of the standardization idea to natural kinds and to basic categories of being has not met the same level of acceptance yet (Schulz et al., 2018).
Awareness is growing that the creation of industry standards is something close to the practice of ontology and terminology engineering, and that both communities can learn from each other. Good practices as developed by the Applied Ontology community (Guarino and Musen, 2015) could then support discussions about ontology-based standards and support efforts to mutual discussions and collaboration towards interoperability as an ultimate goal.
It often occurs that standards contradict each other. Then, users have the difficult task of using one standard and rejecting the others. But standards can also complement each other. Here, the task is more rewarding. It is then centered on the creation of links between the components of the respective standards, as well as the identification of mappings in their overlapping areas. However, it is common that any two standards to be compared and aligned bear implicit assumptions that challenge interoperation, particularly in cases where they represent different communities with different views regarding the purpose of standardization and their effects on downstream use cases. Here, additional effort is needed to view, understand and re-interpret one standard in the light of the other one. Such an analysis ideally fills interpretation gaps in either standard, and the harmonization task can be described as creating convergence between the standards.
This paper will scrutinize two ontology-based standards in natural science;
Interoperation between ontology standards is mostly driven by the need for interoperability of data that are annotated or coded by these standards. Lack of interoperability strongly affects the use of data in the biomedical field. More and more ontologies used for research use BFO as their foundational level, whereas SNOMED CT gains more ground as a healthcare ontology. The interest in harmonization between SNOMED CT and BFO is therefore motivated by the interest in closing the data gap between healthcare and biomedical research.
Although there are a number of foundational ontologies, which might be equally suited as a counterpart to SNOMED CT (see Applied Ontology issue “Foundational Ontologies in Action” (Borgo et al., 2022), presenting seven foundational ontologies), BFO was chosen for the following reasons:
Its high degree of consolidation, documented by the fact that it has recently become an ISO standard, committed to support the interchange of information among heterogeneous information systems (ISO/IEC, 2022);
Its focus on the representation of entities relevant to natural science, particularly life science including healthcare;
Its importance as an upper level of biomedical domain ontologies in the OBO Foundry (Smith et al., 2007);
The fact that both BFO and SNOMED CT adopted OWL (OWL, 2023) as one representational language (besides others).
This does not mean that the review of SNOMED CT against other foundational ontologies would be less fruitful; we even postulate (albeit without providing evidence in this paper) that, generally, most foundational ontologies that like BFO subscribe to a three-dimensionalist view of the world would lead to solutions that are very similar to what we propose here.
SNOMED CT
SNOMED CT is a large clinical terminology standard, proposing interoperable codes linked to clinical terms in several languages. The mission of SNOMED International, the non-profit standards development organization that owns and maintains SNOMED CT is to support semantic interoperability between clinical care systems, as well as between clinical care systems and biomedical research environments, across institutions, jurisdictions and linguistic groups, by what they name a “global language for health”. The roots of this global language, SNOMED CT, lie in a nomenclature, i.e. a multiaxial and hierarchical, albeit informal, compilation of English medical terms, driven by the College of American Pathologists (CAP). The early versions SNOMED, SNOMED II and SNOMED 3 were followed by SNOMED RT (Spackman et al., 1997), underpinning the term collection with description logics axioms, aiming at the capacity of computing equivalence between the meaning of term compositions and pre-existing terms, thus addressing a desideratum formulated in 1994 (Campbell et al., 1994). After merging SNOMED RT with CTV3 (“Clinical Terms Version 3”), a hierarchical and systematized terminology used in the U.K. National Health Service, it became SNOMED CT (“Clinical Terms”) in 2002. In 2007, the intellectual property rights to all versions of SNOMED were acquired by IHTSDO, now SNOMED International. Thus, SNOMED CT is the result of a long bottom-up terminology engineering process based on what clinicians recorded and wished to retrieve about patients. Each code represents a standardized meaning, called a SNOMED CT concept. SNOMED CT concepts are ordered in multiple hierarchies that extend a domain-specific upper level with foundational classes such as
Basic Formal Ontology (BFO)
In contrast, Basic Formal Ontology (BFO) (Arp et al., 2015; Basic Formal Ontology, 2022) is the result of a top-down process, carried out by a cross-disciplinary academic team with a strong anchoring in analytic philosophy. It resulted in a foundational ontology that proposes a small set of disjoint, hierarchically ordered types (universals), accompanied by formal binary and ternary relations, textual definitions and elucidations, as well as formal axioms. Foundational ontologies like BFO typically introduce upper-level distinctions by disjoint classes, with the aim to provide clear-cut dissections of reality – in terms of types, properties (relations) and constraints – in benefit of domain ontologies that import them as their top layer. It is expected that domain ontologies using the same upper layer better interoperate.
BFO uses first-order logic; an approximate rendering in OWL-DL description logics (Baader et al., 2008) is mostly finished. BFO has recently become an ISO/IEC standard (ISO/IEC, 2022). As a three-dimensional ontology, BFO is well known for its top-level bipartition into
Continuants are those things that exist in time and have no temporal parts. Material entities, spaces or qualities are typical continuants, e.g. an aspirin tablet, an operation theater, the cavity of a stomach, or a broken bone.
Occurrents are entities in time like processes and events with temporal parts, i.e. phases or temporal slices, such as July in a year, the opening of the chest in a heart transplant procedure, adolescence in a human’s life, or the event of a bone fracture or its healing process.
The difference is that you can never take away temporal parts from occurrents, e.g. July 2022 from the year 2022 or adolescence from one’s life (then it would no longer be the same), but in contrast you could take away parts of continuants, e.g. a tooth from one’s body without affecting its identity. Material continuants typically have a volume and/or a mass (which may vary every instant), as opposed to occurrents, which have a duration. Occurrents “happen”, whereas continuants “are there” and maintain identity across time. Typically, continuants participate in occurrents, e.g. a heart participates in a heart transplant, a person participates in an exam, but also a biological organism is a participant in this organism’s life. In BFO, an important descendant class of
BFO – SNOMED CT synopsis
Table 1 displays key features of SNOMED CT and BFO 2. Given that one is a domain ontology and the other is a top-level ontology, it is not intended for direct comparison. What can be sensibly compared between SNOMED CT and BFO is (i) SNOMED CT’s top hierarchy (the classes directly underneath “SNOMED CT concept”) with the BFO class hierarchy, (ii) the BFO relations with SNOMED CT’s linkage concepts (binary relations), and (iii) SNOMED CT concept model (a set of domain and range constraints on SNOMED CT object properties (SNOMED International, 2023)) with BFO axioms.
In terms of scope, SNOMED CT would ideally fit underneath BFO, as there is no or minimal overlap (Fig. 1). There are, however, controversies regarding the possibility of aligning/harmonizing the two ontologies, especially due to BFO’s strict desiderata concerning domain ontologies linked to it (see OBO Foundry criteria (Smith et al., 2007)). However, interoperation between the two artifacts does not stop at technicalities and alignment tasks, which are common in knowledge representation circles, where the fitness for specific use cases are the criterion for achievement. BFO claims that it represents reality independent of any purpose because tailoring ontologies to address specific purposes would undermine their ability to serve interoperability (Smith, 2018).
SNOMED CT has never raised that universal claim, and from its history, it has always been committed to clinical documentation tasks and therefore driven by the need of providing standardized meaning to human language expressions used in clinical care contexts and materialized in electronic health records (EHRs). Although nowhere explicitly stated in the SNOMED CT documentation, we make the assumption in the further course of our deliberations that SNOMED CT categorizes things in reality, whenever used for the purpose of clinical documentation, which range from physical entities and processes to qualities and information entities under several distinct upper level concepts.

Class hierarchies in SNOMED CT and BFO (complete). Indentations indicate subclasses.
A comprehensive agenda for BFO-SNOMED CT harmonization would require identifying appropriate BFO categories corresponding to each SNOMED CT hierarchy and, if needed, subdivisions thereof, by scrutinizing the current state of SNOMED CT in light of the precepts of formal-ontological analysis in general, and the foundational divisions proposed by the BFO ontology in particular.
Preliminary work showed that important parts of current SNOMED CT can easily be aligned with the upper-level classes of BioTopLite, an experimental domain upper level ontology, partially aligned with BFO (Schulz and Martínez-Costa, 2015), particularly organisms, devices, procedures and substances. For those other hierarchies where there are still open issues regarding BFO compatibility,
The goal of this paper is therefore to investigate the harmonization of the CF hierarchy with BFO. We expect, as a result, a formal framework that is consistent with the current content and useful for the modeling of new CF content, formulated as a set of recommendations. This requires first of all to clarify the ontological commitment of SNOMED CT, by adding more precision to what is currently provided by the SNOMED CT Concept Model (SNOMED International, 2023).
Criteria of success would be (i) to reach a consensus among SNOMED CT users and maintainers on the resulting clarifications and recommendations, (ii) to facilitate the use of SNOMED CT together with other domain ontologies based on BFO, (iii) to facilitate its use with information models for clinical data interoperability, (iv) to better support biomedical data representation and management, and (v) finally to get closer to an answer to the question whether two ontologies with such different histories and criteria can be reconciled.
We highlight that (i) this study is preliminary in the sense that it does not propose any experimental approach to assess the interoperability claims, and (ii) limited in scope as it focuses on the CF hierarchy only. We also refrain (iii) from discussing possible ontological criteria for distinguishing between findings and disorders, as well as (iv) between findings that are necessarily pathological and those that are pathological only in certain contexts, because this does not affect the ontological nature of CFs.
Resources and methods
Methodological considerations
Most of this section is devoted to an in-depth description and clarification of the resource SNOMED CT and its basic tenets, in the light of BFO. The notion of “Concept” in SNOMED CT and its boundaries regarding a BFO-compatible interpretation is analyzed and illustrated by means. SNOMED CT Clinical Findings (CF) are then discussed in light of the BFO
Modeling of the SNOMED CT class
Reduction of the impact on current SNOMED CT editorial principles to a minimum;
Testing the logical entailments of these patterns, using HermiT as a description logic reasoning engine;
Visualization of the modeling and reasoning results;
Discussion of the results, selection of a preferred pattern and analyzing its impact on the current state of the CF hierarchy;
Testing the preferred model for plausibility against a selection of active CF classes of the July 2022 release, i.e. those with the hierarchy tags “(disorder)” or “(finding)”. Out of a total of 115,998 classes, 62,875 had multiple stated and inferred parents. A random sample (
Formulation of recommendations for the future content development of the CF hierarchy.
Throughout the paper, class names are shown in
“Concepts” in a BFO-compatible interpretation of SNOMED CT
The term “concept” has repeatedly been subject of heated discussions between ontologists and terminology builders. “Concept” had been introduced as a cornerstone of terminology theory, with its ISO definition as “unit of knowledge created by a unique combination of characteristics” (ISO). SNOMED CT defines “concept” as “clinical idea”, which comes close to the ISO definition. In contrast, the creators of BFO repeatedly rejected the notion of concept (Smith, 2004), as being incompatible with the precepts of Scientific Realism they defend, in which an ontology’s representational units correspond to universals, in the sense as discussed in Western philosophy since Plato and Aristotle.
The attempt to harmonize BFO and SNOMED CT could already end at this point, but from a pragmatic point of view there is a common denominator, namely the set-theoretical semantics of OWL. Description logics specifies only classes and properties and is therefore agnostic regarding the question whether the members of a class are included in the corresponding extension of a concept or a universal.
However, at the level of SNOMED CT itself, the word “concept” requires clarification, because – unorthodoxically, for SNOMED any entity that has a SNOMED ID is a concept, which deliberately includes relations and metadata elements.
This is why the authors refrain from the use of the word “concept” in the remainder of this paper and introduce the following definitions, in accordance with SNOMED International (2023).
SNOMED CT concepts that correspond to OWL classes will be referred to by “SNOMED CT classes”.
Those SNOMED CT concepts that are descendants of “Concept model attribute” (and are, in fact, binary relations) correspond to OWL object and datatype properties. We will name these “SNOMED CT properties”.
Many SNOMED CT classes have formal definitions in OWL axioms. E.g. ‘
It should not be overlooked that there is a small number of SNOMED CT classes that are used for individual things, e.g.
Nevertheless, individuals – although they are rarely ever named in the biomedical domain – are fundamental, because OWL axioms are always quantifications over all individuals that belong to a class. Whereas
BFO’s view of an ontology as a system of universals leads to another limitation. The assumption that universals, by definition, exist in their instances (as they represent what individual things have in common) obviously precludes uninstantiated universals, but also universals defined by negation, such as representational units that correspond to the terms “non-smoker” or “absence of rib”. For BFO such terms would not have any relevance to an ontology. This has sparked controversies in the past, mostly with the argument that biomedical ontologies have to account for representing the entirety of scientific discourse. Here, terms that do not yet or do possibly not denote anything in reality, or that denote phenomena that might exist in the future or whose existence is disputed cannot be avoided. Examples are whole-body transplants or Qi deficiency (Schulz et al., 2011). Again, the ontologically neutral ground of OWL shows a way out of this dilemma. Given an ontology representing universals in the BFO sense and implemented in OWL would not preclude it being enhanced by additional classes that are defined on the basis of existing ones by using the logical constructors provided by the language, e.g. introducing a class
What might be less acceptable to BFO is the notion of mental constructs as defining principles for entities within a scientific reality. Medical practice and biomedical sciences have always been characterized by such constructs, rooted in natural language expressions and used in contexts in which they were related to some biological entity. Examples are disease entities like sepsis or rheumatoid arthritis, which are ill-defined or repeatedly re-defined. Their intensional aspects represent mental constructs such as scientific hypotheses or disease models, rather than things that can be pointed to such as a broken bone or a red eye. Whether such constructs are eventually confirmed by a material understanding of an underlying phenomenon or abandoned is a result of progress of science. Nevertheless, such terms with changing meanings, merely phenomenological descriptions and supposed phenomena of future obsolescence are and will be important elements of reasoning, communicating and decision-making in the clinical realm, as well as subject to scientific investigations. This includes that traces of such “concepts” persist in clinical records and that we have to acknowledge that, at the time they were used, they pointed to some instance in reality – most generically to be understood as some state, event or series thereof, characterizing presence during some part of a patient’s life.
Whereas a term like “Qi” (in the Tradition Medicine extension of SNOMED CT) in an ontology under BFO would conflict with BFO’s commitment to the exact sciences, this should be less of a problem with a class ‘
As a general and not negotiable principle, SNOMED CT cannot blind out clinical terms with unclear, ill-defined or debatable meanings, because its mission is to represent the entirety of clinical discourse. Harmonization with BFO therefore means either to identify areas of SNOMED for which no attempt at harmonization should be made or, to find a mutual agreement on the referents (the entities in reality) SNOMED CT classes denote.
SNOMED CT Clinical Findings (CF) in the light of BFO Continuant/Occurrent dichotomy
The CF hierarchy includes 119,833 classes (July 2023 release). Part of them have the hierarchy tag “finding”, the rest “disorder”. There is no clear-cut criterion that distinguishes findings from disorders, though some distinctions have been proposed, apart from the fact that due to taxonomic inheritance all disorders are taxonomic descendants of some finding. The distinction often depends on circumstances and individual judgment. SNOMED defines CFs as follows: “normal/abnormal observations, judgments, or assessments of patients”, whereas disorders are always and necessarily abnormal clinical conditions. This definition leaves several questions open, particularly regarding the ontological high-level classes to which CF content belongs. This is not only a matter for interoperation with BFO, but for any foundational ontology with non-overlapping upper-level classes.
In the case of diseases/disorders, or more generally, clinically relevant body conditions, clinicians use, often interchangeably, terms like “disease”, “disorder”, “clinical course”, “clinical evolution”, “clinical picture”, or in other languages “sjukdom”, “Krankheit”, “maladie”, “enfermedad”, “disturbio”. In Chinese, they are the same “
” (jı¯-bìng). Those terms and their hyponyms do not clearly denote types of entities that can be unequivocally put into the “continuant” or “occurrent” basket.
A tentative characterization of CF classes is that all of them are (mostly dynamic) bodily, mental, and social features that are subject to health-related investigation and scrutiny. To define them, the SNOMED Concept model provides a large part of following object properties: ‘
Most of these properties intuitively suggest that CF classes are occurrents. For instance, a strep throat can be seen as a process with an internal dynamic. A counterexample would be Trisomy 21, a material entity consisting of morphologically different chromosomes, which are, finally, bearers of dispositions that determine the known trisomy 21 phenotype. A supernumerary toe is, in contrast to the chromosomal condition, a macroscopic continuant, as well as ulcers or hematomas, which undergo morphological change, so that the occurrents in which they participate are more in the foreground. Finally, there are conditions, e.g. a specific gait or speech pattern, in which the participating continuants (limbs, joints, bones, tongue, pharynx) do not exhibit, in a snapshot view, any particularity, but their time-dependent configurations and movements catch the eye of the observer. Here, the focus is on the occurrent entity only. It is noteworthy that for each and every continuant class
In the light of BFO, this flexibility regarding considering CF classes as continuants or occurrents not only clashes with the disjointness between the classes
This is also an issue in the following paragraphs where we will introduce two additional characteristic features of the CF hierarchy, viz. Role groups and Dispositions. The importance of both features will become apparent in our later modeling approaches, so that their introduction is justified at this place.
Role groups in the Clinical Finding (CF) hierarchy
Representing compound procedure and findings such as
However, to meet clinicians’ requirements, compound findings must appear classified and be retrieved under each of their component parts so that, for example, queries concerning the class
This modeling pattern formally corresponds to a defined class, but does not answer the question about the difference to its potential non-grouped model variant. Non-grouped variants do not occur in pre-coordinated definitions, but may result from post-coordination, i.e. by the creation of a compositional expression using SNOMED CT content and constructors. In OWL, a post-coordinated expression could result in the following:
A closer inspection of the children of
Dispositions in the Clinical Finding hierarchy
Dispositions, in BFO
In SNOMED CT, both dispositions and their manifestations are in the CF hierarchy, e.g.
Results
Basic BFO framework
We present several OWL models. Their basic elements required for the representation of central ontological aspects of CFs in BFO are depicted in Fig. 2. We use
But also:

The conceptual space of Clinical Findings (CFs) in the context of BFO. The example shows different entity types involved, typed by their BFO upper-level categories and OGMS (Ontology of General Medical Science) classes (Scheuermann et al., 2009). “Cancer genomic structure” relates to the germline genomic structure underlying a disposition for malignant growth.
The fact that clinical terminologies (in the sense of collection of human language terms) often do not distinguish between entities of the type #1, #2, and #3 on the one hand, and between entities of the type #4 and #5 on the other hand, does not mean that the referent of a term is a combination of them, e.g. that the referent of “cancer” is the intersection of a class under
Whenever there is a cancer occurrent, there is also a (material) cancer. Whenever there is a (material) cancer, a cancer occurrent exists (or existed – in case we also classify dead tissue as cancer) (#1 ↔ #2).
Whenever there is a material cancer, it exhibits a cancer quality and vice versa (#2 ↔ #3)
But also
Whenever there is a genomic structure for cancer, there is a disposition for cancer and vice versa (#4 ↔ #5)
If clinicians speak or write “the patient has cancer”, whether they refer to the quality, the occurrent, or the material correlative thereof does not matter, because all of them exist whenever one of them exists. Only a more precise utterance like “the tumor has a size of 3 cm” disambiguates the term “cancer”, because only a material entity can have a size.
Note that there is no such mutual dependence between any of #1, #2, #3 and any #4, #5. Therefore, clinicians will always make a terminological difference between “risk of cancer” (#4, #5) and “cancer manifestation” (#1, #2, #3). This is depicted by the relation
Such mutually dependent entities are closely related in an ontological sense, but fall into different categories. This phenomenon had been termed “dot objects” and “logical polysemy” (Pustejovsky and Bouillon, 1995). Common examples of this are “University” (building vs. institution) or “Book” (printed copy vs. intellectual product), or in biology “Enzyme” (protein vs. function). Medical language abounds of this kind of polysemy, e.g., “inflammation” as denoting some morphologically altered tissue, which is the result of an inflammatory process, or “biopsy” for a procedure, but also as the outcome of this procedure, the biopsy sample (often called biopsy, too). Our built-in contextual awareness and extensive background knowledge explains why across all realms of discourse ontologically distinct but mutually dependent entities are often not distinguished by different words, even in technical language. Domain terms have different aspects of meaning, which in some context need to be distinguished, in some not. Each aspect, however, denotes a distinct entity in the domain, often belonging to disjoint upper-level type such as
The SNOMED CT concept model allows representation of these aspects in the CF hierarchy; by defining CFs in their relation to processes by the SNOMED CT relation ‘
Our modeling efforts resulted in the following three patterns. Pattern 1 defined the disjunctive class
Intensive discussions around the ontological interpretation of CF content in SNOMED CT have taken place more than ten years ago by SNOMED CT terminologists and members of the SNOMED International special interest group “Event-Condition-Episode”. They argued for a
All axioms on these disjunctive SNOMED CT classes should work for all interpretations, e.g.
This is plausible because the
Formula (7) introduces our running example, modelled along Pattern 1:
Relevant reasoning patterns (e.g. that a condition of a part is a condition of a whole) could be shown to work also without committing to the precise ontological nature of that condition (every osteosarcoma is located in a bone, regardless of whether we mean by “osteosarcoma” the malignant growth process, its material correlate or its morphological quality).
The proposed approach leaves open how to interpret the SNOMED CT relation ‘
Whereas both cases would still preserve the ambiguity between The corresponding OWL model is available as supplementary material (BFO_SCT_pattern_1a.owl).
As stated above, we could ignore qualities and represent morphologies as subclasses of ‘ See OWL model BFO_SCT_pattern_1b.owl
These problems did not appear when the disjunctive approach was originally proposed, because it was modeled under BioTop (Schulz et al., 2017) as ontological upper-level, which has a broader notion of quality, and provides a generic spatiotemporal inclusion relation. So it did not result in unintended models.
Back to BFO, the only compatible solution would be to introduce a new relation, e.g., ‘
There are several reasons for not favoring this modeling approach. Neither a new top-level object property ‘
Nevertheless, this model takes into account the undeniable fact that from a clinical-terminological point of view, many CF terms are polysemous under formal-ontological scrutiny, with their actual meanings (e.g. whether “tumor” means a lump of tissue of a growth process) only becoming clear from the context in which they are used.

Pattern 1: expressing SNOMED CT findings/disorders as clinical conditions yields an unintended model:
The problem of how to represent complex diseases and syndromes had already occupied the creators of the GALEN ontology (Rector et al., 1997) in the 1990s, who termed it

Pattern 2: introducing clinical life phases as the root of the SNOMED CT findings/disorders. The object property ‘
Figure 4 shows a modeling attempt under the class ‘ See OWL model BFO_SCT_pattern_2.owl
Osteosarcoma would be modeled as following:
The problem here is that because
The third pattern (cf. Figure 5) attempts to reconcile (i) parsimony, (ii) compatibility with BFO, (iii) expressibility in OWL-EL, and (iv) the redefinition of role groups as object properties that link occurrents to parts of occurrents. All of what is currently named ‘
In contrast to the former two models, the classes

Pattern 3: SNOMED CT findings/disorders as Clinical occurrents. Here, the reflexive and transitive object property ‘
In this modeling approach,
Opposed to ‘
Since nearly all definitional axioms of CFs use role groups, the implicit meaning of a CF class
The interpretation of CFs in the sense of “
It is to highlight that this interpretation of all findings and disorders includes
Another decision in this modeling strategy is to interpret ‘
The corresponding OWL model4 See BFO_SCT_pattern_3.owl

SNOMED CT CF classes as clinical occurrents and clinical occurrent states. Compliance with BFO classes and properties. It highlights the need to refer to dispositional findings via clinical occurrents (I.e. “states with a disposition”), which requires to express the SNOMED CT object property ‘
Finally, bodily dispositions like allergy to pollen need to be fitted under ‘
Currently, SNOMED CT links allergic dispositions to allergic processes via the relation ‘
By using an OWL property path axiom, the disturbance caused by this difference can be minimized (with “
This is the price to be paid here, otherwise the CF hierarchy would never become “clean” under a BFO perspective. Due to the lack of equivalent property statements in OWL, the equivalence between (13) and (14) can only be approximated: the classifier identifies See BFO_SCT_pattern_4.owl
Finally, the ‘
The manual review of 100 CF classes randomly selected from the July 2022 release revealed that twenty of them, ([13–29%] in a 95% confidence interval) would lead to inconsistencies when literally interpreted.
SNOMED CT findings and disorders sample, only interpretable as clinical occurrents
SNOMED CT findings and disorders sample, only interpretable as clinical occurrents
(Continued)
Table 2 shows these 20 examples, classified by their characteristics. Only if analyzed under the assumption that each class has the meaning of
Our scrutiny of clinical finding and disorder classes in SNOMED CT, i.e. the descendants of ‘
Interpreting ‘
With ‘
The proposal would not be affected by a future re-interpretation of morphological abnormalities as qualities. In such a case the material correlate of a disorder such as the tumor mass would then be expressed as the bearer of tumor mass morphological quality, and ‘
Interpreting ‘
Expressing the SNOMED CT object property ‘
Thus, all instances of ‘
The relevance of the fact that this modification does not require any structural redesign of SNOMED CT CF hierarchy and related axioms is emphasized by our analysis of a sample of classes in this hierarchy. Here, roughly between 8,200 and 18,200 classes would cause problems if their parents were interpreted literally. This would affect much more SNOMED CT content due to the placement of these classes in a tightly woven multi-hierarchical network.
According to the clinical committees advising SNOMED CT an accompanying disorder is placed as a taxonomic parent of the disorder it accompanies, e.g., ‘
An equally typical and frequent pattern is the expression of both etiology and manifestation as taxonomic parents, e.g., in ‘
One could argue that this is the result of a long-lasting negligence of ontological principles among SNOMED CT developers, disregarding proposals already aligned with BFO such as OGMS with its tripartition between (i) clinical dispositions, (ii) clinical material entities and (iii) clinical processes (despite the often questioned labeling with “disease”, “disorder”, and “disease course”). However, from a pragmatic, clinical documentation point of view, a dissection of, e.g.
What is, in contrast, not irrelevant in clinical documentation is the difference between dispositions and manifestations. Recently, SNOMED CT has strengthened this distinction, particularly with regard to allergy, where ‘
It would be interesting to compare the SNOMED CT CF pattern with patterns found in other ontologies that represent clinical occurrents. The Human Phenotype Ontology (HPO) represents phenotypes as qualities, e.g. fractured is the quality of a fractured object like a bone:
Although a combined forearm fracture is not contained in the ontology, it is assumed that this would be expressed as a conjunction of two clauses beginning with ‘
For HPO qualities that inhere in occurrents, HPO – BFO compatibility has to be established before discussing the harmonization of HPO with SNOMED CT.
Another interesting source is the openGALEN ontology (Rector et al., 2003), which grew out of the pioneering GALEN project (Rogers et al., 2001) in the 1990s.
A problem that may have prevented a widespread use of this large ontology may be its idiosyncratic naming conventions and object properties. It contains very few combined disorders, but the definition of this one also reveals a modular structure that is very close to SNOMED CT clinical findings (the openGALEN relation
Regarding BFO compatibility of other SNOMED CT hierarchies, in many cases the effort seems to be rather straightforward. As investigated by Schulz and Martínez-Costa (2015), however with regard to the BTL2 ontology, the ones that require most scrutiny are probably
Conclusion and outlook
The early versions of SNOMED had started out working to provide structure based on representations of organ systems and pathophysiology that were intended to be understandable, reproducible, and useful (Spackman and Reynoso, 2004). With SNOMED’s evolution, the intent was to incrementally add structure in order to provide incremental value. This bottom-up development is in contrast to the top-down approach of BFO and other foundational ontologies, where a more comprehensive model is defined in advance.
In this paper, we reviewed the
We concluded that
Analyzing this phenomenon more in detail, led us to the conclusion that
A result that reaches beyond the SNOMED CT use case, is that this work provided evidence that the harmonization of a terminology system that grew over decades in a bottom-up manner with a principled foundational ontology, developed top-down, does not necessarily require major redesign, but rather a thorough ontological analysis of the implicit assumptions of its curators and users. The fact that the phenomenon of logical polysemy (Pustejovsky and Bouillon, 1995), which pervades domain terminologies, poses problems to ontologists but usually not to terminology users such as clinicians is an important factor in the harmonizing process. The work also suggests that a common ground is possible between those who insist in the application of ontological rigor and formal methods and those who do the practical work of producing artifacts that fulfill concrete requirements of clinical users and use cases. The experience that with some effort a top-down and a bottom-up approach can be harmonized, resulting in a shared understanding and representation, is a powerful validation of both approaches.
These are, however, still hypotheses, which require more validation effort. In this sense, we recommend the following investigations:
Demonstrate the practical use of the harmonization with one or more of the BFO-compliant ontologies like HPO and OGMS, as well as with several disease ontologies that use BFO;
Perform a similar scrutiny of other SNOMED CT hierarchies known as heterogeneous, particularly
Scrutinize all SNOMED CT object properties, together with the constraints from the SNOMED CT concept model against the BFO object properties and constraining axioms.
Provide evidence for the benefit of BFO-SNOMED harmonization and integration for the communities of Applied Ontology and Biomedical Informatics as well as for healthcare and biomedical research in the context of health data analytics and interoperability.
Future SNOMED CT content development decisions should be informed by the output of these investigations, regarding naming of SNOMED CT components, particularly hierarchy labels, but also regarding updates in the documentation for users and content developers.
Footnotes
Acknowledgements
We would like to give our appreciation to the reviewers Jim Campbell, Keith Campbell and Alan Rector. Their exceptional dedication and invaluable insights have greatly enriched the quality of this work. Their meticulous attention to detail, constructive criticism, and unwavering support have played an instrumental role in shaping this article into its final form. We extend our thanks to Barry Smith and Werner Ceusters who provided valuable feedback to an earlier version of the manuscript. We are immensely grateful for the expertise of all of them and the amount of time they invested. Their commitment to advancing knowledge in our field is truly commendable, and we sincerely thank them for their remarkable contributions.
