Sage Journals: Discover world-class research

Abstract

Generating vector representations (embeddings) of OWL ontologies is a growing task due to its applications in predicting missing facts and knowledge-enhanced learning in fields such as bioinformatics. The underlying semantics of OWL ontologies are expressed using Description Logics (DLs). Initial approaches to generate embeddings relied on constructing a graph out of ontologies, neglecting the semantics of the logic therein. Recent semantic-preserving embedding methods often target lightweight DL languages such as ${E L}^{+ +}$ , ignoring more expressive information in ontologies. Although some approaches aim to embed more descriptive DLs such as $A L C$ , those methods require the existence of individuals, while many real-world ontologies are devoid of them. We propose an ontology embedding method for the $A L C$ DL language that considers the lattice structure of concept descriptions. We use connections between DL and Category Theory to materialize the lattice structure and embed it using an order-preserving embedding method. We show that our method outperforms state-of-the-art methods in several knowledge base completion tasks. This is an extended version of our previous work, where we incorporate saturation procedures that increase the information within the constructed lattices. We make our code and data available at https://github.com/bio-ontology-research-group/catE.

Keywords

ontology embedding knowledge base completion neurosymbolic AI

1. Introduction

Ontologies are usually developed and maintained by manual curation of experts and therefore the knowledge therein can be inconsistent or incomplete. Traditionally, symbolic reasoners are used to test for consistency of the knowledge within ontologies and to infer new statements. However, they are designed to infer statements that are logically entailed from the ontology or knowledge base; in some cases, it is useful to also suggest axioms that are probably true but not entailed, leading to the task of “ontology completion” or “knowledge base completion.”

From the viewpoint of knowledge graph completion (Chen et al., 2020), we can initially define knowledge base completion as the task of predicting “missing” or “novel” axioms in a knowledge base (or ontology). “Novel” may be understood temporally as axioms that are added at a later time to a knowledge base, or, more commonly, with respect to existing axioms in the knowledge base. However, unlike knowledge graphs, a knowledge base (ontology) has an infinitely large deductive closure with deductively entailed statements. Those statements can be considered “novel” because they do not exist in the knowledge base but can effectively be generated by a deductive reasoner. Therefore, knowledge base completion can have a two-fold presentation: (a) knowledge base completion as approximate entailment, where the completion system first generates the deductively entailed statements, and then, with potentially lower confidence, the system generates the non-entailed but probable statements, and (b) the completion system generates only non-entailed statements and, optionally, has access to information to the deductive closure.

Transversally, knowledge base completion methods can be evaluated based on the type of axioms to complete. We distinguish between two sub-tasks: “TBox completion,” when the axioms to generate are terminological and are of the form $C ⊑ D$ and “ABox completion,” when the axioms to generate are assertional and are of the form $C (a)$ or $r (a, b)$ . TBox completion systems have been proposed as supporting tools to assist or automate ontology curation procedures (Banerjee et al., 2023; Chen et al., 2023) or to match concepts between ontologies (Chen et al., 2023). ABox completion systems are evaluated alongside neuro-symbolic reasoners in challenges such as SemREC (Banerjee et al., 2023). Furthermore, ABox completion can be regarded as knowledge graph completion enhanced with ontological knowledge (Hao et al., 2019).

Several neuro-symbolic approaches have been developed to perform the knowledge base completion tasks (Chen et al., 2023, 2021; Jackermeier et al., 2023; Kulmanov et al., 2019), and most are based on generating embeddings that preserve some logical properties of a knowledge base. Methods that perform knowledge base completion follow different strategies. One type of method corresponds to transforming ontology axioms into graphs. Under this approach, axioms in a Description Logic (DL) knowledge base are transformed into a graph and then knowledge graph completion methods are applied (Chen et al., 2021). Although this strategy has proved to be useful, this set of methods does not capture all information in axioms and the embedding process is usually not invertible (Zhapa-Camacho & Hoehndorf, 2023); therefore, these methods do not allow exact inference of axioms and are often used for similarity-based tasks.

Another type of method for embedding DL knowledge bases constructs an approximate model of the knowledge base. ELEmbeddings (Kulmanov et al., 2019) represent concepts as $n$ -dimensional balls and roles are represented as geometric translations of concepts. By modifying the geometric structure from balls to boxes, methods such as BoxEL (Xiong et al., 2022) guarantee intersectional closure of concepts (i.e., the intersection of two boxes is a box). However, representing roles as translations can only encode one-to-one relations. Therefore, Box $^{2}$ EL Jackermeier et al. (2023) represents roles as two boxes, representing the domain and the codomain of the role, respectively. This representation enables encoding many-to-many relations. However, all these methods target the ${E L}^{+ +}$ language, which is a lightweight language that does not support the construction of axioms involving full negation or universal restrictions, therefore they cannot leverage more expressive statements in DL knowledge bases. In this regard, methods such as FALCON (Tang et al., 2023), which is a method similar to Logic Tensor Networks (Badreddine et al., 2022), can construct an approximate model for $A L C$ knowledge bases. FALCON represents concepts as fuzzy sets and treats logical connectives as fuzzy operators (van Krieken et al., 2022). However, FALCON requires the existence of individuals to populate the fuzzy sets, which is a limiting factor in cases involving knowledge bases without individuals such as the gene ontology (GO). Another approach for modeling the $A L C$ language is found in Özcep et al. (2023) with a theoretical analysis on the use of axis-aligned cones to represent ontology concepts.

1.1. Proposed Approach: Lattice-Preserving Embeddings

To overcome the limitations of current ontology embedding approaches, we propose CatE, a lattice-preserving embedding method for the $A L C$ language. Our approach relies on the fact that the concept descriptions in a DL knowledge base can be arranged in a lattice structure. The lattice construction of DL concepts can be formulated in the context of Formal Concept Analysis (Baader & Sertkaya, 2004), using connections between DL and Modal Logic (Baader & Lutz, 2007; Schild, 1991; Venema, 2007) or using connections between DL and Category Theory (Brieulle et al., 2022; Duc, 2021). We use the category-theoretical formulation and construct a lattice out of all concept descriptions that are sub-concepts of any concept description in the knowledge base. After materializing the lattice, we represent its elements as vectors in an ordered-vector space. To enforce the ordered structure of the vector space, we use an order-embedding method. We apply CatE and show that it can outperform state-of-the-art methods in the different forms of the knowledge base completion task. We extend our previous work (Zhapa-Camacho & Hoehndorf, 2024) by incorporating saturation procedures that can introduce new information to the lattice in terms of new elements and morphisms. We apply partial saturation to the lattice and show that these procedures can improve the knowledge completion performance on some metrics such as mean reciprocal rank (MRR). Our contributions are the following:

We propose an embedding method for $A L C$ knowledge bases that preserves the lattice structure of the semantics of concept descriptions.

We show that our method can perform competitively in generating statements in the deductive closure and generating probable statements.

We show that our method can perform competitively in both TBox and ABox completion tasks.

We show that partial saturation procedures can enhance the embedding representation of ontology concept descriptions.

2. Preliminaries

2.1. Description Logics (DLs)

A DL signature $Σ = (C, R, I)$ consists of a set of concept names $C$ , a set of role names $R$ , and a set of individual names $I$ . In the DL $A L C$ , all concept names are concept descriptions; if $A$ and $B$ are concept descriptions, $r$ a role name, and $a, b$ are individual names, then $A ⊓ B$ , $A ⊔ B$ , $\neg A$ , $\exists r . A$ , and $\forall r . A$ are concept descriptions; $A ⊑ B$ , $A (a)$ and $r (a, b)$ are axioms. A set of axioms is an $A L C$ theory (Baader et al., 2003).

An interpretation $I$ of an $A L C$ theory consists of an interpretation domain $Δ^{I}$ and an interpretation function $\cdot^{I}$ such that for every concept name $C \in C$ , $C^{I} \subseteq Δ^{I}$ ; for every individual name $a \in I$ , $a^{I} \in Δ^{I}$ ; and every role name $r \in R$ , $r^{I} \in Δ^{I} \times Δ^{I}$ ; and, inductively:

\begin{aligned} ⊥^{I} & = \emptyset, \\ ⊤^{I} & = Δ^{I}, \\ (\neg A)^{I} & = Δ^{I} ∖ A^{I}, \\ (C ⊓ D)^{I} & = C^{I} \cap D^{I}, \\ (C ⊔ D)^{I} & = C^{I} \cup D^{I}, \\ (\exists r . C)^{I} & = {a \in Δ^{I} ∣ \exists b . ((a, b) \in r^{I} \land b \in C^{I})}, \\ (\forall r . C)^{I} & = {a \in Δ^{I} ∣ \forall b . ((a, b) \in r^{I} \to b \in C^{I})} . \end{aligned}

An interpretation $I$ is a model for an axiom $C ⊑ D$ iff $C^{I} \subseteq D^{I}$ , for an axiom $B (a)$ iff $a^{I} \in B^{I}$ , and for an axiom $r (a, b)$ if and only if $(a^{I}, b^{I}) \in r^{I}$ (Baader et al., 2003). Given an $A L C$ theory $T$ , an axiom is entailed from $T$ if it is true in all models of $T$ .

3. Construction of the Lattice Structure

A preorder $(P, \leq)$ contains a set $P$ equipped with a reflexive and transitive binary relation $\leq$ . A partial order is a preorder that is also antisymmetric. A lattice is a partially ordered set where each two-element subset has a least upper bound and greatest lower bound. If a lattice has a greatest element, it is denoted $⊤$ , and if it has the least element it is denoted $⊥$ (Davey & Priestley, 2002).

In an $A L C$ theory $T$ , the set $C$ of concept names can be used to create arbitrarily complex and infinitely many concept descriptions. We consider only the concept descriptions in the knowledge base with their sub-expressions and call this set $\tilde{C}$ . We furthermore denote ${\tilde{C}}^{I} = {C^{I} ∣ C \in \tilde{C}}$ .

The pair $({\tilde{C}}^{I}, \subseteq)$ can form a lattice where concept descriptions $C^{I}, D^{I} \in {\tilde{C}}^{I}$ stand in a relationship if $C^{I} \subseteq D^{I}$ . Within models of $A L C$ theories, the relation $\subseteq$ is reflexive and transitive. For a pair of concepts descriptions $A^{I}, B \in {\tilde{C}}^{I}$ , the least upper bound is denoted as $(A \cup B)^{I}$ and the greatest lower bound is denoted using $(A \cap B)^{I}$ . Additionally, for any concept description $X$ it holds $⊥^{I} \subseteq X^{I} \subseteq ⊤^{I}$ .

To represent the lattice $({\tilde{C}}^{I}, \subseteq)$ , we use the syntactic representation of the axioms (where the operator is $⊑$ and not $\subseteq$ ) and denote it as $(\tilde{C}, ⊑)$ (Figure 1). The representation based on $⊑$ does not hold all the properties of lattices; however, it is used as an intermediate structure between the lattice $({\tilde{C}}^{I}, \subseteq)$ and the embedding space which will be introduced later (Section 3.2).

Figure 1.

Lattice representation. $⊥$ is the bottom element and $⊤$ is to top element. Arrows represent the $⊑$ operator.

The concepts in $\tilde{C}$ are materialized following a recursive process and, depending on the type of concept descriptions, $\tilde{C}$ can be extended with new elements. We rely on connections between DL and Category Theory described in Duc (2021).

Intersection of Concepts

Given a concept description $A ⊓ B$ in the theory, we add the following relationships to $(\tilde{C}, ⊑)$ : $A ⊓ B \to A$ and $A ⊓ B \to B$ . Additionally, for any $X$ , if relationships $X \to A ⊓ B$ are found in $(\tilde{C}, ⊑)$ , we add the relationships $X \to A$ and $X \to B$ (Figure 2a). Concepts $A, B$ are processed recursively.

Figure 2.
Lattice representations of complex concept descriptions.

Union of Concepts

Given a concept description $A ⊔ B$ in the theory, we add the following relationships to $(\tilde{C}, ⊑)$ : $A \to A ⊔ B$ and $B \to A ⊔ B$ . Additionally, for any $X$ , if relationships $A ⊔ B \to X$ are found in $(\tilde{C}, ⊑)$ , we add the relationships $A \to X$ and $B \to X$ (Figure 2b). Concepts $A, B$ are processed recursively.

Negation of Concepts

Given a concept $\neg C$ , elements $C ⊓ \neg C$ and $C ⊔ \neg C$ are added to $\tilde{C}$ . The relationships $C ⊓ \neg C ⊑ ⊥$ , $⊤ ⊑ C ⊔ \neg C$ are added to $(\tilde{C}, ⊑)$ . Additionally, for any $X$ , if the relationship $C ⊓ X \to ⊥$ is found in $(\tilde{C}, ⊑)$ , we add the relationship $X \to \neg C$ , and if the relationship $⊤ \to C ⊔ X$ is found in $(\tilde{C}, ⊑)$ , we add the relationship $\neg C \to X$ (Figure 2c). The concept $C$ is processed recursively.

Existential Restriction of Concepts

First, an auxiliary preorder is constructed for DL roles, denoted as $(\tilde{R}, ⊑)$ . In this preorder, elements $r, s$ stand in a relationship $r ⊑ s$ if $r^{I} \subseteq s^{I}$ or if $r ⊑ s$ is entailed. $\tilde{R}$ is extended from $R$ during the lattice construction process. For any role $r$ represented in $\tilde{R}$ , elements $d o m a i n (r)$ and $c o d o m a i n (r)$ are added to $\tilde{C}$ . Given a concept description $\exists r . C$ , the relationship $r_{\exists r . C} \to r$ is added to $(\tilde{R}, ⊑)$ . Relationships $c o d o m a i n (r_{\exists r . C}) \to C$ , $d o m a i n (r_{\exists r . C}) \to \exists r . C$ and $\exists r . C \to d o m a i n (r_{\exists r . C})$ are added to $(\tilde{C}, ⊑)$ . Additionally, if there are roles $s \in \tilde{R}$ with relationships $s \to r$ and $c o d o m a i n (r) \to C$ , the relationship $d o m a i n (s) \to d o m a i n (r_{\exists r . C})$ is added to $(\tilde{C}, ⊑)$ . The concept $C$ is processed recursively.

Universal Restriction of Concepts

Given a concept description $\forall r . C$ , the element $\neg \exists r . \neg C$ is added to $\tilde{C}$ with relationships $\forall r . C \to \neg \exists r . \neg C$ and $\neg \exists r . \neg C \to \forall r . C$ . Furthermore, if there are roles $s \in \tilde{R}$ with relationships $s \to r$ and $d o m a i n (s) \to \forall r . C$ , the relationship $c o d o m a i n (r) \to C$ is added to $(\tilde{C}, ⊑)$ . Concepts $\neg \exists r . \neg C$ , $\neg C$ and $C$ are processed recursively.

Subsumption Axioms

Axioms $C ⊑ D$ are incorporated directly to the lattice. Additionally, relationships $⊤ \to \neg C ⊔ D$ are added to $(\tilde{C}, ⊑)$ . Concepts $C$ and $D$ are processed recursively.

Class Assertion Axioms

Given an axiom $C (a)$ , we construct the element ${a}$ in $\tilde{C}$ with the following relationships: $⊥ \to {a}$ , ${a} \to C$ and ${a} \to ⊤$ .

Role Assertion Axioms

Given an axiom $r (a, b)$ , we construct elements ${a}, {b}$ in $\tilde{C}$ with the following relationships: $⊥ \to {a}$ , ${a} \to ⊤$ , $⊥ \to {b}$ , ${b} \to ⊤$ and ${a} \to \exists r . {b}$ .

Every operator ( $⊓ ∣ ⊔ ∣ \neg ∣ \exists ∣ \forall ∣⊑$ ) introduces a constant number of elements into $\tilde{C}$ and a constant number of relationships in $(\tilde{C}, ⊑)$ . Therefore, for a formula in the knowledge base with $n$ operators the space and time complexity to process it is $O (n)$ .
3.1. Saturation Procedures

The lattice construction process is not complete in the sense that we consider a subset $\tilde{C}$ from the infinite set $C$ of possible concept descriptions. Additional concept descriptions that can be generated by deduction rules will not have a representation in the lattice. While extra information might improve the quality of embeddings, the time and space requirements to construct and process the lattice will be bigger. We study the impact of adding new information to the lattice by introducing saturation procedures.

By saturation we refer to the process of adding new elements and morphisms to the lattice until a fixed point is reached. However, due to practical limitations, we apply saturation rules partially and the fixed point might not be actually obtained. Since the lattice is equipped with a transitive relation, an immediate saturation rule is to compute the transitive closure of the lattice. Additionally, as specified in Brieulle et al. (2022), certain deduction rules can be transformed into partial saturation procedures. We specify the rules below in the form of $p r e c o n d i t i o n \Rightarrow c o n s e q u e n c e$ , where $p r e c o n d i t i o n$ denotes the set of morphisms existing in the lattice and $c o n s e q u e n c e$ denotes the set of elements and morphisms to be added to the lattice. Therefore, for elements $C, D, E \in \tilde{C}$ and for elements in $r \in \tilde{R}$ :
$\begin{aligned} C \to \neg D & \Rightarrow D \to \neg C, \end{aligned}$
(1)

$\begin{aligned} C ⊓ D \to ⊥ & \Rightarrow C \to \neg D, \end{aligned}$
(2)

$\begin{aligned} ⊥ \to C ⊔ D & \Rightarrow \neg C \to D, \end{aligned}$
(3)

$\begin{aligned} C \to D, D \to E & \Rightarrow C \to E, \end{aligned}$
(4)

$\begin{aligned} C \to D & \Rightarrow \exists r . C \to \exists r . D . \end{aligned}$
(5)

Equations (1), (2), and (3) are applicable to a subset of the asserted morphisms in the lattice and all of them introduce one new element to the lattice. Equation (4) corresponds to the transitive closure of the lattice and only introduces new morphisms but not new elements. Equation (5) is applicable to all morphisms in the lattice and introduces $2 \times | \tilde{R} |$ elements per morphism in the lattice. Due to the large space complexity required when implementing equation (5), we do not consider it in our analysis.
3.2. Embedding Into an Ordered-Vector Space

With the structure $(\tilde{C}, ⊑)$ in place, we proceed to embed it into an ordered-vector space. This step is crucial for preserving the hierarchical relationships within the lattice, ensuring that our embeddings reflect the inherent ordering of concept descriptions. We use an ordered-vector space $(X, ⪯)$ over $R^{n}$ where, for elements in $a, b \in X$ with $a = (a_{1}, \dots, a_{n})$ and $b = (b_{1}, \dots, b_{n})$ , $a ⪯ b$ if and only if $a_{1} \leq b_{1}, \dots, a_{n} \leq b_{n}$ . We show in Appendix 7 that $(X, ⪯)$ is an ordered-vector space.

Consequently, we introduce a parameterized function $f_{θ}$ which maps objects in $(\tilde{C}, ⊑)$ to the ordered-vector space $(X, ⪯)$ over $R^{n}$ . In this way, we intend $f_{θ}$ to be a lattice-preserving function of $(\tilde{C}, ⊑)$ . Since $f_{θ}$ is unknown, our task is to find the set of parameters $θ \in Θ$ that accommodates the intended structure of the space $X$ . We optimize $f_{θ}$ using gradient descent. We use the following order-preserving scoring function (Vendrov et al., 2016):

s (A, B) = | | max (0, f_{θ} (A) - f_{θ} (B)) | |_{2},

(6)

for elements

A, B \in \tilde{C}

with a relationship

A ⊑ B

. If

f_{θ} (A) ⪯ f_{θ} (B)

, then

s (A, B) = 0

, and otherwise

s (A, B) > 0

. We apply the following loss function to all relationships

A ⊑ B \in (\tilde{C}, ⊑)

L = \sum_{A \to B \in (\tilde{C}, ⊑)} \sum_{A \to B^{'} \notin (\tilde{C}, ⊑)} s (A, B) + max (0, γ - s (A, B^{'})) .

(7)

Relationships

A ⊑ B^{'} \notin (\tilde{C}, ⊑)

are called negative samples and are generated by replacing

B

in an existing relationship

A ⊑ B

by a corrupted entity

B^{'}

obtained by random sampling in a uniform distribution. The parameter

γ

is a margin parameter enforcing a minimum score value of the negative samples.

We show that the space $X$ gets a partial-order structure whenever the loss function $L = 0$ .

Theorem 1 Lattice-preserving embeddings

Let $O$ be a $A L C$ theory with signature $Σ = (C, R, I)$ and $(\tilde{C}, ⊑)$ the lattice of concepts descriptions generated from $O$ . Let $(X, ⪯)$ be an ordered-vector space where for elements $a, b \in X$ with $a = (a_{1}, \dots, a_{n})$ and $b = (b_{1}, \dots, b_{n})$ , $a ⪯ b$ if and only if $a_{1} \leq b_{1}, \dots, a_{n} \leq b_{n}$ . Let $f_{θ}$ be a function mapping objects from $\tilde{C}$ to $X$ . If $L = 0$ , then $f_{θ}$ is a lattice-preserving function of $(\tilde{C}, ⊑)$ into $(X, ⪯)$ .

Proof.
Let us assume that $L = 0$ and there exists a relationship $A ⊑ B$ in the lattice such that $f_{θ} (A) ⋠ f_{θ} (B)$ , meaning that the order is not preserved in the vector space $X$ . Reordering the definition of $L$ in equation (7), we have that $L = s (A, B) + K$ , where $K$ is a non-negative number. Therefore, since $L = 0$ , it follows that $s (A, B) = | | max 0, f_{θ} (A) - f_{θ} (B) | | = 0$ . Consequently, we have that $f_{θ} (A) ⪯ f_{θ} (B)$ , which leads to a contradiction.

Now that we have shown that any relationship $A ⊑ B$ in the lattice $(\tilde{C}, ⊑)$ is preserved as $f_{θ} (A) ⪯ f_{θ} (B)$ in $(X, ⪯)$ , we now verify that $f_{θ}$ preserves partial-order properties:
Reflexivity: Let $A ⊑ A$ be a relationship in $(\tilde{C}, ⊑)$ . Since $L = 0$ , it implies that $f_{θ} (A) ⪯ f_{θ} (A)$ .

Transitivity: Let $A ⊑ B$ and $B ⊑ C$ be relationships in $(\tilde{C}, ⊑)$ . Since $L = 0$ , it follows that $f_{θ} (A) ⪯ f_{θ} (B)$ and $f_{θ} (B) ⪯ f_{θ} (C)$ and, by the transitive property of $⪯$ , $f_{θ} (A) ⪯ f_{θ} (C)$ .

Antisymmetry: Let $A ⊑ B$ and $B ⊑ A$ be relationships in $(\tilde{C}, ⊑)$ . Since $L = 0$ , it follows that $f_{θ} (A) ⪯ f_{θ} (B)$ and $f_{θ} (B) ⪯ f_{θ} (A)$ and, by the antisymmetry property of $⪯$ , $f_{θ} (A) = f_{θ} (B)$ .

The embedding $f_{θ}$ preserves a lattice structure. The lattice CatE preserves is generated from the objects and morphisms of the category that provides the semantics for a theory $T$ . This category is shown to be compatible with classical semantics in the sense that the theory $T$ is category-theoretically unsatisfiable if and only if $T$ is set-theoretically unsatisfiable (Duc, 2021). Therefore, by preserving the lattice structure, $f_{θ}$ also preserves the categorical semantics, and therefore classical semantics.

Example in $R^{2}$

Consider the theories $T = {A ⊑ B, B ⊑ C, C ⊔ D ⊑ E}$ and $T^{'} = T \cup {B ⊑ ⊥}$ and their lattices shown in Figures 3a and 3b, respectively. Figure 3c shows an example of embeddings in $R^{2}$ . In theory $T^{'}$ , the concept $B$ is unsatisfiable ( $B ⊑ ⊥$ ), which implies that $A$ is also unsatisfiable by the transitivity rule (equation (4)). This implies that, set-theoretically $A^{I} \equiv B^{I} \equiv ⊥^{I} \equiv \emptyset$ for all models, and category-theoretically the morphisms $A \leftrightarrow B$ , $A \leftrightarrow ⊥$ and $B \leftrightarrow ⊥$ hold. By the antisymmetry property of $f_{θ}$ , the only way to preserve the lattice structure is to make $A, B, ⊥$ equivalent, which is reflected in Figure 3c, where, in the embedding space of theory $T^{'}$ , the embeddings for concepts $A, B, ⊥$ get closer and, based on equation (7), will eventually converge.

Figure 3.
(a, b) Present the lattices generated from theories $T$ and $T^{'}$ , respectively. (c) Shows the generated embeddings for objects in both theories. Notice that in the embeddings for theory $T^{'}$ , entities $A, B, ⊥$ accommodate close to each other since they become semantically equivalent. (a) Lattice for theory $T$ ; (b) lattice for theory $T^{'}$ ; and (c) example of embeddings in $R^{2}$ of lattices generated by theories $T$ (left) and $T^{'}$ (right).
4. Evaluation

To show the effectiveness of our method, we evaluate the following tasks: (a) generation of entailed axioms and (b) generation of probable axioms. In the task of generating entailed axioms, we use the ORE1 dataset from SemREC (Banerjee et al., 2023) and generate axioms of the form $C (a)$ , where $C$ is a concept name and $a$ is an individual. In the case of generating probable axioms, we constructed datasets using GO Ashburner et al. (2000) and FoodOn Dooley et al. (2018) to generate axioms of the form $C ⊑ D$ , where $C, D$ are concept names. For each case, we also show that partially saturating the constructed lattice impacts the performance of axiom generation. Additionally, we applied our method to the biomedical task of predicting protein–protein interactions (PPIs). This task is a form of generation of probable statements of the form $r (a, b)$ , where $r$ is a role and $a, b$ are individuals. We show information about datasets in Table 1.

Table 1.
Number of Axioms in Training, Validation, and Testing Ontologies, Number of Relationships in the Corresponding Training Lattices and DL Expressivity.

Dataset Training Validation Testing Lattice Expressivity

ORE1 61,245 7,578 15,157 364,849 ${E L}^{+ +}$

FoodOn 34,224 2,977 5,957 631,423 $A L C$

GO 81,844 7,260 14,521 1,257,443 ${E L}^{+ +}$

PPI 351,435 12,038 12,040 4,479,085 ${E L}^{+ +}$

Note. DL = Description Logic; FoodOn = food ontology; GO = gene ontology; PPI = protein–protein interaction.

4.1. Experimental Setup

Dataset	Training	Validation	Testing	Lattice	Expressivity
ORE1	61,245	7,578	15,157	364,849	${E L}^{+ +}$
FoodOn	34,224	2,977	5,957	631,423	$A L C$
GO	81,844	7,260	14,521	1,257,443	${E L}^{+ +}$
PPI	351,435	12,038	12,040	4,479,085	${E L}^{+ +}$

To find the optimal hyperparameters for our method, we performed a grid search over parameters: embedding dimension $\in [50, 100, 200]$ , margin ( $γ$ ) $\in [0, 0.01, 0.1, 1]$ , number of negative samples $\in [1, 2, 4]$ , batch size $\in [8, 192, 16, 384, 32, 768]$ , and learning rate $\in [1 \times 10^{- 5}, 1 \times 10^{- 4}, 1 \times 10^{- 3}, 1 \times 10^{- 2}]$ . We used the Adam optimizer (Kingma & Ba, 2015) with a Cyclic Learning Rate scheduler (Smith, 2017).

As baseline methods, we selected those approaches that use only the ontology axioms, without any external knowledge such as text (Chen et al., 2023, 2021). Therefore, we selected ELEmbeddings (Kulmanov et al., 2019) and Box $^{2}$ EL (Jackermeier et al., 2023), and used the implementations provided in the mOWL library (Zhapa-Camacho et al., 2022). Since both methods can handle only axioms in ${E L}^{+ +}$ , we normalized the ontologies to ${E L}^{+ +}$ normal forms. In the case of FoodOn, we applied our method to both the normalized ${E L}^{+ +}$ and $A L C$ versions. To obtain optimal parameters for baseline methods, we performed a grid search over embedding dimension $\in [50, 100, 200]$ , margin $\in [0, 0.01, 0.1]$ batch size $\in [5, 000, 10, 000, 20, 000]$ and learning rate $\in [1 \times 10^{- 5}, 1 \times 10^{- 4}, 1 \times 10^{- 3}]$ . Additionally, we compared with FALCON (Tang et al., 2023); however, due to high memory and time requirements, we were unable to test different hyperparameters for this method. All selected hyperparameters are provided in Table 6 of Appendix B.

We report a variety of rank-based metrics such as mean rank (MR), MRR, Hits@3, Hits@10, Hits@100, and receiver operating characteristic area under the curve (ROC AUC).

In all tasks, we report filtered metrics only and filter statements from the training set. In the task of generating axioms $C (a)$ , we additionally filter statements from the deductive closure of the training set.

4.2. Generating Entailed Axioms $C (a)$

The SemREC challenge (Banerjee et al., 2023), which evaluates neuro-symbolic reasoners, provides a number of benchmark datasets. We selected a representative data set called ORE1. We used the ORE1 dataset to test our method on the task of predicting axioms $C (a)$ , where $C$ is a concept description and $a$ is an individual. We perform a ranking-based evaluation, where we rank every testing statement $C (a)$ against every $C^{'} (a)$ , where $C^{'}$ is a named concept. We show results in Table 2, where we can see CatE performs better than baseline methods across all metrics.

Table 2.
Prediction of Axioms $C (a)$ , Where $C$ is a Concept and $a$ is an Individual. We Selected the ORE1 Dataset Proposed in Banerjee et al. (2023).

Method MR MRR Hits@3 Hits@10 Hits@100 AUC

ELEmbeddings 105 0.12 0.08 0.22 0.87 0.99

Box $^{2}$ EL 122 0.10 0.08 0.18 0.70 0.98

FALCON 603 0.02 0.00 0.02 0.34 0.92

CatE 37 0.18 0.10 0.51 0.96 0.99

Method	MR	MRR	Hits@3	Hits@10	Hits@100	AUC
ELEmbeddings	105	0.12	0.08	0.22	0.87	0.99
Box $^{2}$ EL	122	0.10	0.08	0.18	0.70	0.98
FALCON	603	0.02	0.00	0.02	0.34	0.92
CatE	37	0.18	0.10	0.51	0.96	0.99

Bold values indicate best performance, underlined values indicate second best performance.

Note. MR = mean rank; MRR = mean reciprocal rank; AUC = area under the curve.

4.3. Generating Probable Axioms

C ⊑ D

To evaluate the task of generating probable axioms, we generate two benchmark sets following procedures designed in previous methods (Chen et al., 2021; Mondal et al., 2021). We create two datasets using the GO (Ashburner et al., 2000) and the food ontology (FoodOn; Dooley et al., 2018). In each ontology, we remove 30% of the axioms $C ⊑ D$ uniformly at random and distribute 10% for validation and 20% for testing. The training set contains the 70% of the subsumption axioms together with the other axioms existing in the ontology.

We focus on the prediction of subsumption axioms $C ⊑ D$ and perform a rank-based evaluation ranking scores of axioms of interest $C ⊑ D$ over all axioms $C ⊑ D^{'}$ , where $D^{'}$ are named concepts. Table 3 shows the results. We can see that CatE consistently outperforms baselines in all metrics. In the case of FoodOn, we apply CatE to the ${E L}^{+ +}$ (CatE-EL) and the $A L C$ (CatE) versions of the ontology. CatE-EL outperforms the baselines, demonstrating that our method generates better embeddings for this specific task. Adding extra information present in the $A L C$ version of FoodOn improves the Hits@k metrics.

Table 3.
TBox Completion Task Over Axioms $C ⊑ D$ in GO and FoodOn.

Dataset Method MR H@10 H@100 AUC

GO ELEmbeddings 3,562 0.19 0.37 0.92

Box $^{2}$ EL 6,621 0.01 0.07 0.85

FALCON (5 models) 8,982 0.02 0.08 0.79

CatE 2,968 0.22 0.58 0.93

FoodOn ELEmbeddings 3,336 0.25 0.38 0.88

Box $^{2}$ EL 2,763 0.06 0.19 0.90

FALCON (5 models) 3,815 0.02 0.12 0.86

CatE-EL 2,633 0.29 0.43 0.91

CatE 2,764 0.30 0.47 0.90

Dataset	Method	MR	H@10	H@100	AUC
GO	ELEmbeddings	3,562	0.19	0.37	0.92
	Box $^{2}$ EL	6,621	0.01	0.07	0.85
	FALCON (5 models)	8,982	0.02	0.08	0.79
	CatE	2,968	0.22	0.58	0.93
FoodOn	ELEmbeddings	3,336	0.25	0.38	0.88
	Box $^{2}$ EL	2,763	0.06	0.19	0.90
	FALCON (5 models)	3,815	0.02	0.12	0.86
	CatE-EL	2,633	0.29	0.43	0.91
	CatE	2,764	0.30	0.47	0.90

Bold values indicate best performance, underlined values indicate second best performance.

Note. GO = gene ontology; FoodOn = food ontology; MR = mean rank; H@10 = Hits@10; H@100 = Hits@100; AUC = area under the curve.

4.4. PPI Prediction

PPIs are direct or indirect interactions between proteins, and information about PPIs is useful in systems biology and network-based bioinformatics methods. While PPIs can be investigated experimentally, several strategies have been developed to predict them using a variety of information, including the predicted or experimentally determined functions of proteins. The functions of proteins can be represented using the GO, and if $X$ is a class from the GO, the axiom $p_{1} ⊑ \exists h a s F u n c t i o n . X$ asserts that the class of proteins $p_{1}$ has function $X$ . PPIs can be encoded in axioms $i n t e r a c t s (p_{1}, p_{2})$ , where $p_{1}, p_{2}$ are proteins. In order to apply our method, we need to ensure that elements $\exists i n t e r a c t s . p_{i}$ exists in the lattice for any class of proteins $p_{i}$ . Therefore, we added the relationships $⊥ \to \exists i n t e r a c t s . p_{i}$ and $\exists i n t e r a c t s . p_{i} \to ⊤$ to the lattice structure for all classes of proteins $p_{i}$ . We used the PPI dataset provided in Zhapa-Camacho et al. (2022). We compare our method against state-of-the-art methods such as ELEmbeddings and Box $^{2}$ EL (Kulmanov et al., 2020; Xiong et al., 2022). We show the results in Table 4, where we can see that CatE is not able to outperform over baselines. The PPI benchmark relies on the assumption that the information GO acts as background knowledge to predict PPIs. To further investigate this task, we evaluate how well the methods capture the hierarchy of GO functions, which are axioms of the type $C ⊑ D$ . We compute the deductive closure of axioms $C ⊑ D$ using the ELK reasoner (Kazakov et al., 2013), and evaluate the capability of each method to generate the axioms in this new set. We find that ELEmbeddings and Box $^{2}$ EL do not capture the semantics of GO axioms at all, yet they can perform PPI predictions. Originally, ELEmbeddings and Box $^{2}$ EL are trained with negative samples just for PPI axioms, which can cause the other axiom types to converge to a trivial solution. Since CatE uses negative samples for all relationships in the lattice, it can predict PPIs while capturing other types of information in GO. Our analysis shows that predicting PPIs on its own is not sufficient to show that a particular embedding method is utilizing the background knowledge. Further analysis of embedding methods should be required, which is out of the scope of this work.

Table 4.
PPI Prediction on Yeast. Left-Side Shows the Results on PPI Axioms. Right Side Shows the Results on Axioms $C ⊑ D$ That are Learned During Training.

PPI axioms $r (a, b)$ Axioms $C ⊑ D$

Method MR MRR H@3 H@10 H@100 AUC MR H@100 AUC

ELEmbeddings 289 0.10 0.09 0.25 0.73 0.95 23812 0.00 0.53

Box $^{2}$ EL 188 0.17 0.19 0.43 0.81 0.97 23234 0.00 0.54

CatE 223 0.08 0.07 0.18 0.69 0.96 8936 0.28 0.82

	PPI axioms $r (a, b)$	Axioms $C ⊑ D$
ELEmbeddings	289	0.10	0.09	0.25	0.73	0.95	23812	0.00	0.53
Box $^{2}$ EL	188	0.17	0.19	0.43	0.81	0.97	23234	0.00	0.54
CatE	223	0.08	0.07	0.18	0.69	0.96	8936	0.28	0.82

Bold values indicate best performance, underlined values indicate second best performance.

Note. PPI = protein–protein interaction; MR = mean rank; MRR = mean reciprocal rank; H@3 = Hits@3; H@10 = Hits@10; H@100 = Hits@100; AUC = area under the curve.

4.5. Effect of Partial Saturation Procedures

To analyze the impact of the saturation procedures, we extend the lattices of the ORE1, GO, and FoodOn use cases. We first experiment with the ORE1 lattice as it is the smallest one and apply three types of saturation: (a) S1, which consists of applying equations (1), (2), and (3), (b) Tr, which consists on computing the transitive closure of the lattice, and (c) S1–Tr, which consists on performing S1 followed by Tr. For GO and FoodOn use cases, which produce larger lattices, we only apply S1 because the other settings introduce a large number of elements and morphisms which make the optimization costly and also hinder the hyperparameter search. We show performance results in Table 5 and notice that the S1 procedure contributes to improving the MRR and Hits@3 metrics in the three use cases. Additionally, for ORE1, the Tr procedure improves metrics such as MR and Hits@100; however, the combination of S1–Tr does not contribute to improving the performance.

Table 5.
Impact of the Application Saturation Procedures on the Performance of Generation of Axioms.

Dataset Method MRR H@3 H@10 H@100 MR

ORE1 CatE 0.175 0.097 0.505 0.958 37

CatE-S1 0.176 0.115 0.426 0.884 46

CatE-Tr 0.164 0.104 0.381 0.991 23

CatE-S1–Tr 0.155 0.098 0.367 0.931 30

GO CatE 0.062 0.008 0.216 0.578 2968

CatE-S1 0.066 0.011 0.226 0.595 3002

FoodOn CatE 0.087 0.023 0.298 0.473 2764

CatE-S1 0.094 0.121 0.238 0.419 3310

Dataset	Method	MRR	H@3	H@10	H@100	MR
ORE1	CatE	0.175	0.097	0.505	0.958	37
	CatE-S1	0.176	0.115	0.426	0.884	46
	CatE-Tr	0.164	0.104	0.381	0.991	23
	CatE-S1–Tr	0.155	0.098	0.367	0.931	30
GO	CatE	0.062	0.008	0.216	0.578	2968
	CatE-S1	0.066	0.011	0.226	0.595	3002
FoodOn	CatE	0.087	0.023	0.298	0.473	2764
	CatE-S1	0.094	0.121	0.238	0.419	3310

Bold values indicate best performance, underlined values indicate second best performance.

Note. MRR = mean reciprocal rank; H@3 = Hits@3; H@10 = Hits@10; H@100 = Hits@100; MR = mean rank; GO = gene ontology; FoodOn = food ontology.

4.6. Effect of hyperparameters

The time and space complexity of CatE increases linearly with the number of operators. However, the number of operators can be arbitrarily large for axioms in $A L C$ . Furthermore, hyperparameters such as embedding size and number of negative samples can have an impact on training and/or inference time as well as on memory consumption. In Table 4, we analyze how these hyperparameters impact performance. We chose Hits $@$ 100 and ROC AUC metrics and show that while the embedding dimension has a direct impact on performance (the higher the dimension the better the performance), the number of negative samples does not have a large effect (either positive or negative).

Figure 4.

Impact of embedding size and number of negatives on the Hits@100 and receiver operating characteristic area under the curve (ROC AUC) over different datasets.

5. Discussion

We have developed a method named CatE that generates embeddings for the $A L C$ language. CatE consists of materializing the lattice structure of concept descriptions found in a $A L C$ knowledge base. Furthermore, we use an order-preserving loss function to optimize the embedding space, and we show that when our loss function is minimized, the embedding space preserves partial-order properties. We have applied our method to different forms of knowledge base completion tasks, and we showed that our method can outperform several state-of-the-art methods.

Additionally, we implemented saturation procedures to extend the lattices and the information therein. We showed that saturated versions of the lattices can improve on some metrics. However, not all the saturation rules can be applied if the knowledge bases are large because the size of the resulting lattice and the number of morphisms can hinder the application of the optimization process. A potential direction for future work can be to generate some concepts directly in the embedding space rather than explicitly materializing them within the lattice.

Current graph-based methods to embed DL knowledge bases (ontologies) construct graphs relying on syntactic information therein and the embedding process is not guaranteed to be invertible. On the other hand, methods such as ELEmbeddings, Box $^{2}$ EL, and FALCON are able to generate approximate models for DL knowledge bases. We state that CatE stands in a midpoint between both types of methods. CatE looks into the syntactical information in the knowledge base to construct a lattice and, consequently, an embedding space that is consistent with the semantics.

However, as in graph-based embeddings, CatE cannot make inferences over concepts that are not explicitly stated in the lattice. This is a limitation that was exposed in the PPI task, where we had to add concept descriptions a priori in order to be able to make inferences over them. To mitigate this issue, future work can focus on solutions based on inductive learning over knowledge graphs, which can be applicable in the context of lattices.

6. Conclusion

We developed an embedding method for the $A L C$ that preserves the lattice structure of concept descriptions. Our method materializes the lattice structure following connections between DLs and Category Theory. The lattice in place is embedded into an ordered-vector space. We provide empirical results that our method can perform effectively across different tasks involving knowledge base completion.

Footnotes

Acknowledgements

We acknowledge support from the King Abdullah University of Science and Technology (KAUST) Supercomputing Laboratory.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work has been supported by funding from KAUST Office of Sponsored Research (OSR) under Award Nos. URF/1/4355-01-01, URF/1/4675-01-01, URF/1/4697-01-01, URF/1/5041-01-01, and REI/1/5334-01-01. This work was supported by the SDAIA–KAUST Center of Excellence in Data Science and Artificial Intelligence (SDAIA–KAUST AI), by funding from KAUST Center of Excellence for Smart Health (KCSH) under award number 5932, and by funding from KAUST Center of Excellence for Generative AI under award number 5940.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

ORCID iDs

Fernando Zhapa-Camacho

Robert Hoehndorf

The Ordered-Vector Space ( X,⪯ )

Lemma 1 <inline-formula> <math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline" id="mml-inline383"> <mo stretchy="false">(</mo> <mi>X</mi> <mo>,</mo> <mo>⪯</mo> <mo stretchy="false">)</mo> </math> </inline-formula> is a partial order

The pair $(X, ⪯)$ over $R^{n}$ , where for elements $a, b \in X$ with $a = (a_{1}, \dots, a_{n})$ and $b = (b_{1}, \dots, b_{n})$ , $a ⪯ b$ if and only if $a_{1} \leq b_{1}, \dots, a_{n} \leq b_{n}$ , is a partial order.

Proof.

We demonstrate for each property of a partial order:

Reflexivity $(\Rightarrow)$ : Let $a \in X$ with $a ⪯ a$ . By definition, we have $a_{i} \leq a_{i}$ for any $i$ . $(\Leftarrow)$ : Let $a \in X$ . Since $a_{i} \leq a_{i}$ for any $i$ , then $a ⪯ a$ .

Transitivity $(\Rightarrow)$ : Let $a, b, c \in X$ . If $a ⪯ b$ and $b ⪯ c$ , we have that $a_{i} \leq b_{i}$ and $b_{i} \leq c_{i}$ ; therefore, $a_{i} \leq c_{i}$ for any $i$ . $(\Leftarrow)$ : Let $a, b, c \in X$ with $a_{i} \leq b_{i}$ and $b_{i} \leq c_{i}$ for any $i$ . It follows that $a_{i} \leq c_{i}$ , which implies $a ⪯ c$ .

Antisymmetry $(\Rightarrow)$ : Let $a, b \in X$ . If $a ⪯ b$ and $b ⪯ a$ , it follows that $a_{i} \leq b_{i}$ and $b_{i} \leq a_{i}$ . Therefore, $a_{i} = b_{i}$ and $a = b$ . $(\Leftarrow)$ : Let $a, b \in X$ with $a_{i} = b_{i}$ for any $i$ . It implies that $a_{i} \leq b_{i}$ and $b_{i} \leq a_{i}$ , therefore, $a ⪯ b$ and $b ⪯ a$ .

Hyperparameter Selection

Table 6.

Selection of Hyperparameters for the Different Methods With Respect to the Dataset Used.

Dataset	Method	E.S.	L.R.	M	B.S.	N.N.
GO	ELEmbeddings	200	0.0001	0.1	20,000	1
	Box $^{2}$ EL	200	0.00001	0.1	20,000	1
	CatE	200	0.00001	1.0	32,768	4
FoodOn	ELEmbeddings	50	0.001	0.1	20,000	1
	Box $^{2}$ EL	200	0.0001	0.1	40,000	1
	CatE	200	0.0001	1	8,192	2
ORE1	ELEmbeddings	200	0.00001	0.01	4,096	1
	Box $^{2}$ EL	200	0.0001	0	8,192	1
	CatE	200	0.0001	1	32,768	4
PPI	CatE	256	0.0001	0.1	2,048	4

Note. E.S. = embedding size; L.R. = learning rate; M = margin; B.S. = batch size; N.N. = number of negative samples; GO = gene ontology; FoodOn = food ontology; PPI = protein–protein interaction.

References

Ashburner

Ball

C. A.

Blake

J. A.

Botstein

Butler

Cherry

J. M.

Davis

A. P.

Dolinski

Dwight

S. S.

Eppig

J. T.

Harris

M. A.

Hill

D. P.

Issel-Tarver

Kasarskis

Lewis

Matese

J. C.

Richardson

J. E.

Ringwald

Rubin

G. M.

Sherlock

(2000). Gene ontology: Tool for the unification of biology. Nature Genetics, 25(1), 25–29. https://doi.org/10.1038/75556

Baader

Calvanese

McGuinness

Nardi

Patel-Schneider

P. F.

(Eds.). (2003). The description logic handbook: Theory, implementation, and applications. Cambridge University Press. https://doi.org/10.1017/CBO9780511711787

Baader

Lutz

(2007). 13 Description logic. In P. Blackburn, J. Van Benthem & F. Wolter (Eds.), Handbook of modal logic, studies in logic and practical reasoning (Vol. 3, pp. 757–819). Elsevier. https://doi.org/10.1016/S1570-2464(07)80016-4. https://www.sciencedirect.com/science/article/pii/S1570246407800164

Baader

Sertkaya

(2004). Applying formal concept analysis to description logics. In Lecture notes in computer science (pp. 261–286). Springer. https://doi.org/10.1007/978-3-540-24651-0_24

Badreddine

d’Avila Garcez

Serafini

Spranger

(2022). Logic tensor networks. Artificial Intelligence, 303, 103649. https://doi.org/10.1016/j.artint.2021.103649. https://www.sciencedirect.com/science/article/pii/S0004370221002009

Banerjee

Usbeck

Mihindukulasooriya

Singh

Mutharaju

Kapanipathi

(Eds.). (2023). Joint proceedings of scholarly QALD 2023 and SemREC 2023 co-located with 22nd international semantic web conference ISWC 2023, Athens, Greece, November 6–10, 2023. In CEUR workshop proceedings (Vol. 3592). CEUR-WS.org. https://ceur-ws.org/Vol-3592

Brieulle

Duc

C. L.

Vaillant

(2022). Reasoning in the description logic ALC under category semantics (extended abstract). In O. Arieli, M. Homola, J. C. Jung & M. Mugnier (Eds.), Proceedings of the 35th international workshop on description logics (DL 2022) co-located with federated logic conference (FLoC 2022), Haifa, Israel, August 7th to 10th, 2022 CEUR Workshop Proceedings (Vol. 3263). CEUR-WS.org. https://ceur-ws.org/Vol-3263/abstract-7.pdf

Chen

Geng

Jiménez-Ruiz

Dong

Horrocks

(2023). Contextual semantic embeddings for ontology subsumption prediction. World Wide Web, 26(5), 2569–2591. https://doi.org/10.1007/s11280-023-01169-9

Chen

Jimenez-Ruiz

Holter

O. M.

Antonyrajah

Horrocks

(2021). OWL2Vec*: Embedding of OWL ontologies. Machine Learning, 110 , 1813–1845. https://doi.org/10.1007/s10994-021-05997-6

10.

Chen

Wang

Zhao

Cheng

Zhao

Duan

(2020). Knowledge graph completion: A review. IEEE Access, 8, 192435–192456. https://doi.org/10.1109/ACCESS.2020.3030076

11.

Davey

B. A.

Priestley

H. A.

(2002). Ordered sets. In Introduction to Lattices and Order (2nd ed., pp. 1–32). Cambridge University Press. https://doi.org/10.1017/CBO9780511809088.003

12.

Dooley

D. M.

Griffiths

E. J.

Gosal

G. S.

Buttigieg

P. L.

Hoehndorf

Lange

M. C.

Schriml

L. M.

Brinkman

F. S. L.

Hsiao

W. W. L.

(2018). FoodOn: A harmonized food ontology to increase global food traceability, quality control and data integration. npj Science of Food, 2(1), 23. https://doi.org/10.1038/s41538-018-0032-6

13.

Duc

C. L.

(2021). Category-theoretical Semantics of the Description Logic ALC. In M. Homola, V. Ryzhikov & R. A. Schmidt (Eds.), Proceedings of the 34th International Workshop on Description Logics (DL 2021) part of Bratislava Knowledge September (BAKS 2021), Bratislava, Slovakia, September 19th to 22nd, 2021. CEUR Workshop Proceedings (Vol. 2954). CEUR-WS.org. https://ceur-ws.org/Vol-2954/paper-22.pdf

14.

Hao

Chen

Sun

Wang

(2019). Universal representation learning of knowledge bases by jointly embedding instances and ontological concepts. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD’19 (pp. 1709–1719). ACM. https://doi.org/10.1145/3292500.3330838

15.

Jackermeier

Chen

Horrocks

(2023). Dual box embeddings for the description logic EL++, In T. Chua, C. Ngo, R. Kumar, H. W. Lauw & R. K. Lee (Eds.), Proceedings of the ACM on web conference 2024, WWW 2024, Singapore, May 13–17, 2024 (pp. 2250–2258). ACM. https://doi.org/10.1145/3589334.3645648

16.

Kazakov

Krötzsch

Simančík

(2013). The incredible ELK. Journal of Automated Reasoning, 53(1), 1–61. https://doi.org/10.1007/s10817-013-9296-3

17.

Kingma

D. P.

(2015). Adam: A Method for Stochastic Optimization. In Y. Bengio & Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings. http://arxiv.org/abs/1412.6980

18.

Kulmanov

Liu-Wei

Yan

Hoehndorf

(2019). EL embeddings: Geometric construction of models for the description logic EL++. In International joint conference on artificial intelligence (pp. 6103--6109). IJCAI.org. https://doi.org/10.24963/ijcai.2019/845

19.

Kulmanov

Smaili

F. Z.

Gao

Hoehndorf

(2020). Semantic similarity and machine learning with ontologies. Briefings in Bioinformatics, 22(4), bbaa199. https://doi.org/10.1093/bib/bbaa199

20.

Mondal

Bhatia

Mutharaju

(2021). EmEL++: Embeddings for EL++ description logic. In A. Martin, K. Hinkelmann, H. Fill, A. Gerber, D. Lenat, R. Stolle & F. van Harmelen, (Eds.), Proceedings of the AAAI 2021 Spring Symposium on Combining Machine Learning and Knowledge Engineering (AAAI-MAKE 2021), Stanford University, Palo Alto, California, USA, March 22–24, 2021, CEUR Workshop Proceedings (Vol. 2846). CEUR-WS.org. https://ceur-ws.org/Vol-2846/paper19.pdf

21.

Özcep

Ö. L.

Leemhuis

Wolter

(2023). Embedding ontologies in the description logic ALC by axis-aligned cones. Journal of Artificial Intelligence Research, 78, 217–267. https://doi.org/10.1613/jair.1.13939

22.

Schild

(1991). A correspondence theory for terminological logics: Preliminary report. In J. Mylopoulos & R. Reiter (Eds.), Proceedings of the 12th International Joint Conference on Artificial Intelligence. Sydney, Australia, August 24–30, 1991 (pp. 466–471). Morgan Kaufmann. http://ijcai.org/Proceedings/91-1/Papers/072.pdf

23.

Smith

L. N.

(2017). Cyclical learning rates for training neural networks. In 2017 IEEE winter conference on applications of computer vision, WACV 2017, Santa Rosa, CA, USA, March 2017 (pp. 464–472). IEEE Computer Society.

24.

Tang

Hinnerichs

Peng

Zhang

Hoehndorf

(2023). Falcon: Sound and complete neural semantic entailment over alc ontologies. arXiv:2208.07628. https://doi.org/10.48550/arXiv.2208.07628

25.

van Krieken

Acar

van Harmelen

(2022). Analyzing differentiable fuzzy logic operators. Artificial Intelligence, 302, 103602. https://doi.org/10.1016/j.artint.2021.103602. https://www.sciencedirect.com/science/article/pii/S0004370221001533

26.

Vendrov

Kiros

Fidler

Urtasun

(2016). Order-embeddings of images and language. In Y. Bengio & Y. LeCun (Eds.), 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, Conference Track Proceedings. http://arxiv.org/abs/1511.06361

27.

Venema

(2007). 6 Algebras and coalgebras. In P. Blackburn, J. Van Benthem & F. Wolter (Eds.), Handbook of modal logic. Studies in logic and practical reasoning (Vol. 3, pp. 331–426). Elsevier. https://doi.org/10.1016/S1570-2464(07)80009-7. https://www.sciencedirect.com/science/article/pii/S1570246407800097

28.

Xiong

Potyka

Tran

T.-K.

Nayyeri

Staab

(2022). Faithful embeddings for EL++ knowledge bases. In Proceedings of the 21st International Semantic Web Conference (ISWC 2022) (pp. 1–18). Springer. https://doi.org/10.1007/978-3-031-19433-7\_2

29.

Zhapa-Camacho

Hoehndorf

(2023). From axioms over graphs to vectors, and back again: Evaluating the properties of graph-based ontology embeddings. In Proceedings of the 17th international workshop on neural-symbolic learning and reasoning, La Certosa di Pontignano, Siena, Italy, July 3–5, 2023. https://ceur-ws.org/Vol-3432/paper7.pdf

30.

Zhapa-Camacho

Hoehndorf

(2024). Lattice-preserving ontology embeddings. In T. R. Besold, A. d’Avila Garcez, E. Jimenez-Ruiz, R. Confalonieri, P. Madhyastha & B. Wagner (Eds.), Neural-symbolic learning and reasoning (pp. 355–369). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-71167-1\_19

31.

Zhapa-Camacho

Kulmanov

Hoehndorf

(2022). SmOWL: Python library for machine learning with biomedical ontologies. Bioinformatics (Oxford, England), 39(1), btac811. https://doi.org/10.1093/bioinformatics/btac811

Lattice-Based A L C Ontology Embeddings With Saturation

Abstract

Keywords

1. Introduction

1.1. Proposed Approach: Lattice-Preserving Embeddings

2. Preliminaries

2.1. Description Logics (DLs)

3. Construction of the Lattice Structure

Intersection of Concepts

Union of Concepts

Negation of Concepts

Existential Restriction of Concepts

Universal Restriction of Concepts

Subsumption Axioms

Class Assertion Axioms

Role Assertion Axioms

Example in R 2

4.2. Generating Entailed Axioms C ( a )

6. Conclusion

Footnotes

Acknowledgements

Funding

Declaration of Conflicting Interests

ORCID iDs

The Ordered-Vector Space ( X,⪯ )

Hyperparameter Selection

References

Example in $R^{2}$

4.2. Generating Entailed Axioms $C (a)$