Abstract
In this paper, we study the problem of query rewriting for disjunctive existential rules. Query rewriting is a well-known approach for query answering on knowledge bases with incomplete data. We propose a rewriting technique that uses negative constraints and conjunctive queries to remove the disjunctive components of disjunctive existential rules. This process eventually generates new non-disjunctive rules, i.e., existential rules. The generated rules can then be used to produce new rewritings using existing rewriting approaches for existential rules. With the proposed technique we are able to provide complete
Introduction
Rules are very important elements in knowledge-based systems and incomplete databases [21]; they allow us to perform query answering over incomplete data and come up with complete answers. There are two main approaches to perform query answering in the presence of rules, which depend on the way we use the rules. The forward chaining approach [26] applies the rules on the facts in order to produce new facts. On the other hand, the backward chaining approach [20] uses the rules to translate the input query into a set of queries (called the rewriting of the initial query) that also encode answers of the initial query. Both approaches allow us to infer answers that cannot be extracted from the initial data.
Let us consider a rule that defines the
Forward chaining allows efficient query answering in systems where data is constant and queries change frequently. However, the size of the stored data can grow excessively, and the method is not appropriate for frequently changing data. For some ontologies, this approach does not always terminate [26], and may keep generating constantly new data. Backward chaining, on the other hand, is ideal for constant queries and changing data, although the size of the rewriting can be exponential with respect to the size of the initial query [19] or in some cases a finite rewriting may not exist. In both approaches, the application of the rules does not always terminate. Furthermore, having no restriction on the expressivity of the rules or having negated atoms on the query makes the query entailment problem undecidable [7].
In this paper, we focus on the entailment problem of conjunctive queries with negated atoms (
Disjunctive existential rules allow the representation of very expressive knowledge in FOL, e.g.,
Conjunctive queries with negation let us define the counterexamples of disjunctive rules, e.g.,
The entailment problem for
Having existential variables, disjunction and queries with negated atoms makes the entailment problem even more difficult. However, by using these expressive resources in a smart way we can get very interesting and useful decidable fragments.
Particularly, in this paper we are interested in solving the entailment of a union of conjunctive queries with universally quantified negation by rewriting it into a union of conjunctive queries without negation (
In [20], König et al. define a generic rewriting procedure for conjunctive queries with respect to existential rules which takes as a parameter a rewriting operator, i.e., a function which takes as input a
In [12], Bourhis et al. present a study of the complexity of query answering with respect to guarded disjunctive existential rules. The problem is 2
Some other research papers [3,9,11,16] focus on the transformation of the query answering problem with respect to guarded (disjunctive) existential rules into a query answering problem with respect to (disjunctive) Datalog programs by getting rid of existential variables. In [3], Ahmetaj et al. provide a transformation of the problem that yields a polynomial size (disjunctive) Datalog program when the maximal number of variables in the (disjunctive) existential rules is bounded by a constant. The translations proposed by other authors [9,11,16] are of exponential size.
Based on the constraint resolution method, we propose an algorithm to compute a
Additionally, we consider unions of conjunctive queries with negated atoms and answer variables (denoted as
Finally, we implemented the proposed algorithm for rewriting conjunctive queries with negated atoms and answer variables with respect to existential rules in the
In this section we introduce the basic concepts related to the FOL resolution process. Resolution is the base of all the reasoning processes we describe in this paper. All steps in high level reasoning processes can be tracked down to sequences of resolution steps that ensure the correctness. We also describe the framework of disjunctive existential rules and present the definition of conjunctive queries with negated atoms.
First-order logic resolution
We assume the reader is familiar with the standard definition of first-order logic formulas. In this paper, we focus on FOL formulas without function symbols over a finite set of predicate names and a finite set of constant symbols. We also adopt the standard definitions for the entailment and equivalence of formulas, as they are rarely modified in the literature. We refer the reader to [25] in case a background reading is needed.
Because in the following we will often need to modify formulas, in order to make such modifications more compact and easier to understand we introduce the definition of conjunctive (disjunctive) set formulas (
(Conjunctive (Disjunctive) Set Formula).
A
For a given set of formulas
Note that in case a
Set operators can then be used to combine set formulas of the same type and obtain a new set formula. Moreover, elements of a set formula that are equivalent can be
A
De Morgan’s Laws allow changing a
Let
We model the entailment operator in FOL assuming that on the left-hand side of the entailment symbol we have a
The right-hand side of the entailment operator is assumed to be a
In the following, we recall the definitions of some of the concepts that are needed for the theory presented in this paper.
A
An
A formula is
A
Let
A substitution
A
The
A
It follows that a

Hypergraph corresponding to a
The
A detailed proof is given by Tessaris [29]. However, the reader can clearly see that since no variable is shared between the connected components
It is easy to check that two
A
Every FOL formula can be transformed into an equisatisfiable CNF formula using variable standarization, Skolemization, De Morgan’s laws, and the distributivity of the conjunction and disjunction logical operators.
An
Let
It is easy to show that resolution is sound, i.e.,
(Resolution Derivation (Refutation)).
Let Σ be a set of clauses and
Sometimes it is useful to know which clauses were used to produce a resolvent
(Derivation (Refutation) Graph).
Let Σ be a set of clauses and
If we only include in the graph the last clause

A derivation tree.
This is a straightforward consequence of the soundness of the resolution rule. □
A clause logically implies any instance of it, possibly extended with more literals. This follows directly from the properties of the disjunction operator and the universal quantification.
Let
(Deduction).
Let Σ be a set of clauses and
Resolution steps or derivations involving ground instances of clauses (i.e., clause instances that do not contain variables) ensure that there are corresponding resolution steps or derivations involving the non-ground version of the clauses. This process is known as
In the text below we recall known theorems taken from the literature [25].
(Lifting Lemma).
(Derivation Lifting).
Resolution derivations allow us to infer clauses that are logical consequences of an initial knowledge in a complete way.
(Subsumption Theorem).
(Refutation Completeness of Resolution).
Performing resolution steps in a
The resolution operator
(Symmetry).
If
(Distributivity).
If
Note that on right-hand side of (1) the literals used in the resolution steps with respect to
(Commutativity).
If
Proving the above properties is straightforward if we consider them over ground instances of the clauses and track the set operations on ground literals. As a consequence of Theorem 2.3, the properties also hold for general resolution over non-ground clauses.
We will illustrate the distributivity property for
A
A
A closer look at the definition of facts reveals that a fact is equivalent to a Boolean conjunctive query. However, facts are used to express existing knowledge, while queries represent questions, so they have different roles in the process of reasoning.
A
A
We say that a
A
(Disjunctive Knowledge Base).
Let us define an example Facts:
Existential rules:
Negative constraints:
Disjunctive existential rules:
Note that the existential rule (4) has an existential variable
In this paper, we study the
In particular, we solve the entailment problem (10) by reducing it to the entailment of a
Finally, we may be interested in the values that some of the variables in
Backward chaining with disjunctive knowledge
In this section we first present
Constraint resolution
The entailment problem (10) can be transformed into a consistency check problem
To apply resolution, we need to convert first the facts
(Positive/Negative Charge).
The
According to the above definition, a
Properties of different types of clauses in the disjunctive existential rules framework
Properties of different types of clauses in the disjunctive existential rules framework
From the Fact clauses:
Rule clauses:
Constraint clauses:
Disjunctive rule clauses:
Table 2 shows the properties of the resolvent for different types of clauses when performing a resolution step. From these properties follows that a resolution refutation should involve resolution steps with respect to fact clauses, because they always produce clauses with a smaller negative charge. Such resolution steps can be arranged so that they are performed in the last part of the resolution derivation. Additionally, this type of resolution steps is generally linked to
Properties of the resolvent
As we can see from Table 2, in order to get derivations that produce clauses with a non-increasing positive charge, we need to avoid resolutions steps involving two
Let Σ be a set of clauses and
The subsumption theorem can be formulated using constraint deductions and consequences with no positive literals.
(Constraint Subsumption for Constraint Clause Consequences).
Based on the subsumption theorem for resolution derivations (Theorem 2.4), the fact that
The resolution derivation
This follows from Theorem 3.1 by taking as consequence the empty clause that has no positive literals and is only subsumed by itself. □
Conjunctive query rewriting is a process that mimics the constraint derivations introduced in the previous section. However, resolution steps involving Skolem functions are performed together in order to avoid introducing literals with Skolem functions that will not be able to be removed. For existential rules, the process of query rewriting is well known [20]. However, in most of the existing literature disjunctive rules are mainly used in a forward chaining manner [13,15] or to perform Disjunctive Datalog rewritings [3,9,11,16].
In Example 2.3, one could infer that two first-degree relatives that have the same age have to be siblings:
This rule can be obtained by a constraint derivation using (9) and the clauses corresponding to:
The new existential rule (13) can then be used in rewriting steps defined for existential rules.
A rewriting step as defined in the existential rules framework [20] corresponds to the resolution steps between
(Rewriting Step).
Let if if
then the query
If an existential rule
Definition 3.3 is an adaptation to our framework of the
For the resolution steps between some
(General (Disjunctive) Rewriting Step).
Let if if
then
A disjunctive rewriting step can yield a disjunctive rule with fewer disjunctive components, an existential rule in case
Consider a disjunctive existential rule:
On the other hand, if we have the negative constraint
Using the above-defined rewriting steps, we can now define rewriting for
Let
A
So far we have dealt with rewritings of conjunctive queries with respect to existential rules and disjunctive existential rules. However, we have not considered negative constraints and conjunctive queries with negated atoms. Negative constraints are transformed into queries in the rewriting process, i.e.,
In a similar way, if
(Soundness and Completeness of Rewritings).
The

Function to rewrite
Given a set of rules

Function to rewrite

Function to rewrite
Function
The
The
In both
All
The rewriting function for disjunctive existential rules (
Algorithm 2 describes the rewriting function for existential rules (
König et al. study the completeness of the
The termination of Algorithm 1 depends on the termination of Algorithms 2 and 3. Algorithm 3 always terminates because the produced rules contain less disjunctive components in the head. On the other hand, setting
If a set
For a set of existential rules
Let
The proof is based on being able to organize the application of the rules of
Property 3.1 [7] allows us to study the decidability of entailment when we combine two sets of rules for which the entailment problem is decidable.
In Algorithm 3, even if the resulting set of existential rules
Existential rules with one atom in the body ensure that the
A rule
Linear existential rules are a
Theorem 3.4 is closely related to Theorem 7.13 and Lemma 7.12 proposed by Bourhis et al. in [12]. However, it is still interesting to prove it considering the algorithm we have proposed so that the technique can be extended to the study of other fragments.
Rules that do not share variables between the head and the body produce rewritings where the introduced body of the rule is not connected to the remaining part of the query.
A rule
Disconnected rules add atoms to the rewritings that do not share variables with the remaining part of the query. It follows that the connected cardinality of
Using Lemma 2.2 we conclude that the number of rewritings produced by
The new existential rules produced by the function
Rewritings of disjunctive rules
Because
For other types of queries and knowledge bases there is no certainty that the algorithm will stop. However, we can still try to compute the rewritings up to a certain depth. Nevertheless, we should point that our algorithm stops if there is a finite and complete
The completeness of the definition of rewritings and the fact that we produce rewritings using all possible one-step rewritings ensures that if there is a finite
The algorithm can still produce new (disjunctive) existential rules, but since no new
Note that a subset of the existential rules needed to generate the finite and complete
To illustrate this we can consider the following Existential rules:
Disjunctive existential rules:
If we try to rewrite the
However, for
Therefore, the expansion process in Algorithm 2 needs to have a finite depth (i.e.,
Theorem 3.7 ensures that Algorithm 1 stops only if there is a finite and complete
While Theorems 3.4 and 3.6 impose rather strong restrictions on the disjunctive framework, they also suggest the existence of finite
A
Given a knowledge base
Consider the following three
Consider also a knowledge base
The certain answers of the queries with respect to
For queries with answer variables we focus on the query answering problem instead of the entailment problem. In theory we could try all possible assignments of constants in
The answer atoms in the elements of
The entailment of a
Algorithm 1 needs to be modified in order to avoid unnecessary rewritings of queries with answer variables. In Algorithms 2 and 3 we need to modify the functions
Consider a knowledge base without rules and the following
Answer variables play a different role when splitting
A
A
It follows directly from Lemma 2.1 after transforming (16) into:
Straightforward using Lemma 3.1 and similar arguments to the ones used in the proof of Lemma 2.2. □
Given a set of rules
If all the variables in the frontier of the
The variables that appear in negated atoms of the
In the presence of deterministic one-step rewriting functions, existential rules with all the frontier included in the set of answer variables (
Using Lemma 3.2 we can also affirm that there are finitely many rewritings that can be obtained. Therefore, every existential rule that is generated from the negated
Likewise, every existential rule that is generated from the negated
Based on Theorem 3.4, we can also define other restrictions on
Because
The deterministic
We conclude that Algorithm 2 terminates every time it is called in the main loop and also that the condition to continue executing the loop at some point will not hold because there are a finite number of rules and
Finally, if there is a finite and complete deterministic Deterministic one-step rewriting functions will only discard rewritings that are not deterministic. The algorithm produces rewritings using all possible deterministic one-step rewritings and this ensures that if there is a finite deterministic
In the second version of the system [23], queries with only one negated atom are answered by being transformed into rules. The approach is complete but termination is guaranteed only when the resulting set of rules is a
The current version of
To the best of our knowledge there is no other system that produces
For the experiments we used two ontologies that contain negative constraints and have been used in previous research papers based on queries with negation [22,23]. The first is the Lehigh University Benchmark (
We also prepared a query file with 500
Rewriting experiments results for the
Table 3 shows the size of the
For the
For the
For both ontologies the

Size of the

Size of the
Figures 3 and 4 show information about the

Cumulative distribution of the time needed to compute the

Cumulative distribution of the time needed to compute the
Figures 5 and 6 show the cumulative distribution of the rewriting runtime. Dashed horizontal lines represent the mean runtime. Each bar represents the number of queries that were rewritten in or faster than the corresponding time. Note that in both cases the runtime for more than 60% of the queries was smaller than the mean runtime.

Correlation matrix with different performance parameters for the

Correlation matrix with different performance parameters for the
Figures 7 and 8 show the correlation matrix with different performance parameters for the
One of the queries for the
In this paper, we studied the application of the query rewriting approach on the framework of disjunctive existential rules in order to produce complete
To ensure the completeness of our rewriting approach, we introduced a special case of first-order logic resolution (constraint resolution), where every resolution step involves one clause without positive literals, and the subsumption theorem holds when the consequence is a clause without positive literals. The resolution completeness theorem also holds, allowing constraint resolution to be used in refutation procedures for FOL formulas.
Based on the definition of constraint resolution we proposed an extension of the rewriting approach for existential rules in order to deal with disjunctive existential rules. The rewriting of a disjunctive existential rule produces disjunctive rules with less disjunctions in the head, and eventually produces an existential rule or a conjunctive query. The rules generated from disjunctive rules are then used in order to find additional rewritings of the initial conjunctive query rewriting. The proposed algorithm can be used for general knowledge bases with disjunctive existential rules; it terminates for the cases where there is a finite and complete
Moreover, we studied some of the sufficient conditions that ensure that the proposed algorithm terminates. One case requires atomic
Using the proposed algorithm and taking advantage of the stopping criteria, we implemented a sound and complete rewriting approach for unions of conjunctive queries with negated atoms and answer variables in the
The experimental results showed that the implementation is able to provide
In the future, we would like to focus on implementing the proposed algorithm in a more efficient way and also on studying other sufficient conditions that ensure the termination of our rewriting algorithm.
Footnotes
Acknowledgements
We would like to thank Lida Petrou for providing the queries for the experiments. Also, we thank Stathis Delivorias and Michael Giazitzoglou for their support and comments in the writing process of this paper.
