Abstract
When reusing existing ontologies for publishing a dataset in RDF (or developing a new ontology), preference may be given to those providing extensive subcategorization for important classes (denoted as focus classes). The subcategories may consist not only of named classes but also of compound class expressions. We define the notion of focused categorization power of a given ontology, with respect to a focus class and a concept expression language, as the (estimated) weighted count of the categories that can be built from the ontology’s signature, conform to the language, and are subsumed by the focus class. For the sake of tractable initial experiments we then formulate a restricted concept expression language based on existential restrictions, and heuristically map it to syntactic patterns over ontology axioms (so-called FCE patterns). The characteristics of the chosen concept expression language and associated FCE patterns are investigated using three different empirical sources derived from ontology collections: first, the concept expression pattern frequency in class definitions; second, the occurrence of FCE patterns in the Tbox of ontologies; and last, for class expressions generated from the Tbox of ontologies (through the FCE patterns); their ‘meaningfulness’ was assessed by different groups of users, yielding a ‘quality ordering’ of the concept expression patterns. The complementary analyses are then compared and summarized. To allow for further experimentation, a web-based prototype was also implemented, which covers the whole process of ontology reuse from keyword-based ontology search through the FCP computation to the selection of ontologies and their enrichment with new concepts built from compound expressions.
Introduction
The main motivation of providing machine-readable semantics to data on the web in the form of ontologies is that of achieving interoperability of independently built data sources and applications. For example, if the same kind of product offered by different e-shops is semantically described using the same web ontology, comparison and automatic recommendation of these offers can be provided to customers.
Obviously, interoperability depends not only on the existence of ontologies but also on their
Since the majority of ontologies is nowadays published in the same standard language, OWL [27], the reuse is easy from the technological point of view, whether the method is the direct reuse of existing ontology entities or their subsumption/equivalence mapping from the dataset schema or from the new standalone ontology. However, despite the general agreement on the benefits of ontology reuse, this best practice is not massively adhered to yet [3]. One reason might be that selecting an ontology, or a fragment of it, suitable for being reused, from a larger pool of ontologies (typically pre-selected via keyword-based search in ontology repositories) is a non-trivial task for which only recently formal methods have emerged. They mostly rely upon However, these approaches face the ‘cold start’ problem. Due to rapid growth of the ontological ‘ecosystem’ on the web, many emerging ontologies relevant for a certain dataset (or, generally, a reuse case) might not yet have achieved significant popularity ratings. Furthermore, the fact that an ontology as whole is thematically related to the reuse case does not mean that it structurally fits well the data that is to be semantically described. All this calls for complementing such ‘extrinsic’ sources of evidence with ‘intrinsic ones’, reflecting what is in the ontology itself, beyond mere unstructured keyword matching; the ontology’s axiom structures should be examined.
Our proposed approach to enhancing web ontology reuse is based on four assumptions:
The first two assumptions are supported, among other, by the findings of a study on competency questions by Ren et al., from 2014 [17]. They collected 168 competency questions (CQ) from two ontology projects, and clustered them into twelve archetypes. Of them, at least three correspond to tasks that can be characterized as sub-categorization of instances of a relatively generic class (here, pizza or software), namely:2 #1: “Which [ #3: “What type of [ #4: “Is the [
In archetype #1,
Note that while Assumption #2 gives a value to compound class expressions, we can realistically expect that for certain reuse cases (e.g., when the categories have to be imported into a static hierarchy such as a thesaurus or product catalog on a web page) the compound class expressions have to be structurally and lexically
The validity of Assumption #3 obviously depends on the meaning of “(ontology) provides (categories)”. Namely, an ontology is, structurally, a collection of
Assumption #4 brings a bottleneck: the respective notion of quality escapes an automatic empirical evaluation, as it depends on the subjective perception of users. We however believe that the direct human perception factor is indispensable for assuring the versatility of the reused schema with respect to different use cases. We could of course imagine some alternative, indirect quality measures, in particular, the distinction whether an expression would fit the
Let us now present a concrete motivation example, which we will use throughout the paper to illustrate various components of our approach.
A used vehicle retailer website is to be enhanced with RDF descriptions of the offered vehicles. The descriptions should refer to suitable ontologi/es whenever possible, in view of achieving interoperability with (presently unknown) applications of partners, search engines and aggregators. In other words, the ontologies will be reused as parts of the schema of the dataset (or, as recently called, knowledge graph) consisting of all the structured descriptions. Now, how can we assess the potential of different existing ontologies for being reused in this case? And how will the reuse itself then take place?
Various customers are interested in different Ontologi/es allowing to express many ‘meaningful’ categories of vehicles – whether as The whole or part of the reused ontologi/es (after a merging step, if more ontologies are reused for the ‘vehicle’ concept) would possibly give rise to a product taxonomy published as a navigation structure of the shop website.
To name some hypothetical applications (in the context of this particular domain) where the category-centric processing of data would make sense:
within the retailer website, the user could the designed categories would be considered as dimensional values in an internal meaningful sales segments to be coherently
We will now leverage on the presented example in order to explain the idea of our approach in intuitive terms, prior to formalizing it in Section 2. We will present the basic principles of focus categorization power computation, explain its broader ontology reuse context, and clarify the used terminology.
Formally we should use the term ‘OWL concept expression language’; we omit the ‘OWL’ attribute for brevity, since we do not go beyond the limits of OWL in any part of this research.
Depending on the chosen language, we can
These expressions can further subcategorize entities of which we already know their
A concept expression language and the signature of the ontology alone are not sufficient as input for building ‘meaningful’ categories. In most cases, the vast majority of the combinations of entities (fitting the language structure) would be nonsensical, as the entities would be mutually irrelevant. Therefore, we need one more ingredient: heuristic
The ability of the ontology to subcategorize the focus class, the
The research presented in this paper aims to bring the following contributions:
Introduce the notion of
Demonstrate this ontology analysis approach using a particular
The demonstration is not merely descriptive, but also features
Finally, further experimentation is made easier for other researchers by making available a prototype
The present paper is an evolution of a previous conference paper [22], which provided a brief explanation of the notion of FCP, informally proposed a concept expression language with FCE patterns, and provided the results of a first cognitive experiment. The present paper however extends the previous one along numerous axes, and the majority of its volume consists of entirely new content:
The
The survey on FCE pattern
A study on the presence of
The
The different analyses / experiments in the paper are now framed by a
The prototype Its early version was presented as a short demo paper [14]; the current version however has substantially enhanced functionality.
The rest of the paper is structured as follows. Section 2 provides a formalization of the focused categorization power framework (outlined in intuitive terms in Section 1.3). Section 3 introduces a CEL having suitable properties for an initial study. Section 4 complements this language with FCE patterns. Section 5 surveys the occurrence of concept expression patterns in ontology axioms. Section 6, analogously, surveys the occurrence of FCE patterns in the Tbox of ontologies. Section 7 describes a series of cognitive experiments in which humans provided an assessment of ‘meaningfulness’ of class expressions belonging to different patterns. Section 8 provides a comparison of those complementary surveys and experiments and a summarizing discussion. Section 9 describes a tentative operationalization of the results from the analyses, and explains the functionality of the implemented OReCaP prototype. Section 10 reviews some related methods and projects. Finally, Section 11 wraps up the paper and outlines the directions for future work.
The aim of this section is to formally underpin the whole approach as well as to motivate the empirical analyses to which most of the remainder of the paper is devoted. We will proceed from the notion of
Concept expression language
The use of the syntactic constructors in OWL can be restricted in different ways, producing a formal system in logic often called a fragment or sublanguage. There is a number of decidable fragments of description logics [8]. The so-called OWL Profiles as sublanguages of OWL defined in the current OWL 2 standard are examples of such fragments [26]; however, non-standardized restricted sublanguages of OWL may be also be useful for particular tasks, as here.
The notion of concept expression language, in our terms, is based upon a set of concept expression
Let us first recall or introduce a few preliminary notions. A
In order to bring in the notions of variable and substitutions, let us introduce a special signature
Further, given a signature
With preliminaries in place, we may proceed to our definitions.
A
One example of a CEL may be the trivial language of named concepts
Another CEL example is the language of existential restrictions to the top concept
For clarity, we explicitly define the notion of class expressions generated by a CEL: Given a CEL
Only some of the
Let For simplicity, we will sometimes simply write ‘category’ rather than
Within this paper we will sometimes informally refer to a DCE as to the
The capability of
Still remaining at the (non-operational) ‘guideline’ level, we should discuss some desiderata of the weight function just introduced. Intuitively, leveraging on some of the assumptions from Section 1.1, an ideal function
The weight
The weight
The weight
The weight Presumably, graph-based metrics relying on the number of different paths connecting the constituent entities might be applied.
Checking whether the
Before proceeding to an operational elaboration of the category weight function in FCP computation, we need to return to the problem of category generation. While the CE patterns defining a given CEL allow for substitutions producing class expressions to be ‘counted’ in the FCP computation, such expressions cannot be directly extracted from the OWL code, which consists of (TBox, and sometimes ABox)
The definition of a focused category extraction pattern relies on OWL
A For simplicity we only consider conjunctions of axioms. Disjunctions could be expressed by means of multiple FCE patterns for the same CE pattern, if needed.
Considering we have two kinds of patterns (CE and FCE), which both abstract over some OWL structures, a question naturally arises why these two could not be more integrated or based one upon the other, e.g., in the sense that both the LHS and RHS of an FCE pattern axiom template would be CE patterns. There are however two factors that would make such an integration cumbersome or impossible:
First, while the variables in CE patterns are strictly Second, the role of the FCE patterns is not merely in matching the OWL structures but also in constructing class expressions based on the match. The CE pattern of the constructed expressions will not always (probably, just rarely) correspond to the structure of expressions explicitly present in the axioms matched by the FCE pattern.
For these reasons, we keep the two notions as completely separate. In the rest of the paper we will carefully distinguish which type of pattern is being discussed (using the ‘CE’ and ‘FC’ acronyms) unless the type of pattern is entirely clear from the context.
Assignment of a weight
As pointed above for the specific case of
Moreover, if
Frequency of instantiation of
Likelihood that the
Let us demonstrate these sources on the vehicle domain of our motivating example. A considered ontology contains in its signature a focus class For easy readability of the DL formulas, we will mostly use the simple DL notation without IRI prefixes in the examples. The namespace will be irrelevant for artificial examples and clear from the context in real-world examples.
As suggested in the previous subsection, the ultimate weight of a category can be influenced both by the factors specific for the particular category and by the CE pattern type of the DCE of this category. The category weight computation (for the sake of FCP computation, see Equation (1), i.e., still at the non-operational level) could be possibly decomposed as
⊗ is a function for combining the two partial weights into a single weight of the category.11 As we exemplified (on ‘cars with an accident’) in Section 2.5, the presence of (trustworthy) Abox data should overrule the assumptions made based on the specific Tbox information (within
As regards the CE RDF Existing ontology By the RHS we mean the syntactic RHS as indicated in the code. For the overwhelming majority of axioms inside the OWL code available in ontology documents, the syntactic LHS is a named class. Although the OWL grammar allows for general concept inclusions, having a compound concept on the left-hand side (see
Data collected from these sources then have to be
Orthogonally, we can also aggregate the occurrence counts of
In different sections of the paper we will empirically investigate three of the sources identified in this subsection: the CEs in axioms in Section 5, the FCE patterns in Section 6, and the human assessment of FC+category pairs in Section 7 (thus only deferring the Abox analysis to later research), and discuss their potential and limitations in more detail.
To avoid any mismatch of the presented four ‘weight sources’ list with the (also four) ‘weight sources’ from Section 2.5, note that the sources from Section 2.5 are applied ‘deductively’, to estimate the weight of a particular category, while the sources in this section serve for ‘inductive’ derivation of the (average) weight pertaining to a whole CE pattern.
We will now specifically consider the situation when the Abox information on the usage of a specific category is either unavailable or unreliable, and we thus have to derive the category weight from the
0 if
Let us then assume that we distinguish among
Note that in terms of Eq. (3), the multiplication by each CE weight were equal to its CE pattern weight, and, the FCE patterns were ‘perfect’ in the sense of generating exactly those CEs producing a non-zero weight (if used as the DCE of a category).
Notably though, these assumptions are entirely unrealistic in any practical setting. This entails that the resulting ranking of ontologies for reuse according to
Pattern-based CE extraction and approximate FCP computation algorithm
The computational process leading to extraction of class expressions using FCE patterns is rather straightforward with respect to the general description of the approach in Section 2.7, see the algorithm pseudo-code in Listing 1. Each call of the function
For completeness, we also include the approximate FCP computation algorithm, as Listing 2. It simply amounts to summation of CE pattern weights

Pattern-based CE extraction algorithm

Computation of approximate FCP
With the algorithm, we now have a general operational framework allowing us to approximately calculate the FCP of an ontology with respect to a focus class, relying purely on the axioms from the ontology.
The framework still needs to be instantiated with three components:
a concrete CEL, providing the CE patterns, sets of FCE patterns, linked to these CE patterns via mapping functions, and, finally, weights of these CE patterns.
In Section 3 we provide a CEL that is simple to work with but still sufficiently rich to demonstrate various aspects of the method. In Section 4 we suggest a set FCE patterns for its CE patterns. Finally, tentative CE pattern weights are deduced from the experiments in Section 7, while the preceding sections, 5 and 6, provide some empirical insights on the proposed CE and FCE patterns, respectively.
Simple existential CEL:
Having explained the whole proposed general framework of FCP computation, let us now return to its most essential element: the concept expression language (CEL) that determines what kinds of CEs are considered for the differentiating class expressions (DCEs) specializing the focus class.
As set up by the Assumption #2 from the introduction (importance of both atomic and compound CEs for the FCP), for meaningful analysis of the FCP computation landscape we have to combine some variant of the language of named concepts ( The The number of matching CEs only grows linearly with the number of classes The existence of a property assertion (witnessing the validity of the existential restriction for the subject entity) can be easily
OWL concept constructors’ usability for focused categorization is analyzed in more detail in Section 5.
When considering the CE pattern
One of them is
The other is
Along with the CE patterns based on existential restriction, we will also consider the sole CE pattern of
Based on these consideration (and, additionally, eliminating some uninteresting edge cases) we defined a suitable CEL, for the sake of the empirical research described further in this paper, as follows.
The
Therefore, the CE patterns considered for the sake of the presented research are: a named class; an unqualified property restriction (i.e., one with ⊤ as the filler); a property restriction with a named class as the filler; and, an individual value restriction (a property restriction with a singleton class as the filler).
Since
Table 1 gives an overview of the CE patterns:
The first column assigns the CE patterns a numbering local to the given CEL, for convenience.13 We denote these specific CE patterns in bold face, to differentiate them from CE pattern meta-variables using the math font with subscript notation (
The second column indicates the structure of the CE pattern itself in DL notation; the CE patterns substitutions are already restricted according to
The third column indicates which variables from the CE structure are to be substituted by corresponding
The fourth column measures the length of the Abox path (as number of triples) connecting the individual
The order of the CE patterns in the table reflects the increased complexity of their detection in the Tbox using FCE patterns designed for
Summary of CE patterns in
Looking back at the competency questions of Ren et al. [17] referenced in Section 1, we see that while the archetypes #3 (“What type of [CE] is [I]?”) and #4 (“Is the [CE1] [CE2]?”) assume named subclasses of the FC, i.e. are covered by the CE pattern
Having defined
Inventory for
FCE pattern axiom templates
In principle, the FCE patterns could feature any OWL constructs within their axiom templates ( Both the existential and universal restriction require three triples in the RDF representation; one of them in both cases relies on an auxiliary predicate
While the construction of the heuristic FCE patterns for
Existential restriction is also used, but it is in the constraint part of the FCE patterns rather than in the axiom template part. Therefore, while the pattern setting assumes the presence of both ‘global’ (domain and range) and ‘local’ (existential) restrictions, omission of the latter by the designers does not preclude the generation of CEs but only disables their pruning.
We present below five FCE patterns for
Also note that even such a simple FCE pattern might not be the only possible for the given CE pattern. If we accept the possibility that not all classes in different hierarchical paths are pairwise disjoint, we could also apply to
As we move from
The relationship between
The restriction can also be inherited from a superclass or part of a complete definition, or can have the form of a
In order to illustrate the application of the two FCE patterns introduced so far together, in the calculation of the approximate FCP (Eq. (4)),we will return to our running example. Unlike in the previous, artificial, examples, we will refer to a real-world ontology.
The Here and in the rest of the running example we omit the There is actually a tweak in this example.
The instantiated
Intuitively, while
Let the weights be, for example,
Let us now proceed to the FCE patterns for the remaining CE patterns included in
As this FCE pattern is already a bit more complex, we will also demonstrate its application (and the added value compared to

Tbox of
Reliance of FCE patterns on
The inferential closure is used as in
We can now illustrate the application of all considered FCE patterns on our used cars example.
The property Such special-purpose classes would have to be filtered out; typically they could be automatically detected by appearing in the range of a huge proportion of properties.
All five FCE patterns assure ( In contrast, this would for example not be the case for the alternative FCE pattern for
As regards the complexity of the FCE patterns in terms of (templated) RDF triples in the template axioms, it is as listed, for the respective CE patterns, in the fifth column of Table 1.
We have previously stated that the proposed FCE patterns are merely heuristic (approximate) with respect to the optimally chosen set of
First of all, the Assumption #4 from Section 1, i.e., that the quality of compound categories should be correlated with the degree to which users would consider a corresponding named class as meaningful, rules out any kind of optimality guarantee, since human judgment is always subjective. The designer wishing to reuse an ontology may even question the
The situation is analogous for the FCE patterns’
As regards the FCE patterns
The generic algorithm in Listing 1 ends by generating a list of CEs (plus their CE patterns) for a given focus class. However, to finalize our
Let us assume that the categories discovered for the purpose of the FCP computation are retained for future use. The vehicle retailer might then decide to rebuild the product catalog so as to cover the categories, even the compound ones, that are populated by a significant number of vehicle items. The categories derived from the same property, say,
This structure might either become materialized in the underlying ontology as such or might only be generated at the web engineering level. The taxonomy would naturally follow the specialization of CE patterns as in Table 1: the categories with DCE conforming to
FCE patterns are a crucial element of the operationalization of FCP computation over the ontology axioms. We therefore described the FCE patterns for
CE patterns in ontology axioms
We analyzed the axioms of publicly available ontologies as an empirical source for estimating the frequency of occurrence of CE patterns of the CEL
Ontology axiom sources
For our analysis we used three collections of ontologies:
The collection indexed by the
The BioPortal collection,24
A small experimental collection of ontologies having heterogeneous styles and relatively rich in axioms, from the domain of conference organization, called
The analysis took place in November 2017.
The impetus for this empirical analysis was the close relationship between the central motivating task of the research, that of using CEs (either named classes, or compound CEs that can possibly be transformed to named classes) for We can for now ignore general concept inclusions (having a compound CE as their left-hand side), which are allowed in some dialects of OWL but only used sparingly in real-world ontologies.
Note that the subsequent use of the constructed CEs is, in some aspects, similar in both tasks, too. Notably, in all cases, an aspect of
A compound CE in the RHS of an
A compound CE in the RHS of a
A compound CE that can be merely constructed from the
Technically, the expected outcomes of the analysis were the following:
Findings about the frequency of various concept expression patterns in the RHS of axioms in existing ontologies (possibly interesting for the community even beyond the focused categorization setting)
Positioning of the CEL
The considered list of CE patterns is, essentially, that of first-level constructors in the axiom RHS. However, since we were particularly interested in the compound CE patterns of Strictly speaking, the singleton enumeration
The frequency of CE pattern occurrence, in absolute counts, was classified by three dimensions:
By the analyzed ontology collection. By the distinction of equivalence or subclass axioms (and the sum of both). By the level of nesting: outermost constructor vs. further levels.
The results are in Table 2, however omitting the last dimension (nesting level) for brevity. The most frequent constructors are listed: 10 for LOV and BioPortal, and 5 for OntoFarm (where only the top of the ranking is relevant, due to very low counts). The constructors corresponding to
CE patterns in axiom RHS
CE patterns in axiom RHS
As regards the
The other constructors appearing at the first level of the axiom RHS nesting are:
The contribution of this section was twofold. First, it featured the CE pattern occurrence frequencies as such, which could be of general interest to the ontology design community. Second, specifically in the line of the current paper, it provided intellectual analysis of axiom type roles in FCP.
The analysis suggested that the
FCE pattern occurrence in the Tbox of ontologies
The questions to be answered by this analysis were:
How many ontologies, and for how many FCs, provide a decent number of ‘categorizing’ CEs through heuristic mapping from the patterns from Section 4. What are the differences in the occurrence of the individual FCE patterns overall and across different ontology collections.
The answers to these questions would help us estimate how likely it is that the particular FCE pattern would yield potential categories when a focus class is provided as an input. (They will not, though, provide an estimate of the quality of such categories; this question will be addressed in Section 7.)
Ontology sources and data aggregation method
In the analysis we made use of our
In order to provide aggregate results, we counted the occurrences of FCE patterns from Section 4 across all classes of all ontologies evaluated in the role of FC. We summed up these results at ontology level by identifying ‘categorizable’ classes, i.e. classes for which the pattern occurrence reached some threshold Pattern
For illustration of the bottom-up calculation steps, let us take the example of an OntoFarm ontology called
Ratio of ontologies with a
Let us now try to answer the questions posed at the start of the section, by examining the result table.
Unsurprisingly, the percentage of ‘categorizable’ classes is in most cases highest for
For
Results for pattern
p5
For completeness, we performed an analysis of ontologies which use SKOS concepts for entity categorization (pattern
Table 4 presents information about those 7 ontologies in terms of the number of SKOS concepts that can be used as the ‘categorization individual’ and the number of SKOS concept schemes from which those concepts come. Further, we include information about the date of the last modification of the ontology. In two cases, no concept schemes are available. For the other ontologies the number of SKOS concepts (and SKOS schemes, respectively) usable for categorization varied from 11 to 274 (from 1 to 16, respectively). Although the phenomenon captured by
Categorization via SKOS concepts (FCE pattern p5 )
Categorization via SKOS concepts (FCE pattern
The findings about
Cognitive experiments: CE assessment
The previous analyses carried out over the ontology Tbox (axioms RHS in Section 5 and FCE pattern occurrence in Section 6) only indirectly contributed to the central question: whether the compound CEs from
In order to get finer insights, we proceeded to a detailed investigation of sample CEs by human ‘ontologists’, both experts and relative novices (students of relevant subjects). We performed two campaigns of experiments, the first in Spring 201630 Already described in our early publication [22]. Here we provide a synoptic view of both campaigns.
Provide the human assessors with a set of ‘focus class – subcategory’ pairs, such that the subcategories correspond to
Collect the
Seek
Examine the
A summary of the experimental setting in both campaigns is in Table 5.
Overview of cognitive experiments with students
In this section we write the example CEs in Manchester syntax, with the keyword
Ontologies tied to software applications, such as some OntoFarm ones (capturing the processes supported by conference software) use object properties to capture relationships that are only relevant within a
In some cases the use of inferential closure for the filler class in
Some CEs of
The questionnaire was in Czech. The English translation of a sample task is available in Appendix A.
The 59 tasks from the initial sample were randomly divided into three questionnaire versions (one
We aggregated the results by questionnaire task, and then both by the course and by CE pattern. The aggregation was carried out by simple summation over the answer values rescaled to the
A short digest of the results follows:
The average NS over all 60 tasks was 0.07, i.e. rather low, although positive. Of the 60 NS values, 28 were positive, 5 zero and 27 negative. The values strictly below 0.25 and above −0.25, possibly viewed as ‘borderline aggregates’, were 34 (57%).
Perhaps most important, the average NS was highest for
The cases33 Most namespace prefixes used can be expanded using the prefix.cc service. Prefixes unlisted by this service follow:
The average NS was higher for the OE students (0.12) than for the AI students (0.04), which might be attributed to more developed ‘ontologistic thinking’ of the former. The
CEs with highest and lowest average NS of student scores, 2016 campaign
In comparison with the ‘expert ontologist’ assessment:
The students gave a significantly lower score: only about a half of the tasks had a positive NS, compared to 92% (54/59) in the final consensus of experts. This can be explained by their lower ability to figure out specific situations in which less obvious categories might become meaningful.
If we apply the same method of average NS computation on the initial assessment of experts, the proportion of ‘borderline aggregates’ between −0.25 and 0.25 is only 14% (in contrast to 57% for the students’ values).
There is agreement on the less frequent ‘meaningfulness’ of
As regards the case-by-case comparison between the students and the experts, there is also a correlation in the sense that the 43 experts’ clear positives obtained a positive average NS from students (0.14), while the 14 initially ‘clash’ cases obtained a slightly negative average NS (−0.07) and the 3 negative cases obtained a clearly negative average NS (−0.24).
In the second campaing we tried to modify the setting so as to avoid some biases and gaps appearing in the first campaign, in particular:
The even distribution of tasks between LOV and OntoFarm was judged inadequate, as OntoFarm is by an order of magnitude smaller, addresses one domain only, and its ontologies have been created artificially, even if based on real-world non-ontological resources.
Assessing the CEs solely based on their formal representation risked of suffering from a comprehension bottleneck.
There was no a priori expert assessment this time (assuming that the correlation of the expert and novice assessment had been adequately studied in the first campaign).
they had more than 90% of their classes equipped with the they had at least 10 classes (to eliminate the long tail of very small ontologies).
The actual sampling was then performed on approx. 130 thousand CEs generated from the 72 ontologies that satisfied the above conditions. From this pool we randomly sampled ten tasks for every CE pattern (
selected
Unfortunately, the sampling results exhibited some potentially undesirable features, and we did not have time to redesign the sampling because of the planned experiment dates (within the schedule of both courses) that we were unable to shift. Namely:
Some domain ontologies contained links to upper-level ontologies. If the FC was then picked from an upper-level ontology, it was highly abstract (e.g. ‘Feature’, ‘Object’ or ‘Endeavor’), and its relationship to domain-specific concepts of the CE was hard to figure out. The assessment then had ‘strong philosophical flavor’, and the setting was unrealistic wrt. our target use case, since upper-level entities would not typically be sought as focus classes when publishing linked datasets. One of the tasks referred to an ontology in a language different from English (namely, Spanish).
By the results and the students’ feedback it however does not seem that these infelicities would have seriously biased the experiment.
The structure of the CEs was
The questionnaire was again in Czech, except the CEs (verbalized in English, to avoid issues with the inflection grammar of Czech). The English translation of a sample task is available in Appendix B.
The task question was slightly modified: it explored to what the degree the category is meaningful and The questionnaire separated the meta-question on The students could textually For the compound CEs, the students could provide a
We will however not discuss the last two types of metadata elements (the unstructured ones) in the current paper, to avoid thematic dilution.
In addition to the FCP tasks assessment, the questionnaire also examined the students’ assessment of their own level of
We computed the normalized sum (NS) of the task assessment values, as in the 2016 campaign (using Equation (12)). The core results are as follows:
The average NS over all 40 tasks was 0.27, i.e. much higher than in the first campaign (0.07). Of the 40 NS values, 34 were positive, 2 zero, and only 4 negative. This can possibly be attributed to the longer time available for each task, to the higher amount of available documentation, and/or to the verbalization of the CEs.
The relative position of the compound CE patterns did not change from the 2016 campaign. The average NS was 0.33 for
The cases with highest positive and lowest negative values are in Table 7; both the FC and the subcategory are now shown at the level of labels, just as presented to the students. The property and its filler, whether a class or an individual, are separated with a colon. The underlying ontology is referenced in the third column, through its nickname, which can be resolved against the LOV portal by appending it to
CEs with highest and lowest average NS of student scores, 2018 campaign
We also computed the relative frequencies reflecting the impact of the (declared) Of the 168 assessments by students with excellent or very good English skills, 102 (61%) were positive (‘certainly’ or ‘perhaps’); in contrast, of the 80 assessments by students with fair or basic English skills, only 37 (46%) were positive. Of the 176 assessments where the students comprehended the meaning of the CE entities (‘quite familiar’ or ‘roughly’), 126 (72%) were positive (‘certainly’ or ‘perhaps’). In contrast, only 13 assessments of 72 (18%) where the students did not comprehend the meaning of the CE entities (‘pretty vague idea’ or ‘no clue’) were positive.
In order to reflect the degree of entity semantics comprehension in the assessment (with the assumption that more weight should be given to ‘more informed’ assessment), we also applied simple numerical weighting: the formula from Equation (12) was changed to
The role of the cognitive experiments was to eventually attempt to assess the reusability degree of different individual CEs and thus (indirectly) their patterns, which is tied to the central idea of the whole approach (in which, for example, the FCE patterns are merely instrumental).
Across the different campaigns and settings, the order of the CE patterns according to the average ‘meaningfulness’ of the categories remains stable: (
The description of each task (consisting in evaluation of the meaningfulness of a class expression) in each questionnaire variant The table with calculation of aggregated results.
Discussion
Since the empirical part of the paper may appear a bit fragmented and the results hard to align, in this section we first provide an integrative meta-view of the surveys / experiments settings and results.
From this we depart to a discussion of limitations and open questions of the analysis.
Meta-view of the empirical analyses
In Table 8 we synoptically summarize the three empirical pillars of our research so far, as elaborated in Sections 5, 6 and 7. We see that the surveys/experiments are to a large degree complementary, differing in their features: in the
The last row in the table attempts to summarize the core findings of each analysis. At the first sight the arrangement of the CE/FCE patterns might look incoherent. Especially, Section 5 considers the CEs within the RHS of axioms, where they primarily serve as a means for inferring the subordination of On the other hand, Sections 6 and 7 already study the categorization in the setting with a known
As regards the high ranking of
Summary of the complementary surveys / experiments, with core findings (last line)
Summary of the complementary surveys / experiments, with core findings (last line)
Since the paper explores a substantially novel problem space, the coverage of its different corners is still rather limited. In this section, we discuss five limitations and/or open questions, in turn: the omission of Abox data in the whole process, the simplification made in the empirical analysis of CE pattern frequency, the reliance on particular Tbox design principles in the application of FCE patterns, the negligence of logical considerations when creating the definitions of new named classes, and, eventually, the actual choice of existential restriction as central primitive of the initial CEL.
The presented research currently ignores the role of While some early experiments are under way, their inclusion (in a low-maturity state) would have made the current paper lengthy and the contribution diluted.
The computation of the
The design of our Note that some best practices, e.g., in biomedicine, used to argue against domain/range axioms, see e.g. [16].
The generation of new named classes with their compound definitions in the reuse step, as described in Section 4.3 (as a follow-up technique, less central than the FCP computation, which is the main topic of the paper), is currently conceived rather naïvely from the logical semantics point of view. The definitions are assumed to be generated one by one locally, without taking into consideration the inferential structure of the ontology as a whole. This probably does not harm if the resulting ontology is to be used merely as a schema for data to be processed at the assertional level. However, possible exploitation of such an ontology by reasoners might ask for involving more sophistication in the generation of new definitional axioms.
Finally, even the assumption under which we gave preference to the
While most of this paper is devoted to the description and formalization of the FCP ‘view’ of ontologies and to the associated empirical analyses, we also provide a tentative (or rather, illustrative) operationalization of the empirical findings into FCP computation weights, an describe an early prototype of an ontology search (for reuse) tool leveraging on FCP.
Tentative operationalization of the empirical results
We understand the obtained insights into the usage of OWL class expressions and their perception by humans in general as a research achievement per se. However, the starting point for the overarching empirical study was an ‘engineering’ goal (possibly modest compared to the extent of the performed surveys and experiments): to propose
In these terms, based on the cognitive experiments in particular, we can see that
Alternative CE pattern weights derived from the cognitive experiments
Alternative CE pattern weights derived from the cognitive experiments
Considering that the lowered average NS values of
The current version of the OReCaP tool (see the next subsection), when launched, proposes these values, i.e.
To demonstrate the whole focused categorization framework, specifically for The acronym refers to the terms ‘ontology reuse’ and ‘categorization power’.
The interaction workflow with the tool consists of several, possibly iterative, phases:
The process starts with a
The search returns a
The next step is to execute the
The calculated FCP score is then displayed below the ontology overview in the search results, and also saved to a
From the comparison list it is possible to proceed to the
The actual

OReCaP interface: two ontologies found via keyword search, with focus class and additional match.
Let us assume the user wanting to publish data about business contracts and their payment, and seeking a suitable ontology for their subcategorization. A partial screenshot showing the overall search results together with FCP scores for two ontologies is in Fig. 2. The focus class keywords (here, just one) are entered in the top-left field; their matches are therefore proposed (hightlighted in blue) as focus classes, by the tool. The user has also provided additional keywords, which may improve the result ranking but do not produce further focus classes (unless the user pro-actively highlights them, too).
For the first ontology, PPROC, a snippet of the reuse summary window is then shown in Fig. 3; the numbers (e.g., “(1/5)”) indicate how many categories were chosen in the given sub-tree. The user has chosen two categories, corresponding to ‘contract that has been anyhow modified’ (CE pattern

OReCaP interface: the reuse summary, with user’s choices of two CEs.
OReCaP makes use of the Linked Open Vocabulary API38
Since the research described in this paper addresses the focused categorization power problem from various angles, multiple areas of related research can be identified. In this section we report on the following, in turn: abstract notions similar to our notion of FCP; empirical studies on presence of class expressions and structural patterns in ontology repositories; cognitive experiments on assessing ontological structures; concept learning in DL; ontology reuse metrics and methods.
Abstract notion of focusing or categorization (power) in ontologies
We are unaware of prior work on the same topic of FCP as we coin it in the current paper. We will however reference some related research that overlaps with ours at the abstract level.
The notion of focusing recently appeared in the work of Gogacz et al. [7]. The so-called focusing solution pairs a set of predicate symbols that describe a database schema (that is, a set of predicates) with a set of assumptions on the partial completeness on the data and the ontology (closed and fixed queries). In their approach, focusing is about choosing which parts of the data and ontology are to be declared complete, to allow for efficient reasoning. In our approach, focusing is about choosing ontology classes for whose instances in data we would like to obtain many meaningful categorization options; the categorization itself however need not rely on logical reasoning (never mind using the ontology), but can be based on whatever kind of classification model or even made by humans.
The term classification/categorization power previously appeared in many scientific texts, however, rarely as a rigorously defined notion. For example, on many occasions, automated classifiers (typically, machine-learning-based) are reported to have certain ‘classification power’ with respect to classes from an ontology, which is merely an informal circumscription of measures such as accuracy or error rate. The ‘power’ also clearly pertains to the
Partially relevant is the analysis made by Giunchiglia & Zaihrayeu [6], who categorized ‘lightweight’ ontologies with respect to two dimensions: complexity of labels (simple noun phrases vs. use of connectives and prepositions) and use of ‘intersection’ operator allowing to combine atomic entities of different nature (e.g., the atomic concepts ‘Italy’ and ‘vacation’ implicitly combine into ‘vacation in Italy’). Maximal ‘classification power’ is obtained when both explicitly complex labels and implicit concept combinations are allowed. This however only applies to classifying documents extrinsic to the ontology, since ‘intersection’ of concepts of different nature is not coherent with the set-theoretic semantics of DL. Overall, their ‘classification power’ is a global property of the method by which the ontology has been built. In contrast, our notion of FCP applies to individuals intrinsic to the DL world of the ontology and is calculated with respect to a focus class.
A. Rector’s work on entangling hierarchies (normalization) [15] addressed a different problem than us, but to some degree analogously considered the compound concepts as an alternative to named ones. This applies to ‘partitioning’ or ‘refining’ concepts, that only modify the ‘self-standing’ concepts; secondary partitioning aspects should not be expressed through subclassing (yielding a multi-hierarchy) but through existential restrictions filled with classes from separate ‘codelist’ taxonomies. For example, a class
Our own ongoing work on the PURO modeling language [21] deals with various options how the same ‘background’ state of affairs can be expressed in OWL. PURO structurally resembles OWL but relaxes some of its modeling constraints. A library of transformation patterns allows to proceed from one PURO model to alternative OWL ontologies in different encoding styles. An example relevant to our case is the notion of
Certain research in cognitive psychology might also be relevant wrt. the notion of FCP. In particular, the notion of
As regards the analysis of ontology repositories in terms of various aggregated features and metrics (logical, graph, lexical etc.), there has recently been renewed interest, following up with the early work of Tempich et al. [23] (aiming to build a benchmark for testing ontology tools). A large scale study of OWL ontology metrics was carried out by Matentzoglu et al. [13]. However, the categorization power of ontologies has not been, to our knowledge, studied, never mind with the flavor presented here.
Our study on class expression frequency in axioms from Section 5 looks similar to the recent study carried out in the MontoloStats project [12]. Both studies essentially analyze the same ontology repositories (primarily, LOV and Bioportal), and refer to the suitability of ontologies for reuse. There is however a difference in the restrictions coverage. For an unclear reason, the MontoloStats study does not cover existential restrictions (which are central for our study, and also shown as empirically very frequent) at all, nor the conjunctive concepts. On the other hand, it covers (subclass axioms with) named class in the RHS, and also property axioms such as domain/range or functional property. Notably, even over the common subset of CE patterns in restrictions (such as disjoint, universal and cardinality restriction CEs) that some differences in the computed ranking appear between MontoloStats and our research; these may be due to additional distinct features in the methodology used.
Yet another stream of empirical research aims to study ontologies not on their own but from the point of view of LOD datasets in which they are used. This was the subject of the project by Asprino et al. [1], which produced a condensed representation of the global, virtual, ‘LOD ontology’ in the form of so called equivalent set graphs. Various metrics related to the connectedness and extensional size of ontology entities were computed; while this research does only addresses compound concepts, it is in line with our ongoing activity in analyzing the Abox imprint of (named as well as compound) class expressions.
Cognitive experiments on assessing ontological structures
Several cognitive studies using ontologies as material have been published in the ontological engineering research. However, they primarily address the capability of humans themselves to carry out the categorization of objects to a set of classes or to understand the structure of OWL expressions. A recent example of the former is a study on classifying domain entities to upper-level ontology classes [20]. An example of the latter is an earlier study on the human capability of deriving useful information from differently verbalized OWL statements [24]. Our research in Section 7 of this paper differs in that the humans were to assess the automatically build concepts as more or less plausible, thus generating ground truth. (Semi-automated) verbalization was present, too, but only played an auxiliary role, the actual subject of assessment being the formal CEs themselves.
Concept learning in DL
The heuristic construction of compound CEs from the ontology axioms, triggered by the identification of a focus class, bears some resemblance to
Ontology reuse metrics and methods
The broad context of our research, the task of ontology reuse, was studied by Schaible et al. [18]: the users expressed their preferences on reuse strategy in a survey. The results indicate that reusing multiple entities from the same vocabulary may often be preferred; this corroborates the relevance of our approach to measuring the categorization power of ontologies with respect to focus classes.
Vocabulary reuse techniques similar to the use of FCP-based metrics also appeared in a recent project on combining popularity metrics with the credibility of the vocabulary designers [19]. As regards the designer credibility, this is a feature of the ontology itself similarly to FCP, but it is completely orthogonal.
Reuse support [5] is also systematically sought by the maintainers of LOV [25], primarily at keyword relevance level; we are in contact with them and will seek to integrate our complementary approaches.
Conclusions and future work
Ontologies are an important means of subcategorizing entities already known to belong to a general focus class. Ontologies with the best
Ongoing research addresses the analysis of the CE
Another area of ongoing research concerns the techniques of
In middle term, we plan to extend the
Within the scope of the different CE patterns, the syntactic
The pool of testing subjects in
Since the main foreseen practical application of the whole approach is the improvement of
Footnotes
Acknowledgements
The research had been supported by CSF 18-23964S, “Focused categorization power of web ontologies”, by projects ORBIS, funded by Slovak SRDA agency under contract No. APVV-19-0220, KATO, funded by Slovak VEGA agency under contract No. 1/0778/18, and by TAILOR, funded by EU Horizon 2020 research and innovation programme under GA No. 952215.
Questionnaire task description from the 2016 cognitive experiment campaign
FC and category in Manchester syntax:
?i is an instance of a class including all objects that are in the relationship ‘bornIn’ to at least one object.
Hint: the expression, e.g. (bornIn some Thing), suitable for categorization of the FC should satisfy the following conditions:
Possible values:
certainly (2) perhaps (1) borderline (0) perhaps not (−1) certainly not (−2) no judgment, since I don’t understand the example (N)
Questionnaire task description from the 2018 cognitive experiment campaign
The focus class is from the ontology no. 162945,
Ontology name: Proton Ontology Ontology description: “PROTON (PROTo ONtology) was developed in the SEKT project as a lightweight upper-level ontology, serving as a modeling basis for a number of tasks in different domains. To mention just a few applications: PROTON is meant to serve as a seed for ontology generation (new ontologies constructed by extending PROTON); it can be used for automatic entity recognition and more generally Information Extraction (IE) from text, for the sake of semantic annotation (metadata generation). PROTON was extended to cover the conceptual knowledge encoded within the most popular datasets from Linked Open Data like DBPedia, GeoNames, etc.” Involved entities: Focus class: Property: Target class: Proposed category:
