Abstract
The use of ontologies to model human behaviours that affect health is challenging since this process involves data from multiple inter-related domains that unfold and evolve over time. However, while current ontology development methodologies are generic enough to model any domain of interest, they do not provide design guidelines for modelling time-related aspects. This paper proposes a methodology for ontology development that entails the requirements for behaviours modelling based on passive temporal data. Its main focus is on temporal representations of classes and their holistic relations since no other methodology approaches ontology design from its temporal perspective. We exemplify these ideas by modelling the sleep behaviour domain, its relations to other behavioural aspects, and its effects on health.
Introduction
Human behaviour is a frequent research theme in the physical, psychological, and social areas because it is known that daily life behaviours influence individuals’ health and life quality in the long term Mokdad et al. (2004); Ng et al. (2020). Behaviour definitions differ according to the perspective used to analyse such a phenomenon. Levitis et al. (2009), for example, define behaviour as “the internally coordinated responses (actions or inactions) of whole living organisms (individuals or groups) to internal and/or external stimuli, excluding responses more easily understood as developmental changes (e.g., ageing)”. Fassnacht (1982) complements this definition saying that these actions can be “externally perceived” and, consequently, measured.
One of the interests in modelling behaviours is understanding how their elements (constructors) jointly affect aspects of individuals’ physical, physiological state, and social life. For example, the study of Pereira et al. (2020) tries to understand how personal behaviours affect the learning process. Similarly, Carmeli et al. (2009) discuss how personal behaviours affect productivity in workplaces, while Funk et al. (2010) investigate the influence of human behaviour in the spread of infectious diseases and how spreading models incorporate these behaviours.
These studies rely on different domains of human behaviour. However, they share the same idea of identifying constructors for behaviours that are important to the domain of discussion and relating these constructors to a conclusion. Ontologies Guarino et al. (2009) are appropriate resources to support the representation of these domains because they can simplify such domains in terms of classes, properties, and rules. Moreover, ontologies are forms of knowledge representation that rely on Description Logic (i.e., a decidable fragment of First-Order Logic) as their formal basis Baader et al. (2008). Thus, representations can be used together with reasoning mechanisms to generate new knowledge.
While these examples involve simple properties, some domains require more complex holistic relations to support conclusions. This is the case when we try to understand the health of individuals by considering their daily behaviours that influence each other. According to studies such as the WHOQOL research program Kim (2020), from the World Health Organisation, most repetitive behaviours of individuals influence their quality of life and, in the long term, affect their health. For example, sleep behaviour (e.g., sleep duration Hirshkowitz et al. (2015)), a routine of physical activities (duration and intensity Wang and Boros (2021)), personal relationships (e.g., time dedicated to these relationships Carmeli et al. (2009)), negative feelings (e.g., exposition to stressful situations Stewart et al. (2011)), and others contribute to the probability of expression of chronic diseases in the long term Ng et al. (2020). Mokdad et al. (2004) also highlight the effects of behaviour on health. Overall, the literature indicates the importance of holistically monitoring our behaviours in order to manage the emergence of chronic illnesses Ng et al. (2020).
Besides the holistic nature of behaviour assessments, temporal aspects are also usually present in the health domain. For example, isolated behavioural episodes (e.g., sleeping for only two hours in one night) do not significantly interfere with health. However, experiencing this as a repetitive behaviour may be the reason for serious health problems. Thus, the representation of the behavioural influences on health must incorporate temporal aspects. This representation is not trivial to model using ontologies. Current methodologies assist in the task of developing ontologies. However, while the definition of their steps gives a good notion about what must be done, they do not specify ways to conduct these steps, specifically concerning the modelling of temporal aspects. Thus, they are proper guidelines for creating representations of concrete and static domains. However, further directions and details could better assist developers in covering more dynamic domains, such as behavioural modelling.
This paper describes a methodology, called Onto-mQoL, that relies on the approach of Noy et al. (2001) as a starting point, extending its steps with pragmatical actions and modelling details to cover the requirements for behaviour domains that rely on passive temporal data. This means data that are mainly assessed through ubiquitous personal devices. As a case example, we apply this methodology to model individuals’ sleep and rest behaviour, adding practical aspects and design details to each step. Therefore, this paper contributes by supporting any knowledge engineer or project that intends to create models involving multiple inter-related domains that unfold and evolve over time.
Related work
This paper relies on two main definitions. Firstly, Fassnacht (1982) defines behaviour as an occurrence that emanates from an organism or may be detected by an observer outside of an organism. These occurrences (behaviours) are given, for example, in terms of sleep routines, tobacco and alcohol consumption, dietary behaviours, physical activity, and sexual practices. Moreover, such observable occurrences can be measured and generate data. For example, total sleep time and awakenings are features that characterise sleep routines, while the amount of ingested macronutrients characterises dietary behaviours.
Levitis et al. (2009) give an important second definition. Their work consolidated the results of a comprehensive study on published definitions for behaviour, together with survey responses from 174 members of three behaviour-focused scientific societies, for coining the behaviour definition as “the internally coordinated responses (actions or inactions) of whole living organisms (individuals or groups) to internal and/or external stimuli.”. This second definition emphasises the existence of internal and/or external stimuli, which can also be measured and generate data. Emotional states are instances of internal stimuli that can heavily influence behaviours Davis et al. (2015) and can be measured by psychology instruments (e.g., Daily Stress Inventory – DSI for stress assessment).
Observable occurrences and internal/external measurable stimuli associated with behaviours are referenced in our research as behaviour-related terms that characterise behaviours. Most ontologies that model behavioural aspects use subsets of these terms, such as the 15 ontologies identified in the scoping review of Norris et al. (2019). For example:
Health Behaviour Change Ontology (HBCO) Bickmore et al. (2011): specifies the conceptualisation of the conversational agent-based behaviour change interventions. It is a domain and theory-driven ontology since it only includes emotional states and therapeutic actions sanctioned by a theory or method of health behaviour change.
Neuro-Behaviour Ontology (NBO) Gkoutos et al. (2012): is designed to integrate behavioural observations in animal and human organisms. It also understands behavior as an organism’s response to external or internal stimuli.
Mental Functioning Ontology (MF) Hastings et al. (2012): domain ontology aimed at representing all aspects of mental functioning, including mental processes such as cognitive processes and qualities such as intelligence.
Cognitive Paradigm Ontology (COGPO) Turner and Laird (2012): represents experimental conditions focused on the presented stimuli, the instructions, and the responses requested.
The review of Norris et al. (2019) concluded that none of these 15 identified ontologies captures the full breadth and detail required to adequately describe and explain temporal events, such as human behaviour changes. The main problem is the need for expressive temporal representations. HBCO Bickmore et al. (2011), for example, relies on the OWL-time ontology to include temporal references. However, apart from the language constructs for the representation of time in ontologies, this standard does not contain mechanisms to represent the evolution of concepts (e.g., events) in time. Moreover, OWL-Time does not propose inference rules to infer new temporal data automatically. NBO Gkoutos et al. (2012) and MF Hastings et al. (2012) were structured using the Basic Formal Ontology (BFO) Spear et al. (2016), which is an upper-level ontology that provides a structure for development by dividing entities into two categories of ‘continuants’ (representing objects and spatial regions) and ‘occurrents’ (representing processes extending over time). Indeed, 10 of the 15 ontologies discussed in Norris et al. (2019) align their concepts to BFO. According to such works, BFO facilitates ontology development because one does not have to reinvent the wheel concerning basic categories and relations, and it improves the overall quality and interoperability. However, these ontologies did not take advantage of the abstract definitions of the BFO categories or their hierarchical components to detail or extend temporal representations. This limitation in modelling more expressive temporal representations is an important motivation for our research.
Requirements for behaviour modelling
Consider the following example in the mHealth area. A wearable device is used to collect data about the sleep and physical activity behaviours of an individual and represent such data using a ontology. Therefore, a deductive system that relies on this ontology could infer if this individual has some type of sleep disorder, such as insomnia, and the causes of this disorder. Considering this scenario, an ontology for behaviours modelling should be able to represent descriptions such as the example below:
She sleeps five hours per night [precise interval]. She always runs for 5 km starting at 20:30 before going to bed [qualitative relation between moments]. She goes to bed at 22:00 [precise moment] but stays awake for around two hours before sleep [uncertain quantitative relation between interval and moment]. She wakes up three times during the night [precise quantitative relation between moments and interval] and wakes up around 5:00 [uncertain moment] in the morning. These bad nights have occurred daily during the last month [precise qualitative relation between intervals].
A methodology that guides the creation of an ontology, which aims at representing such a behavioural description, should detail the design of four particular requirements:
Passive and personal data representations and their temporal contexts: so far, the characterisation of behaviours mostly relies on self-reported tools. However, the passive assessment of observable and measurable events (e.g., wake-up time, running periods) is a trend. Methodologies should consider the data associated with these events as the basis of a bottom-up modelling strategy.
Holistic representations based on temporal relations: if we intend to investigate, for example, if the physical behaviour (e.g., running 5 km before sleeping) is affecting the sleep routine, then the methodology must indicate strategies to create object properties that relate the classes of these domains. These holistic representations differ from most of the current domain-specific ontologies.
Broad temporal representations: as indicated in the previous example, behaviour modelling requires the representation of a generic set of time concepts (moments and intervals), time concept properties (precise and uncertain), time relations (interval-interval, interval-moment, and moment-moment), and time relation properties (qualitative and quantitative). Current methodologies only approach static rather than temporal representations.
Space representations: while behaviours occur over time, they also occur in space. Moreover, space is an important part of the behaviour study since it affects the routine of individuals. For example, individuals usually present different sleep behaviours when they are at home and in a hotel. Like time, space representations for behaviours also require the use of tertiary relations. The literature describes solutions for such representations, but they present limitations. For example, reification Noy et al. (2006) is commonly used to represent n-ary (e.g., tertiary) relations. However, it restricts the OWL capabilities since relations are described as the object of a property. Thus, the OWL semantics properties, such as inverse, cannot be applied. As the requirements and design approach for modelling the time and space semantics are completely different, our current paper focuses only on the former, leaving the latter one as a future research area.
Temporal representations are inherently embedded in behavioural domains. Thus, inferences and their refinements can only be managed by using broad and expressive temporal modelling. Temporal representations also affect the passive data representation, since data are assessed inside a temporal context, and the representation of holistic relations since several relations between classes of different domains are given in terms of time. For example, physical activities may improve sleep quality if conducted 90 minutes before sleep time. Thus, the relation “improves” has an associated temporal constraint.
Recent proposals Michie et al. (2020) already consider the importance of these requirements in the context of their behavioural representations. Our efforts contribute to that and provide a more concrete approach to managing the requirements, followed by the design and implementation of the ontologies in that domain. However, traditional ontology development methodologies do not discuss practical ways to guide the definition of these requirements. The analysis of the attributes described in Table 1 emphasises this limitation. This table is part of our review that captured the main aspects of well-known methodologies from the perspective of their pragmatic use. We selected these methodologies for analysis since they have evolved for a long time and, thus, used and evaluated in different projects, providing diverse examples and documentation Aminu et al. (2020); Sattar et al. (2020).
Summary of the main existing methodologies
Summary of the main existing methodologies
The number of steps indicates the level of detail that a methodology presents. However, even Methontology, which contains 11 steps, still presents abstract descriptions for its steps. For example, step 1: “builds a glossary of terms that includes all the relevant terms of the domain, their natural language descriptions, and their synonyms and acronyms”, and step 2: “build concept taxonomies to define the concept hierarchy when the glossary of terms contains a sizable number of terms”. Regardless of the methodology, their steps refer to what activities someone needs to do when building ontologies.
The third column (Tool) indicates that two methodologies are directly linked to tools that support their use. While WebODE Corcho et al. (2005) supports the Methontology approach, Protégé Noy et al. (2003) supports the development of ontologies that use the Noy and McGuinness methodology. However, such tools are mostly editors that also offer strategies for identifying consistency problems in their axioms (e.g., same individual as instances of two disjoint classes). Thus, the use of tools does not mean that a methodology indicates “how” (rather than “what”) their steps should be conducted.
The fourth column shows that most of the methodologies were defined as generic. This is certainly the main reason for their level of abstraction. In other words, they cannot be specific regarding how to identify ontology components since such a task may be domain specific. Differently, as the FAO-based methodology tries to define a more concrete process for the agriculture area, it presents a more detailed approach that uses a multilingual agricultural thesaurus as part of its process to construct agriculture-related ontologies.
The fifth column identifies the most prominent feature of each methodology, considering their practical use. We did not find, for example, a discussion of strategies that support the previous three requirements, apart from the FAO-based approach that integrates its steps to domain-specific resources. Moreover, the last column emphasises that no methodology embraces the representation of temporal models. This result was expected since ontologies do not naturally support this kind of tertiary-based representation, which requires extensions in the ontology semantics Wang and Uz (2019).
Our proposal, in particular, considers the Noy and McGuinness approach as the basis for our extensions. Two main reasons support this choice. First, their methodology considers the traditional sequence of steps (scope definition, terms identification, specification of concepts, concepts hierarchy, properties, and individuals), which is mainly adequate for initial ontology developers. The steps are neither too granular as Methontology, nor too abstract as Uschold and King. Moreover, Noy and McGuinness create an adequate division between concepts and properties. In other words, one can create a complete hierarchy of concepts and afterward think about the properties (divide and conquer principle). This division is clear in Protégé, which has different environments to cover the concepts and properties specifications. The second reason is the ample support for Protégé regarding plugins, community, documents, tutorials, and reasoners. Apart from these advantages, the Noy and McGuinness approach is still abstract, mainly when the domain requires advanced representations, such as temporal notions and relations. Onto-mQoL, a project that intends to develop knowledge representations for the mobile quality of life domain, tackles this issue.
This section refines each of the Noy and McGuinness steps using strategies that support the three main requirements defined in Section 3. Moreover, we align the core parts of our upper-level design ontologies to BFO, clarifying the distinction of concepts, such as processes and states. Figure 1 illustrates the methodological steps and their flow.

Onto-mQoL methodology, indicating the sections where each of its steps is detailed.
Consider the following question: How does behaviour X affect my health? For this paper, in particular, X is a behavioural routine that can be associated, for example, with physical activities, sleep, intake of specific foods, and social interactions. The instantiation of X defines a
After defining the domain, the next step is to define the
Reusing existing ontologies
Reusing previous ontologies is mainly important to identify ontological elements (classes and properties) already well-established in the literature. NESTORE Mastropietro et al. (2021), IEEE P1752 Open Mobile Health Bent et al. (2021), and SNOMED CT El-Sappagh et al. (2018) are examples of useful ontologies for the context of behaviour modelling based on mHealth data. These ontologies can be found in ontology repositories, such as the Behavioural and Social Sciences Ontology (BSSO) Foundry (2022) and BioPortal Algergawy and König-Ries (2019). Alignment with upper ontologies, such as Basic Formal Ontology (BFO) Spear et al. (2016), is also useful to improve interoperability and early make agreements on semantics (e.g., states and processes).
Enumerating the ontology terms
The next step of this methodology is to find important
These actions are valid for identifying terms of any behavioural domain. Examples of such actions are given in Section 5.
Defining classes and classes hierarchy
The bottom-up strategy is the adequate approach to organise the hierarchy of classes when the modelling process starts from the perspective of health data assessed from mHealth sources. This strategy defines the most specific classes while clustering these classes into more general upper classes. In other words, the process creates sub-trees that compose a tree. Therefore, health data represent the leaves of the hierarchy, and they support the construction of other classes, according to the information collected from the previous step. A typical and general hierarchy for behaviour modelling is illustrated in Fig. 2, which also shows the alignment of classes using BFO Spear et al. (2016).

Example of template for bottom-up modelling strategy: from data to person.
This schema has measurable observations (e.g., heart rate or the number of steps) at its bottom. These measurable observations are modelled as sub-classes of assessment that characterise a behaviour. For example, assessments of the number of steps and heart rate characterise a behaviour regarding the physical activities domain. Issues are derived from specific behaviours that attend predefined axioms. For example, a low amount of steps may characterise a relaxed or disease-impaired behaviour. Observe that behaviours from different domains can mutually affect each other. The definition of these relations ensures the holistic property of the representation. Finally, at the top level, the schema presents the concept of Person, which has several types of behaviours in different domains. This schema works as the representation backbone of behaviours, and several other concepts can be integrated according to the applications’ requirements. Section 5.4 exemplifies the instantiation of this template using Description Logic sentences Baader et al. (2008).
The definition of properties is divided into two parts: static and temporal. We give special attention to this latter type of property since temporal-based modelling is not part of other methodologies (see Table 1), and it is a major contribution of Onto-mQoL.
Defining static properties
The modelling of static object properties follows the method defined in Noy et al. (2001). However, over their modelling, we must consider that object properties can define relations between two concepts of different domains, and these relations may rely on temporal aspects. The definition of these relations is the main strategy to configure holistic representations. Rather than only identifying the influences between domains, we must also understand how these influences semantically work to justify their relations. A simple strategy is to obtain the assistance of a specialist in the domain that is the object of study. For example, the specialist in physical activities can indicate the main correlations of this domain with others, such as dietary behaviour Gillman et al. (2001), sexual activity behaviour Dekker et al. (2020), and sleep behaviour Vanderlinden et al. (2020). Therefore, we could focus our investigation on these related domains and find a body of knowledge that supports the properties’ representation.
Defining temporal properties
Temporal relations compose a particular type of object property. They are common in health domains due to their dynamic representations (e.g., diseases, treatment) that unfold and evolve over time. However, as indicated previously, time representations are underused in ontologies since this form of representation uses Description Logic (DL) Baader et al. (2008) as its formal basis. DL is a decidable fragment of the first-order logic that only uses unary and binary predicates. Therefore, ontologies do not naturally support representations for inputs such as “I slept well from 23:00 to 04:00”, “I had a post-traumatic stress disorder after my car accident”, and “I had a heart attack in 1999”. These temporal relations are difficult to represent, requiring ontological extensions, as proposed in Siebra and Wac (2022). While the work in Siebra and Wac (2022) provides the semantics for this type of extension, this section describes the ontological engineering to use its ideas in our methodology. Indeed, as knowledge engineering may be a complex process, including temporal notions tends to create further modelling challenges.
Onto-mQoL proposes the following strategy to reduce this complexity. The modelling process initially specifies the static version of the ontology. After that, apply one by one the following actions to transform the static version into its temporal version:
Action 1. Given the initial set of competency questions, identify the subset of such questions that require temporal notions to be answered.
For example, consider the simplified model below (Fig. 3), which indicates that a person performs a physical activity and drinks liquid, and the following competency questions: (1) What was the duration of the physical activity performed by a person? (2) Has a person drunk liquids during the physical activity? Note that this model is not able to answer such questions. Thus, it must be extended.

A simple example of representation, indicating that a person drinks liquids and performs physical activities. This type of representation does not directly allow the representation of the activities sequence.
Action 2. Create new temporal classes and properties that support the generation of answers for the previous questions.
As an example, two new classes are defined and included in the previous model: HydrationEvent and ActivityEvent. The first class refers to the moment when a person drinks liquid, while the second refers to the interval of the physical activity. Using these two new classes, we can state, via object properties (e.g., after, during, before), if a HydrationEvent occurs during an ActivityEvent, and also indicate the duration of the ActivityEvent. Thus, the ontology can answer the previous competency questions.
Action 3. Integrate the new classes into the original static ontology modifying the domain and range of the object property that connects the classes involved into the temporal relation.
Following the temporal framework in Siebra and Wac (2022), we use the N-ary approach to integrate the new classes into the original ontology (Fig. 4).

Extension of the previous model (Fig. 3) with two new temporal concepts. In this case, the inclusion of a temporal property between these two new concepts can directly represent the activities sequence.
The domain and range of the object properties were updated as follows: drinks (domain: Person ∪ HydrationEvent, range: HydrationEvent ∪ Liquid) and performs (domain: Person ∪ ActivityEvent, range: ActivityEvent ∪ PhysicalActivity). The discussion about the semantics of this approach is out of the scope of this paper, but its details can be seen in Siebra and Wac (2022).
Action 4. Connect the temporal structure to each new temporal event using the “occurs” object property.
Concepts such as HydrationEvent and ActivityEvent inherit from the Event concept. This event has an object property called occurs, which relates events to their temporal notion, which can be a moment or an interval (Fig. 5). Moments are time points and do not present a duration. Intervals have initial and final moments. Thus, a duration can be derived from their definitions. Considering the previous model (Fig. 4), the class ActivityEvent will be connected to an interval, i.e., occurs(ActivityEvent,Interval-X). Meanwhile, such an interval will be defined using two moments, i.e., hasBeginning(Interval-X,PreciseMoment-A) and hasEnd(Interval-X,PreciseMoment-B). In this example, precise moments define the interval. However, the proposed representation is very flexible since it also allows the representation of uncertain time notions. Details of this representation, such as its semantics, are found in Siebra and Wac (2022).

Template for temporal classes and properties representations of an event, and their alignment with BFO classes.
Action 5. Update the knowledge base with the individuals (instances) and correspondent values for their object and data properties.
For example, when an instance of the HydrationEvent class is included (from Fig. 3 to Fig. 4), other instances representing the interval of this event and its initial and final moments must also be included. As a general rule, temporal versions of static ontologies have additional classes and properties (moments and intervals associated with events), and we must include their instances (individuals) in the ontology description.
Evaluation of ontologies should be performed against a frame of reference, which can be represented by a set of predefined competency questions (CQ) Gómez-Pérez (1995); Bezerra et al. (2013); Fernández-López et al. (1997). According to this strategy, the questions defined in natural language during the initial step of the methodology are now considered as demands to verify requirements’ satisfiability by either knowledge retrieval or by entailment on its axioms and answer checking Bezerra et al. (2013). Such evaluation also requires a set of individuals (instances) to populate the knowledge base and criteria to qualify the ontology. The criteria commonly used are accuracy, clarity, relevance, and completeness. A deeper discussion on these and other criteria can be found in Amith et al. (2018).
Onto-mQoL methodology case example: The sleep and rest domain
Determining the domain and scope of the ontology
This case example considers the sleep and rest domain as the object of study. According to several papers Daza et al. (2020), Buysse (2014), repetitive behaviours of individuals regarding their sleep and rest actions/decisions (e.g., interval dedicated to sleep, time to wake up, time to go to bed and sleep) have a strong influence on their health in the long term and can be the main reason for physical and mental disorders. Considering our passive and personal data requirement, we are using mHealth data that can support outcomes for the competency questions (CQ) below:
(
(
(
(
(
(
(
These questions were based on hypotheses regarding the sleep quality domain. The codification of
Reusing existing ontologies
We are reusing the IEEE P1752 Open Mobile Health Bent et al. (2021) standard since it presents a specific vocabulary for mHealth data in the sleep domain. For example, this standard defines semantic schemes to the following sleep terms:
sleep-onset-latency: the amount of time between when a person starts to want to go to sleep and sleep onset.
total-sleep-time: duration after sleep onset in an entire sleep episode minus the duration of all awakenings.
wake-after-sleep-onset: duration summary of all awakenings (i.e., minimum 15 secs) after sleep onset in a sleep episode.
NESTORE ontology Mastropietro et al. (2021) is another source for reuse since it presents a specific package to model sleep aspects. Its schemes are used for representing terms that are not described in the P1752 standards. Indeed, the scope of NESTORE is ampler and also involves concepts of clinical-based measuring instruments (e.g., actigraphy, polysomnography) and questionnaire types (e.g., pittsburgh_sleep_quality and karolinska_sleep_diary). However, we are only interested in sleep quality concepts that mobile sensors can assess, mainly because they are enough to answer the above stated CQs. Thus, these other concepts are not part of our representation. Both IEEE P1752 standard and NESTORE do not cover sleep disorders terminologies and, consequently, they do not offer support for answer CQs such as
Enumerating the ontology terms
As discussed in the methodology, this step aims to identify the main terms that compose the ontology and support the inference of answers for CQs. This process is conducted using the four actions introduced in Section 4.3. As an example, the next Section 5.3.1 details the process of enumerating terms for one of the CQs (
Enumerating the ontology terms for
The first action (
In this case, “The American Academy of Sleep Medicine defines the disorder as a ⟨ “The term insomnia will be used as a disorder with the following diagnostic criteria: (1) ⟨
These definitions present several measurable observations (e.g., sleep initiation, sleep duration, sleep quality), and some of these observations (terms) have the same semantics (e.g., sleep initiation and falling asleep). Therefore, at this point, it is also important to unify the vocabulary using standard terms (reuse – IEEE P1752 Open Mobile Health and NESTORE ontology). Table 2 presents some terms used as constructors of the definitions, the standard term for such terms, and their semantics.
Measurable observations used as constructors of the insomnia definitions
Measurable observations used as constructors of the insomnia definitions
The next action (
X = (“sleep onset latency” OR “sleep initiation” OR “falling asleep”):
“Devices that measured behavioural aspects of sleep onset consistently overestimated PSG-determined sleep onset latency, but to a comparatively low degree.” Scott et al. (2020)
X = (“total sleep time” OR “sleep duration”):
“The sleep-latency and sleep-duration models accounted for 42% and 84% of the variance in the data, respectively, and yielded acceptable average prediction errors for planning sleep schedules (4.0 min for sleep latency and 0.8 h for sleep duration).” Vital-Lopez, Balkin and Reifman (2021)
X = (“wake after sleep onset” OR “awakenings” “sleep consolidation” OR “staying asleep”):
“In conclusion, we found moderate agreement for total sleep time (TST) and WASO measured by six different commercial activity monitors (CAMs) as compared to an Actiwatch Spectrum (AW). Of the six CAMs, Samsung, followed by Fitbit and Jawbone, had the highest agreement with the AW across various indices when considering both TST and WASO.” Kubala et al. (2020)
X = (“sleep quality” OR “nonrestorative sleep”):
“Sleep efficiency (SE) is an overall measurement of a person’s sleep quality and is simply ratio of the time spent asleep (TST) to the amount of time spent in bed (SOL + WASO + TST). SE is normally 85 to 90% or higher for people who do not suffer sleep problems.” Lawson et al. (2013)
The next action (
Terms of sleep measures according to their value/ranges
Table 4 shows the terms identified for the sleep and rest domain using
Terms of sleep domain and their relations to CQs
Terms of sleep domain and their relations to CQs
This table also includes terms from the negative feelings (supporting
The hierarchical organisation of the previous terms uses the Description Logic (DL) as representation language. Therefore, a parent concept must be created to cluster children concepts that present the same nature or have similar data or object properties. This strategy is used in object-oriented programming when classes and super classes are defined. Figure 6 shows a visual representation of the DL sentences – (1) to (4) – that organise the terms of Table 3.

The parent classes in Fig. 6 (e.g., SleepOnSetLatency) are types of sleep measures. Therefore, we create a new class called SleepMeasure (type of assessment) (5), which works as the parent class of these measures (Fig. 7).

Visual representation of DL sentence (5).
These measures characterise sleep behaviour (or quality of sleep behaviour). Thus, these two classes SleepBehaviour and SleepMeasure hold the following property (6):
Finally, the root concept of the Sleep and Rest domain is SleepBehaviour, which can be further associated with an enumeration of values (Very dissatisfied, Dissatisfied, Neither satisfied nor dissatisfied, Satisfied, Very Satisfied), which qualify this behaviour. This concept of quality in sleep behaviours is mainly represented in Kim (2020) and Mastropietro et al. (2021).
Static properties
This step identifies relations between classes (Section 4.5) according to the semantics provided in the competence questions. For example, all competence questions try to relate a person (e.g., participant, older adult, individual) to some type of sleep behaviour. Thus, Person must be defined as a class, and a property called hasSleepBehaviour must be specified to relate the SleepBehaviour and Person classes.
The answers still limited when we try to explain the causes of low-quality sleep behaviours using only sleep measures. For example, the WHOQOL instrument brings the following question as part of its survey: “How satisfied are you with your sleep?”. Consider that the answer is “Very dissatisfied”. The current representation can show that the reason may be the short sleep time or the high number of awakenings. However, we can ask for more details, such as: what could be the reasons for this short sleep time? In this case, the reason could be a simple individual decision or other non-controlled factors related to daily life circumstances. According to the specialists and grounded in the current literature, two domains are closely related to sleep behaviour: negative feelings Stewart et al. (2011) (
As discussed before, it is not our intention in this paper to demonstrate all the modelling processes of the negative feelings and activities of daily living domains. At the moment, we give a simple description of these domains (7)–(8) based on the attributes presented in the dataset of our case example Rossi et al. (2020a).
According to the respective instruments Rossi et al. (2020a) to assess Anxiety (State-Trait Anxiety Inventory), Anger (Positive and Negative Affect Schedule), and Stress (Daily Stress Inventory); these concepts can be further classified as (9)–(11):
Finally, sleep measures that characterise a sleep behaviour can also characterise a sleep disorder or issue (e.g., insomnia in
Sleep applications of all kinds are currently available in the market, offering diverse functionality features, …and even aiding healthcare professionals
As sleep is a risk factor for many chronic diseases, the momentary
Mobile health (mHealth) apps offer a scalable option for
These merits of commercial
Note that the semantics of these verbs (screening, tracking, or treating) always refer to the possibility of collecting data and characterising the existence of some disorder. A complete graphic view of all components, and their source file (owl), is available as a supplementary online resource (see appendix).
Temporal properties
The analysis of the competence questions also emphasises that sleep behaviours are associated with a period whose standard term is SleepEpisode Bent et al. (2021). Therefore, the analysis of several sleep episodes can indicate the presence of sleep disorders, such as insomnia. The classical solution for this representation is to create a data property (e.g., sleepEpisodeData) for SleepBehaviour (Fig. 8a).

Two different approaches to represent time using data property (a), and temporal classes (b).
However, this strategy limits the reasoning process since ontologies do not support the specification of relations (e.g., before or during) between data properties. Therefore, we employed the strategy described in Section 4.5.2, creating the SleepEpisodeEvent class (Fig. 8b), which occurs at some TimeNotion. This strategy allows the specification of temporal relations between this concept (SleepEpisodeEvent) and other temporal elements of the domain. The following schema (Fig. 9) shows the relations between Person and both DailyActivity and NegativeFeeling. Note the inclusion of the DailyActivityEvent and NegativeFeelingAssessmentEvent classes, which give the temporal notions for these relations. For example, this strategy enables explicitly representing negative feelings assessments and relating them and other events using temporal object properties such as before or during. This type of relationship is important for the reasoning process, as demonstrated in Section 5.6.

Relations between the Person, DailyActivity, and NegativeFeeling concepts.
Source of individuals
The ontology population with individuals is essential to its evaluation. There are two common strategies to create these individuals: conducting a longitudinal assessment and using a predefined dataset. This second approach is used in this section. Therefore, this case example relies on the Multilevel Monitoring of Activities and Sleep in Healthy people (MMASH) dataset Rossi et al. (2020a,b) since it covers a significant part of our ontology. That means MMASH provides the required data to answer the CQs. The MMASH dataset organises the data of 22 healthy participants in seven files according to the type of information (e.g., questionnaire answers, actigraph). During the test, participants wore two devices continuously for 24 hours (starting at about 10:00 am): a heart rate monitor (Polar H7 heart rate monitor – Polar Electro Inc., Bethpage, NY, USA) to record heartbeats and beat-to-beat interval, and an actigraph (ActiGraph wGT3X-BT – ActiGraph LLC, Pensacola, FL, USA) to record actigraphy information such as accelerometer data, sleep quality, and physical activity.
Regarding the ethical aspects, the MMASH data assessment is under the Helsinki Declaration, as revised in 2013. MMASH protocol was also approved by the Ethical Committee of the University of Pisa (#0077455/2018), and it is under the General Data Protection Regulation: Regulation – EU 2016/679 of the European Parliament and of the Council 27/04/2016 – on the protection of private persons concerning the processing of personal data and on the free movement of such data. We consolidated a subgroup of the MMASH data in a unique file as described below:
Five personal data, which are biological gender (M or F), height (cm), weight (Kg), age (years), smokeDailyFrequency.
Four sleep related assessment data, which are TotalSleepTime (time unit), Awakenings, WakeAfterSleepOnset (time unit), and SleepOnsetLatency (time unit).
Seven psychological assessment data, which are two Anxiety states values (integer from 20 to 80) before and after sleeping, one daily stress values (integer from 0 to 406), and four Anger values (integer from 5 to 50) at the time units 600, 1080, 1320 and 1980. The assessment values are defined according to the respective instruments used in Rossi et al. (2020a).
Activities assessment data with several entries (activities) that summarises the physical DailyActivity executed over the assessment period.
Sleep quality assessment, which represents the sleep quality defined by the individual.
The timeline considers continuous values in minutes, starting from zero (00:00, first day). This data is available in the supplementary online material as an excel datasheet, which the Protégé ontology editor can import and automatically transform into ontology individuals using the cellfie plugin.
Evaluation framework
As discussed in Section 4.6 and in Noy et al. (2001), CQs serve as a qualitative resource to test ontologies. Therefore, our case example verified if our ontology covers the required knowledge to answer the set of CQs defined in Section 5.1 (
This example shows that a SQWRL query has two parts. The antecedent rule (before the symbol “→”) is composed of a conjunction of unary (concepts) and binary (object properties) atoms. SQWRL treats this part as a pattern specification for a query. In this case, the rule says that the variable ?person must be an individual of the class Person with the quality of his/her sleep behaviour classified as poor. Meanwhile, this rule also verifies if this ?person carried out a physical activity ?da during the interval ?e2, and if its intensity was moderate (ModerateIntensity(?da)). Finally, the rule verifies if this physical activity occurred before the sleep episode. SQWRL returns all distinct individuals (instances) of Person that satisfy this pattern.
We suggest the paper of O’Connor and Das (2009) as a gentle and easy-to-follow introductory tutorial for the SQWRL syntax. The complete set of rules for this case example is available as a supplementary material.
Evaluation criteria
We employed
The codification of
SPARQL queries results
SPARQL queries results
The following questions verify if the ontology supports identifying the causes or the understanding of these potential sleep issues. The codification of
The codification of
The inference results (Table 5) demonstrate that the ontology supports accurate answers for the CQs used in this evaluation. We could create several other competency questions to explain or better understand the causes of poor sleep quality of individuals. These questions must always rely on knowledge captured from experts or specialised literature, while the role of ontologies is to bring the constructors to answer these questions. If the ontology fails to provide such a representation, this becomes an opportunity to extend it, as we did for
The previous case (Section 5) detailed the use of our modelling methodology to represent behaviour that could affect the sleep quality of individuals and support the reasoning using such knowledge. However, other behaviour-related domains can also make use of this methodology. Consider two of the ontologies discussed in Section 2, which are (Health Behaviour Change Ontology (HBCO) Bickmore et al. (2011) and Cognitive Paradigm Ontology (COGPO) Turner and Laird (2012)). The following schemes (Fig. 10) represent parts of these ontologies.

Fragments of the HBCO (a) and COGPO (b) ontologies.
This HBCO ontology fragment (Fig. 10a) represents therapeutic actions that are part of a counseling session and are appropriate for a specific mental state treatment. Such sessions compose an intervention that aims at, for example, changing the behaviour of patients. Suppose the following competence questions: Which is the sequence of therapeutic actions that provides the fastest result as a part of an intervention? and Which were behavioural changes observed during the intervention? Our methodology could be used to extend this ontology and answer these questions, including: (1) data-centric observations and their temporal notions to characterise the behavioural changes; (2) holistic representations for extending the patient state concept and verifying if these states evolve at the same pace for a given intervention, and; (3) temporal relations to support the inference of answers for the first question.
The COGPO fragment (Fig. 10b) represents behavioural processes that are responses to some stimuli, which are given, for example, in the forms of words delivered to trigger some reaction. A possible competence question in this domain is: Which features characterise the behavioural processes and their relative reaction times? Again, the use of our methodology could extend this representation with data-centric observations and their temporal notions to compare the time and duration of reactions. For example, stimuli could present real-time or later effects, or later long-term consequences in different behaviour domains (e.g., sleep quality, psychological disturbers).
Apart from their different aims, both examples emphasise the importance of temporal representations to support more expressive questions. Thus, using a methodology that guides the development or extensions of ontologies, from the perspective of such representations, may be an important resource for promoting new forms of inferences in health support systems.
This paper presented a proposal for creating ontological representations of human behaviours in the context of human health. It brings four main contributions. First, it provides a more pragmatic but still general approach to this type of modelling. This approach differs from current methodologies, which maintain a high level of abstraction. Then, rather than only indicating what developers must do, we also explain strategies to conduct these steps. Second, we rely on the current trend of using personal health mobile technology (smartphones and wearable devices) as an assessment resource, instead of relying on self-reports. Therefore, we start modelling from the data types assessed using a bottom-up modelling strategy. Third, we include temporal aspects in this modelling. The lack of a time notion is one of the main limitations of current ontologies, which usually do not model such aspects, limiting the reasoning abilities of ontology-based systems. Finally, we show the importance of holistic representations, using knowledge of different domains, and how we could specify these representations. For example, using temporal relations between typical events of diverse life domains (e.g., physical health, psychological, social interactions, and environmental ones).
A further particular aspect of this approach is its strong reuse of existing knowledge. Our approach encourages the study of previous ontologies and terminologies, which could facilitate the future integration of systems. Indeed, this lack of reuse leads to several syntactically diverse terms that refer to the same semantics, as we have observed in our studies. Apart from reusing ontologies, we also encourage using previous knowledge when we rely on searching for terms and, consequently, concepts and properties, using specialised literature and reviews. Reviews, in particular, summarise domains and are the root of a tree of primary references or body of knowledge.
Deeper research on the ontology validation process is not part of our studies, and this aspect is a limitation of this paper. We recommended the use of competency questions as the basis for validation. However, using these questions is not straightforward regarding the evaluation criteria. For example, the clarity criterion requires a more systematic validation with the final users of the ontology Espinoza et al. (2021). On the other hand, the relevance criterion should be analysed by domain experts. The completeness criterion is especially hard to evaluate since it depends on the quality of the competency questions in terms of their coverage. This issue is similar to the traditional test coverage problem in the software testing area, where strategies include gathering information about which parts of a program are executed when running the test suite to determine which branches of conditional statements have been taken Malaiya et al. (2002). In the context of ontologies, a strategy to ensure the coverage is to create competency questions related to ontological elements (e.g., classes/concepts, relations, properties, axioms) as soon as these elements are included in the specification. Thus, very simple competency questions are initially defined and tend to become more complex as the representation evolves.
The lack of forms to guide the representation of space and its integration with time are limitations of our study. Indeed, behaviours may be affected by the place (space) where individuals are at a certain moment. For example, someone may present different levels of sleep quality if she/he is sleeping at home or in an unfamiliar place. The importance of the space to contextualise the analysis of events is already foreseen, for example, in upper-level ontologies such as BFO, which defines the class “spatiotemporal region” to represent occurrent entities (e.g., events, processes) that are part of spacetime. For example, the spatiotemporal region occupied by an evolving cancer tumor. Thus, these limitations are also opportunities for future research.
Finally, while the focus of our case example was on the sleep domain, modelling other behaviours in other domains can follow the same process. For example, we intend to extend this study using this process to model the psychological states and physical activities (quality of life) of cancer patients during treatment.
Footnotes
Acknowledgement
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement H2020-MSCA-IF2020-101024693. The authors would like to thank Dr. Pascale Gaudet, from the Swiss Institute of Bioinformatics, and Dr. Natasha Lino, from the Applied Artificial Intelligence Lab (UFPB), for the reviews and comments about this paper.
Online resources
The supplementary online material supports the understanding and replications of this study. It is composed of: (1) OWL file representing the ontology without the individuals; (2) OWL file representing the ontology with the individuals; (3) Excel datasheet with the individuals; (4) Rules in the json format to upload the individual in the ontology using the Cellfie plugin; and (5) the text file with the SQWRL queries.
