Abstract
Keeping track of data semantics and data changes in the databases is essential to support retrospective studies and the reproducibility of longitudinal clinical analysis by preventing false conclusions from being drawn from outdated data. A knowledge model combined with a temporal model plays an essential role in organizing the data and improving query expressiveness across time and multiple institutions. This paper presents a modelling framework for temporal relational databases using an ontology to derive a shareable and interoperable data model. The framework is based on: OntoRela an ontology-driven database modelling approach and Unified Historicization Framework a temporal database modelling approach. The method was applied to hospital organizational structures to show the impact of tracking organizational changes on data quality assessment, healthcare activities and data access rights. The paper demonstrated the usefulness of an ontology to provide a formal, interoperable, and reusable definition of entities and their relationships, as well as the adequacy of the temporal database to store, trace, and query data over time.
Keywords
Introduction
Today, no health system can guarantee a high level of efficiency without storing real world data (RWD). 1 RWD plays an increasing role in healthcare systems to foster clinical trials, observational studies or large-scale population monitoring, such as during the Sars-Cov-2 pandemic. 2 Thus, data modelling is required from different organizations to enable appropriate and efficient data reuse. This requirement implies a high level of technical and semantic heterogeneity in the source systems for reuse.3,4
For handling technical heterogeneity, different approaches have been proposed, such as mediation architectures like the Fast Healthcare Interoperability Resources (FHIR) layer5,6 or a common data warehouse models like i2b2 7 and the Observational Health Data Sciences and Informatics program with the Observational Medical Outcomes Partnership (OMOP). 8 For semantic heterogeneity, many challenges remains especially in modelling and tracking the evolution of data. In many heterogeneous environments, a knowledge model is needed to decipher the source structure, to identify relevant data elements to extract and to combine them. For these purposes, an ontology can be used as knowledge model to share and formalize relevant entities and their relations. In healthcare, biomedical ontologies have been used to formalize biomedical concepts and to support data integration in many projects.9–11 However, challenges remain in using ontologies over existing databases and uniformly track data evolution across time.12,13
This paper presents an ontology-based data modelling framework for temporal relational databases to derive a shareable and interoperable data model. Then, we demonstrate the advantage of using an ontology and a temporal model by defining a use case about hospital organizational structure changes across time. Hospital organizational changes occur frequently, for example, when services and departments are merged, when new departments are created or hospitals are united, or when a medicine department that included a unit for children and a unit for adults at one point becomes restricted to adult care only. Keeping track of these changes in the databases is essential to support retrospective studies and the reproducibility of longitudinal clinical analysis, as well as to prevent false conclusions to be drawn and to provide proper access (or not) to the data.
Background
Many approaches, that convert an ontology to a data model, have been proposed but limitations have been identified in the coverage of ontological constructs and the correctness of the relational databases generated as demonstrated in the survey. 14 First, including temporal concepts in an ontology requires substantial design work, decreases reasoning efficiency and may cause semantic errors due to the complexity of axioms.9,15 Second, designing a data model using an ontology requires formal algorithms to guarantee semantic conservation11,16; and third, modelling and implementing a temporal data model in the database require advanced understanding of temporal representation, programming skills and design tools.17,18
Several ontologies of time have been defined, including the OWL-Time ontology developed by the World Wide Web Consortium, 19 the SWRL Temporal Ontology 20 and SOWL, 21 the Clinical Narrative Temporal Relation Ontology 22 and TimeML. 23 However, a cross-disciplinary effort is required to harmonize the definition of temporal concepts9,13 and make them work in a practical context with current databases. In temporal databases, several models and temporal query languages have been defined since 1970 in order to simplify time management. 17 Still many models have limited scope and carry interoperability limitations. 24 Also, some models offer modelling guidelines as well as constraints regarding temporal representation and querying.25,26 However, these models are still implemented manually and depend on the application domain.
Given the wide perimeter to be supported in the context of healthcare data reuse for research, management, and care delivery, it becomes essential to have a temporal model which stands on its own to provide intrinsic sound temporal concepts and computation independently of a specific domain. A more recent work, 27 the Unified Historicization Framework (UHF), generalizes temporal models and combines them to enable automation of the modelling and the to facilitate query construction. Moreover, due to errors that occur when feeding the data into the system or due to inconsistent behaviour of the real-world system modelled by the database, it is required to define standard query operations to evaluate data quality and avoid misinterpretation in temporal data analysis. Especially in healthcare information systems, temporal anomalies in relational databases may be indicative of the poor quality of the temporal data. In a recent work, 28 the authors defined generic queries to retrieve the anomalous tuples from a dataset or to label the tuples in a dataset with an indication of whether they are anomalous or not.
Objective and contributions
The aim of the paper is to present a modelling framework for building a temporal relational database using ontologies for data reuse perspectives such as data quality assessment, data privacy applications and epidemiological interests. To illustrate our framework and show its interests, we proposed the case of the hospital organizational structures in France, and we used a dataset from two hospitals. Finally, to evaluate the method we created validation queries over the chosen dataset that detected temporal anomalies in the hierarchy of the hospital structure across time (see 1). The main contributions of this paper are (1) the integration of two modelling approaches: modelling a relational database from an ontology and modelling a temporal database; and (2) the evaluation of the approach on hospital organizational structures to highlight the importance of ontologies and temporal database for data quality assessment.
Method
Building the ontological temporal relational database
The framework is built by combining two modelling approaches: Ontorelα, the ontology to relational conversion method 16 and UHF, the unified historicization framework for temporal data modelling. 27
The integration of Ontorelα and UHF is based on a common data model, the relational data model. Ontorelα provides a uniform data model in which the semantic is made explicit by the structure, and UHF provides a relational data model for a temporal database ensuring a sound temporal semantic, data integrity and query expressiveness. And most importantly, the modelling framework uses an automated process independent of the domain and context, and it can be applicable to many use cases and applications.
Ontorelα is an automated method that defines uniform and consistent conversion rules of ontology constructs (classes, properties, axioms, cardinality restrictions, data types and annotations) to relational constructs (relation, candidate keys, referential keys, general constraints). Briefly, a class, a datatype or a property (object property or data property) is converted into a relation with a candidate key. An axiom is converted to a join relation with referential keys defined towards the linked entities. A cardinality restriction is converted to a constraint to check the number of each individual participating in the axiom. Finally, an annotation is used to document the database and to provide multiple access interfaces in different languages using views. Then, using an ontology a shareable and interoperable relational database (OntoRel) is obtained. The OntoRel can be implemented in a relational database system to efficiently collect, store, and retrieve data with the ontological annotations for various applications. For formal details about the conversion rules and process, see 16.
UHF defines deterministic conversion rules based on fundamental relational and temporal representation. The temporal model uses a discrete time model based on time points and time intervals [b,e] ([begin time point,end time point]). The temporal attribute used is the valid time. The valid time represents the period when a proposition is true in the modelled reality.
29
The temporal schema creation process consists of generating from each table (Ti) three temporal tables with temporal constraints and a “history” table. For each table, three temporal relations are generated to store different time semantics: • Ti@Vbe (the During table) is the extension of Ti with a valid time period attribute @Vbe = [begin,end] where the begin point and the end point values are known. Ti@Vbe represents propositions that are valid (considered true) during a period of time (the proposition is considered true for each point included in the period). • Ti@Vbx (the Since table) is the extension of Ti with a valid time begin point @Vbx = [since,ufn] where since is the known begin point of the period and ufn (until further notice) is the unknown end point of a period. Ti@Vbx represents propositions that are valid since and after a point in time. • Ti@Vxe (the Until table) is the extension of Ti with a valid time end point @Vxe = [faw..until] where faw (from a while) is the unknown begin point and until is the known end point. Ti@Vxe represents propositions that are valid before and until a point in time.
The history table (Ti@history) represents all data modification over time of a table. The table can be stored or calculated over the union of the three temporal table using a
Furthermore, keeping history changes may introduce temporal inconsistencies: redundancy, contradiction, circumlocution, and non-denseness. 26 These inconsistencies may happen when attributes in the same relation are modified independently. In UHF, the verification of these inconsistencies are expressed simply and efficiently either at the time of import to prevent inconsistencies or at the time of quality check to detect temporal anomalies. 28
Use case of hospital organizational structures in France
The rationale for choosing such an application was twofold. First, many studies in health services research evaluate among other things the effectiveness of clinical procedures, healthcare processes and management of healthcare facilities. 30 All these studies require fine-grained organizational information rather than hospital-level analysis31,32 to be able to compare outcomes and costs associated with practice patterns and variations in care processes.33,34 Second, there are strict rules surrounding the data access, data sharing and data reuse for research purposes. Data access policies are based either on organization-level, on patient-level, or on a combination of both. Many hospitals in Europe have adopted standard data access policies based on the organizational structures. By default, doctors are allowed to access and reuse the data of all patients that have been admitted in their department. At patient-level, the “data access by project” is adopted by a data warehouse in the United States 35 and in Europe. 36 At organization-level, the data access is granted by certified structures or an ethics committee. 37 As a result, for a retrospective study, such rules cannot be implemented without a deep understanding of the hospital organizational structures and their relationship across time. Usually, this is largely done with ad hoc methods developed manually.
An ontology for hospital organizational structures
Knowledge models for administrative data, especially for a hospital organizational structure, are rare or very specific. 38 For example, in the biomedical community, the Ontology of Medically Related Social Entities presents only general high-level concepts 39 (e.g., hospital, home care); the Ontology of organizational structures of trauma centres and trauma systems 40 describes the organizational structure of a trauma centre (e.g., paediatric trauma centre); and the Ontologized minimum information about biobank data sharing 38 describes the organizational structure of a biobank. In the semantic web community, we can find the Organization Ontology (ORG) 41 developed as part of the World Wide Web Consortium, which provides a basis to represent generic organizational structure. Thus, to represent hospital organizational structures for multiple institutions all these ontologies need to be tailored.
For the proof-of-concept, we concentrated our study on French hospitals. Thus, we investigated with various stakeholders to describe the important entities for organizational structure in French hospitals (see Figure 1). Also, the model is derived from the principles of the French national management accounting for hospitals
42
that is linked to the Program of Medicalization of Information Systems, a legacy system based on the diagnosis-related groups that are used for billing in all French hospitals.
43
Mainly, the organizational structure hierarchy has 4 levels with 2 elementary structure: the hospitalization unit (HU) and the functional unit (FU). A HU has an administrative responsibility scope, and a FU has a medical responsibility scope. The other levels are aggregates of HU or FU elements according to different roles (administrative, medical, or mixed). Conceptual view of organizational structures in French hospitals.
We designed an ontology, the Hospital ORGanization Ontology in France (HORGO-FR), based on the concepts in Figure 1. HORGO-FR was designed using Protégé software. 44 Starting from the ORG ontology4 top entities, we created new classes to represent the organizational structures in French hospitals, i.e., Hospital is defined as a subclass of ‘org:Organization’, the units are subclasses of ‘org:OrganizationUnits’. The hierarchy of units is described using the object properties ‘org:unitOf’ and ‘org:hasUnit’, and the links between two units are described using the ‘linkedTo’ object properties. Moreover, some classes were added to represent information about entities such as a unit identification (described as a ‘unit identifier’ and a ‘unit label’) and a medical speciality of a functional unit.
The reason for designing this ontology was threefold: (i) provide a shareable definition of the organizational structures of French hospitals (ii) support reasoning about these structures including for consistency checks, 45 (iii) benefit from the pipeline already developed to translate ontologies into relational data model. 16 It should be noted that Ontorelα and the temporal approach presented here are neutral regarding the content of the ontology used, so the goal was not to focus on a global representation of the structures, but rather to model a correct representation that highlights the importance of the temporal data model in our case study.
A temporal data model for hospital organizational structures
HORGO-FR is used to generate a data model. Then, the data model is extended with temporal constructs to build the temporal database. The OntoRel can be converted to different data models, a graph model or a relational model.
Figure 2 shows part of the OntoRel as a graph. The nodes represent classes, the directed edges represent subclass-super-class relations between the classes, and the undirected edges represent an axiom, i.e, ‘Healthcare division’, ‘hasUnit’ some ‘Functional unit’. Illustration of OntoRel of HORGO-FR as a graph.
Figure 3 shows part of the OntoRel as a relational schema. The rectangles represent tables derived from classes (rectangle with a single line) and tables derived from an axiom (rectangle with a double line). The lines with an arrow represent the referential keys derived from an axiom (an inheritance axiom or object property axiom). Illustration of OntoRel of HORGO-FR as a relational model.
Then, for each table in the OntoRel, temporal tables are created according to UHF. Figure 4 shows the creation of the temporal tables: 3a) shows the “Hospitalisation unit” table with the key attribute HORG-FR_000012_iid (HORG-FR_000012 corresponds to the Internationalized Resource Identifier of the ontology class); 3b) shows the table representing the axiom “Medico-Administrative unit hasUnit some Hospitalization Unit” with the key attributes HORG-FR_000011_iid and HORG-FR_000012_iid. The initial table is partitioned into three tables and extended with temporal attributes, finally a history view is defined to build the history of an entity. The motivation behind the three partitions is that the semantic of the data model must be defined by its structure and not by the values of the data. This modelling decision allows a common interpretation of the query definition and its result. Please refer to 27 for more details. Illustration of the temporal schema creation process of a table : (a) temporal schema for the “Hospitalisation unit” relation, (b) temporal schema for the “Medico-Administrative has Unit Hospitalisation unit”.
Dataset and ethical aspects
The framework is evaluated at a public health establishment with seven affiliated universities that comprises 39 hospitals. Among them, one has an i2b2 clinical data warehouse (CDW) in operation since 2009. The CDW contains almost all the electronic health records data produced in the hospital since 2000, 36 resulting in a total of 1,164,525 patients and 2,830,351 encounters in 2019-01-08. Yet, the data related to the hospital organizational structures and its evolution is not part of the CDW. Thus, the evaluation presented here aims at exploring the possibility of the future integration of the clinical and the organizational structure data. The approach is evaluated using HORGO-FR and the data of organizational structures extracted from an institutional database 46 through the IBM InfoSphere MDM(TM) solution, which holds the organizational structure data. The use case is conducted under the methodological guidelines MR-005 of the French national data privacy authority (CNIL, Délibération n◦ 2018-256 du 7 juin 2018). All scripts are available on Github.
Results
General assessment
First, the ontology-based temporal database for the hospital organizational structure was generated from HORGO-FR. HORGO-FR contains 124 definitions, including 25 classes, 3 object properties, 4 data properties and 68 logical axioms. The consistency checking was performed using HermiT and Pellet reasoners installed in Protégé. Protégé is an open source and multi-platform ontology editor. HermiT and Pellet are publicly available reasoners for ontologies used to validate the consistency of axioms. The ontology-based temporal database was implemented in a PostgreSQL database. PostgreSQL is an open source and multi-platform database that fully supports temporal intervals and operators using powerful indexes required for temporal constraints and queries. 47
Second, the source data was acquired incrementally using transformation rules developed with SQL functions. We had one file of organizational structures per month (from 2013 till 2019). Thus, to feed the database transformation rules were applied on each file. The temporal relational database contains 656 relational constructs, including 153 temporal tables, 70 history views, 343 constraints and 80 procedures for data verification and data modification. On the 5th of January 2019, the temporal relational database contains a total of 2985 different unit instances and 3529 different relationships instances of two hospitals X and Y.
Data quality assessment using ontology axioms and temporal features
The data was validated using a set of 21 validation queries. The validation queries (VQ) are defined according to the ontology axioms with the following object properties: hasUnit, unitOf and linkedTo. Each validation query tests two properties: (A) the cardinality restriction of the unit lifespan, and (B) the temporal inclusion between related units. The temporal cardinality test verifies if a super-unit has the minimum or more than the maximum number of subunits at each point in time during it’s lifespan. The temporal inclusion test, verifies if a sub-unit valid time period is included in the super-unit valid time period using Allen operators (equals, during, begins and finishes).
Results of the validation queries.
These tests demonstrated the usefulness of the ontology as a base model to formulate validation queries and the usefulness of a temporal database to verify the constraints over time. The detected anomalies were used to notify the persons in charge of the reorganization and the data access definition, as well as the researchers, so that they can take these anomalies into consideration while querying the data. Finally, the implementation of validation queries in the database could be an interesting way to detect potential errors generated by a reorganization of the hospital structures.
Querying organizational structure using ontology axioms and temporal features
We have evaluated the current approach from two perspectives: data reuse for clinical epidemiology and data access rights. Using the ontology-based temporal database, the analysis of organizational structures and healthcare activities over a long period of time and across multiple hospitals was straightforward. The ontology and the temporal relational database allowed us to construct semantic queries to extract the hierarchical structure over different periods.
The epidemiological use case
This use case focuses on the evolution of the organizational structures of units related to the nutrition specialty at hospital X and Y. The results are illustrated in Figure 5. Evolution of healthcare activity and structures related to obese patiants in hospital X and Y. (A) Relative proportion of obese patients per month computed as the ratio between the number of obese patients seen in hospital X (resp. hospital Y) and the total number of obese patients. (B) Evolution of the organizational structure related to the nutrition service at hospital X. (C) Evolution of the organizational structure related to the nutrition service at hospital Y.
First, data on the healthcare activities were extracted from the French hospitalization summary discharges database. The query used the discharge summary data with E66 (“obesity and overweight”) ICD-10 code between 2013-01-01 to 2019-01-01. Figure 5(A) presents the relative proportion of obese patients per month computed as the ratio between the number of obese patients seen in hpostial X (resp. hospital Y) and the total number of obese patients. The results contain respectively 16,899 and 26,212 hospitalization stays from hospital X and hospital Y with the E66 code corresponding to 16,921 and 8,087 distinct patients (resp.). Six months after the creation of nutrition units (black transversal line), we observed a temporal shift of the relative proportion of patients at hospital X (from 9% to 12%). At hospital Y, the number remained stable.
Second, the organizational changes were extracted from the ontology-based temporal database using a temporal query over the hierarchy of units. The query builds the hierarchical structure from 2013-01-01 to 2019-01-01 of all functional units having nutrition as the medical specialty for both hospitals. The hierarchy is built according to ontology axioms using “unit Of property”. Figure 5(B) presents the evolution of the organizational structure related to the nutrition in hospital X. Figure 5(C) presents the evolution of the organizational structure related to the nutrition in hospital Y. It should be noted that, while it may seem that new units are also appearing at hospital Y at the beginning of 2018, UTN units (Unité Transversale de Nutrition in French) are created to improve the quality of care by sharing medical resources between the nutrition specialists and other specialties of Y hospital, hence the lack of increased activity with their creation, as they do not represent hospitalizations.
To summarize, the results show a significant increase in the prevalence of obesity among hospital X patients after July 2016. Thanks to the temporalized organizational data and the ontology, this phenomenon can be easily interpreted with regard to changes in the organization: new units specialized in nutrition opened at hospital X in July 2016 and, at the same time, changes were made in the Y hospital (Figure 5(B)), but in the latter, the activity pertaining to obesity remained stable despite the changes in the structure. Indeed, these changes correspond to the transfer of the head of the department of nutrition from the hospital Y to hospital X in July 2016. This example demonstrates that: (1) data interpretation requires that the organizational structure history be integrated in the CDW; and (2) storing the data in a temporal database provides an adequate solution for automated queries that consider time aspects, e.g., search for all units that exhibit a significant increase in the prevalence of some ICD code and any organizational change for a certain period.
The access right use case
This use case reports on the organizational changes in cardiovascular surgery and reanimation divisions at hospital X for 7 years and their impact on access rights. The results were obtained as follows (see Figure 6). Evolution of activity related to four reanimation units in hospital X and the cardiology and the anaesthesia service and department structure.
First, the number of stays in the four specific hospitalization units (HU) was calculated based on the CDW: HU 339 (cardiovascular surgical reanimation unit), HU 681 (general surgical reanimation unit), HU 331 (surgical reanimation unit for transplantation) and HU 682 (surgical reanimation unit for transplantation). According to the administrative staff responsible for maintaining the organizational structure data: two new HU were opened: HU 681 and HU 682, and two HU were closed: HU 339 and HU 331. The data of the query results reveals a potential transfer of activities between HUs from 2018-08-01 to 2019-02-01. However, notice that the number of hospitalizations in HU 331 and HU 339 was still changing several months after the structure changes. This can indicate that the data about this structure was not updated in the electronic health records. As a result, the data access rights were not changed accordingly, causing disruptions between administrative decisions and medical activities.
Discussion
Recently developed guidelines for improved data reusability, 37 which are referred to as the FAIR Data Principles (Findability, Accessibility, Interoperability, and Reusability), put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This work is a proof of concept of the adoption of such principles. Beyond data collection, data stewardship includes the notion of ‘long-term care’ of the data. Thus, the temporalization of the data and the metadata is essential to track the data evolution. The temporal database is used to store, to trace, and to find all data changes across time. The ontology provides a formal, interoperable and reusable definition of entities and their relationships. Then, combining the two models by generating an ontological-relational temporal schema offers a coherent axiomatic structure for temporal data aligned with the entities and axioms from the ontology and temporal validation rules using temporal constraints. Our use case highlights the impact of the evolution of structures on the secondary use of healthcare data. To our knowledge, it is the first time a method linking ontology and temporal database has been implemented.
Clinical significance
The medical informatics community pays a limited attention to the issues raised by the representation of organizational structures in databases. The absence of comprehensive descriptions of hospital organizational structures has been previously acknowledged as a limitation in health services research. 48 However, we believe this topic to be crucial for analysing “big longitudinal data” repositories. Some terminologies integrate entities about the organizational structures, e.g., the Medical Subject Headings (MeSH) defines “health facilities” as “institutions which provide medical or health-related services”. Regardless, the MesH hierarchy does not reflect the granularity of the entities nor the hierarchy constraints to allow reasoning e.g., “Hospitals” and “Hospital Units” are siblings in MeSH. Therefore, more formal representation is needed. The generalization of the ontology to all hospitals in France is guaranteed by national accounting rules.42,43 Despite this, evidence from the scientific literature suggests potential for broader applications, as seen in countries like the Netherlands, 49 Denmark, 50 and the United States. 51 Nonetheless, the level of detail in existing publications about hospital structures is basic with a predominant focus on clinical variables. A key benefit of our results is the potential to conduct validation studies using existing databases. Our approach could facilitate the extension or adaptation of HORGO to other countries. The first use case demonstrated that analysing the number of obese patients according to the evolution of organizational structures can bring relevant results. The second use case demonstrated the difficulty of implementing data access policies across time without deep investigation of the reorganization of the structures. A holistic approach could combine the organizational level and the patient level for data policy access. According to Collins et al., 52 the data access policies of healthcare organizations could indirectly impact the analysis of data contained in patient health records, and consequences could reach beyond a specific institution. Yet, this approach is possible if we integrate an ontological and temporal description of the hospital organizational structure in a CDW with the consent registries. Thus, our method could make data access management easier, better controlled, and automated. This ensures that the rules governing the data access rights are rigorously followed even when the hospital organization changes.
Technical significance
The framework takes advantage of the extensibility of shared ontologies, the relational database maturity to define temporal queries and tools to automate the construction process. Structuring the data according to the ontology and storing it in a temporal relational database allowed the definition of temporal queries that reflected the organizational structure hierarchy.
The choice of an ontology is important. The ORG ontology developed by the W3C could be used for interoperability with the semantic web data, but it is not aligned with an upper-level ontology like the Basic Formal Ontology (BFO). This may restrict its ability to be assembled with biomedical ontologies, as many of them are based on the BFO. However, it is important to note that since no assumptions were made on the choice of an ontology, no intrinsic design choices would preclude the method from being used with a specific ontology or even being applied in other domains.
The choice of a temporal model is also important. Currently, there is no consensus on adding temporal concepts to an ontological model. However, there is a consensual temporal model in the database community that defines uniform structure patterns and constraints. 29 Therefore, the use of a temporal relational database to store data evolution over time is a more adequate system. In our method, the temporal schema is implemented in a relational database and uses existing temporal features. Moreover, the temporal model represents three categories of valid time to handle future indeterminacy, period determinacy and past indeterminacy. This representation can be easily extended by adding another timeline, such as transaction time, without impacting the current structure. 26 Finally, at this stage of the project, the data acquisition and the data exploration phase are still done manually. However, this limitation could be alleviated by using navigation tools and query builder applications fully leveraging this new approach to benefit from the full semantic while helping the user find the needed construct faster (e.g.,53,54).
Perspectives
The work presented in this article is part of three main convergent initiatives. First, extend HORGO to include structures from different countries, low-level units such as operating rooms and beds, as well as healthcare providers’ roles and functions to allow a wider variety of clinical studies across countries. Second, integrating this method into a mediation architecture and into the data warehouse design process. A fully integrated system, including a user-friendly interface for healthcare data exploration using ontologies, is under development. Third, integrating an automatized data quality assessment plan for the routine care data generated in hospitals. Previous studies conducted concluded that data quality control was crucial when data were integrated, 55 and that several temporal events, which remained unrecorded in the CDW until now, might have an impact on data reuse. 56
However, this requires investment in the development of ontologies and new tools to facilitate their definition and deployment. Moreover, an investment in training database designers and users on temporal concepts is a must to generate a better understanding of the advantages and the power of temporal queries.
Conclusion
Building a temporal database requires a knowledge model to formalize relevant entities and their relationships and an advanced understanding of temporal representation and interpretation. This paper presents the proof of concept of a framework that generates a temporal relational database from an ontology, the ontological-relational schema. The use of a formal ontological model and the resulting temporal schema allows: (1) to considerably improve the evaluation of the quality of the data using the defined axioms; and (2) to detect problematic situations much more systematically. The intended use of this approach is to facilitate and extend data reuse, in particular for data quality assessment and longitudinal studies from multisources projects.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Ministère de l’Économie et de l’Innovation – Québec.
