Sage Journals: Discover world-class research

Abstract

Objective

Data-as-a-product (DaaP) treats data as a marketable asset by applying product management principles throughout the data life cycle. Despite the high value of healthcare data, poor data quality hinders the effective implementation of DaaP in the healthcare industry. Robust data quality assessment is necessary to ensure that data products meet stakeholder expectations. This study aims to develop a healthcare data quality indicator framework (HDQIF) combined with the perspective of DaaP, explore the interrelationships among various indicators within HDQIF, and apply the HDQIF in assessing healthcare data quality.

Methods

A three-staged hybrid method is adopted. A specialized HDQIF is initially constructed through grounded theory and revised by Delphi consultation. Then, the HDQIF is investigated to identify key indicators and disentangle intricate interrelationships among indicators using a Decision-making Trial and Evaluation Laboratory and Adversarial Interpretive Structure Modeling (DEMATEL-AISM) method. Afterward, the HDQIF is applied to quantitatively assessing healthcare data quality using the analytic network process and fuzzy comprehensive evaluation (ANP-FCE) approach, with a case study demonstrating the practical application of the HDQIF.

Results

The HDQIF was established with 16 unique indicators that comprehensively captured both established and new aspects of healthcare data quality. The DEMATEL-AISM analysis revealed that a four-quadrant influential relation map categorized 16 indicators to identify crucial ones and a nine-level topological hierarchical structure model hierarchized 16 indicators to disentangle interrelationships among indicators. The application of the ANP-FCE approach validated the framework's capacity to quantitatively assess healthcare data quality, with a case study confirming the practicability of HDQIF.

Conclusions

The HDQIF offers a consolidated framework to support fit-for-use understandings of healthcare data quality within the DaaP context. Our findings provide several insights for healthcare organizations to improve healthcare data quality. Future work exploring context-driven adaptations of the HDQIF to facilitate the assessment of various types of healthcare data products is needed.

Keywords

Data-as-a-product data quality assessment DEMATEL-AISM healthcare data quality healthcare data quality indicator framework

Introduction

Data-as-a-product (DaaP) is a methodology where data are managed as a unique asset and treated as a marketable product by applying product management principles to the data life cycle.^1,2 In the healthcare industry, both practitioners and academics increasingly regard healthcare data as an invaluable asset and product characterized as additive, non-depletive, and replicable.³ A practical example is the UK Biobank, which not only collects and stores biological information and health data from half a million UK participants but also provides researchers with access to biomedical research data and data analysis.⁴ As healthcare data originate from diverse sources and exist in various formats and states,⁵ the inherent complexity of healthcare data amplifies the potential value of data. The necessity to contextualize DaaP in the healthcare industry has motivated the emergence of Healthcare-DaaP. With healthcare data accumulating rapidly, Healthcare-DaaP has spurred healthcare organizations to prioritize effective data management and leverage healthcare data to harness greater economic value by sharing or selling data.

While the priceless value of healthcare data has been widely recognized, that low-quality data impede data-driven health research and decision-making^5–7 may disadvantage the effective practice of Healthcare-DaaP. Healthcare data quality is vulnerable to various issues introduced at different stages across the data life cycle, including data generation, data transformation, data reuse, and post-reuse data quality reporting.⁸ Initially, healthcare data are generated from various data sources, including clinical trials, real-world studies, public health surveys, and genetic tests,⁹ and stored in healthcare information technology systems. Raw data require transformation to be structured and stored in repositories. Storage infrastructures support the secure and scalable retention of data and enable reuse for downstream data analysis. Integrating data analytics unlocks data value and allows organizations to optimize operations and enhance business performance.¹⁰ While data have been identified as a de facto asset with huge economic potential, attention should be paid to data quality issues that may compromise the utility and value realization of data.

Data quality is a multidimensional concept without a one-size-fits-all definition.¹¹ Wang and Strong¹² conceptualized data quality as “fit-for-use” and proposed a foundational framework that underpinned data quality studies across various fields. Inheriting and extending this viewpoint, research on healthcare data quality can be roughly categorized into two perspectives. One is the purpose-dependent perspective, where data quality informs adequate fit-for-use-ness for stakeholders to achieve predetermined goals¹³; the other is the characteristic-dependent perspective, where data quality comprises dimensions or indicators that reflect the specific data quality requirements.^14,15 Both perspectives appear to be dynamically integrated within the paradigm of DaaP, as DaaP not only prioritizes aligning data products with diverse stakeholder needs but also promotes data quality assessment to ensure that data products evolve in response to feedback from consumers and business goals. However, assessing healthcare data quality is far from trivial. One of the critical difficulties in healthcare data quality assessment concerns which indicators characterize data quality, which should be assessed, and what assessment methods are appropriate.¹⁶

Considerable progress has been made toward a systematic framework with diverse indicators to characterize healthcare data quality, such as completeness, correctness, concordance, plausibility, and currency.^5,17,18 The constructs of existing healthcare data quality frameworks can be bifurcated under the concept of “fit-for-use.” Some are object-oriented frameworks for diverse data sources such as electronic health record (EHR), electronic medical record (EMR), and wearable-device data.^18–20 Indicators within these frameworks emphasize intrinsic data quality determined by inherent attributes independent of the context in which the data are used. Others are purpose oriented, designed around intended data use, such as evaluating data for secondary use, assessing data reuse for clinical research, and facilitating high-quality rare disease registries.^5,18,21 In contrast to the intrinsic perspective, these frameworks highlight contextual data quality, which depends on the needs of data consumers who select and utilize data products.²²

Despite the recognized multidimensionality, inconsistencies in terminology pose challenges to applying these frameworks to assess data quality. An impetus is given to construct a new healthcare data quality indicator framework (HDQIF) that integrates prior indicators with newly emerged indicators within the context of Healthcare-DaaP to facilitate data quality assessment. Moreover, the complex nature of healthcare data gives rise to the under-exploration of the interrelationships among data quality indicators. Effective healthcare data management needs to disentangle intricate interrelationships of data quality indicators and identify critical ones. Yet limited studies have endeavored to explore these interrelationships. An emphasis on interrelationships of indicators within the HDQIF is believed to be pivotal for deriving tailored strategies for improving data quality. Overall, the study aims to address the following research questions to contribute to the existing knowledge and bridge the gap:

▪ What indicators should be included in developing the HDQIF combined with the perspective of Healthcare-DaaP?

▪ Which indicators are critical to healthcare data quality, and what are the interrelationships among all indicators within the HDQIF?

▪ How can the HDQIF be applied to assess data quality? What managerial insights for improving healthcare data quality can be derived from the HDQIF?

Methods

This study employs a hybrid methodology combining qualitative and quantitative approaches. It was conducted from September 2024 to February 2025 in Zhejiang Province, China. To address the aforementioned research objective, the study intended to construct an HDQIF that characterizes data quality with a set of indicators, to explore the indicators in their importance and interrelationships, and to apply the HDQIF to quantitatively assess data quality. Three research phases were organized into the methodology, whose full roadmap was delineated in Figure 1.

Figure 1.

Roadmap of methodology.

Phase 1: Development of the HDQIF. Phase 1 was conducted from September 2024 to January 2025. A literature analysis initially identified existing healthcare data quality frameworks following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The grounded theory with three-staged coding of initial coding, intermediate coding, and advanced coding was performed to extract indicators for building the HDQIF. In-depth expert interviews and two rounds of Delphi consultation were launched to refine the preliminary indicators, whereby 12 local experts from two healthcare data technology companies and a certified data broker company engaged. As a result, the HDQIF was established with diverse data quality indicators.

Phase 2: Exploration of the HDQIF. Conducted in January 2025, Phase 2 purposed to identify key indicators and disentangle the interrelationships among identified indicators within HDQIF using the DEMATEL-AISM method. DEMATEL is capable of clarifying the interrelationships among factors within a system by structuring the pairwise influence.²³ To quantify the qualitative linguistic description of the term “influence,” a fuzzy linguistic scale was adopted with terms represented by triangular fuzzy numbers (TrFNs) to depict the pairwise influence relationship in DEMATEL. The AISM was applied to generate a multi-level topological hierarchical structure model where significant interrelationships among indicators were retained while the noncritical ones were eliminated.

Phase 3: Application of the HDQIF. Phase 3 applied the HDQIF to quantitatively assess healthcare data quality through a multi-criteria decision-making (MCDM) approach. MCDM can serve as a newly exploratory approach in assessing data quality by integrating multiple quality indicators and expert judgments.²⁴ The total influence matrix from DEMATEL defined indicator dependencies and served as input for the analytic network process (ANP) to assign weights. The fuzzy comprehensive evaluation (FCE) method combined with ANP was applied to assess healthcare data quality. A case study on Chinese healthcare big-data company ZJSMH Co., Ltd, was conducted in February 2025 to demonstrate the practical application of HDQIF.

All research procedures were approved by the Ethical Review Organization at the Management School of Hangzhou Dianzi University (approved on 2 July 2024). Written informed consent was obtained from all individual experts and professional participants involved in the three-staged research investigations. Prior to their participation, all participants were fully briefed on the research purpose and potential risks (if any) to ensure voluntary and informed involvement.

Development of the HDQIF

Extracting preliminary indicators for HDQIF

To identify existing healthcare data quality frameworks, a literature analysis was conducted following the PRISMA guidelines.²⁵ The databases of PubMed and Web of Science were selected for retrieval of relevant papers. The search was conducted in December 2024 and covered the timeframe from January 2012 to December 2024. The timeframe started in January 2012 because the first systematic data quality framework focusing on the healthcare setting was proposed by Weiskopf and Weng.¹⁸ To ensure the comprehensiveness of the analysis, search terms included “indicator,” “domain,” “dimension,” and “framework.” The specific search queries were as follows:

▪ PubMed: ("data quality"[Title/Abstract] OR "data accuracy"[MeSH Terms]) AND “health*"[Title/Abstract] AND ("dimension*"[Title/Abstract] OR "domain*"[Title/Abstract] OR "indicator*"[Title/Abstract] OR "framework"[Title/Abstract]) AND 2012/01/01:2024/12/31[Date – Publication] AND “English" [Language]

▪ Web of Science: TS = ("data quality") AND AB = (health*) AND (AB = (dimension* OR domain* OR indicator* OR framework*))

A total of 2691 papers were retrieved, with 1283 from PubMed and 1408 from Web of Science. In the identification stage, 1025 duplicates (45.52%) were removed, leaving 1666 (54.48%) papers for eligibility screening. Inclusion and exclusion criteria were then applied to scrutinize titles and abstracts. Papers were included for analysis if they met the three criteria listed in Table 1. Specifically, 50 papers were excluded under criteria 1. The remaining 1616 were screened for relevance to healthcare settings based on criteria 2, by independently reviewing titles and abstracts. Among these 1616 papers, 68 were outside healthcare contexts, and 1315 lacked a data quality focus, resulting in 233 papers relevant to healthcare data quality. A final full-text screening based on criteria 3 excluded additional papers, yielding 22 papers for indicator extraction.^6,13–15^{,17–20,26–39} Figure 2 illustrates the literature analysis process following PRISMA.

Figure 2.

Process of literature analysis in PRISMA.

Table 1.

Inclusion and exclusion criteria for the literature analysis.

Category	Inclusion criteria	Exclusion criteria
(1) Paper type	Original studies and reviews are published in a peer-reviewed journal and in English.	Nonempirical, non-original studies, thesis, and non-peer-reviews publications
(2) Context	Papers are relevant to data quality issue in the context of healthcare (e.g. HIS, EHR, and EMR).	Papers not relevant to data quality researches or in a non-healthcare setting
(3) Target	Papers develop conceptual data quality framework or propose new insight on existing framework.	Papers without a focus on data quality frameworks or refinement on existing ones

The grounded theory, a method commonly used in qualitative research, was used to analyze the identified 22 papers. According to the research paradigm of grounded theory, a three-staged coding of initial coding, intermediate coding, and advanced coding⁴⁰ was applied to extract indicators for developing HDQIF. The coding was performed using NVivo 15. In initial coding, two researchers (R1 and R2) extracted “labels” from the 22 data quality frameworks, including dimension terminology describing data quality and their definitions. A third researcher (R3) reviewed all labels and removed duplicates to resolve disagreement. As a result, 171 labels were generated in initial coding. During intermediate coding, all 171 labels were grouped into 74 subthemes and further consolidated into 32 major themes based on their similarities and commonalities. In classic grounded theory, the goal of advanced coding is to produce a theory grounded in the collected data with the storyline technique. However, this study did not aim to derive a theory on healthcare data quality. The objective of advanced coding was adjusted to integrate the 32 major themes into multiple preliminary data quality indicators.

As existing literature characterizes data quality in multidimensional terminology, inconsistencies were observed due to overlapping definitions and synonyms, particularly for terms of “accuracy” and “correctness.” For example, Syed et al.¹⁵ interpreted “accuracy” as “the extent to which data reveal the truth about the event being described,” while Feder³⁹ referred to it as “the degree to which the value in the EHR is a true representation of the real-world value.” It was informed that ambiguous terms such as accuracy, validity, and correctness should be avoided, for they were commonly used in data quality terminology but carried wide-ranging and competing interpretations.⁶ Alternative terms were considered on these indicators to reduce confusion.

Finally, three researchers independently proposed indicator categorizations and reached consensus through discussion, deriving a preliminary set of 17 indicators. Figure 3 summarizes the derivation of the 17 preliminary indicators by tracing the analytical flow through the three-staged coding. The flow starts with initial coding where 171 labels were extracted from the 22 studies. From these labels, a total of 43 distinct terms to characterize data quality were identified as a priori indicators, which were further consolidated into 28 major themes based on their similarities and commonalities. Due to space constraints, the 74 subthemes are not displayed in Figure 3. Instead, the detailed mapping between 74 subthemes and 28 major themes is documented in Supplementary Table S1.

Figure 3.

Analytical flow for deriving the preliminary indicators.

Revising preliminary indicators with the Delphi method to establish the HDQIF

A two-round Delphi consultation was conducted to revise the 17 preliminary indicators, lasting for 6 weeks from December 2024 to January 2025. Twelve experts were engaged in the consultation, including five from ZJSMH Co., Ltd, and three from HZGPT Co., Ltd, two healthcare data technology companies; two from ZJBDE Co., Ltd, an officially certified data broker that facilitates trading of data; and two faculty members who specialize in data science and digital health from A University. The background of experts is summarized in Table 2, and the consultations were held through face-to-face and online meetings. Their experience ranged from 3 to 13 years, with an average of approximately 11.3 years. Among the 12 experts, seven were male and five were female; nine held master's degrees or higher, and four had more than 10 years of experience in the healthcare industry.

Table 2.

The demographic characteristics of 12 experts.

No.	Organization	Position at the time of the interview	Academic education	Experience (years) in the healthcare industry
E1	ZJSMH Co., Ltd	General manager	Master's degree	12
E2	ZJSMH Co., Ltd	Product manager	Bachelor	9
E3	ZJSMH Co., Ltd	Big data engineer	Master's degree	5
E4	ZJSMH Co., Ltd	Big data engineer	Master's degree	10
E5	ZJSMH Co., Ltd	Big data engineer	Bachelor	4
E6	ZJBDE Co., Ltd	Division head	Master's degree	3
E7	ZJBDE Co., Ltd	Division head	Bachelor	8
E8	HZGPT Co., Ltd	General manager	Master's degree	13
E9	HZGPT Co., Ltd	Division head	Master's degree	6
E10	HZGPT Co., Ltd	Big data engineer	Bachelor	10.5
E11	A University	Faculty member	PhD	6
E12	A University	Faculty member	PhD	4

The interview topics were carefully designed to concentrate on the 17 preliminary indicators and gather expert insights on DaaP, including healthcare data governance, data quality management, and data pricing. The interviews in ZJSMH Co., Ltd; HZGPT Co., Ltd; and ZJBDE Co., Ltd lasted for appropriately 106, 125, and 73 min, respectively, with an average duration of about 101.3 min. All interviews were audio-recorded and transcribed into text for further analysis. In each interview, key viewpoints from experts were documented. All 17 preliminary indicators were evaluated by experts through anonymous questionnaires (see Supplementary Appendix SA1) distributed through Wenjuanxing, an online survey platform in China. Each indicator was scored on a 5-point Likert scale. The questionnaire consisted of three sections of expert background information, scores on indicators, and an assessment of expert authority based on decision-making and familiarity with the content. In each round of consultation, the original data were transformed into Excel and then imported into Python for further analysis. The data analysis was conducted using NumPy⁴¹ and Pandas,⁴² two third-party Python packages for original data preprocessing and numerical computation.

Results of the first-round Delphi consultation

In the first round of Delphi consultation, the experts’ familiarity Cs and the average experts’ authority coefficient Cr were recorded as 0.716 and 0.734, respectively. Both of them were greater than 0.70, which indicated a moderate level of expertise and ensured the reliability of the consultation. In Table 3, among 17 indicators, “relational conformance,” “value conformance,” and “contextualization” exhibited a coefficient of variation above 0.25, implying a low consensus. Therefore, additional feedback was collected to address these indicators.

Table 3.

The result of Delphi consultations.

Indicators	First-round consultation			Second-round consultation
Indicators	Mean	SD	CV	Mean	SD	CV
Accessibility	3.5833	0.6686	0.1866	3.2500	0.4523	0.1392
Uniqueness	3.2500	0.4523	0.1392	3.0833	0.2887	0.0936
Comprehensiveness	3.9167	0.6686	0.1707	4.3333	0.4924	0.1136
Value completeness	3.7500	0.4523	0.1206	4.1667	0.3892	0.0934
Sufficiency	4.0833	0.6686	0.1637	4.5833	0.5149	0.1123
Value conformance	3.6667	1.0731	0.2927	‒	‒	‒
Relational conformance	3.5833	0.9003	0.2513	‒	‒	‒
Metadata verification	‒	‒	‒	4.2500	0.4523	0.1064
Compliance	4.4167	0.5149	0.1166	4.6667	0.4924	0.1055
Comparability	3.1667	0.3892	0.1229	3.5833	0.5149	0.1437
Promptness	3.3333	0.4924	0.1477	3.8333	0.3892	0.1015
Endurability	4.3333	0.7785	0.1797	4.2500	0.4523	0.1064
Interoperability	3.3333	0.4924	0.1477	3.4167	0.5149	0.1507
Traceability	3.3333	0.4924	0.1477	3.9167	0.5149	0.1315
Contextualization	3.4167	0.9962	0.2916	‒	‒	‒
Relevancy	3.7500	0.6216	0.1658	3.8333	0.3892	0.1015
Interpretability	3.8333	0.5774	0.1506	4.1667	0.3892	0.0934
Reliability	3.9167	0.5149	0.1315	4.4167	0.5149	0.1166

The “value conformance” and “relational conformance” were integrated into a new indicator “metadata verification.” It is agreed by experts E8, E10, E11, and E12 that both indicators reflect the fact that data should conform to predefined constraints but differ in scope that “value conformance” refers to internal constraints (e.g. data ranges and formats), while “relational conformance” addresses external constraints embedded in physical data architecture (e.g. database or data warehouse).⁶ Given their shared reliance on metadata-defined rules, both indicators were revised into distinct major themes of “metadata-driven constraint” and “architecture-driven constraint” and categorized under the indicator “metadata verification.” “Contextualization,” originally defined as the annotation of acquisition context to support interpretation,²⁶ was recategorized as a major theme under “interpretability” based on suggestions from experts E1, E2, and E5 who noted the similarity of “contextualization” and “interpretability” in emphasizing data applicability and task relevance.

Additionally, experts E6 and E12 proposed an indicator “scarcity,” arguing that the rarity of diseases might affect the value and sensitivity of clinical-related data. For example, while thalassemia major is locally listed as a rare disease due to extremely low prevalence, minor types of alpha- and beta-thalassemia are not. Expert E6 proposed that clinical data from rare diseases collected in real-world settings were seen as having higher clinical value. Accordingly, “scarcity” was added to the HDQIF to capture the importance of disease-specific rarity and heterogeneity.

Results of the second-round Delphi consultation

A second-round Delphi consultation was conducted after modifying the 17 preliminary indicators. The experts’ familiarity Cs and the average experts’ authority coefficient of experts Cr were 0.783 and 0.774, respectively, both higher than those in the first round. None of the 16 indicators had a coefficient of variation above 0.25. The newly added indicators “metadata verification” and “scarcity” received high mean scores greater than 4.0 and acceptable coefficients of variation. Compared with the first round, 12 indicators showed decreased coefficient of variation with notable observations on “endurability” (from 0.179 to 0.106) and “relevancy” (from 0.167 to 0.101). However, the mean score of “accessibility” and “uniqueness” declined to 3.250 and 3.083, respectively. Further feedback indicated that despite lower scores, both were still regarded as essential aspects of data quality. Therefore, “accessibility” and “uniqueness” were retained in the HDQIF. As shown in Table 4, the final HDQIF comprises 16 indicators and 29 revised major themes.

Table 4.

HDQIF.

Indicators	Sign	Description	Major theme
Comprehensiveness	CPH	The scope of data encompasses all necessary types for the intended task.	Complete clinical panorama for tasks
Sufficiency	SUF	Data are adequate to fulfill the specific task requirements.	Adequate enough for a specific purpose
Value completeness	VCPL	The extent of data presence has acceptable levels of missingness relevant to the task.	Missingness and bias
Value completeness	VCPL		Desired proportion of observation
Comparability	CPRA	Data values are consistent across multiple sources.	Data source comparability
Comparability	CPRA	Data values are consistent across multiple sources.	Granularity
Compliance	CPL	Data processing adheres to relevant standards, technical specifications, and expert consensus.	Clinical make-sense
Compliance	CPL		Formality
Metadata verification	META	Data conform to predefined structural constraints from databases or metadata standards.	Architecture-driven constraint
Metadata verification	META		Metadata-driven constraint
Accessibility	ACS	Data are easily accessible and used within legally protected and controlled environments.	Easily accessed
Accessibility	ACS		Readily used
Interoperability	ITO	Data are able to be exchanged, accessed, and processed across heterogeneous healthcare IT systems.	Portability
Interoperability	ITO		Linkage
Interpretability	ITPR	Data are clearly annotated to ensure accurate comprehension and interpretation to users.	Acquisition context annotated
Interpretability	ITPR		Understanding in ease
Relevancy	RELV	Data are fit-for-use and applicable to the task whose outcome will provide benefit and value to stakeholders.	Applicability
Relevancy	RELV		Usefulness
Endurability	EUDR	Data are sufficiently up-to-date for the task.	Up-to-dateness
Promptness	PRMT	Processing related to de facto healthcare practice is documented within the desired timeframe.	Data recency and timeliness
Promptness	PRMT		Temporal dynamics
Reliability	RELA	The trustworthiness and plausibility of data support stakeholder confidence in task outcomes.	Objectivity
			Trustworthiness
			Unequivocalness
Traceability	TR	Data processes are able to be audited and tracked throughout the data life cycle.	Credibility data sources
			Provenance
			Auditability
Uniqueness	UNQ	Data are free of duplications and redundancy to a certain degree.	Free of duplication
Scarcity	SCRC	Disease rarity and heterogeneity impact data quality requirements.	Rarity

Exploration of the HDQIF

Identifying key indicators within HDQIF

After organizing the 16 indicators into the HDQIF, the pairwise influence relationships among indicators are evaluated by a group of experts. The evaluated relationship is characterized in a Structural Self-Interaction Matrix (SSIM). In classic DEMATEL, experts assign binary values to indicate pairwise influence relationships in the Structural Self-Interaction Matrix (SSIM).²³ However, the relationship may be uncertain since “influence” is a linguistically vague and qualitative description. Therefore, a multi-scale fuzzy linguistic terminology is employed to depict the pairwise influence relationship for dealing with vagueness. Each term is represented by a TrFN denoted as $(l, m, u)$ , where $l \leq m \leq u$ . All TrFNs formulated are shown in Table 5. Notably, “none” and “definitely” are included to consider the extreme case that an indicator definitely either has no impact or has an impact on another indicator.⁴³

Table 5.

Linguistic terms and TrFNs assigned.

Linguistic term	TrFN
Definitely none (DN)	(0, 0, 0)
Slightly (SL)	(0, 0, 0.2)
Less moderately (LM)	(0.1, 0.3, 0.5)
Moderately (MD)	(0.3, 0.5, 0.7)
Fairly (FA)	(0.5, 0.7, 0.9)
Strongly (SR)	(0.8, 1, 1)
Definitely (DF)	(1, 1, 1)

Group decision-making is employed to achieve consensus on the relationship between a pair of indicators. Expert k assigns a linguistic term to the cell $(i, j)$ of the individually evaluated SSIM $X^{k} = (x_{i j}^{k})_{n}$ , where $x_{i j}^{k} = (l_{i j}^{k}, m_{i j}^{k}, u_{i j}^{k})$ is the TrFN of the linguistic term and $x_{i i}^{k} = (0, 0, 0)$ evaluates the influence dispatched by indicators $a_{i}$ to $a_{j}$ . The SSIM from experts are merged to a total SSIM $\tilde{X} = ({\tilde{x}}_{i j})_{n}$ , where ${\tilde{x}}_{i j} = {x_{i j}^{1}, x_{i j}^{2}, \dots, x_{i j}^{K}}$ . Afterward, the CFCS defuzzification method⁴⁴ is applied to de-fuzzify the total SSIM to obtain a direct influence matrix $S = (s_{i j})_{n}$ using equations (1) to (8). Once the direct influence matrix S is developed, the normalized direct influence matrix $M = (m_{i j})_{n}$ is derived using equation (9). The total influence matrix $T = (t_{i j})_{n}$ is obtained by using the normalized influence matrix M and equation (10), where T is the power series of M and $I$ is an identity matrix. If the series is convergent, $T$ equals $I (I - M)^{- 1}$ .

▪ CFCS step 1: Normalize each TrFN $x_{i j}^{k} \in {\tilde{x}}_{i j}$

x_{l}^{k} = \frac{l_{i j}^{k} - min l_{i j}^{k}}{max u_{i j}^{k} - min l_{i j}^{k}}

(1)

x_{m}^{k} = \frac{m_{i j}^{k} - min l_{i j}^{k}}{max u_{i j}^{k} - min l_{i j}^{k}}

(2)

x_{u}^{k} = \frac{u_{i j}^{k} - min l_{i j}^{k}}{max u_{i j}^{k} - min l_{i j}^{k}}

(3)

▪ CFCS step 2: Obtain left and right normalized values

x_{l s}^{k} = \frac{x_{m}^{k}}{1 + x_{m}^{k} - x_{l}^{k}}

(4)

x_{r s}^{k} = \frac{x_{u}^{k}}{1 + x_{u}^{k} - x_{m}^{k}}

(5)

▪ CFCS step 3: Compute the total normalized crisp value

x_{c r i s p}^{k} = \frac{x_{l s}^{k} (1 - x_{l s}^{k}) + {(x_{r s}^{k})}^{2}}{1 - x_{l s}^{k} + x_{r s}^{k}}

(6)

f_{i j}^{k} = min l_{i j}^{k} + x_{c r i s p}^{r} (max u_{i j}^{k} - min l_{i j}^{k})

(7)

▪ CFCS step 4: Integrate the total normalized crisp value of each $x_{i j}^{k} \in {\tilde{x}}_{i j}$ to obtain the direct influence matrix $S$

s_{i j} = \sum_{k = 1}^{K} f_{i j}^{k}

(8)

M = \frac{S}{max_{1 \leq i \leq n} \sum_{j = 1}^{n} s_{i j}}

(9)

T = M + M^{2} + M^{3} + \dots = \sum_{n = 1}^{\infty} M^{n} = I (I - M)^{- 1}

(10)

Four metrics are introduced to measure the influential extent of an indicator within HDQIF: influential degree $(D)$ , influenced degree $(C)$ , centrality $(M)$ , and causality $(R)$ . For each indicator $a_{i}$ , $D_{i}, C_{i}, C_{i}, and R_{i}$ are defined by the formula in equations (11) to (14). Specifically, $D_{i}$ is the sum of the i-th row of the total influence matrix T, representing the total direct and indirect influence that $a_{i}$ exerts on other indicator $a_{j}; C_{i}$ is the sum of the i-th column that depicts the total direct and indirect influence received by indicator $a_{i}$ from other indicator $a_{j}$ . Based on the centrality $M_{i}$ and causality $R_{i}$ , an influential relation map is diagramed to identify key indicators within the HDQIF.

D_{i} = \sum_{j = 1}^{n} t_{i j}

(11)

C_{i} = \sum_{i = 1}^{n} t_{i j}

(12)

M_{i} = D_{i} + C_{i}

(13)

R_{i} = D_{i} - C_{i}

(14)

A questionnaire survey (see Supplementary Appendix SA2) was conducted in January 2025 to evaluate the interrelationships among 16 indicators and to obtain individual SSIM from experts. The expert group involved two big-data engineers from ZJSMH Co., Ltd (E2, E3 in Table 2); two healthcare professors from A University (E11 and E12 in Table 2); and two physicians (P1 and P2) from a local third-class hospital. Each expert assigned a linguistic term to evaluate the pairwise influence from indicators $a_{i}$ to $a_{j}$ . The interdisciplinary background of the expert group ensured relevance to healthcare data quality. Six SSIMs from experts were aggregated to a total SSIM which were further de-fuzzified by the CFCS method to generate the direct influence matrix S. Subsequently, the total influence matrix T were introduced and shown in Supplementary Table S2. The influential relation map illustrated in Figure 4 was constructed based on the causality and centrality of indicators. The X-axis represents centrality and the Y-axis represents causality. Positive causality reveals that an indicator influences others while negative one indicates that the indicators are more influenced by others.

Figure 4.

Influential relation map.

The influential relation map is categorized into four quadrants based on the mean of centrality and causality.²³ Quadrant I includes “promptness” and “accessibility” with positive causality and strong centrality. Both are denoted as stake indicators as they exert influence on other indicators and are affected by others as well. Quadrant II contains “scarcity,” “compliance,” “metadata verification,” “reliability,” “interoperability,” “value completeness,” and “comparability.” These indicators exhibiting positive causality but low centrality are identified as critical or influential indicators, as they predominantly exert influence on other indicators while remaining relatively unaffected by them. “Scarcity” exhibits the strongest causality, indicating an important indicator in the HDQIF. Indicators in Quadrants I and II should be prioritized to improve healthcare data quality, as a higher absolute causality indicates greater influence. Conversely, Quadrant III and IV include indicators with negative causality. “Uniqueness” and “endurability” located in Quadrant III are denoted as autonomous indicators due to low causality and centrality below the average. It's indicated that neither may influence other indicators nor significantly being affected by them. Quadrant IV includes dependent indicators of “sufficiency,” “comprehensiveness,” “interpretability,” “traceability,” and “relevancy,” which show a negative causality but strong centrality. Attention to these indicators is also essential for effective data quality management. Table 6 summarizes the causality, centrality, and categorization of all indicators.

Table 6.

Causality, centrality, category, and weight for each indicator.

Indicator	Centrality	Causality	Category	Weight
Comprehensiveness	4.9150	−1.5093	Dependent	0.0530
Sufficiency	4.7698	−0.9878	Dependent	0.0618
Value completeness	3.6389	0.1244	Influential	0.0583
Comparability	3.5504	0.0431	Influential	0.0570
Compliance	3.7071	1.7183	Influential	0.0900
Metadata verification	3.8178	1.2455	Influential	0.0823
Accessibility	4.0267	0.2386	Stake	0.0676
Interoperability	3.2294	0.6124	Influential	0.0608
Interpretability	4.3962	−1.7215	Dependent	0.0445
Relevancy	4.3935	−2.1466	Autonomous	0.0381
Endurability	3.7422	−0.2264	Dependent	0.0557
Promptness	4.1117	1.0734	Stake	0.0876
Reliability	3.6456	1.0035	Influential	0.0749
Traceability	4.2830	−1.8107	Dependent	0.0437
Uniqueness	2.1553	−0.3118	Autonomous	0.0299
Scarcity	3.0464	2.6549	Influential	0.0950

Disentangling interrelationships among indicators within HDQIF

The AISM method is adopted to analyze interrelationships among HDQIF indicators. DEMATEL supports AISM by generating a total influence matrix as the basis for the reachability matrix.^23,45 To better elucidate the interrelationships, AISM is modified by incorporating fuzzy theory and applying fuzzy matrix operations to derive the fuzzy reachability matrix $F R$ from the total influence matrix. A commonly used fuzzy operator in the fuzzy matrix operation is $\otimes = M (\cup, \cap)$ ,⁴⁶ whose procedure is presented in equations (15) and (16). Equation (17) is first applied to the obtained total influence matrix to ensure the feasibility of fuzzy matrix operations, followed by fuzzy transitive closure in equation (18) to obtain the fuzzy reachability matrix $F R = (r_{i j})$ , where $T_{E}^{k}$ denotes the k-th iteration of fuzzy matrix operation with operator $\otimes$ .

x \cap y = min (x, y)

(15)

x \cup y = max (x, y)

(16)

T_{E} = T \cup I

(17)

F R = lim_{n \to \infty} ⋃_{k = 1}^{n} T_{E}^{k}

(18)

A critical step is to convert the fuzzy reachability matrix $F R$ into a Boolean $λ$ intercepted reachability matrix $F R_{λ} = (r_{λ})_{i j}$ using a predefined threshold $λ$ . Threshold filters indicator-to-indicator reachability by retaining influence relationships with values equal to or above the threshold value and discarding weaker relationships. In classic AISM, the threshold is determined using a statistical criterion based on the mean and standard deviation of the values of the total influence matrix.⁴⁷ Such a statistical view enables data-dependent thresholding by accounting for variation on the values of the total influence matrix, but it overlooks topological characteristics inherent in hierarchical structural models. Therefore, a threshold set $Δ$ comprising all unique values in the fuzzy reachability matrix is derived to record all possible thresholds. Any threshold $λ \in Δ$ can be applied to produce $λ$ intercepted reachability matrix $F R_{λ}$ using equation (19).

(r_{λ})_{i j} = {\begin{matrix} 1, r_{i j} \geq λ \\ 0, r_{i j} < λ \end{matrix}

(19)

According to the fuzzy reachability matrix $F R$ in Supplementary Table S3, 88 thresholds are arranged in ascending order into a finite set $Δ = {0.01611, 0.10592, \dots, 0.29632, 1}$ . Any two adjacent thresholds $λ_{1}, λ_{2} \in Δ$ define a threshold interval $(λ_{1}, λ_{2}]$ where all $λ_{i}$ produces an identical topological hierarchical structure model. Each selected threshold determines three structural features in the topological hierarchical structure model: (a) number of levels, (b) number of loops, and (c) number of connected components. To select a representative threshold, three principles are formulated based on these features.

▪ Threshold that yields more hierarchical levels are preferred. The number of levels exhibits a unimodal pattern with a maximum of 9 when $λ = 0.14065$ . More levels enable a clearer interpretation of the interrelationships among the indicators.

▪ Threshold that results in fewer connected components indicate better structural cohesion of HDQIF. Connected components reflect that the HDQIF can be divided into disjoint subsystems. The number of connected components increases gradually at first then rises sharply as the threshold increases. At $λ \in (0.12273, 0.21399]$ , “uniqueness” becomes isolated in the hierarchical structure

▪ Threshold with a higher number of loops. Loops indicate bidirectional feedback among indicators. For example, two loops are identified at $λ = 0.13912$ . One is composed of “comprehensiveness,” “sufficiency,” “traceability,” “relevancy,” and “interpretability,” and the other consists of “compliance” and “reliability.” At most three loops appear in $λ \in (0.11162, 0.13347]$ and none when $λ > 0.17911$ .

According to the three principles, λ = 0.14065 was selected to generate the topological hierarchical structure model, as it first reached the highest level of 9 with 2 connected components and one loop. The λ-intercepted reachability matrix $F R_{λ}$ is obtained to develop a topological hierarchical structure diagram.

Given the $λ$ intercepted reachability matrix $F R_{λ}$ , the levels are partitioned using the reachability sets $R (a_{i})$ , antecedent sets $Q (a_{i})$ , and intersection sets $T (a_{i}) = R (a_{i}) \cap Q (a_{i})$ .⁴⁸ Sixteen indicators are partitioned into levels based on two rules of UP-extraction satisfying $T (a_{i}) = Q (a_{i})$ and DOWN-extraction following $T (a_{i}) = R (a_{i})$ . Both rules are iteratively executed while disregarding the extracted indicators from previous iterations. During the partition, loops are identified as sets of indicators that mutually influence one another, reduced into a single representative indicator through a points-reduction operation. The resulting reduced-point matrix $F R_{λ}^{'}$ merges such loops. Finally, the general skeleton matrix $S^{'}$ is derived using equation (20) to eliminate noncritical interrelationships and emphasizes the precedence to the most significant ones.⁴⁹

S^{'} = F R_{λ}^{'} - (F R_{λ}^{'} - I)^{2} - I

(20)

The topological hierarchical structure model was obtained based on the general skeleton matrix and the partitioning, as illustrated in the UP-extraction diagram and the DOWN-extraction diagram of Figure 5. All indicators were arranged across levels consistent with the partitioning. Directed edges represented cause-and-effect relationships between indicators at different levels. Both diagrams had the identical number of levels. The UP-extraction diagram sequentially extracted indicators that exerted influence on others and placed them from bottom to top, while the DOWN-extraction diagram sequentially extracted indicators that were affected by others and placed them from top to bottom.

Figure 5.

Topological hierarchical structure model diagram.

Active element and loop analysis

Active elements are indicators situated at different levels, reflecting the topological extensiveness of HDQIF. Figure 5 shows that “interoperability,” “comparability,” and “metadata verification” are identified as active elements and correspond to influential indicators in the influential relation map. Directed edges in both diagrams represent the interrelationships among indicators, while bidirectional edges indicate feedback loops. A notable loop exists between “comprehensiveness” and “sufficiency” to characterize their mutual dependence. Such loops should be treated as integrated units when designing strategies to improve healthcare data quality.

Hierarchical analysis

16 indicators are structured into a nine-level topological hierarchical structure model to delineate their interrelationships. The UP- and DOWN-extraction diagrams are largely consistent. “Uniqueness” is revealed as an isolated indicator with no connected edges, aligned with its weak centrality and near-zero causality in the influential relation map. After removing “uniqueness,” the system can be divided into three layers. The cause layer consists of the indicators at the bottom level (Level 9) including “scarcity” and “compliance.” They are only the source nodes of directed edges that hold a dominant position with significant influence on the HDQIF. The effect layer consists of “relevancy” and “traceability” located at the top level (Level 1) influenced by “interpretability.” The intermediate layer involves 11 indicators across Level 2–8 that link the cause layer and the effect layer. Intermediate-layer indicators serve as contingent coordinators that support effective and context-driven data quality improvement, due to the fact that data quality is shaped by the synergistic interrelationships among multiple indicators rather than by a few dominant ones.

Application of the HDQIF

Integrating DEMATEL and ANP enables effective indicator weighting, as DEMATEL captures total influence among indicators and simplifies ANP's computational complexity.⁵⁰ The total influence matrix T is column-normalized to obtain the weighted supermatrix W. By raising W to powers, the final weights are determined when each column supermatrix converges to a stable limit matrix. FCE is applied to assess the healthcare data quality using the HDQIF. The application steps of FCE are outlined as follows.⁵¹

Step 1: Determining the evaluated indicator set for FCE. The evaluated indicator set $U = {u_{1}, u_{2}, \dots, u_{n}}$ includes all 16 indicators in the HDQIF.

Step 2: Setting up the category set for evaluating each of the indicators. The category set $V = {v_{1}, v_{2}, \dots, v_{m}}$ adopts five-level linguistic terms: poor, deficient, general, fair, and pretty.

Step 3: Establishing the single-factor evaluation matrix. One of the tasks assigned to the experts is to establish correspondences between n indicators and m categories. The fuzzy evaluation matrix R is of the form:

R = \begin{matrix} \begin{matrix} v_{1} & v_{2} & \dots & v_{m} \end{matrix} \\ \begin{matrix} u_{1} \\ u_{2} \\ ⋮ \\ u_{n} \end{matrix} & (\begin{matrix} r_{11} & r_{12} & \dots & r_{1 m} \\ r_{21} & r_{22} & \dots & r_{2 m} \\ ⋮ & ⋮ & ⋮ \\ r_{n 1} & r_{n 2} & \dots & r_{n m} \end{matrix}) \end{matrix}

Each row of R is the single-factor fuzzy evaluation vector of $u_{i}$ . $r_{i j}$ represents the degree of membership of indicator $u_{i}$ in the linguistic terms $v_{j}$ and subjected to $\sum_{j = 1}^{m} r_{i j} = 1, i = 1, 2, \dots, n$ .

Step 4: Producing the evaluation result. The evaluation result is produced by synthesizing the ANP-derived priority weight vector $ω = (w_{1}, w_{2}, \dots, w_{n})$ with the matrix R, as shown in equation (21).

G = ω \cdot R = (w_{1}, w_{2}, \dots, w_{n}) [\begin{matrix} r_{11} & r_{12} & \dots & r_{1 m} \\ r_{21} & r_{22} & \dots & r_{2 m} \\ ⋮ & ⋮ & ⋮ \\ r_{n 1} & r_{n 2} & \dots & r_{n m} \end{matrix}]

(21)

Case study

ZJSMH Co., Ltd is a healthcare big data company in Hangzhou, China. It specializes in providing medical institutions with innovative digital healthcare tools including internet hospital, patient follow-up platforms, and remote multidisciplinary collaboration systems. By collaborating with local medical institutions, ZJSMH has accumulated diverse healthcare data resources to enhance healthcare accessibility and patient-centered care for a broader population.

In February 2025, a three-hour site visit and face-to-face interview were conducted with a professional group of five data engineers (designated as EG1-EG5), two product managers (PrD1-PrD2), and one project manager (PrJ1) from ZJSMH. Three structured datasets of type 2 diabetic patients ZA1-D, ZA2-D, and ZH-D were served as representative data products obtained from three local medical institutions ZA1, ZA2, and ZH, all of which have long-standing collaborative relationships with ZJSMH. The datasets varied in terms of their granularity, with differences in timeframe, attributes, and number of records. The intended use of these datasets was assumed to be as training sets for constructing machine learning based screening models for type 2 diabetes. Before the interview, a Data Protection and Confidentiality Protocol was signed with ZJSMH Co., Ltd Restricted temporary and on-site access to the de-identified ZA1-D, ZA2-D, and ZH-D was granted solely for healthcare data quality assessment. It was strictly ensured that no data copies were retained, and all access rights were terminated immediately upon completion of the study on the day of the site visit to ZJSMH.

At the beginning of the interview, to scrutinize the three data datasets, engineers EG2 and EG5 introduced the three datasets to allow all participants to gain an overview. Then, all participants were briefed on the HDQIF with 16 indicators. The quality of each dataset was assessed by the professional group who evaluated and scored each indicator from 0 to 100. The scoring table is provided in Supplementary Appendix SA3. Each score assigned by interviewee k generated a membership vector $(r_{i 1}^{k}, r_{i 2}^{k}, \dots, r_{i m}^{k})$ , with each component $r_{i j}^{k}$ representing the degree of membership of indicator $a_{i}$ to linguistic term $v_{j}$ judged by k. Therefore, the consensus on indicator $a_{i}$ was aggregated using the mean membership factor $({\bar{r}}_{i 1}, {\bar{r}}_{i 2}, \dots, {\bar{r}}_{i m})$ where ${\bar{r}}_{i j} = \frac{1}{8} \sum_{k = 1}^{8} r_{i j}^{k}$ and ${\bar{r}}_{i j}$ forms the element of the single-factor evaluation matrix R. With indicator weights shown in Table 6, the evaluation result $G_{1}, G_{2}, G_{3}$ were obtained via equation (21).

\begin{aligned} G_{1} = (0, 0 .0004, 0 .0994, 0 .5557, 0 .3445) \\ G_{2} = (0, 0, 0.0374, 0.4602, 0.5024) \\ G_{3} = (0, 0, 0.0307, 0.4940, 0.4753) \end{aligned}

According to the maximum membership degree principle,⁵¹ the evaluation results of ZA1-D, ZA2-D, and ZH-D are “fair,” “pretty,” and “fair,” respectively. In Supplementary Table S4, where the evaluation scores across three datasets are recorded, it is clear that ZA2-D scores higher than both ZA1-D and ZH-D on 10 out of 16 indicators, such as “relevancy,” “reliability,” and “interpretability,” which indicates adequate data quality to fulfill the fit-for-use expectations. The better performance of ZA2-D may stem from a larger data volume with 411 records and 32 clinical indicators including fasting plasma glucose, HbA1c, and fasting C-peptide. However, ZA2-D scored lower in “promptness” and “endurability,” possibly due to the temporal nature, since it was developed about three years ago and has not been continuously or regularly updated. It is suggested a trade-off between comprehensiveness and limited up-to-dateness in the quality of ZA2-D.

ZA1-D and ZH-D underperformed compared to ZA2-D. ZA1-D scored lowest on 12 out of 16 indicators, except for “promptness,” “compliance,” and “endurability.” Despite being recently collected, ZA1-D limited volume (262 records) compromises sufficiency and comprehensiveness, potentially weakening user confidence in fitness-for-use. ZH-D are presented with a close degree of membership to both “fair” and “pretty.” It performed comparably to ZA2-D on “value completeness,” “comprehensiveness,” “compliance,” “reliability,” and “scarcity.” Despite a notable gap in data quality between ZH-D and ZA2-D, ZH-D remains promising for meeting fit-for-use expectations of data consumers. Among the three diabetes datasets, ZA2-D demonstrated the highest overall data quality, followed by ZH-D and ZA1-D.

The feature of “fit for use” emphasizes that data quality should be assessed according to the intended use of the data product. It is implied that data quality assessment is inherently relative. Assigning a universal value to define “data quality” is less meaningful, as such an approach neglects the contextual requirements that determine whether a dataset is truly fit-for-use. As existing studies have largely characterized data quality qualitatively through multiple indicators, this study proposes integrating MCDM methods with the HDQIF to transform healthcare data quality assessment into a comprehensive multi-criteria evaluation problem. By treating each quality indicator as a criterion, MCDM advances the assessment from a fragmented set of 16 qualitative indicators into a unified and operational process for systematic evaluation. The case study demonstrates that using the HDQIF to assess the quality of datasets ZA1-D, ZA2-D, and ZH-D within the context of MCDM is a practical attempt. It enables data-product developers and stakeholders of ZJSMH to clearly understand both strengths and limitations of a specific healthcare dataset. HDQIF may serve as a strategic intermediate to systematically assess and monitor the healthcare data quality. As the HDQIF supports further specification of these indicators according to specific task requirements, it is possible to design explicit computational formulas for each indicator in the future according to task-oriented requirements.

Discussion

Implications of the proposed HDQIF

The proposed HDQIF systematically delineates the dimensions of healthcare data quality through 16 distinct indicators. The indicators include “comprehensiveness,” “sufficiency,” “value completeness,” “comparability,” “compliance,” “metadata verification,” “accessibility,” “interoperability,” “interpretability,” “relevancy,” “endurability,” “promptness,” “reliability,” “traceability,” “uniqueness,” and “scarcity.” The HDQIF was grounded in a systematic literature analysis and was further refined by a two-round Delphi consultation. The 22 studies span diverse scopes of data quality, including the secondary use of EHR/EMR,^5,18,29,37 real-world-study data,²⁸ digital health,¹⁵ clinical research,^13,34 and wearable-device-generated data.³⁸ These scopes collectively underpin HDQIF to dedicate broader adaptability across healthcare contexts.

The HDQIF makes three contributions by harmonizing terminology, expanding indicator coverage, and operationalizing the DaaP paradigm. First, the framework harmonizes inconsistent terminologies that have fragmented prior research, where semantic overlap among terms has led to conceptual ambiguity. For instance, as shown in Figure 3, it was identified that “consistency,” “conformance,” and “concordance” were frequently highlighted but share semantic similarity. Therefore, these semantically overlapping terminologies were re-categorized into distinct major themes according to their original definitions to retain and clarify subtle conceptual differences. Supplementary Table S5 maps the HDQIF against four well-recognized frameworks^6,17,18,31 and the ALCOA++ guideline proposed by the European Medicines Agency.⁵² It is demonstrated that the HDQIF captures most terminology from these sources (except for “scarcity”) while offering refined definitions. Extending Weiskopf and Weng's study,¹⁸ the HDQIF refines “plausibility” as a subtheme under “compliance” and disentangles “endurability” from “currency” to highlight that data remains up to date for task-specific use. Similarly, the multifaceted nature of “consistency”^6,17,31 is further elaborated through “comparability,” “compliance,” and “metadata verification.” Furthermore, the successful alignment with ALCOA++,⁵² a recognized data quality standard in clinical research, suggests that the HDQIF holds strong potential for practical application in healthcare DaaP practices.

The second contribution is that the HDQIF introduces new indicators that extend coverage to indicators of healthcare data quality that were previously neglected. Three indicators of “compliance,” “interoperability,” and “comprehensiveness” were newly conceptualized in the HDQIF. First, the term compliance is carefully selected and synthesized from two major themes of “clinical make-sense” and “formality.” In HDQIF, “compliance” is described as the processing of healthcare data that adheres to national or local standards, medical technical specifications, and clinical expert consensus. The description echoes the original notion of compliance as the extent to which patient behavior aligns with medical advice.⁵³ “Compliance” also aligns with the concept of “plausibility” described by Weiskopf and Weng¹⁸ as well as “value conformance” and “atemporal plausibility” proposed by Kahn et al.⁶ In healthcare context, compliance also carries security implications by stressing the adherence to relevant regulations that safeguard patient privacy and ensure data security during processing, as exemplified by Health Insurance Portability and Accountability Act (HIPAA) enacted by United States to provide a secure framework to appropriately facilitate data access and control.⁵⁴

The second is “interoperability,” referring to the capacity to exchange, access, and process healthcare data across disparate healthcare IT systems. Given the coexistence of numerous digital systems within a single facility, enabling effective communication indicates a shift from the narrower concept of data portability toward a broader interoperability. Human decision-making plays a role in achieving interoperability. A HIS engineer may be required to establish mapping between two healthcare IT systems that use incompatible data formats. Poor interoperability leads to fragmented and inaccessible medical data, which contributes to information silos and declines care quality.³ A deliberate inclusion of interoperability as an indicator in the HDQIF deserves consideration.

“Comprehensiveness” is defined as the breadth of data that includes all desired types of data necessary for the intended task. The nuance between the terms comprehensiveness and completeness should be articulated. Completeness is characterized by “a desired proportion of available data values” that primarily focuses on data without missingness⁵⁵; while “comprehensiveness” highlights task-dependence to ensure that all types of data meet a desired coverage to fulfill the intended task. For example, a diagnostic model for classifying lung nodules as benign or malignant is mainly constructed based on numerous lung CT images as a dataset. Yet collecting additional relevant data alongside lung image sets, such as pathological data, clinical laboratory data, and genetic testing data, is also helpful to ensure the intended use of the model.

Building upon improving the terminology and indicator coverage, the third contribution is that the HDQIF makes the idea of DaaP truly operational. While DaaP has received growing interest in industry, it has received limited attention in academic research. Therefore, theoretically, the HDQIF takes data quality as an entry point to understand how DaaP may work in healthcare industry. DaaP consists of four care principles of Consumer-based quality, Life-cycle-oriented governance, Deliverability, and Marketability.^1,2 As shown in Table 7, each principle embodies one or more of indicators from the HDQIF. It is also recognized that an individual indicator can be associated with more than one DaaP principle. For example, “metadata verification” mainly supports life cycle governance while also reflecting its relevance to deliverability. To handle this, each indicator is categorized under the principle where it plays its primary role. Such a best-fit adoption does not deny the multiple roles of indicators but provides a clear and organized way to display the connection of the DaaP's principles and the HDQIF's indicators.

Table 7.

Alignment of indicators with the four DaaP principles.

DaaP principle	Description	Indicators
Consumer-based quality	Data should be of suitable quality to fulfill consumers’ actual needs. Fit-for-use characterizes data quality by whether the data can effectively support the tasks and decisions of its users. Indicators aligned with this principle foster iterative refinement and continuous improvement of the data product, ensuring its alignment with consumer feedback and evolving business requirements.	Relevancy
		Interpretability
		Comprehensiveness
		Sufficiency
		Value completeness
Life-cycle-oriented governance	Data should be managed across its entire life cycle. Effective governance ensures that the data product remains traceable and auditable throughout its life cycle, from production to delivery to users. Indicators aligned with this principle help organizations keep data trustworthy over time and make sure it meets accountability requirements.	Traceability
		Metadata verification
		Promptness
		Endurability
		Compliance
Deliverability	Data should be easy to access and able to interoperate across different systems. A good data product is not locked away but delivered smoothly to any stakeholders who need it. Deliverability means that data can be compared, reused, and shared without losing meaning or consistency. Indicators aligned with this principle primarily highlight that the data product can transfer effectively among stakeholders and demonstrate its value in broader organizational or market contexts.	Accessibility
		Interoperability
		Comparability
		Uniqueness
Marketability	Marketability refers to the capacity of data products to be recognized as meaningful and applicable to stakeholder needs. A marketable data product makes its significance evident to diverse stakeholders. The thinking of DaaP enhances the value of the data product by creating solutions that resonate with stakeholders.	Scarcity
Marketability		Reliability

The requirement of data quality reflects the DaaP principle of consumer-based quality. In data marketplaces, the pricing of data products is driven by customers who make purchasing decisions based on their specific business objectives and preferences.⁵⁶ Although the importance of data quality has been particularly pronounced in the healthcare industry, inconsistent definitions of various data quality indicators urge clearer articulation. Given that DaaP involves diverse stakeholders such as data producers, product developers, and consumers, it is foreseeable that each of them may bring different interpretations about data quality informed by their contextual knowledge, which can result in a degradation of data quality across life cycle stages. That data quality problems may be introduced at each stage in the life cycle necessitates a full-life-cycle data quality assessment. Therefore, the requirement on continuous oversight across all life cycle stages directly echoes the principle of life-cycle-oriented governance. A consolidated HDQIF could facilitates shared understanding and coordinated quality evaluation across all stakeholders, as stressed by the DaaP principle of deliverability.

Interpretation and validation of the topological hierarchical structural model

The topological hierarchical structure model provides a theoretical foundation for elucidating interrelationships among healthcare data quality indicators, an aspect under-explored in previous healthcare data quality studies. By mapping 19 directed edges among 16 indicators, the model visualizes how these relationships can inform targeted insights for improving data quality. Validation of the interrelationships through literature supports 14 of the 19 pairs, with detailed references in Supplementary Table S6. However, the interrelationships of (1) “metadata verification” to “comparability”; (2) “interpretability” to “traceability”; (3) “scarcity” to “metadata verification”; (4) “scarcity” to “promptness” are not directly supported by existing literature.

Figure 5 positions “scarcity” as a cornerstone indicator within the HDQIF, where it is conceptualized on the rarity and heterogeneity of specific diseases. Rarer clinical cases possess greater value since such cases often provide unique insights that inform medical research.⁵⁷ However, due to scarce clinical research and variable practices, rare diseases encounter significant data quality issues, including limited interoperability, inconsistent coding, and incomplete metadata.⁵⁸ In contrast, common diseases benefit from well-established research and metadata standards. This disparity accounts for the tentative associations from “scarcity” to “metadata verification” and “promptness,” as rare disease data often lack sufficient standardization and are updated infrequently.

The relationship from “interpretability” to “traceability” remains ambiguous. Throughout the data life cycle from creation and collection to ETL processes, the importance of traceability is stressed in ensuring data quality feedback to improve data quality for stakeholders. Data should remain traceable across the life cycle to audit any changes from the initial creation to subsequent processes of all data and metadata. The requirement for traceability necessitates comprehensive annotations that facilitate accurate interpretation of data within their acquisition context. Interpretability may be considered a prerequisite for achieving traceability.

The link from “metadata verification” to “comparability” can be interpreted in the context of healthcare practice. “Metadata verification” involves architecture- and metadata-driven constraints to ensure data conforms to predefined structural and semantic standards. Even with format mappings across different healthcare IT systems, the need for unified data standards may persist due to semantic inconsistency. Meanwhile, “comparability” refers to value-level consistency across data from multiple sources, which requires alignment with a reference dataset or gold standard dataset. As healthcare IT systems expand, balancing flexible data modeling and consistent comparability becomes critical for supporting reliable clinical decisions.

Key management insights for promoting data quality

The results of the study on HDQIF provide key managerial insights for promoting healthcare data quality across organizations engaging in Healthcare-DaaP practices. Although these insights are derived from a specific organizational context, they reflect the challenges and opportunities in data quality faced by similar organizations in the healthcare industry.

A significant role of data quality assessment should be reaffirmed in the paradigm of DaaP. The case study on ZJSMH presents a practical attempt for the HDQIF to assess data quality. However, it is noteworthy that characterizing data quality through multiple indicators is merely the first step of assessing data quality. A key challenge lies in determining the appropriate method for quantitatively or qualitatively measuring each indicator. Such a difficulty is particularly pronounced in healthcare settings due to both the complexity of healthcare data and the limited expertise of stakeholders. Organizations should adopt a flexible and context-driven approach that aligns with the context-specific nature of indicators within the HDQIF.

Priority should be given to establishing a systematic mechanism that ensures the process of data complies with relevant regulations and standards. In healthcare, the security and privacy of data are undoubtedly one of the most significant issues. The concern about security makes ensuring compliance an imperative when developing or sharing healthcare data. The topological hierarchical structure model also informs such imperatives, as “compliance” is identified as a critical indicator influencing overall data quality. The evolving Healthcare-DaaP demands an adaptable compliance mechanism to help organizations proactively manage potential risks. Exemplifying early efforts in prioritizing compliance, ZJSMH Co. Ltd has explored a combined regulatory-technology approach to healthcare data governance, ensuring compliance while leveraging data accumulated through partnerships with medical institutions.

Additionally, aligning stakeholder expectations also warrants consideration. DaaP highlights the close full-life cycle collaboration among stakeholders to ensure that the data product in sufficient quality remains aligned with business objectives. However, deficient collaboration among these stakeholders at any stage of the data life cycle can lead to a mismatch of expectations on data quality.⁸ Indicators such as “relevancy,” “interpretability,” and “comprehensiveness” reflect this task-dependent focus. Indeed, healthcare data can be seen as a by-product of healthcare outcomes engaged with diverse stakeholders. A close engagement among stakeholders throughout the healthcare research process requires specialized expertise on data management skills and knowledge of regulatory frameworks⁵⁹ to maintain the proper use of the data. Such interrelated dependent indicators may be influenced by stakeholder expectations and contextual understanding.

Limitations and future work

Several limitations necessitate consideration in the study. First, although the PRISMA guidelines are strictly followed to conduct a systematic literature analysis, the omission of relevant indicators not extensively covered in prior studies still exists. Some indicators may have been excluded due to database scope and search strategy limitations. Further studies could expand the scope of literature retrieval by incorporating additional databases and interdisciplinary sources to enrich the comprehensiveness of indicator coverage.

Second, the three-staged coding process may inevitably introduce subjective bias, as the extraction and consolidation of labels, subthemes, and themes are influenced by the perspectives and judgments of the researchers. Although the involvement of three independent researchers in the coding process enhanced reliability relative to a single researcher, subjective bias nevertheless remains objectively present. Further work could involve more researchers in the coding process and explore the use of large language model (LLM) to achieve semi-automated grounded theory coding. For example, a specific LLM like GPT could act as a third researcher working alongside human coders to conduct literature three-staged coding. The outputs generated by the LLM could then be compared with those produced by human researchers to analyze similarities and differences across coding results.

Third, the limitation of the Delphi consultation should be acknowledged. The study invited 12 experts with substantial professional backgrounds to participate in the Delphi consultation, and their expertise ensured that deficiencies during the construction of the HDQIF were effectively identified. However, the relatively small number of participants may constrain the representativeness and comprehensiveness of the results. Future research could expand the panel size and diversify expert backgrounds across disciplines and regions to enhance to ensure greater credibility and comprehensiveness of the HDQIF.

Lastly, the case study was limited to three dataset-type products provided by a single company. While this design illustrates the practical applicability of the HDQIF in a real-world context, the absence of comparative analysis with data products from other companies restricts the generalizability of the framework. Moreover, due to privacy and security concerns, the datasets used in this study were accessible only within the company and could not be disclosed. Variations in data product types, such as data services or data algorithms, may lead to divergent evaluation outcomes across institutions.

Conclusion

This study proposes a DaaP-oriented HDQIF with 16 indicators to characterize healthcare data quality. Indicators are categorized in a 4-quadrant influential relation map according to two metrics of causality and centrality by DEMATEL. The interrelationships among 16 indicators are disentangled and visualized in a nine-level topological hierarchical structure model using the AISM method. The application of HDQIF in the case study demonstrates a feasible pathway to quantitatively assess healthcare data quality in the view of MCDM. Significant implications in theory and practice are highlighted. Theoretically, the HDQIF advances the understanding of healthcare data quality within the Healthcare-DaaP context by integrating multi-dimensional indicators and clarifying ambiguous terminology. Our study resonates with the forward-looking focus on data-driven initiatives in the healthcare industry, such as valuation, pricing, and assetization of healthcare data. Practically, the HDQIF supports integration into cost-, market-, and revenue-based valuation methods to quantify the monetary value of healthcare data, offering actionable insights for monetizing healthcare data in data marketplaces. Future research may expand the HDQIF by contextualizing diverse healthcare settings to capture a broader range of data quality indicators.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076261427506 - Supplemental material for Development and application of a healthcare data quality indicator framework from the perspective of data-as-a-product

Supplemental material, sj-docx-1-dhj-10.1177_20552076261427506 for Development and application of a healthcare data quality indicator framework from the perspective of data-as-a-product by Min Cai, Xijie Huang, Yijie Cao, Xuan Shao and Xueqi Xu in DIGITAL HEALTH

Supplemental Material

sj-docx-2-dhj-10.1177_20552076261427506 - Supplemental material for Development and application of a healthcare data quality indicator framework from the perspective of data-as-a-product

Supplemental material, sj-docx-2-dhj-10.1177_20552076261427506 for Development and application of a healthcare data quality indicator framework from the perspective of data-as-a-product by Min Cai, Xijie Huang, Yijie Cao, Xuan Shao and Xueqi Xu in DIGITAL HEALTH

Supplemental Material

sj-docx-4-dhj-10.1177_20552076261427506 - Supplemental material for Development and application of a healthcare data quality indicator framework from the perspective of data-as-a-product

Supplemental material, sj-docx-4-dhj-10.1177_20552076261427506 for Development and application of a healthcare data quality indicator framework from the perspective of data-as-a-product by Min Cai, Xijie Huang, Yijie Cao, Xuan Shao and Xueqi Xu in DIGITAL HEALTH

Footnotes

Acknowledgements

The authors would like to thank part of members from ZJSMH Co., Ltd, ZJBDE Co., Ltd, and HZGPT Co., Ltd for their engagement in our consultation. Their valuable and professional insights on assetization and commodification of healthcare data have greatly contributed to our study. The authors would like to thank the editor and the anonymous reviewers for their insightful comments and suggestions as well.

ORCID iDs

Min Cai

Xijie Huang

Xuan Shao

Xueqi Xu

Author contributions

Min Cai: conceptualization, methodology, supervision, project administration, writing‒review and editing, and funding acquisition.

Xijie Huang: conceptualization, methodology, software, formal analysis, data curation, visualization, writing‒original draft and writing‒review and editing.

Yijie Cao: conceptualization and resources.

Xuan Shao: software and formal analysis.

Xueqi Xu: conceptualization, methodology, and writing‒review and editing.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Zhejiang Provincial Philosophy and Social Science Planning Project, Humanities and Social Science Project of Ministry of Education of China (grant number 23SYS11ZD, 24YJA630002).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Supplemental material

Supplemental material for this article is available online.

References

Tim

Cole

What is data as a product (DaaP)?. https://www.ibm.com/think/topics/data-as-a-product (2024 , accessed 4 May 2025 ) .

Nizamis

Julian

Valero

, et al. Data-as-a-product to enable data-driven value networks in industries 4.0 & 5.0: the Swiss smart factory experiment. Procedia Comput Sci 2025; 257: 793–800.

Telenti

Jiang

. Treating medical data as a durable asset. Nat Genet 2025; 52: 1005–1010.

Conroy

Lacey

Bešević

, et al. UK Biobank: a globally important resource for cancer research. Br J Cancer 2022; 128: 519–527.

Shilo

Rossman

Segal

. Axes of a revolution: challenges and promises of big data in healthcare. Nat Med 2020 Jan; 26: 29–38.

Kahn

Callahan

Barnard

, et al. A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data. EGEMS (Washington, DC) 2016 Sep; 4: 1244.

Bernardi

Alves

Crepaldi

, et al. Data quality in health research: an integrative literature review. J Med Internet Res 2023; 25: e41446.

Weng

. Clinical data quality: a data life cycle perspective. Biostat Epidemiol 2020; 4: 6–14.

Liaw

Guo

Ansari

, et al. Quality assessment of real-world data repositories across the data life cycle: a literature review. J Am Med Inform Assoc 2021; 28: 1591–1599.

10.

Wang

Kung

Byrd

. Big data analytics: understanding its capabilities and potential benefits for healthcare organizations. Technol Forecast Soc Change 2018; 126: 3–13.

11.

Kahn

Brown

Chun

, et al. Transparent reporting of data quality in distributed data networks. EGEMS 2015; 3: 1052.

12.

Wang

Strong

. Beyond accuracy: what data quality means to data consumers. J Manag Inf Syst 1996; 12: 5–33.

13.

Kahn

Raebel

Glanz

, et al. Pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research. Med Care 2012; 50: 21–29.

14.

Liu

Talaei-Khoei

Storey

, et al. A review of the state of the art of data quality in healthcare. J Glob Inf Manag 2023; 31: 1–18.

15.

Syed

Eden

Makasi

, et al. Digital health data quality issues: systematic review. J Med Internet Res 2013; 25: e42615.

16.

Chen

Hailey

Wang

, et al. A review of data quality assessment methods for public health information systems. Int J Environ Res Public Health 2014; 11: 5170–5207.

17.

Liaw

Rahimi

Ray

, et al. Towards an ontology for data quality in integrated chronic disease management: a realist review of the literature. Int J Med Inform 2013; 82: 10–24.

18.

Weiskopf

Weng

. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc 2013; 20: 144–151.

19.

Cho

Weng

Kahn

, et al. Identifying data quality dimensions for person-generated wearable device data: multi-method study. JMIR Mhealth Uhealth 2021; 9: e31618.

20.

Terry

Stewart

Cejic

, et al. A basic model for assessing primary health care electronic medical record data quality. BMC Med Inform Decis Mak 2019; 1: 30.

21.

Gopal-Srivastava

Kaufmann

. Facilitating clinical studies in rare diseases. Adv Exp Med Biol 2017; 1031: 125–140.

22.

Strong

Lee

Wang

. Data quality in context. Commun ACM 1997; 40: 103–110.

23.

You

Liu

, et al. DEMATEL technique: a systematic review of the state-of-the-art literature on methodologies and applications. Math Probl Eng 2018; 1: 1–33.

24.

You

Yang

, et al. Data asset quality evaluation framework based on a hybrid multi-criteria decision-making method. Qual Reliab Eng Int 2024; 41: 801–820.

25.

Page

McKenzie

Bossuyt

, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Br Med J 2021; 372: n71.

26.

Aerts

Kalra

Sáez

, et al. Quality of hospital electronic health record (EHR) data based on the International Consortium for Health Outcomes Measurement (ICHOM) in heart failure: pilot data quality assessment study. JMIR Med Inform 2021; 9: e27842.

27.

Bian

Lyu

Loiacono

, et al. Assessing the practice of data quality evaluation in a national clinical data research network through a systematic scoping review in the era of real-world data. J Am Med Inform Assoc 2020; 27: 1999–2010.

28.

Castellanos

Wittmershaus

Chandwani

. Raising the bar for real-world data in oncology: approaches to quality across multiple dimensions. JCO Clin Cancer Inform 2024; 8: e2300046.

29.

Declerck

Kalra

Vander Stichele

, et al. Frameworks, dimensions, definitions of aspects, and assessment methods for the appraisal of quality of health data for secondary use: comprehensive overview of reviews. JMIR Med Inform 2024; 12: e51560.

30.

Ghalavand

Shirshahi

Rahimi

, et al. Common data quality elements for health information systems: a systematic review. BMC Med Inform Decis Mak 2024; 24: 243.

31.

Johnson

Speedie

Simon

, et al. A data quality ontology for the secondary use of EHR data. AMIA Annu Symp Proc 2015; 1: 1937–1946.

32.

Kim

, et al. Healthcare data quality assessment for improving the quality of the Korea Biobank Network. PLoS One 2023; 18: e0294554.

33.

Kodra

Posada de la Paz

Coi

, et al. Data quality in rare diseases registries. Adv Exp Med Biol 2017; 1031: 149–164.

34.

Lee

Weiskopf

Pathak

. A framework for data quality assessment in clinical research datasets. AMIA Annu Symp Proc 2018; 2017: 1080–1089.

35.

Lewis

Weiskopf

Abrams

, et al. Electronic health record data quality assessment and tools: a systematic review. J Am Med Inform Assoc 2023; 30: 1730–1740.

36.

McDonald

Little

Kriellaars

, et al. Database quality assessment in research in paramedicine: a scoping review. Scand J Trauma Resusc Emerg Med 2023; 31: 78.

37.

Weiskopf

Bakken

Hripcsak

, et al. A data quality assessment guideline for electronic health record data reuse. EGEMS (Wash DC) 2017; 5: 14.

38.

Yang

Ren

Sun

, et al. Investigation on the preferences for data quality assessment indicators of electronic health records: user-oriented perspective. JAMIA Open 2024; 7: ooae142.

39.

Feder

. Data quality in electronic health records research: quality domains and assessment methods. West J Nurs Res 2018; 40: 753–766.

40.

Chun Tie

Birks

Francis

. Grounded theory research: a design framework for novice researchers. SAGE Open Med 2019; 7: 1–8.

41.

Harris

Millman

van der Walt

, et al. Array programming with NumPy. Nature 2020; 585: 357–362.

42.

McKinney

. Data structures for statistical computing in Python. Scipy 2010; 445: 51–56.

43.

Chen

. Evaluating the rate of aggregative risk in software development using fuzzy set theory. Cybern Syst 1999; 30: 57–75.

44.

Opericovic

Tzeng

. Defuzzification within a multicriteria decision model. Int J Uncertain Fuzz 2003; 11: 635–652.

45.

Chen

Liu

Wanyan

, et al. Influencing factors of novice pilot SA based on DEMATEL-AISM method: from pilots’ view. Heliyon 2023; 9: e13425.

46.

Zadeh

. Fuzzy sets. Inf Control 1965; 8: 338–353.

47.

She

Guo

, et al. An evaluation of factors influencing the community emergency management under compounding risks perspective. Int J Disaster Risk Reduct 2024; 100: 104179.

48.

Warfield

. Binary matrices in system modeling. IEEE Trans Syst Man Cybern 2007; 12: 441–449.

49.

Tian

Chen

Bai

. Key influencing factors of vertical integration of electronic health records in medical consortiums. Int J Med Inform 2022; 170: 104959.

50.

Chen

Ming

Zhang

, et al. A rough-fuzzy DEMATEL-ANP method for evaluating sustainable value requirement of product service system. J Clean Prod 2019; 228: 485–508.

51.

Chen

Hsieh

. Evaluating teaching performance based on fuzzy AHP and comprehensive evaluation approach. Appl Soft Comput 2015; 28: 100–108.

52.

European Medicines Agency Good Clinical Practice Inspectors Working Group. Guideline on Computerised Systems and Electronic Data in Clinical Trials. https://www.ema.europa.eu/en/documents/regulatory-procedural-guideline/guideline-computerised-systems-and-electronic-data-clinical-trials_en.pdf (2023, accessed 6 May 2025).

53.

Melnikow

Kiefe

. Patient compliance and medical research. J Gen Intern Med 1994; 9: 96–105.

54.

Mbonihankuye

Nkunzimana

Ndagijimana

. Healthcare data security technology: HIPAA compliance. Wirel Commun Mob Comput 2019; 2019: 1–7.

55.

Weiskopf

Hripcsak

Swaminathan

, et al. Defining and measuring completeness of electronic health records for secondary use. J Biomed Inform 2013; 46: 830–836.

56.

Zhang

. Data pricing strategy based on data quality. Comput Ind Eng 2017; 112: 1–10.

57.

Wang

, et al. Alliance chain-based simulation on a new clinical research data pricing model. Ann Transl Med 2022; 10: 836.

58.

Lyu

Haider

Landman

, et al. The opportunities and shortcomings of using big data and national databases for sarcoma research. Cancer 2019; 125: 2926–2934.

59.

Pellegrini

Lovati

. Stakeholders’ engagement for improved health outcomes: a research brief to design a tool for better communication and participation. Front Public Health 2025; 13: 1536753.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.07 MB

0.05 MB

0.03 MB