Abstract
Keywords
Introduction
In the last decades, systematic outcome measurement has become a fundamental condition for value-based healthcare, and quality improvement in pediatric healthcare seems to increasingly depend on the standardization of these outcome measurements.1,2 Patient-Reported outcome measures (PROMs) play a key role in systematic outcome measurement, as they provide unique information of that what matters most to patients.3,4
Concerning cleft care, many calls and large initiatives in order to find or develop standardized outcome measures were set up over the past decennia, such as Eurocleft, Scandcleft, Americleft and Eurocat.5-9 Unfortunately, no validated PROMs for children with a cleft lip and/or cleft palate (CL/P) were used in the initiatives. However, in 2017, the International Consortium for Health Outcomes Measurement (ICHOM) developed the ICHOM Standard Set for Cleft Lip and Palate (ICHOM Standard Set). 10 An effort was taken to find a multidisciplinary, international consensus for a standard set of outcomes in cleft care. 10 Concerning PROMs, the workgroup found consensus in implementing several scales of the CLEFT-Q questionnaire into the ICHOM Standard set. 10
The CLEFT-Q questionnaire is a cross-cultural PROM, made recently, that is specifically developed and validated for patients with CL/P.11-13 Erasmus Medical Center (EMC, Rotterdam, The Netherlands), and University Medical Center Utrecht (UMCU, Utrecht, The Netherlands) have both implemented the ICHOM Standard Set. It was noticed that in clinical practice, caregivers were often involved in completing the CLEFT-Q. However, the questionnaire is only validated for completion by patients. And although psychometric properties of the CLEFT-Q have been examined thoroughly in previous studies, no studies regarding the effect of reporter have been done so far.13,14
Other pediatric PROM instruments, such as the PedsQL, have reported discrepancies between patient- and parent-reported outcomes in their studies.15-17 Therefore, many validated generic pediatric PROM instruments are available with scales specifically for patients or parents, or 2 completely separate versions of a questionnaire are used.18-25 More child-parent agreement in physical domains than in social and emotional domains is reported, although other studies state that agreement depends on the clinical relevance to a given disease group.15,16
In order to encourage further standardization of outcome measures and to enable future research and benchmarking, it is essential to examine to what extent reporter type affects CLEFT-Q outcomes.
The objective of this multicenter study is to promote further standardization of pediatric PROMs by exploring possible discrepancies between reporter types in children with a cleft, using all CLEFT-Q scales included in the current ICHOM Standard set. This was done by comparing outcome scores between reporter types. Since no randomization methods were applied, the reporter group sizes provide additional insights into the preferred manner of filling out the PROMs for pediatric patients and their caregivers.
Methods
A retrospective study was conducted on patients with CL/P at the EMC and UMCU using the ICHOM Standard set. The set was implemented in 2015 and 2021 at EMC and UMCU respectively, as part of the regular care protocol for patients with CL/P (ethical board approval nr. METC395478). According to the set, patients are evaluated from their initial visit to the cleft team until they are 22 years old, with the first visit usually occurring during the neonatal period. However, patients who are adopted or patients that have switched cleft teams are included too, and might be older at the start of treatment in the cleft centers. The ICHOM Standard set comprises various scales of the CLEFT-Q questionnaire, and the assessment of these scales depends on the patient’s cleft type and is conducted at the ages of eight, 12, and 22 years. 10 In the UMCU, an additional assessment at age 15 is conducted as well.
Cleft-Q Questionnaire
The CLEFT-Q questionnaire is a PROM designed specifically for individuals with CL/P. It is internationally applicable as it is available in various languages and has a cross-cultural character.14,26 The questionnaire consists of 12 scales and one checklist that cover the domains of facial function, health-related quality of life (HRQoL), and appearance. These scales can be used independently or in combination with other CLEFT-Q scales to gain insights into the Quality of Life (QoL) of individuals with CL/P. The ICHOM Standard set includes 8 CLEFT-Q scales and one eating and drinking checklist, including one functional scale (CLEFT-Q Speech Function), 3 appearance scales (CLEFT-Q Face, Teeth, Jaw), and 4 HRQoL scales that evaluate psychological and social domains (CLEFT-Q Speech distress, School function, Social function, Psychological function). The development and validation process of the CLEFT-Q questionnaire are extensively described elsewhere.11,12,14,27
Patient Population
The study included patients with CL/P and their caregivers who were evaluated according to the ICHOM Standard set between 2015 and 2021, for the EMC and UMCU respectively, until January 2023. Cases with at least one completed CLEFT-Q scale and registration of the reporter type were analyzed. Patients with cognitive impairments that prevented them from understanding the questionnaire items were not assessed with any PROM scales of the ICHOM Standard set. The patient’s cognition was estimated by the specialized nurse of the cleft team in consultation with the patient’s caregivers. Patients who were not proficient in the Dutch language were not assessed either.
Data Collection
Data collection was part of the regular care protocol, and patients could complete the CLEFT-Q scales either at home by receiving them via email, or during a regular visit to the cleft team. At the outpatient clinic, patients and their caregivers were asked who responded to the CLEFT-Q items, and the responses were categorized as follows: (1) “Patient” meaning the patient completed all items alone; (2) “Caregiver” referring to the parent/caregiver(s) who responded to all items without the involvement of the patient; (3) “Together” indicating the patient filled out the questionnaire with one or 2 caregiver(s). In this study, both (biological) parents and caregivers will be referred to as “caregivers.”
Baseline characteristics of all patients were recorded, including sex and cleft type. Data was collected from the following moments of assessment: Time-point 8 years, 12 years, and 15 years (only UMCU). According to the measurement moments of the ICHOM Standard set, 8-year-olds were evaluated with the CLEFT-Q Face, Teeth, and Social function. In patients who were 12 and 15 years old, assessment took place with the CLEFT-Q Face, Teeth, Jaw, Psychological function, School function, Speech function, and Speech distress scale. Time-point 22 years was not included in the data collection, because the responder type was not recorded in the 22-year-olds. During the data collection period, some scales of the CLEFT-Q were revised: Outcome scores of the older version of the CLEFT-Q Social scale were not used for this study due to the addition of new items.
Data Analysis
The data was analyzed using R statistical software. The scales were previously validated according to Rasch measurement theory, which showed reliability and validity in all scales except for the Eating and Drinking checklist. Consequently, outcomes of this checklist have to be evaluated per item instead of using the total score per scale, and therefore the checklist was not included in the current study. 13 For the included scales, scores on a 0 to 100 derived from the logit scores were used, with a higher score indicating a better outcome. The normality of distributions of scores was checked first using skewness and kurtosis values. Values greater than +2 or lower than −2 were considered indicative of non-normality. 28 Levene’s tests were done to test the homogeneity of variances. To investigate possible differences between the scores of the 3 reporter groups for all scales separately at each time point, Kruskal-Wallis H tests were subsequently performed. In case of a significant test, multiple pairwise comparisons were performed using Dunn’s test while applying Bonferroni correction. Moreover, mean outcomes were visualized in bar plots, including the 2.5%-97.5% quantiles to map out response distributions for each scale. Finally, responder distributions were compared between each age to examine possible trends in the responder preference.
Results
Baseline Descriptives and General Findings
A total of 567 patients were included in the study, with the majority being male (61.2%, Table 1).
Baseline Characteristics of the Included Patients, Per Age.
The distribution of cleft types followed the epidemiology of CL/P, with most patients having a CLAP (47.6%), followed by CP (31.7%), CLA (10.2%), and CL (9.0%). 29 Of the included patients, 268 were 8 years old at the time of assessment, 243 were 12 years old, and 56 were 15 years old.
A detailed overview of reporter types, age-groups, and cleft types is provided in Figure 1.

Overview of the participants per responder type.
In the majority of the cases, the CLEFT-Q scales were completed by patient and caregiver(s) together (n = 440), followed by the caregiver alone (n = 81) and the patient alone (n = 46).
Detecting Differences Per Age Group
All distributions of CLEFT-Q scores were considered non-normal based on the skewness and kurtosis values. Hence, the non-parametric Kruskal Wallis H test was used to test for differences on all scales. None of the scales showed significant differences between the reporter types in any of the age-groups (app. 1).
Scale outcomes at eight years
Two appearance scales and one psychosocial scale were completed at 8 years (Table 2).
Mean Outcomes and 2.5% to 97.5% Quantiles of the CLEFT-Q Scales in Each Age, Categorized by Responder Type.
Only CLAP and CP.
The CLEFT-Q Face was completed 268 times, whereas the Teeth and Social function scale were filled out 267 times.
From the reporter types, the “patient alone” formed the smallest group, followed by “parent alone.” The majority completed the scales together (75%, Table 2). In Table 2, an overview of all mean outcomes and the 2.5% to 97.5% quantiles are provided. No clear trends can be observed: the reporter type with the highest mean score differs in each scale (Figure 2).

Bar plots with the outcomes of each scale, categorized per age and responder type. Mean outcomes and 2.5% to 97.5% quantiles are visualized.
Scale outcomes at 12 years
The Face, Teeth, and Jaw scales were completed 240, 242, and 241 times respectively. The School and Social functioning scales were both filled out 241 times. The CLEFT-Q Speech function and Speech distress were completed in 200 participants. Observing the mean outcome score, the “patients” shows the best scores on most scales. All appearance scales are reported most positively by the “patient” group, although the Face scale shows similar outcomes in “patients” and “caregivers” (Figure 2). In the HRQoL scales, the highest mean score is found in the “patient” group in 2 out of the 3 scales. For the speech function scale, “patients” are the best-scoring group too (Figure 2).
Scale outcomes at 15 years
All scales were completed in 56 participants, except for Speech function and Speech distress, that were completed 45 times. In 2 out of the 3 appearance scales “patients” reported the highest mean scores (Face, Teeth, Figure 2). In the HRQoL scales, the highest mean scores were also found in the “patient” group in 2 out of the 3 scales (Psychological function, Social function, Figure 2). In the Speech function scale, the highest mean score came from the “caregiver” group.
Detecting Trends in Reporter Types
In Figure 3, trends for the reporter types are given by showing the proportion of participants per respondent group at each age.

Trend analyses between the age, per responder type. Confidence intervals are visualized as well.
The confidence intervals per age group and responder type are given in appendix 2.
When focusing on the “patient” group, the largest absolute number is found at age-group 12 years. However, looking at the relative amounts per age group, 15-year olds tend to complete the CLEFT-Q the most often alone (23.2% in 15-year olds, and 8.5% in 12-year olds, Table 2).
In Figure 3, an increasing trend in the “patient” group is visible. The proportion of patients that complete the scales alone increases significantly in 15-year-olds, when compared to 8-year-olds and 12-years-olds (Table 3).
Two-Sample Test for Equality of Proportions Without Continuity Correction.
The opposite phenomenon is seen in the “caregiver” group, where a downward trend is observed(Figure 3). The proportion of caregivers that complete the scales is significantly lower in the age-group of 12 years, when compared with the age-group of 8 years.
Overall, the majority of the participants completed the CLEFT-Q together, regardless of the age of the patient. The peak is seen at the 12-year olds (82.3%, Table 2), and declines at age 15, when the “patient” group increases (Figure 3).
Discussion and Clinical Interpretation
In this multicenter study of 567 participants, outcomes of all CLEFT-Q scales included in the current ICHOM Standard set were examined to explore if any trends in reporter types are present at the ages of assessment. Possible discrepancies between reporter types were detected at the same ages as well. This way, preferences of patients and parents were mapped out, and new insights for further improvement of standardization of pediatric PROMs were provided.
Strengths and Limitations
This is the first study to explore possible discrepancies in CLEFT-Q scales, based on type of reporter, and therewith provides unique insights in and between these groups. As patients with CL/P and caregivers included in the study individually decided who should complete the questionnaire, both advantages and disadvantages of the study design can be listed. Due to small sample sizes in the “patient” and “caregiver” group, possibilities for statistical analysis are limited in order to prevent over-analyzing the limited data. The absence of measurement outcomes of both patient and caregiver concerning the same patients makes it impossible to examine correlations between reporter types. Furthermore, a reporter bias may have occurred as it was for the patients and caregivers to decide who should fill out the questionnaire. On the other hand, practical and clinically meaningful information regarding preferences of patients with CL/P and their caregivers was provided.
Clinical Interpretations and Recommendations
At all ages, the majority of the participants completed the CLEFT-Q scales together. In younger patients, caregivers completed the scales more often alone. The opposite is observed in older patients, where the proportion of patients completing the scales alone exceeded the “caregiver” group.
This finding might indicate the added value of a parent-version of the CLEFT-Q scales, especially for younger patients. Many pediatric QoL questionnaires have such a validated parental version available. 25 The current results show that parental involvement is high, as the majority of the caregivers are completing the questionnaire either with their child or alone: the availability of a parental version of the CLEFT-Q could encourage their involvement further. It might be a valuable addition for the ICHOM Standard set as well, where the caregiver’s voice currently is captured by only one validated Parent Reported Outcome Measures that assesses speech function. 10 Furthermore, a parental version would enable assessments earlier than 8 years as well, which is the current minimum age for patients to complete the PROM. It would enhance longitudinal data collection, minimalize patient-burden, and it is simply more user-friendly to let patients and parents decide who should fill out the PROM.
Besides the patient’s and parent’s preferences to complete the CLEFT-Q together, parental involvement offers advantages from a clinical perspective as well. Wong Riff et al reported a small number of participants in their study that felt worse about how they look after filling out the appearance scales of the CLEFT-Q. 30 Parental involvement could provide emotional support to pediatric patients that experience emotional impact due to their disease or disability. Moreover, the authors of the current study noticed in their outpatient clinic that patients with CL/P who filled out the scales together with a parent, reported that the questions created a reason to openly discuss topics highlighted by the CLEFT-Q scales. When taking the discrepancies of the mean outcome scores found in the current study into consideration as well, parents might not always be aware of the emotional impact of the cleft on their child. Both findings of the current study suggest that some form of parental involvement should be preferred when a pediatric PROM is used in a clinical setting. Therefore, clinicians should encourage to minimalize parental involvement to an evaluation of the outcomes, after the patient filled out the PROM scales alone.
Concerning possible discrepancies between the reporter type outcomes in 12-year-olds and 15-year-olds, the “patient” group tend to score higher (thus better) on the scales in the psychosocial domain and the appearance domain. However, no significant differences between any of the reporter types were found in the current study, despite a difference in mean outcome score of almost 20 between “patient” and “caregiver” in the Speech function scale at 12 years. The lack of significance might be caused by the wide spread of outcome scores in the groups.
Discrepancies between reporter types have been examined in studies with other pediatric QoL questionnaires as well, and large systematic reviews regarding more generic pediatric QoL measurement instruments found that differences in outcomes between patient and proxy reported questionnaires are often larger in social and emotional domains, whereas functional domains seem to show more concordance.16,31,32 Volpicelli et al, who examined discrepancies between patient- and parent reports regarding psychosocial function in pediatric patients with craniofacial anomalies throughout other outcome measurement instruments than the CLEFT-Q, included patients at similar age groups. In their findings, parents tend to score more negative on the psychosocial domain than children in older age-groups (12 and 15 years). 33 This tendency is in line with our findings, as most psychosocial scale outcomes were higher in the “patient” group than in the “caregiver” group at age 12 and 15. However, caution should be taken with the current study results, as participants were able to decide by themselves who should fill out the PROM: It could be that patients who did not experience many problems were more prone to complete the CLEFT-Q scales by themselves, resulting in a reporter group with more optimistic outcomes.
Suggestions for future studies
Possibilities for further, more in-depth research concerning this topic could concern examinations of discrepancies based on the gender of the parent or the child, as was done by Ooi et al in a study regarding obese children. 17 Moreover, research concerning the reporter group that complete the questionnaire together should be conducted. Although studies often focus on outcome differences between patient reports and parent or proxy reports,15-17 the current study points out the importance to examine this third, large group. It is difficult to determine the exact share of the patient or the parent when the scales are filled out together, which is reflected in the results of this study: outcomes from the together reporter group are not simply the mean outcomes of the patient alone and parent alone group. The presence of any response sets within the response groups, and the differences in response sets might play a key role in differences in outcomes, as children might answer differently in order to please their parents or to appear competent.34,35
Although the diversity of this reporter group might cause complications in further psychometric assessment, the current study shows that in the clinical daily practice, the large majority of the patients fill out the questionnaire together with at least one caregiver. Therefore, it is recommended for centres that have implemented a pediatric PROM to record the reporter type.
Furthermore, future studies with a prospective set-up could elucidate possible discrepancies between the outcomes of patient and caregivers, like the study of Volpicelli et al. 33 It would enable the collection of outcomes of both patient and parents concerning the same patients. However, a validated parental version should be available first. In case a validated parental version would be developed, it is of great importance to first examine outcomes in a set-up as mentioned above: like in other pediatric PROMs, (validated) outcomes of parent versions and patient versions have to be compared to check if they are interchangeable, or if they are only valid for comparison of outcomes within their own responder group.
In a more clinical perspective, future studies that would focus more in-depth on the implementation of both patient and (future) parent versions of the CLEFT-Q scales could be of great value for clinical use. Implementing PROM scales in an efficient and meaningful manner, even without the current presence of a parent version, is important to keep patient burden minimal while still retrieving all information that is important. Implementing the right CLEFT-Q scales at the optimal time-points might improve patient care, without “asking too much” of our patients and families, and several studies have already made recommendations to do so.36,37 A critical look towards the establishment of sets of outcome measures should be maintained and, depending on the objective of the set of outcome measures, potential availability of a parent version could help in deciding whether some scales could be better completed by parents or by patients.
Conclusion
Although the CLEFT-Q questionnaire is designed and validated for patient reports, the majority of patients with CL/P and parents prefers to fill out the CLEFT-Q scales together. More patients complete the questionnaire alone in older age-groups, whereas the group of parents that complete the questionnaire for their child decreases with the age of the patient. Mean outcome scores per age suggest discrepancies between reporter types, in which patients tend to score more positively when completing the questionnaire alone. Therefore, caution should be taken concerning reporter type, and it is recommended to record reporter type in pediatric PROMs, in order to promote standardization of outcome measurers and optimize research possibilities. From a clinical perspective, parental involvement should be encouraged in the assessment of pediatric PROMs, as involvement can motivate families to openly discuss the impact on the QoL of the child. Therefore, a recommended alternative to completing the questionnaires together, is to let parents evaluate outcomes of the PROM with their child after the PROM is filled out by the patient alone. This way, parental support can be provided, while the outcomes should not be affected by parental opinions. In the future, a parental version of the CLEFT-Q could be of additional value, especially in younger patients.
Footnotes
Appendix
Confidence Intervals of the Trend Proportions Per Age Group, Per Responder Type.
| Age | Responder group | Lower bound | Upper bound |
|---|---|---|---|
| 8 y | Patient | 0.023 | 0.08 |
| Caregiver | 0.16 | 0.26 | |
| Together | 0.69 | 0.80 | |
| 12 y | Patient | 0.05 | 0.13 |
| Caregiver | 0.06 | 0.13 | |
| Together | 0.77 | 0.87 | |
| 15 y | Patient | 0.13 | 0.36 |
| Caregiver | 0.02 | 0.17 | |
| Together | 0.56 | 0.81 |
Acknowledgements
We would like to thank all children and caregivers that participated in the study, that contributed by filling out the questionnaires and allowing the research team to use their data for the current study. Furthermore, we would like to thank the cleft team clinicians from both participating centers, that contributed tremendously by collecting the data consequently and by advising the research team on important clinical considerations
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical Approval
Institutional Review Board approval was obtained MEC-2016-156.
Data Availability
Use of the CLEFT-Q™ questionnaire, developed by Anne Klassen, PhD, and Karen Wong, MD, MS, was made under license from McMaster University, Hamilton, Ontario, Canada. The ICHOM Standard Set for Cleft Lip/Palate is maintained under the auspices of the International Consortium for Health-Outcome Measurement (ICHOM) and is available free at
.
