Abstract
In this study, we review the implementation, reliability, and validity of the Pediatric Quality of Life Inventory (PedsQL), a measure of health-related quality of life, in young children in rural Guatemala. Mothers of 842 children (
What Do We Already Know About This Topic?
Children living in low resource settings are exposed to many risks to their physical health and therefore to their overall health-related quality of life.
How Does Your Research Contribute to the Field?
The most commonly used measure of health-related quality of life in children, the PedsQL, did not perform as expected and did not discriminate between groups of healthy children and those with chronic or acute illness.
What Are Your Research’s Implications Toward Theory, Practice, or Policy?
Future studies in low resource settings, particulary those involving the youngest children, should describe implementation of the health-related quality of life measure and analysis of reliability and validity to ensure that data accurately reflect what is occuring in that population.
Background
The World Health Organization defines health-related quality of life (HRQOL) as not simply an indicator of the presence or absence of disease, but as a more comprehensive metric of physical, mental and social well-being. 1 There is a growing recognition of the importance of measuring quality of life in health outcomes, because children who suffer from health-related conditions are not only impacted physically. Emotional, social and school functioning are also affected, placing these children at a disadvantage across all areas of daily living compared to their healthy peers. 2
Measuring HRQOL is particularly important in low resource settings (LRSs) where children are exposed to many risks to their physical health.3,4 The most common health problems, unique to children living in LRSs, are stunting and wasting, with incidence rates as high as 1/3 to 1/2 in the most impacted parts of the world.5–8 Stunting and wasting are reported to be responsible for 2.2 million deaths a year and 35% of the disease burden in children under 5. 8 Stunting is understood to be a proxy for chronic undernutrition and related to compromised immunity and repeated illness.9–11 Wasting is thought to be a reversible condition related to acute undernutrition and can be seen in cases of food shortages or during outbreaks of disease, for example. 12 Not surprisingly then, several studies have shown stunting and wasting to be related to lower HRQOL13–15 and consequently, poor long-term outcome.16–21
While there is no agreed upon gold standard for the measurement of HRQOL, the PedsQL has emerged as one of the most commonly used tools in clinical practice and healthcare research. Many studies have shown it to be a valid instrument with good ability to discriminate between groups of healthy and ill children, differing levels of disease severity, and chronic and acute illness.22–25 While it was developed in the United States, the PedsQL is widely used throughout the world including in LRSs.22,26–28 However, the majority of studies outside of the US report PedsQL results without presenting data on whether it has been shown to be a reliable metric of HRQOL with good discriminability both cross-culturally and in that particular study setting.29–32 Of the studies that do report on the psychometric properties of the PedsQL in LRSs, some have shown that it can be used successfully13,33,34 while others have found that it did not discriminate as well as expected between groups of healthy and ill children and have called for more research when utilized in these settings.27,35,36
Additionally, most studies of the PedsQL in LRSs have been conducted on school-aged children, and there is less evidence for its use in the very young, particularly infants and toddlers.13,27,34,37,38 In Argentina, the PedsQL, adapted for use at the study site, worked well for children ages 5 years and older, but was less reliable for children 2 to 4 years of age. 39 Even in the US, the research has been equivocal, with some studies showing that the PedsQL can be used successfully in infancy23,26,40 and through the preschool years,23,41 and another study showing that it may not perform as well with the youngest children. 42 Few studies have focused on the reliability and discriminatory ability of the PedsQL to measure HRQOL in an infant/young child population in an LRS.43,44
The objective of the current study was to report on implementation, reliability and discriminant validity of the caregiver report PedsQL in order to advance the understanding of HRQOL of young children living in poverty in rural Guatemala. We hypothesized that PedsQL scores would remain stable over time in healthy children and would discriminate between healthy children (i.e., children without a known, active health problem) and those meeting criteria for stunting or wasting. A secondary objective was to examine the sensitivity of the PedsQL to acute illness. We hypothesized that PedsQL scores would be lower for children during the time of acute illness compared to when they were healthy.
Methods
The Parent Study enrolled children from birth to 5 years of age who were prospectively followed for 1 year to determine the incidence of postnatally acquired symptomatic and asymptomatic ZIKV infection. Infant-mother pairs were eligible if the child was 0 to 2.9 months of age and the mother was ≥16 years of age at enrollment. They were screened and enrolled from the Center maternal and child health program, from referrals of pregnant women who were close to delivery or had delivered an infant in the previous 2 months by community health workers. Older children (<5 years of age) were eligible if they either participated in a prior 2015 to 2016 dengue acute febrile illness surveillance study or were a sibling of an enrolled infant. Informed consent was obtained by research nurses. The research nurses were available to discuss the details of the study. Informed consent forms were signed according to the National Regulations in Guatemala, who were supplied with a copy of the form. If the parent was illiterate, they were provided a thumbprint on the consent document, and a witness who was not a member of the research team was present to countersign to attest that the study information was provided and the parent(s) understood the study procedures, risk, benefits and alternatives. There were no cases of acute ZIKV during the time of the study.
In addition to serial neurodevelopmental assessments, the PedsQL was administered to mothers at the following time points for each participating child: baseline (enrollment within the first 3 months of life), 3, 6, 9, and 12 months after enrollment for infants; baseline, 6 and 12 months for children under 36 months of age; and baseline and 12 months for children 36 months of age and older. For the purposes of consistency in comparing children for the current analyses, baseline, 6 months and 12 months PedsQL scores were used. Height and weight measurements were taken at each visit by study nurses. Salter and hanging scales were used to weigh infants and floor scales were used to weigh children. Seca infantometers were used to measure infants’ length and stadiometers were used to record children’s height.
Weekly illness surveillance was conducted, in which mothers were contacted by phone and asked about whether or not their child had suffered from fever >38.0°C, rash, conjunctivitis (non-purulent/hyperemic), arthralgia, myalgia, or peri-articular edema for more than 1 day. Any child with at least 2 of these symptoms met the case definition for a flavivirus-like illness (FLI), which prompted a case investigation and diagnostic testing for ZIKV, dengue, and chikungunya. Mothers were administered the PedsQL for the child within 4 weeks after every FLI episode in order to capture any effects of the recent illness.
The PedsQL Generic Core Scales 4.0 designed for use in both healthy and patient populations was administered.24,47 Mothers were asked how much of a problem each item has been during the past month and are asked to respond on a 5-point Likert scale (0 = never, 1 = almost never, 3 = sometimes, 4 = often, and 5 = almost always). Items are reverse scored and tranformed to a scale of 0 to 100 points with higher scores indicating a better HRQOL. Three summary scores are derived from the PedsQL: Psychosocial Health (comprised of Emotional, Social and Cognitive Functioning for children <24 months and Emotional, Social, and School Functioning for children age 2 and older), Physical Health (comprised of Physical Functioning and Physical Symptoms for children <24 months and Physical Functioning for children 2 years and older) and Overall Health Related Quality of Life (all scales combined). Scores are raw scores and are not standardized. Composite data from 9430 school-aged children in the US showed a mean total score of 82.70 (SD = 15.40) on caregiver report with >1 standard deviation below the mean indicating at-risk status for impaired HRQOL. 25 A US sample of 246 healthy infants showed a similar mean total score of 82.47 (SD = 9.95) for healthy children. 26
Literacy: A 2011 community needs survey found 21% of mothers reported having no formal education. 46 Overall, 52% of mothers completed school beyond second grade and only 7.5% of mothers attended school beyond the 6th grade. Therefore, given that it was likely that a sizable proportion of mothers may have struggled with the literacy demands of the PedsQL and were demonstrating difficulty during pretesting, we administered it orally to all mothers to ensure equal access to the test and consistency in test administration across the Parent Study.
Question stem: The question stem on the PedsQL refers to problem severity (i.e., “How much of a problem has your child had with..?”), but requires a response regarding frequency (e.g., never, almost always). While this seems to work in English, it appeared to create substantial confusion in our study. To address this, we modified the stem to “¿Qué tan seguido ha tenido problemas con. . .?,” which directly translates to “How often has he/she had problem with. . .?” This improved the ability to connect the stem to item responses and reduced the need for clarification from mothers.
Likert scale: During the pretesting, mothers also frequently expressed confusion about the 5 point Likert scale, its meaning and the variety of response options available. To address this, we worked to develop a visual prompt to represent each of the 5 options. Several visuals were attempted and failed during pilot testing. In the end, we adapted an unpublished illustration by Perez-Condor and Kohrt 54 of a Guatemalan woman burdened by a heavy sack, which illustrates that having “more” of something was negative. We created a visual of a child with an increasing number of mosquitos on his body (Figure 1). The artist, who is a nurse coordinator at the site (EEB), added the written answers above each drawing for those mothers who could rely on those as well. In introducing the questionnaire, we provided a verbal explanation about the frequency and not the burden/severity of the mosquitos. So, for example, we described that he “never” or “always” has mosquito bites. This resulted in improved ability to answer questions and in better variability in responding qualitatively.

Visual of PedsQL Likert scale.
A Pearson’s product moment correlation was computed to examine the correlation between PedsQL scores among children categorized as healthy at all time points. We expected to see consistent scores over time due to normal growth indicators for height and weight and lack of presence of acute illness. Next, multivariable mixed models were used to compare PedsQL scores over time between healthy children and children meeting criteria for stunting and between healthy children and children meeting criteria for wasting. PedsQL data was tested for normality using the Shapiro-Wilk test at each time point for infants and young children separately.
The “minimal clinically important difference” for the PedsQL was described in Varni et. al. (2003) as the smallest difference in a score that would trigger different patient management or would be perceived by parents to be beneficial. 24 Therefore, we also examined mean scores for healthy children and children meeting criteria for stunting or wasting to assess whether these differences met criteria for a minimal clinically important difference. Lastly, effect sizes were computed to analyze PedsQL scores at baseline, immediately after a FLI event, and then in the months following. We hypothesized that scores during a FLI event would be lower than before or several months later for a given child.
Results
Analysis of infants consisted of 486 subjects for whom anthropometric measurements and linked PedsQL data were available for at least 1 time point (enrollment, 6 months visit, or 12 months visit). Of these 486 subjects, 309 were healthy (no stunting or wasting and no FLI events) at all visits, 13 met criteria for stunting or wasting at all visits, and 159 met criteria for stunting or wasting at some but not all timepoints. Infant subjects were 47% female, and an average of 1.1 (SD 1.0) months old at first anthropometric measurement and PedsQL administration. Analysis of older children consisted of 357 subjects for whom anthropometric measurements and linked PedsQL data were available for at least 1time point (enrollment, 6 months visit, or 12 months visit). Of these 357 subjects, 182 were healthy at all visits, 114 met criteria for stunting or wasting at all visits, and 61 met criteria for stunting or wasting at some but not all timepoints. Older children were 48% female, and an average of 35.9 (SD 10.3) months old at first anthropometric measurement and PedsQL administration.
Test-retest reliability of the PedsQL for infants and children categorized as healthy (never meeting criteria for stunting or wasting or a FLI during the time of the study,
Pearson Correlation Coefficients of PedsQL Scores Across Visits in Healthy Infants and Older Children.
Infants: children 0-3 months at the time of enrollment.
Older children: siblings and community controls ages 1 to 5 years at the time of enrollment.
Comparison of Mean PedsQL Scores Between Infants and Children Healthy at All Time Points Compared to Those Who Met Criteria for Wasting or Stunting at All Time Points.
Statistically significant
Met threshold for the minimal clinically significant difference (
Multivariable mixed models compared PedsQL scores at visits in which an infant or child met criteria for stunting with visits of those that were healthy, as well as visits in which an infant or child met criteria for wasting with healthy infants and children. Those who met criteria for a FLI at any point in the study were not included in this analysis. The models were adjusted for gender. At clinic visits in which an infant met criteria for stunting, the infant had significantly lower scores on Psychosocial Health (
Association Between PedsQL Scores Collected at Visits During Which the Infant or Child Met Criteria for Stunting Versus Visits at Which the Infant or Child Did Not Meet stunting Criteria*, Adjusted for Relevant Confounders.
Adjusted for gender; those with FLI events not included. Visits at which the infant or child also met criteria for wasting are not included.
Statistically significant (
Association Between PedsQL Scores Collected at Visits During Which the Infant or Child Met Criteria for Wasting Versus Visits at Which the Infant or Child Did Not Meet Wasting Criteria*, Adjusted for Relevant Confounders.
Adjusted for gender; those with FLI events not included. Visits at which the infant or child also met criteria for stunting are not included.
Statistically significant (
Effect sizes were small to medium and frequently not in the expected direction for infants and children before, during and after a FLI event (Table 4).
Effect Sizes of the Means of PedsQL Scores for Children with a FLI Event.
Discussion
Low to moderate test-retest reliability was found demonstrating modest support for the PedsQL as a reliable measure of HRQOL in infants and children without known chronic or acute health problems. While the difference in scores between those meeting criteria for stunting and healthy infants and children was statistically significant when multiple test administrations were analyzed over time, this difference was not evident for wasting. Differences were also not consistently evident when stunting and wasting were combined and analyzed at specific time points for older children and was not evidenced at any time point in infants, although this latter analysis was limited by small sample size for the stunting/wasting group. For the time points that met statistical significance, the difference in means did not consistently meet the threshold for a minimal clinically important difference 24 potentially limiting any real world implications. Lastly, mothers did not report statistically significant lower scores in the weeks following an acute illness episode when compared to time points in which the same child was healthy. Therefore, like for wasting, the PedsQL did not appear to capture the impact of acute illness in this population.
While we found modest support for the PedsQL as a reliable measure of HRQOL in young children without known, active health problems, it did not function reliably as a discriminatory tool between healthy children and children meeting criteria for wasting, nor did it discriminate reliably when an individual child was experiencing acute illness. PedsQL scores were lower in certain indices for infants and children with stunting, but only when multiple assessments over time were included, which is likely not feasible for many smaller studies or for those planned to assess children at a single time point. While there is a growing body of literature supporting the use of measures developed in high-income countries to assess children in LRSs when carefully translated, adapted, and applied59–61 and we have successfully shown other U.S.-developed measures to work reliably in our population,62,63 our data suggest that the PedsQL did not function sufficiently as a reliable indicator of HRQOL in these groups of young children in rural Guatemala.
There were limitations of our study that may partially explain these results. In pretesting, there were significant problems with literacy and comprehension of test items, the question stem and the Likert scale of the PedsQL. Therefore, modifications to administration were necessary to address these problems. While some studies have shown that mode of adminstration (paper, computer, telephone interview) did not impact PedsQL results,64,65 while minor, our additional modifications to administration potentially changed the measure in ways such that it performed differently than the original version. However, again, given the extent of problems evidenced during pretesting, we believe we would not have been able to use the PedsQL at all without modifiying how it was administered.
Additionally, there were limitations to how we categorized children into the healthy group. Due to the lack of available health care specialists in the area, there were likely children who would have met criteria for a chronic medical condition yet were undiagnosed and so were included in the healthy sample. There are also high background rates of infectious disease and other acute illnesses in the community45,46,66 which may mean that even infants and children not meeting criteria for stunting or wasting or with an active FLI event were not truly “healthy,” also complicating any comparisons between groups and potentially attenuating our results.
It should be noted that in other analyses, our translated and adapted performance-based neurodevelopmental measure, the Mullen Scales of Early Learning, 67 did discriminate between children meeting criteria for stunting or wasting and children we categorized as “healthy” with the former performing more poorly across domains. 68 It is possible that a meaure focused on HRQOL, like the PedsQL, picks up on the elevated background rates of health problems in all children in a high risk community such as this and so fine distinctions between groups of children are more difficult to capture than differences between children on measures focused on developmental skill domains.
Presumably, this problem with high background rates of disease and unknown rates of other chronic and acute medical problems would be common in other LRSs around the world, as well. It is possible then that these factors would make it difficult for the PedsQL or any other health-focused measure to isolate the impact of any 1 disease or factor in these higher risk communities. It may also be challenging for caregivers to recognize and report a symptom as problematic that is normative in that particular high-risk community. For example, if there are high rates of stunting in a community, then low energy and lethargy may be commonly observed in children and so the mother of a child with stunting may not view her child as different in this way.
The PedsQL may also have functioned differently than the Mullen Scales of Early Learning in our study because the latter is a performance-based assessment and the PedsQL is caregiver report. In the field during administration of the PedsQL, we repeatedly observed incongruencies between what mothers reported during adminstration of the PedsQL and what was reported to clinic nurses in informal interview or observed by the team. For example, on 1 FLI visit, the mother reported a very recent hospitalization of the infant for bronchitis. The baby continued to have loud and labored breathing. Yet, on the PedsQL, the caregiver responded that the baby had not been struggling to breathe and was not making any sounds while breathing over the past 2 weeks. Therefore, while the PedsQL has been shown to be reliable in other settings and despite our efforts to modify administration to improve comprehension of the measure (described in detail above), we often struggled to obtain answers consistent with what mothers reported on more informal interview. Again, this may suggest some specific challenges, cultural and otherwise, to caregiver report in our study community that we have yet to fully understand and not a problem with the PedsQL in particular.
Conclusion
Understanding the impact of HRQOL in populations of children exposed to elevated and cumulative risk factors to their health and well-being is of utmost importance.4,22 However, ensuring that these data accurately reflect what is occuring so that targeted and effective interventions can be developed is critical. Any future studies in LRSs, particulary those involving the youngest children, should describe implementation of the measure and analysis of reliability and validity in that population. Sharing this information will help other research and clinical teams apply lessons learned and more accurately assess the health of the pediatric populations they are supporting.
Footnotes
Author Contributions
AKC directed the local assessment team, efforts to adapt, pilot and implement the measure, and led manuscript preparation.
MML performed the data analysis and interpretation.
AMC assisted in supervision of the local assessment team, data analysis and interpretation.
DB, SH, PA, MAM were the local psychologists involved in piloting and adaptation of the PedsQL, test administration, data input and scoring.
EEB assisted in the piloting and adaptation of the PedsQL, as well as in the development and artwork for the visual Likert scales.
HME administered overall study procedures and provided oversight for data collection and quality control.
APA, MC, GAB, and DB were involved in the creation of study procedures, piloting and adaption of the measure, quality control and data collection.
EJA and FM are the co-PIs of the parent study, were involved in the development and oversight of all study procedures, as well as in data interpretation and initial to final manuscript development.
All authors read and approved the final manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The National Institutes of Allergy and Infectious Diseases at the National Institutes of Health provided contract funding for this project to the Vaccine and Treatment Evaluation Units (VTEUs) at Baylor College of Medicine [HHSN272201300015I. Task Order No. HHSN27200013-16-0057.C1D1.0058].
