Abstract
Since the beginning of the coronavirus disease 2019 pandemic, the number of surveys conducted remotely by mobile phone in low-income and middle-income countries has increased rapidly. This shift has helped sustain data collection despite restrictions on mobility and interactions. It might also allow collecting data more frequently on important demographic and socioeconomic topics. However, conducting interviews by mobile phone might affect the accuracy of reported data, for example, if respondents have difficulties understanding questions asked remotely, or data collectors have less time to probe and cross-check answers. In this visualization, the authors explore time trends in age heaping, a strong signal of reporting errors, in six African countries. They show that mobile phone surveys have generated noisier data on age than recent household surveys and censuses, thus possibly affecting researchers’ understanding of demographic processes and confounding multivariate analyses of socioeconomic outcomes.
During the coronavirus disease 2019 (COVID-19) pandemic, surveys conducted in low- and middle-income countries (LMICs) have increasingly been administered remotely by mobile phone. This pivot away from more traditional modes of data collection (e.g., household visits) was made necessary by restrictions on gatherings and mobility. It made it possible to document in near real time how people understood and navigated new health risks, how poor households coped with increasingly unstable livelihoods, and how communities perceived their governments’ responses to the spread of COVID-19 (e.g., Egger et al. 2021; Kohler et al. 2022).
Besides the fact that they can be sustained during epidemics and other crises (e.g., natural disasters, conflicts), the appeal of mobile phone surveys (MPSs) in LMICs stems from (1) more convenient and cheaper logistics than household surveys, (2) rapidly increasing access to mobile phones, and (3) participation rates that remain much higher than in phone surveys conducted in high-income countries (Gibson et al. 2017). MPS might thus provide an opportunity to collect more frequent data in LMICs on key topics such as poverty, education, fertility, and health (Hensen et al. 2021). They will likely remain prevalent, even after disruptions caused by the COVID-19 pandemic have subsided.
MPSs, however, risk misrepresenting the population of interest in LMICs, because disadvantaged groups continue to have more limited access to mobile phones (Hersh et al. 2021). MPSs might also generate more inaccurate data than household surveys and censuses, for example, if respondents have difficulties understanding questions asked remotely or data collectors have less time to probe and cross-check answers during interviews conducted by mobile phone. It is unclear whether MPSs meet data quality standards set by household surveys and censuses in LMICs.
We compared age data generated by MPSs with data obtained from household surveys and censuses in six African LMICs where MPSs have recently been conducted. Age is central in analyses of demographic processes (e.g., fertility, mortality). It is also frequently included as a covariate in regression models of many socioeconomic outcomes. The presence of heaping (i.e., excess numbers of individuals with ages ending in 0 and 5) is a strong signal of errors in age data. In each country, we assembled a data set that included household surveys and censuses collected since 1990, along with available MPSs. We then calculated Whipple’s index, a standard measure of heaping at adult ages (Ewbank 1981). Finally, we visualized differences in this index between recent MPSs and the long-term trend observed in household surveys and censuses.
In virtually all instances, Whipple’s index was higher in MPSs than in recent household surveys and censuses (Figure 1). In Burkina Faso, two MPS yielded age data that displayed much more severe heaping than household surveys and censuses. In that country, the MPS with the lowest value of Whipple’s index asked respondents about their dates of birth, rather than age. It was thus unlikely to be affected by patterns of age heaping measured by Whipple’s index. In Côte d’Ivoire, MPSs showed heaping levels comparable with those observed in newly launched household surveys, whereas in Ghana, a recent MPS elicited more severe heaping than the latest population and housing census. In Malawi and Rwanda, where household surveys and censuses often generate accurate age data (Whipple’s index < 110), recent MPSs have generated rough data (Whipple’s index > 125). In Senegal, where the quality of age data collected in household surveys and censuses has improved over the past 15 years, an MPS conducted at the beginning of the COVID-19 pandemic showed particularly high levels of heaping.

The figure shows how surveys conducted remotely by mobile phone were frequently affected by higher levels of age heaping than recent household surveys and censuses conducted in person. The dashed line represents the time trend in age heaping observed in household surveys and censuses, whereas the shaded area surrounding that line is the 95 percent confidence interval; they were both obtained from locally estimated scatterplot smoothing.
Our findings suggest that the recent shift to administering surveys remotely by mobile phone has led to noisier age data in several LMICs. This will likely affect the measurement of demographic processes and confound multivariate analyses of socioeconomic outcomes. It might also signal broader data quality issues in MPSs. As calls to expand the use of MPSs to collect socioeconomic data in LMICs are made (Gourlay et al. 2021), new strategies to improve the accuracy of age reports generated by MPSs are needed.
Footnotes
Supplemental Material
Supplemental material for this article is available online.
