Abstract
In this research note, we address the potentials of using interviewer-observed paradata, typically collected during face-to-face-only interviews, in mixed-mode and innovative data collection methods that involve an interviewer at some stage (e.g., during the initial contact or during the interview). To this end, we first provide a systematic overview of the types and purposes of the interviewer-observed paradata most commonly collected in face-to-face interviews—contact form data, interviewer observations, and interviewer evaluations—using the methodology of evidence mapping. Based on selected studies, we illustrate the main purposes of interviewer-observed paradata we identified—including fieldwork management, propensity modeling, nonresponse bias analysis, substantive analysis, and survey data quality assessment. Based on this, we discuss the possible use of interviewer-observed paradata in mixed-mode and innovative data collection methods. We conclude with thoughts on new types of interviewer-observed paradata and the potential of combining paradata from different survey modes.
Keywords
Introduction
Face-to-face interviewing has long been considered the ‘gold standard’ among data collection methods in the market and social research, which is mainly due to the long-time higher response rates, the better reaching of hard-to-reach target groups, and thus the less biased and more representative survey data (Schober, 2018; Villar & Fitzgerald, 2017). Another advantage of face-to-face interviewing is the unique opportunity for the interviewer to collect additional data, so-called interviewer-observed paradata, about the respondents and nonrespondents, their living environment, and the interview situation itself. These paradata allow researchers and practitioners to learn more about improving fieldwork processes and ensure a high quality of survey data (Groves & Heeringa, 2006; Kirchner et al., 2017; Kreuter, 2013).
In recent years, and further reinforced by the COVID-19 pandemic, there have been increasing calls in the market and social research to switch from face-to-face-only interviewing to mixed-mode designs (Luijkx et al., 2021; Wolf et al., 2021), or other innovative survey data collection methods (Conrad et al., 2022; Endres et al., 2022; Jeannis et al., 2013; Schober, 2018; West et al., 2022). While the idea of mixed-mode data collection and its benefits are not new (de Leeuw, 2005; 2018; Dillman, 2005; Scherpenzeel, 2017), they have become even more important in the post-pandemic era (Cleary et al., 2021; Kuenzi et al., 2022; Kantar Public, 2021). In addition, innovative methods that involve an interviewer in some way, such as knock-to-nudge contact strategies or remote video interviewing, gained prominence (Cornick et al., 2022; West et al., 2022).
Mixed-mode and innovative data collection methods allow for rapidly adapting fieldwork processes to changing conditions and more flexible responses to unforeseen events (Cornick et al., 2022; SHARE-ERIC, 2022). Moreover, they enable the collection of rich interviewer-observed paradata at each step interviewers are actively involved. Even though many survey researchers and practitioners are already familiar with the common interviewer-observed paradata from face-to-face-only interviews, little is known about the meaningful use of these paradata in mixed-mode and innovative data collection methods. Therefore, this research note provides a systematic overview of the most common types and purposes of interviewer-observed paradata in face-to-face-only interviews. Based on this, we discuss their potential uses in mixed-mode and innovative data collection methods and provide initial suggestions for academic research and practice.
Interviewer-Observed Paradata in Face-to-Face-Only Interviewing
We systematically searched for previous empirical studies dealing with interviewer-observed paradata in face-to-face interviews and compiled them using the evidence-mapping methodology (Saran & White, 2018; Snilstveit et al., 2013). We included 102 articles and coded the types and purposes of interviewer-observed paradata (details of the search, screening, and coding process in the Supplementary Appendix).
Figure 1 shows in the rows the main types of interviewer-observed paradata in face-to-face studies, namely contact form data, interviewer observations, and interviewer evaluations, and their subtypes (see Table A4 in the Supplementary Appendix for a complete list of paradata types coded in our studies, including examples). Columns list the five primary purposes of interviewer-observed paradata that we identified based on our studies, including fieldwork management, propensity modeling, nonresponse bias analysis, substantive analysis, and survey data quality assessment. The size of the circles corresponds to the frequency with which the paradata occurred as (in)dependent variables in the analyses of the studies. The paradata types most often used for specific purposes are highlighted in light gray and are briefly described below based on selected studies. Evidence map on main types and purposes of interviewer-observed paradata in face-to-face interviewing.
Types of Interviewer-Observed Paradata
Purposes of Interviewer-Observed Paradata
Interviewer-Observed Paradata in Mixed-Mode and Innovative Data Collection Methods
First, we briefly describe three data collection methods with interviewer participation that have gained prominence in the market and social research during the COVID-19 pandemic, including CAPI-plus, video interviewing, and knock-to-nudge (Cornick et al., 2022). Second, we discuss the use of interviewer-observed paradata for these three methods. These uses are anecdotal and do not claim to be exhaustive.
Challenges in Contact and Cooperation
A major objective of mixed-mode and innovative data collection methods is to improve contact and cooperation to increase response rates and sample representativeness. For example, offering an alternative non-face-to-face mode in CAPI-plus can make the survey attractive to those concerned about face-to-face interaction or who want to avoid an interviewer in their home (Cornick et al., 2022). Similarly, face-to-face recruitment for a non-face-to-face survey through KtN can increase response rates. However, it also affects the distribution of respondent characteristics (e.g., younger, unmarried, living in larger households and the most deprived areas), presumably due to different likelihoods of respondents being at home and responding to the interviewer’s knock on the door (Kastberg & Siegler, 2022). In addition, KtN requires comprehensive call scheduling due to the postponement of the interview. Concerning CAVI, not all respondents have access to an Internet-enabled device with a camera and microphone. Even if the technical requirements are met, not all respondents are ready for and comfortable with a video interview. Like KtN, CAVI involves comprehensive scheduling (Endres et al., 2022; Schober et al., 2020). Respondents’ varying ability and willingness to participate and the more complex call scheduling, particularly for CAVI and KtN, underscore the importance of tailored fieldwork management, propensity modeling, and nonresponse bias analysis.
In all three data collection methods presented, contact history information can be usefully applied to fieldwork management and propensity modeling to better understand the mechanisms of successful contact and cooperation and develop an effective call scheduling and recruitment strategy, ultimately increasing response rates and sample representativeness. For example, call record data help optimize contact timing and prioritize cases most difficult in CAPI-plus and KtN to reach at home or those most likely to refuse in face-to-face mode. When different modes are combined, call sequence outcomes can improve recruitment strategy by tailoring the timing of mode switching (e.g., after how many contact attempts in CAPI mode, it is advisable to switch to another mode) and the number and type of reminders (e.g., call reminders, postal reminders, or email follow-ups). We also encourage gathering interviewer observations on the sampled unit’s neighborhood and housing unit in CAPI-plus and KtN during (initial) face-to-face contact. As they have proven helpful for propensity modeling in face-to-face-only studies, they are promising for deriving tailored treatments before or early in the field phase in CAPI-plus and KtN (e.g., assigning cases to the appropriate mode). In addition, we recommend paying particular attention to doorstep concerns. It is crucial to understand respondents’ concerns and barriers to data collection methods that are new and unfamiliar to many respondents. KtN and CAVI may involve concerns other than those from face-to-face-only interviews (e.g., unwillingness to provide a phone number during KtN, inadequate technical equipment, or discomfort with using video in CAVI). Only when we know the specific concerns can appropriate strategies be developed to encourage respondents to participate (e.g., sending experienced interviewers specially trained in refusal conversion, conducting brief doorstep training on the use of video). In addition, contact history information and interviewer observations on all sampled cases, including nonrespondents, help assess the extent of nonresponse and the consequences of nonparticipation for sample composition and survey estimates. For example, interviewer observations of (non)respondents’ sociodemographic characteristics (e.g., age, ethnicity, language spoken) or household type and composition (e.g., single-person household, presence of children) may explain why some respondents are more likely to refuse in CAVI than others or to prefer one mode over another in CAPI-plus and KtN. These paradata can also provide insight into how switches in survey mode and increased fieldwork effort counteract nonresponse (bias). Particularly in mixed-mode data collection, the success of a measure (e.g., number of reminders, amount of incentives) may vary by survey mode, so measures should be tailored to the mode (e.g., different number of reminders or incentives depending on the mode in CAPI-plus or KtN).
Challenges in Data Quality
As with all data collection methods, a challenge with mixed-mode and innovative methods is ensuring the quality of the survey data. Mixing modes results in survey data being collected under very different conditions (e.g., interviewer presence or absence, verbal or visual presentation of question stimuli, differing question formats); thus, mode effects can affect data quality and survey estimates (Conrad et al., 2022; de Leeuw & Hox, 2015; Endres et al., 2022; Lugtig et al., 2011; West et al., 2022). Moreover, when relatively new data collection methods are used that are unfamiliar to both interviewers and respondents, such as CAVI, little is known about the problems that may occur during the interview, such as interrupted speech and frozen or distorted video (Conrad et al., 2022), and about the impact of the new interview situation and the problems encountered on response behavior and data quality. These technical and other issues make it even more essential to take a closer look at the conditions under which the survey data are collected and to evaluate their quality thoroughly.
One advantage of CAPI-plus (when CAPI mode is selected) and CAVI is that interviewers and respondents can usually see each other, and interviewers can thus perceive respondents’ attributes, facial expressions, and nonverbal cues. The visual interview-respondent interaction allows for an extensive collection of interviewer evaluations of respondent characteristics that can be used as proxy information for substantive analyses. Most importantly, we recommend the collection of detailed interviewer evaluations of the interview situation and respondent behavior to enable an informed survey data quality assessment. Especially in CAVI mode, new and unexpected interactions and problems may occur, which should be documented through comprehensive interviewer evaluations (e.g., screen sharing not working, technically related interruptions, acoustically related difficulty understanding questions, distractions from incoming emails and notifications on the respondent’s device) to identify low-quality data and explain differences in data quality between survey modes. In addition, interviewer evaluations can help identify groups of respondents for whom CAVI is particularly problematic (e.g., less technically savvy respondents, elderly) and for whom another mode is preferable. Due to the lack of immediate proximity between interviewer and respondent, interviewers should be specifically trained to collect interviewer evaluations in CAVI mode so that they know exactly what to look for in the interview situation and how to interpret respondents’ (non)verbal behaviors appropriately.
Conclusions and Considerations for Future Research
The range of interviewer-observed paradata in face-to-face interviewing is diverse, as are their purposes, as we have shown through a systematic overview of the previous literature. Moreover, we found that the usefulness of interviewer-observed paradata is often highly dependent on the interview context. Using CAPI-plus, CAVI, and KtN as examples, we have discussed the applicability of interviewer-observed paradata, typically collected in face-to-face-only interviews, in mixed-mode and innovative data collection methods. We have shown that it is necessary to develop modified and new interviewer-observed paradata tailored to the specific needs of a data collection method to realize its full potential. Modified and new paradata require additional interviewer training and a thorough assessment of the quality and applicability of these paradata in the context of mixed-mode and innovative data collection methods, as the collection conditions may differ significantly from those of face-to-face-only interviews.
A worthwhile endeavor from our perspective is to combine interviewer-observed paradata with paradata from other survey modes. Mixed-mode and innovative methods that involve web-based data collection can profit from web paradata (e.g., response times, questionnaire navigation, and device information) that can be used to better understand question-answer processing on the part of respondents and to assess survey data quality (for a comprehensive overview of web paradata and their uses, see, for example, Callegaro, 2013; Kunz & Hadler, 2020; McClain et al., 2019). For example, like interviewer evaluations, response time data can indicate whether respondents have comprehension problems with individual questions or how much effort they put into answering them. These automatically collected web paradata can substitute for at least some interviewer evaluations and allow for the economical collection of paradata by saving interviewer time to record interviewer-observed paradata and increasing standardization by eliminating interviewer variability in the collection of these paradata. Or they can be collected supplementally to compare interviewer evaluations and web paradata to assess their quality per se and to decide what type of paradata will be most useful in future data collection.
Survey researchers and practitioners have recognized in the wake of the COVID-19 pandemic that future survey data collection will likely include multiple modes and different approaches to best meet respondents’ needs. It is therefore necessary to further develop the practice of paradata collection and use and adapt it to the new data collection conditions, particularly mixed-mode settings. We would like to stimulate future research to provide evidence-based insights into how paradata from different survey modes can be usefully supplemented and combined to improve the efficiency of data collection and the quality of survey data.
Supplemental Material
Supplemental Material - Interviewer-Observed Paradata in Mixed-Mode and Innovative Data Collection
Supplemental Material for Interviewer-Observed Paradata in Mixed-Mode and Innovative Data Collection by Tanja Kunz, Jessica Daikeler, and Daniela Ackermann-Piek in International Journal of Market Research
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The costs for the open access publication were funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) ‐ Project number 491156185.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
