Abstract
Social desirability bias is a problem in surveys collecting data on norm violations and compliance. If confronted with sensitive questions, respondents systematically underreport norm violations and overreport norm compliance. This leads to biased survey estimates and poor data quality. To improve the measurement of norm violations and compliance, the item count technique (ICT) has been developed. The ICT anonymizes the question-and-answer process. In their experimental survey (n = 2,510), the authors use the ICT to study norm violations and compliance during the coronavirus disease 2019 (COVID-19) pandemic in Europe. More specifically, they estimate the prevalence of vaccination certificate falsification and self-isolation with COVID-19 symptoms. Estimates obtained using standard direct questioning (DQ; n = 1,006) are compared with ICT estimates (n = 1,504). As a result, the authors find no significant difference between estimates of vaccination certificate falsification (4.8 percent with DQ vs. 4.5 percent with the ICT). At the same time, they find a significant difference between estimates of self-isolation with COVID-19 symptoms (87.7 percent with DQ vs. 76.0 percent with the ICT). Conventional survey measures based on DQ thus likely overestimate the extent of norm compliance in a pandemic such as COVID-19.
At the beginning of 2020, the severe acute respiratory syndrome coronavirus-2 began to spread worldwide. By the end of 2021, the number of infections had risen dramatically. One of the first measures to fight the pandemic was the introduction of coronavirus disease 2019 (COVID-19) tests. In many countries, there was a rule or at least a recommendation to self-isolate if one tested positive or had COVID-19 symptoms. Other norms and measures were social distancing, washing one’s hands, and wearing a face mask. One big step in the fight against the pandemic was the rapid development of vaccines. To travel or participate in public life, proof of vaccination became a mandatory requirement in many countries. One example of a proof of vaccination was the introduction of the digital COVID-19 certificate of the European Union (EU) (European Commission 2022). The new rules and norms demanded citizens’ compliance. However, not all citizens followed the new norms (Diekmann 2020). For a survey study it is a challenge to determine empirically the extent of norm violations and compliance with regard to the protection measures during the COVID-19 pandemic.
Reviewing the research literature, there are many empirical studies investigating deviant or compliant behavior in connection with various COVID-19 protection measures. Examples of this research include vaccination hesitancy or refusal (see Al-Qerem and Jarab 2021; Alrabadi et al. 2023; Bendau et al. 2021; Biswas et al. 2021; Dror et al. 2020; Lazarus et al. 2023; Leigh et al. 2022; Shah and Coiado 2022; Troiano and Nardi 2021; Wolter et al. 2022), compliance with the social distancing norm (Becher et al. 2021; Diekmann 2020; Munzert and Selb 2020; Oosterhoff and Palmer 2020; van Rooij et al. 2020), refusal to wash hands (Becher et al. 2021; Oosterhoff and Palmer 2020), and compliance with the obligation to wear face masks (Betsch et al. 2020; Diekmann 2020).
Regarding the falsification of vaccination certificates, there are studies analyzing data from various international dark web marketplaces exploring and describing the buying and selling process, as well as the motives of the actors (Catalani et al. 2023; Childs 2023; Georgoulias et al. 2023). A study in Nigeria identified health care workers’ vaccination hesitancy, corruption, and inadequate salaries as key contributing factors to the circulation of fake COVID-19 vaccination cards (Ali et al. 2024). However, quantitative prevalence estimates for vaccination certificate falsification are rare (Helbing and Krumpal 2024).
Explaining and researching the influence of self-isolation on mental health was the focus of various empirical studies (Ellis, Dumas, and Forbes 2020; Nkire et al. 2021; Sepúlveda-Loyola et al. 2020; Zhao et al. 2020). Some studies investigated attitudes and determinants of norm-compliant behavior in connection with COVID-19 measures such as self-isolation or self-quarantine (Galasso et al. 2020; Kowalski et al. 2020; van Rooij et al. 2020). However, the focus of these studies was more on explaining norm-compliant behavior and less on estimating its prevalence. Comparatively few studies report prevalence estimates with regard to self-isolation during the COVID-19 pandemic. A survey study on self-isolation compliance in Israel concluded that 94 percent of respondents were willing to comply with self-quarantine regulations if their lost wages were compensated (Bodas and Peleg 2020). In a Mexican survey study, 48 percent of respondents reported staying at home when experiencing COVID-19 symptoms (González-Morales et al. 2021). In three survey experiments in Canada, socially desirable answers to sensitive items such as “visiting someone” or “being visited by someone” during COVID-19 lockdown were examined using different questioning techniques (Daoust et al. 2021). Using face-saving question wording, self-reported noncompliance with the COVID-19 lockdown measures increased by up to 11 percent compared with standard question wording. In the face-saving condition, 24 percent of respondents reported that they were visited by someone, and 21 percent reported that they visited someone during COVID-19 lockdown. Using standard question wording, in contrast, respondents substantially underreported norm violations (13 percent had been visited by someone and 10 percent visited someone; see Daoust et al. 2021).
All in all, the effectiveness of public health measures to fight a pandemic depends on the citizens’ compliance with those measures. During the COVID-19 pandemic, new norms of prosocial behavior emerged (Berger and Krumpal 2021; Korn et al. 2020; Tutić, Krumpal, and Haiser 2022), such as the norms to get vaccinated or to self-isolate with COVID-19 symptoms in order to contribute to the collective good of public health. However, if a substantial share of the population free rides and violates these norms, then the success of the public health measures will be at risk. As reviewed earlier, survey studies can in principle estimate the shares of norm violators and compliers in a population. Such information can serve as an indicator of the public acceptance of the new norms and public health regulations. However, social desirability bias is a problem in surveys collecting data on norm violations and compliance (Krumpal 2013; Tourangeau and Yan 2007; Yan 2021). If confronted with such sensitive questions, respondents systematically underreport norm violations and overreport norm-compliant behavior. This leads to biased survey estimates and poor data quality. The item count technique (ICT) was developed to anonymize the data collection process and to reduce social desirability bias in respondents’ self-reports of norm violations and compliance. In our experimental survey study, we use the ICT to estimate the prevalence of vaccination certificate falsification and self-isolation with COVID-19 symptoms in Europe. ICT estimates are compared with estimates obtained using standard direct questioning (DQ) to explore the extent of social desirability bias in conventional survey measures of norm violations and compliance during the COVID-19 pandemic. In the following, the terms norm violations and deviant behavior are used interchangeably and refer to behavior violating a norm (e.g., not self-isolating when experiencing COVID-19 symptoms). In contrast, the term norm compliance describes socially acceptable behavior adhering to a norm (e.g., self-isolating with COVID-19 symptoms).
Background
The Falsification of Vaccination Certificates
The norm-violating behavior investigated in our study is the falsification of vaccination certificates. A vaccination certificate is an official document proving that a person has been vaccinated against a specific infectious disease, in our case COVID-19. Depending on the country and its regulations, vaccination certificates can take various forms (e.g., as a printed vaccination card in paper form or digitally as a QR code on a mobile device). Many countries are using the International Certificate of Vaccination or Prophylaxis (Centers for Disease Control and Prevention 2024). In the EU, each member state is using the digital COVID-19 certificate (European Commission 2022). There are three countries in our study that are not member states of the EU, namely, Switzerland, Iceland, and the United Kingdom. These countries also use a digital COVID-19 certificate (Federal Department of Foreign Affairs 2022; Government of Iceland 2021; Government of the United Kingdom 2023). Thus, it is possible to falsify vaccination certificates in each country in our analysis. We define the falsification of vaccination certificates as all actions that lead to the possession of a vaccination certificate without actually being vaccinated. A popular method of obtaining fake vaccination certificates is to buy them online on the dark web (Childs 2023; Georgoulias et al. 2023). The severity of sanctions for falsifications of vaccination certificates varies by country. In Germany, for example, issuing of fake health certificates is punishable by up to five years’ imprisonment (German Criminal Code §§ 277 and 278), while using fake health certificates is punishable by up to one year in prison (German Criminal Code § 279). In Romania, a person faces up to five years in prison for falsifying vaccination certificates (Heghes 2021). In our study, we cannot provide a comprehensive legal analysis of the criminal liability of falsifying vaccination certificates in all European countries. In general, EU Regulation 2021/953 states that sufficient resources should be made available to prosecute illegal practices in connection with the issuance and usage of the digital COVID-19 certificate. Furthermore, the falsification of vaccination certificates violates social norms. On the one hand faking vaccination certificates is a form of lying and therefore it violates the norm of honesty. On the other hand, faking vaccination certificates prevents the establishment of herd immunity. It should be noted that in our study, no distinction is made between actual falsification and the use of false vaccination certificates.
Self-Isolation with COVID-19 Symptoms
Suppawittaya, Yiemphat and Yasri (2020) distinguished between three types of social restrictions in a pandemic: social distancing, (self-)quarantine and self-isolation. Social distancing means avoiding personal contact with friends and family. Quarantine is mostly used in the context of travel and applies to the separation of people without symptoms but who have been in contact with sick people. Self-isolation refers to the separation of sick individuals from others (Suppawittaya et al. 2020). In addition, self-isolation can also refer to the separation of people who are merely experiencing symptoms. Self-isolation while experiencing COVID-19 symptoms is the second behavior investigated in our study. In contrast to the falsification of vaccination certificates, which is a deviant behavior, self-isolation with COVID-19 symptoms is a norm-compliant behavior. During the COVID-19 pandemic if someone tested positive for a COVID-19 infection, he or she was required to self-isolate from others. The aim was to contain the disease and prevent further infections. In mid-January 2022, all member states of the EU had regulations regarding the isolation of COVID-19 cases, and the prescribed length of the isolation phase varied depending on the specific country (Directorate-General for Health and Food Safety 2021). There was also an official rule to stay at home and self-isolate when experiencing COVID-19 symptoms (European Centre for Disease Prevention and Control 2020). The non-EU countries Switzerland, Iceland, and United Kingdom had similar regulations regarding self-isolation with COVID-19 symptoms (European Observatory on Health Systems and Policies 2024; Federal Office of Public Health 2024; National Health Service 2023). Until today, there still are official rules to self-isolate when experiencing COVID-19 symptoms. The compliance with these rules represents prosocial behavior. If someone with COVID-19 symptoms consciously decides not to self-isolate, he or she will infect other people and thus cause immediate damage to health. This violates the social norm that one should not cause injury to other people through negligence.
The Problem of Social Desirability in Survey Studies
Data on the dark field of norm violations can be collected through surveys. One possibility to gather information is to ask respondents directly to self-report norm-compliant or deviant behavior. However, such direct questions asking about norm violations are often perceived as sensitive by the respondents (see Krumpal 2013; Krumpal and Näher 2012; Tourangeau and Yan 2007). Against this background, we assume that the questions in our study about whether one has falsified their vaccination certificate or about self-isolating when experiencing COVID-19 symptoms are sensitive questions. Answers to such sensitive questions are often distorted by social desirability bias. More specifically, systematic underreporting of deviant behavior (such as faking of vaccination certificates) and overreporting of norm-compliant behavior (such as self-isolation with COVID-19 symptoms) can be expected.
There are multiple ways to counteract social desirability bias in survey responses. Possible solutions discussed in the research literature are forgiving wording of the sensitive questions and sealed-envelope techniques for questionnaires (for an overview, see Krumpal 2013). The data collection strategy of forgiving wording intends to change the social valuation of the sensitive behavior, such that respondents feel encouraged to answer truthfully. Sealed-envelope techniques, in contrast, obscure the connection between the answer to the sensitive question and the identity of the respondent, preventing third parties (such as interviewers, bystanders, or state authorities) from linking specific answers with identifying information. When third parties are the primary source of worry, sealed-envelope techniques can be expected to reduce social desirability bias. An alternative approach to elicit more honest answers to sensitive questions is the anonymization of the question-and-answer process by using so-called dejeopardizing techniques (Lee 1993) such as the randomized response technique (RRT; Warner 1965) or the ICT (Raghavarao and Federer 1979). These special questioning techniques pursue the goal of eliciting truthful answers by increasing the level of anonymity via combining respondents’ answers with random statistical noise (Blair, Coppock, and Moor 2020). Compared with the RRT, the ICT has the advantage that the underlying concept of counting items is quite simple and the method is easy to implement. Only a moderate cognitive burden is imposed on the respondent, likely increasing the respondent’s compliance with the interview protocol. Because of the inconclusive state of research regarding the effects of forgiving wording on social desirability bias (see Näher and Krumpal 2012:1609–13; Tourangeau and Yan 2007:874–75; Yan 2021:121–23) and the high cognitive burden of the RRT (Holbrook and Krosnick 2010), we chose to use the ICT in our study. Previous research shows that compared with standard DQ, the ICT elicits more honest answers to sensitive questions asking about norm violations (Ehler, Wolter, and Junkermann 2021; Li and van den Noortgate 2022). In addition, there is evidence that the ICT can reduce systematic overreporting of socially desirable behaviors. For example, Wolter et al. (2022) showed that the proportion of people self-reporting vaccination against COVID-19 was significantly lower when using the ICT (75 percent) compared with interviews using DQ (85 percent).
To counteract the problem of misreporting to sensitive questions, we use the ICT in our survey to protect the respondents’ privacy and to reduce social desirability incentives. In comparison with direct self-reports, we expect that the ICT reduces underreporting and yields higher prevalence estimates when asking about the falsification of vaccination certificates (a “more-is-better” assumption with regard to norm-violating behavior). Furthermore, we expect that the ICT reduces overreporting and yields lower prevalence estimates when asking about self-isolation with COVID-19 symptoms (a “less-is-better” assumption with regard to norm-compliant behavior). Details on the ICT and its implementation are described in the next section.
Methods
Participants and Study Design
We collected data with an online survey in July 2024. Study participants from 27 European countries were recruited via the panel online provider Prolific (www.prolific.com). As subjects from the United Kingdom are overrepresented in the Prolific online panel, we set a limit for participants from the United Kingdom to 200. All interviews were conducted in English. In the Prolific online panel, participants are not directly invited to a study. Instead, a study appears in the feed for eligible participants until the targeted sample size is reached. To ensure high data quality, we included three attention check questions in our survey (see Appendix A1). Overall, 2,513 participants completed the interviews. Three participants were excluded because they were not from Europe. This leaves us with the final sample size of 2,510 participants. The participants’ distribution across countries can be found in Table A2 in the Appendix. Most participants reside in Poland (n = 367) followed by Portugal (n = 345) and Italy (n = 285). A descriptive analysis of the sample can be found in Table A3 in the Appendix. The subjects in our sample are on average middle aged (M = 32.5 years, SD = 10.6). About two thirds (66.4 percent) of the participants have academic degrees. Of all participants in the final sample, 59.0 percent identify as male and 39.6 percent as female; 1.5 percent of the participants describe themselves as diverse. We compare the ICT with standard DQ. At the beginning of the interviews, participants were randomly assigned to one of the two experimental groups: about 40 percent were assigned to the DQ control group (n = 1,006) and about 60 percent to the ICT group (n = 1,504). The ICT group was oversampled because the ICT is statistically less efficient than standard DQ, and a larger sample size is required to increase statistical power (Krumpal et al. 2018).
DQ
In the DQ group, the two key items measuring norm violations and norm-compliant behavior were stated as follows (dichotomous items; 1 = yes; 0 = no):
Falsification of a COVID-19 vaccination certificate: “Have you ever acquired a COVID-19 vaccination certificate without being vaccinated?”
Self-isolation while experiencing COVID-19 symptoms: “Have you practiced self-isolation with COVID-19 symptoms?”
The ICT
The basic principle of the ICT is that subjects respond to several items in a list at once and count their affirmative responses to the whole item list. This anonymizes the individual response to the sensitive item of interest (Raghavarao and Federer 1979). Compared with other dejeopardizing techniques, such as the RRT, the ICT is intuitively understandable and can easily be implemented in self-administered survey modes, such as online surveys (Quatember 2023). To further increase statistical power, we use the double-list design of the ICT in our study (see Droitcour et al. 2004; Krumpal et al. 2018). To implement the ICT double-list design, the ICT group is divided into two subgroups via randomization (subgroups A and B; all items are dichotomous). In the first question block, subgroup A receives a list with a set of nonsensitive items plus the sensitive item (long list 1 [LL1]); subgroup B receives a list with the same set of nonsensitive items but without the sensitive item (short list 1 [SL1]). Respondents in both subgroups are then requested to report the total number of items that apply to them (i.e., the number of “yes” answers), without responding to each item separately. In the second question block, subgroup A receives another list with a different set of nonsensitive items (SL2; SL1 ≠ SL2); subgroup B also receives the other list with the different set of nonsensitive items plus the sensitive item (LL2). In short: SL2 in subgroup A corresponds to LL2 in subgroup B, apart from the sensitive item. Here again, respondents in both subgroups are asked to report the total number of “yes” answers, without answering to each item separately. Using such a design, the individual answer to the sensitive item remains secret (unless all or none of the answers to the list are “yes”). Compared with the single list ICT, the effective sample size is doubled in the ICT double-list design, because respondents from both subgroups respond to a long list including the sensitive item, which increases the statistical power. Table 1 shows an application example of the ICT double-list design with regard to the measurement of the falsification of a COVID-19 vaccination certificate.
Example of the Item Count Technique Double-List Design.
Note: COVID-19 = coronavirus disease 2019; LL = long list; SL = short list.
The prevalence of the sensitive behavior can be estimated on the basis of mean differences between subgroups A and B (see Droitcour et al. 2004; Kirchner et al. 2013; Krumpal et al. 2018). In the ICT double-list design, two separate prevalence estimates of the sensitive behavior are calculated first (one for each question block):
and
In a subsequent step, the two separate estimates are combined to obtain the double-list estimator:
The sampling variance of the double-list estimator is
Although theoretically, there is no limit on the length of the lists, they should be kept as short as possible because cognitive load of summing up the single responses increases with the length of the lists and statistical efficiency decreases (Krumpal et al. 2018). Thus, the maximum length of the short list should be no longer than three to five items (Droitcour et al. 2004; Krumpal et al. 2015). Furthermore, to decrease the likelihood of ceiling or floor effects (Junkermann 2022; Kirchner et al. 2013; Kuha and Jackson 2014), the short lists should contain both low- and high-prevalence items. With regard to our study focusing on two key sensitive items, each respondent in the ICT group answered to four lists in total. At the end of each list, respondents were asked to indicate the number of statements that apply to them, but not which statements. Assuming that the respondents trust this offer of answer anonymization, it can be hypothesized that the ICT will elicit more self-reported norm violations and less norm-compliant behavior than standard DQ. The exact wording of the ICT instructions and all the lists used in our study to collect data with regard to the two key sensitive items are documented in the Appendix (see Tables A4 and A5).
To estimate the prevalence, the sampling variance and the standard error on the basis of the ICT double-list estimator, the ictreg command from the R package list 1 was used (Blair and Imai 2012; Imai 2011). Using the ictreg command a linear regression model consisting only of the dependent variable (intercept-only-model) was calculated. This procedure is mathematically identical to the difference-in-mean estimator (see foregoing discussion). To calculate the differences between the experimental groups (including standard errors) the predict.ictreg command was used in the next step (Blair and Imai 2012; Imai 2011). Here the ICT estimates based on the linear model and DQ estimates based on a logistic model were passed to this command. The results are discussed in the next section. 2
Results
The prevalence estimates of vaccination certificate falsification and self-isolation with COVID-19 symptoms are shown in Table 2. In the DQ group, 4.8 percent of the respondents admitted to vaccination certificate falsification. The ICT estimate is 4.5 percentage points. The difference between the estimates of vaccination certificate falsification is about 0.3 percentage points and is statistically not significant. As a result, we find no evidence of systematic underreporting of this norm-violating behavior in our study. With regard to the norm-compliant behavior, self-isolation with COVID-19 symptoms, the DQ estimate is 87.7 percentage points and the ICT estimate is 76.0 percentage points. This difference of about 11.7 percentage points between estimates of the two experimental groups is statistically significant (p < .05). We can state therefore that conventional survey measures of norm-compliant behavior are susceptible to overreporting bias and estimates on the basis of DQ overestimate the extent of norm compliance in a pandemic such as COVID-19.
Prevalence Estimates of Vaccination Certificate Falsification and Self-Isolation with COVID-19 Symptoms by Experimental Group.
Note: Values in parentheses are standard errors. COVID-19 = coronavirus disease 2019; DQ = direct questioning; ICT = item count technique.
p < .05 (two-sided z test).
Next, we conducted subgroup analyses of the difference between the ICT and DQ by the respondents’ gender, 3 academic degree, age, attitudes toward COVID-19 restrictions in general, and attitudes toward the sensitive behavior specifically. Because of the limited statistical power of the ICT estimates, we restrict our subgroup analyses to bivariate comparisons. The results are shown in Tables 3 and 4.
Subgroup Analysis: Reported Vaccination Certificate Falsification by Experimental Group.
Note: Values in parentheses are standard errors. “Younger” refers to respondents younger than the average age in the study sample, and “older” refers to respondents older than the average age in the study sample (M = 32.5 years, SD = 10.6 years). COVID-19 = coronavirus disease 2019; DQ = direct questioning; ICT = item count technique.
p < .05 (two-sided z test).
Subgroup Analysis: Reported Self-Isolation with COVID-19 Symptoms by Experimental Group.
Note: Values in parentheses are standard errors. “Younger” refers to respondents younger than the average age in the study sample, and “older” refers to respondents older than the average age in the study sample (M = 32.5 years, SD = 10.6 years). COVID-19 = coronavirus disease 2019; DQ = direct questioning; ICT = item count technique.
p < .05 (two-sided z test).
With regard to the falsification of vaccination certificates, there are no statistically significant ICT-DQ differences in the considered subgroups (see Table 3). Descriptively, the largest ICT-DQ difference can be seen between respondents who report a positive attitude toward faking vaccination certificates, with a difference of 11.7 percentage points. Beyond that, we cannot find any meaningful and consistent response patterns across the specific subgroups. Note that some values in the first column (ICT) in Table 3 contain negative estimates (e.g., −2.2 percentage points for female respondents). Of course a negative prevalence does not make sense, but the ICT estimator can be negative by construction. In the case of a low true prevalence, negative estimates may occur by chance. For all negative values in the ICT group, the subgroup specific ICT estimates are not significantly different from zero.
With regard to self-isolation with COVID-19 symptoms, most of the ICT-DQ differences are statistically significant (p < .05) in the subgroups under consideration (see Table 4). Descriptively, the ICT estimates are consistently lower than the DQ estimates in all subgroup comparisons. Looking at the specific subgroups, we can see comparatively large differences for male respondents (13.8 percentage points), respondents with no academic degree (14.8 percentage points), younger respondents (16.0 percentage points), respondents with negative attitudes toward COVID-19 restrictions (16.2 percentage points), and respondents with negative attitudes toward self-isolation (17.2 percentage points). From a social desirability perspective, these results are plausible: Especially younger people often underestimated their personal risk of a COVID-19 infection. At the same time, they believed that COVID-19 is dangerous for society (Franzen and Wöhner 2021). Faced with this dilemma, the pressure from social desirability norms was probably higher for the younger respondents leading to considerable overreporting of norm-compliant behavior in this subgroup. Furthermore, those respondents with negative attitudes toward COVID-19 restriction in general or toward self-isolation in particular were more likely to overreport their own norm-compliant behavior. Overall, we find a robust and consistent pattern of misreporting and upwardly biased estimates of self-isolation with COVID-19 symptoms in all subgroups if DQ is used to collect data. 4
Discussion
During the COVID-19 pandemic, new social norms emerged (Berger and Krumpal 2021; Korn et al. 2020; Tutić et al. 2022), such as the norm to self-isolate with COVID-19 symptoms and the norm to use proof of vaccination (e.g., as a printed vaccination card in paper form or digitally as a QR code on a mobile device) in order to travel or participate in public life. To fight the pandemic and contribute to the collective good of public health, citizens were expected to comply with these new norms and regulations. Survey studies can in principle monitor the extent of norm compliant and deviant behavior in a society. Such information can be used to evaluate the effectiveness of public health measures and study the public acceptance of those measures. But it should come as no surprise that estimates on norm violations and compliance differ depending on the method of data collection (Krumpal 2013; Tourangeau and Yan 2007; Yan 2021). Confronted with direct questions, respondents systematically overreport norm-compliant behavior and underreport norm-violating behavior. In our study, estimating the prevalence of vaccination certificate falsification and self-isolation with COVID-19 symptoms in Europe, we used the ICT to anonymize the question-and-answer process and elicit more honest answers to the sensitive questions. ICT and DQ estimates were compared to study the extent of social desirability bias in our estimates of the norm violating and compliant behavior. As a result, we find no evidence with regard to underreporting of norm-violating behavior. About 4 percent to 5 percent of respondents stated that they had falsified their vaccination certificate, with only a small and statistically insignificant difference between ICT and DQ estimates falsification (4.5 percent with the ICT vs. 4.8 percent with DQ). At the same time, a substantial and statistically significant difference was found between ICT and DQ estimates with regard to the reports of self-isolation (76.0 percent with the ICT vs. 87.7 percent with DQ). Conventional survey measures based on DQ thus likely overestimate the extent of norm compliance in a pandemic such as COVID-19.
Comparing our results with those of former studies, our prevalence estimates of vaccination certificate falsification are slightly higher than estimates in a survey study by Helbing and Krumpal (2024). Focusing on different forms of norm violations during the COVID-19 pandemic in Germany, Helbing and Krumpal estimated the overall prevalence for vaccination certificate falsification at about 3 percentage points. In addition, they compared standard DQ with the ICT to evaluate social desirability bias in the respondents’ self-reports. As a result, slightly more respondents self-reported vaccination certificate falsification with the more anonymous ICT (3.5 percentage points) than in the interviews using DQ (2.6 percentage points). Similar to our study, this difference was not statistically significant. With regard to the prevalence estimation of self-isolation, González-Morales et al. (2021) reported a much lower estimate of norm-compliant behavior in Mexico, with 48 percent of the study participants willing to self-isolate when experiencing COVID-19 symptoms. It is important to note that this study has limited comparability with our research because of different target populations (Mexican society vs. Europe) and different survey periods. Collecting data using a face-saving questioning technique, Daoust et al. (2021) reported up to 76 percent of respondents who received no visits and 79 percent who did not visit someone during COVID-19 lockdown. These estimates are similar to the ICT estimates of self-isolation obtained in our study. Note, however, that this comparison is also limited because visiting someone during COVID-19 lockdown is not exactly the same as not self-isolating with COVID-19 symptoms.
Regarding the methodological discussion about researching sensitive topics, our study confirms the effectiveness of the ICT in reducing systematic overreporting of socially desirable, norm-compliant behavior. Our findings support the results of Wolter et al. (2022), who investigated social desirability bias in the respondents’ self-reported COVID-19 vaccination status. Comparing the ICT and DQ in an experimental survey, they found significantly lower ICT estimates of COVID-19 vaccination coverage compared with the conventional DQ method. This result was interpreted as evidence that conventional survey studies likely overestimate norm-compliant behavior in a pandemic because of misreporting by survey respondents and that the more anonymous ICT reduces such misreporting. Survey designers who aim to estimate the prevalence of socially desirable behaviors in a pandemic could therefore benefit from using the ICT in their research studies (Wolter et al. 2022). 5 With regard to the socially undesirable behavior investigated in our study, in contrast, the question arises as to why no significant difference between the ICT and DQ estimates of vaccination certificate falsification could be found. The low prevalence estimates in both experimental groups indicate that vaccination certificate falsification is a behavior that is rather rare in society. We suspect that the ICT and DQ estimates cannot be distinguished statistically because of a lack of precision of the ICT estimator in estimating a small prevalence that is close to zero (Ahlquist 2018; Wolter and Laier 2014), not because it is not relevant from a social desirability perspective. 6 Thus, the ICT should only be used with caution when investigating behaviors expected to be rather rare in the target population.
Finally, we must note limitations of our research and outline future research perspectives. First, our study sample covers a broad range of respondents from different European societies. Nonetheless, the sample is not representative of the European general population, because the Prolific online panel from which the respondents were recruited cannot be considered representative of the general population in the sense of a complete official register. The ICT proved useful in reducing social-desirability bias in our study sample. However, we must be careful not to prematurely generalize the absolute values of the prevalence estimates of norm-compliance and deviance to other population groups. 7 In particular, a replication of our method experiment in a general population survey would be a sensible next step for evaluating the robustness of our results.
Second, a limitation of the ICT is that larger sample sizes are necessary to achieve the same level of statistical precision compared with DQ. The low statistical efficiency of the ICT estimator and the higher data collection costs of increasing the sample size can be justified only if the amount of reduction in social desirability bias by the ICT is large. Thus, the use of the ICT does only make sense for sensitive questions in which the response bias is likely to be severe. Furthermore, the low statistical precision of the ICT is particularly problematic for items with a very low (or very high) prevalence (as our results for vaccination certificate falsification illustrate). In these cases, significant differences between ICT and DQ estimates of the sensitive behavior can hardly be detected (Ahlquist 2018; Wolter and Laier 2014).
Another limitation is that the ICT may not be appropriate if the central research goal is to study psychological mechanisms or individual biases and link those biases to behaviors toward others (e.g., in research on racial, gender and other forms of discrimination). This is because the ICT does not address all sources of social desirability bias. In particular, because the ICT does not hide answers to sensitive questions from the respondents themselves, the ICT will not reduce social desirability bias resulting from self-deception or self-identity concerns. Moreover, although the ICT can provide more accurate population-level prevalence estimates, it has clear limitations for individual-level inferences, i.e., estimating differences in prevalence rates or differences in social desirability bias across subgroups defined by individual-level characteristics and linking those differences to behaviors toward others (Blair et al. 2020).
Conclusion
Our study shows that conventional survey measures based on DQ systematically overestimate the extent of norm compliance in a pandemic such as COVID-19 and that the use of dejeopardizing data collection techniques such as the ICT, despite their limitations, can be beneficial for survey data collection (Krumpal et al. 2015; Lee 1993). Follow-up survey studies are invited to take our research as an inspiration to further investigate norm-compliant and deviant behavior in possible future pandemics or other social crisis situations. In this context, the effectiveness of other data collection approaches and methods of response anonymization on social desirability bias in sensitive self-reports could be further explored and compared with conventional survey methods (Yan 2021).
Supplemental Material
sj-docx-1-srd-10.1177_23780231251328202 – Supplemental material for Social Desirability Bias in Measures of Norm Violations and Compliance during the COVID-19 Pandemic: Results of an Experimental Survey in Europe
Supplemental material, sj-docx-1-srd-10.1177_23780231251328202 for Social Desirability Bias in Measures of Norm Violations and Compliance during the COVID-19 Pandemic: Results of an Experimental Survey in Europe by Alexander Helbing and Ivar Krumpal in Socius
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the German Research Foundation via the Heisenberg program (GEPRIS project number 497852959) and by the Open Access Publishing Fund of Leipzig University.
Supplemental Material
Supplemental material for this article is available online.
3
In our survey, a total of 37 respondents stated that they were diverse when asked about their gender. We excluded these cases from the subgroup analysis because the number of cases is too small to allow a meaningful statistical analysis and conclusion with regard to this group. The exclusion of these cases is therefore for methodological reasons; no one should be discriminated against.
4
In additional analyses, we calculated ICT-DQ differences separately by the health care systems (countries with the Beveridge model vs. countries with the Bismarck model) and separately by the strictness of the pandemic response measures (countries with high strictness vs. countries with low strictness). All in all, the ICT-DQ differences across country groups are similar in size and from a social desirability perspective we find no evidence for a clear interaction pattern between countries’ characteristics and the social desirability norms in these countries. The results of these additional analyses can be found in the Appendix (see
).
5
We recommend the use of the double-list design of the ICT. In comparison with the study of Wolter et al. (2022), who used the single list ICT, we used the double-list ICT, which leads to more precise prevalence estimates (Krumpal et al. 2018). Consider our item measuring self-isolation with COVID 19 symptoms (ICT prevalence estimate: 76 percent with a standard error of 3.1; n = 1,504) in contrast to the item in the study by Wolter et al. (p. 5,
) measuring vaccination coverage (ICT prevalence estimate: 75 percent with a standard error of 4.3; n = 2,115). Comparing the two results (both items are dichotomous), it is evident that despite the smaller sample size in our study, the double-list estimate produces a considerably smaller standard error than the single list estimate in the study by Wolter et al.
6
The topic whether someone falsifies a COVID-19 vaccination certificate has a clear normative dimension. As explained in the introduction section, there was an illegal market for fake vaccination certificates on the dark web (Childs 2023;
), and using a fake vaccination certificate in daily life violated legal and social norms.
7
However, because participants were randomly assigned to either the ICT or the DQ split, we can compare estimates between the experimental groups in our study.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
