Sage Journals: Discover world-class research

Abstract

Fraudulent participation is a growing challenge in digital health research, particularly in online studies where duplicate identities, automated responses, and coordinated sign-ups can distort recruitment, compromise validity, and divert resources. Safeguards intended to prevent fraud might also risk excluding legitimate participants, raising concerns about sample representativeness and study generalizability. Although a wide range of technical and behavioral strategies exists, guidance is lacking on how to organize these methods and report outcomes consistently across studies. To address this gap, we introduce the Configure, Assess, Triage, Corroborate, and Hone (CATCH) framework, a hybrid fraud detection–mitigation model with actionable recommendations for investigators. CATCH begins with pre-study configuration to prepare for fraud mitigation and proceeds through systematic assessment of fraud risk, triage of candidates into risk categories, and corroboration of inconclusive cases, while honing strategies through ongoing monitoring. The framework emphasizes transparent documentation and reporting of actions and outcomes to facilitate comparability across studies and continuous methodological refinement. As fraudulent participation grows and emerging technologies act as both risks and solutions, CATCH can help guide investigators’ efforts to maximize data integrity in digital health research. By synthesizing existing fraud-mitigation strategies into a unified, staged framework, CATCH offers practical guidance for structuring decisions, documenting actions, and balancing data integrity with inclusivity.

Keywords

Fraud mitigation online research CATCH framework digital health remote clinical studies

Fraudulent participation in online research

Remote online studies and decentralized clinical trials are growing increasingly popular in human subjects research. Fully online approaches help address difficulties in meeting proposed samples in large-scale studies, recruiting nationally representative participant pools, and identifying individuals with rare or stigmatized conditions.¹ Social distancing restrictions due to the coronavirus disease 2019 (COVID-19) pandemic further propelled the growth of remote research procedures,² and meta-analytic work confirms that online recruitment is significantly more efficient and less expensive than in-person approaches.³ Alongside these benefits, online methods have introduced new challenges: bots, duplicate participants, and fraudulent reporting have emerged as serious threats to research integrity,⁴ affecting not only recruitment but also downstream study procedures, where verifying participant identity or characteristics is more difficult when all interactions occur online.

Reviews of online studies using fraud detection suggest this phenomenon is common across a range of content areas,⁵ including qualitative and quantitative research paradigms. When undetected, impostors can distort findings and misdirect practice or policy.⁶ Even when impostors are detected, the costs in lost funds, staff time, and slowed recruitment are substantial.^7,8 For example, one study of patient perspectives of their cancer care received 256 fraudulent study sign-ups (of 271 total; 94.5%) in just 7 h of recruitment.⁹ In another study, a team studying performance feedback for medical staff received 268 study incentive requests from just three email addresses.¹⁰ While financial incentives are often assumed to be a primary driver of fraudulent participation, studies show that deception can occur even without monetary rewards, suggesting additional motives such as boredom, curiosity, or ideological intent to disrupt research.¹¹

Fraudulent participation can distort recruitment sampling, misrepresent demographic distributions, and obscure relationships between groups of participants and outcomes. For example, Sharma et al.¹² in Australia reported a sample that was entirely male and 80% aboriginal, despite qualitative research in their field typically recruiting predominantly female participants. Such misrepresentation can hinder the recruitment of a representative population sample, as the presence of fraudulent participants may make groups falsely appear to be over- or under-represented. It may also lead to the exclusion of legitimate participants who share traits with fraudulent participants.

Existing methods for fraud mitigation

An emergent literature has identified procedures for detecting and deterring fraud. Early reports often took the form of case studies describing encounters with suspected participants and the ad hoc strategies used to respond.^13–15 These were followed by articles deriving recommendations for future work, but were grounded mainly in small, qualitative studies.^16–18 Across studies, researchers have experimented with a wide range of fraud-mitigation strategies.⁴ Technical approaches include bot filtering using completely automated public Turing tests to tell computers and humans apart (CAPTCHA), internet protocol (IP) address monitoring to flag suspicious geographies, and checks for duplicate contact details or unusually fast survey completions.^4,19,20 Behavioral approaches have included open-ended screening questions, consistency checks across survey or interview responses, or requiring participants to briefly enable their camera for verification.^4,21 Fraud-mitigation strategies fall into two broad categories: Systemic methods, which can be automated and applied at scale, allowing studies to screen large numbers of candidates with minimal added burden, and person-led methods, which are resource-intensive but provide crucial resolution when systemic checks raise doubts. Hybrid layering reflects current best practice, particularly for large-scale studies: apply scalable tools to filter the bulk of candidates, and reserve researcher judgment for ambiguous cases.²²

Safeguards intended to prevent fraud may introduce new barriers for legitimate participants, particularly those with limited resources. Recruitment through interest groups, such as online communities or associations, may reduce fraudulent entries, but risks narrowing the participant pool to those with access, interest, and the capacity to engage with such groups.¹² Adjusting participant compensation is another lever that has been used to attempt to safeguard research integrity. Although higher incentives have been linked to increased fraudulent responses, lowering them may deter genuine participants, making the sample less representative.^5,23 Phone verification can filter participants but excludes those without consistent access, while video chat requirements may disadvantage individuals without cameras or private spaces.¹² Government ID checks or flags on unusual login times may disproportionately exclude underserved participants.²³ Some participants, especially in studies involving sensitive topics such as mental health, drug use, or reproductive health, may hesitate to provide accurate contact information due to privacy concerns rather than fraudulent intent.²⁴ There is a persistent tension between safeguarding and inclusivity; if safeguards are too lenient, ineligible participants contaminate the data and compromise validity⁶; if safeguards are too strict, legitimate participants risk being wrongly excluded, limiting representation and generalizability.²⁵ No single mitigation method is sufficient on its own, highlighting the need to organize and layer different strategies in complementary ways.⁸

Earlier work has provided useful examples of techniques for fraud mitigation, but what remains absent is a structured way to organize these methods, define their reporting points, and ensure comparability across studies. No general guidance exists on how investigators can structure, document, or justify fraud-mitigation practices to maximize scalability while maintaining rigor. To advance cumulative knowledge, researchers also report how many participants were screened out at each stage by each method, how many required manual corroborations, and how many were ultimately excluded or “redeemed.” Detailed documentation can help illuminate the true magnitude of fraud, clarify the practical capacity for handling inconclusive cases, and help establish benchmarks for estimating false exclusion rates.

The Configure, Assess, Triage, Corroborate, and Hone (CATCH) framework

Here we introduce the CATCH framework, a layered fraud detection–mitigation model with explicit, actionable recommendations designed to guide investigators conducting online studies. The hybrid CATCH model integrates systematic and person-led methods on an as-needed basis. Building on the existing literature and our team's experience conducting remote health studies, CATCH organizes prior recommendations into a unified, staged structure that clarifies when, how, and why each method should be used.

CATCH begins with a pre-study configuration stage, where teams anticipate risks and prepare for fraud mitigation. At this stage, eligibility criteria are defined, risk thresholds established, study off-ramp procedures for excluded participants developed (including relevant consent form language), and fraud-detection strategies are tested through internal simulations and external stress-testing.²⁶ For example, investigators may simulate duplicate sign-ups using test accounts or run stress tests by recruiting through different channels to identify vulnerabilities before live enrollment begins. Review of these activities allows for evaluation of tradeoffs between false positives and false negatives, refinement of procedures, and assurance that the mitigation strategy is robust before recruitment begins. By piloting the process with simulated participants and external testers, teams can identify vulnerabilities.

Once live recruitment begins, systematic assessment of fraud risk narrows the pool of candidates. Methods may involve automated CAPTCHA checks to block bots, IP checks to identify unusual geographies, cross-checks for duplicate emails or phone numbers, attention questions, and checks for inconsistencies across initial data collection points and modalities.^4,7,26 Fraud assessment should also account for vulnerable populations or those with limited digital access, as some checks may flag structural constraints rather than fraudulent intent.¹⁷ For example, individuals using shared or public devices may appear to have duplicate IP addresses, while others may rely on virtual private networks to reduce tracking or surveillance online.²⁷ Depending on the target population, researchers can treat such indicators as low-severity signals that warrant verification only when accompanied by other inconsistencies. Based on the initial assessment, candidates are triaged into low-, medium-, and high-risk categories.²⁸ Candidates who pass all checks may be classified as low risk, while those who fail multiple checks are considered high risk. The medium-risk category captures inconclusive cases, where the strength or combination of indicators differs. For example, attempting to re-enroll with the same email address is more strongly indicative of fraud than failing a single attention check.²⁹ Risk thresholds should be defined in advance for each study, taking into account its scope, target population, and available resources.

Corroboration resolves cases where systematic checks are inconclusive and thus require more resource-intensive human-led methods.^13,22 Research staff may invite candidates to conduct a brief phone or video call, request additional documentation, or compare survey responses for internal consistency before making a final decision.²⁹ During the study, strategies are honed through ongoing monitoring and longitudinal observation to further strengthen participant validation. Examples include checking for repeated patterns in surveys (e.g. providing the same response across all items), unusually fast completion times, or inconsistencies in longitudinal responses compared to baseline information.^6,7 Suspicious activities are flagged, prompting ongoing reassessment of participant risk and adaptation of mitigation strategies as needed. The CATCH framework is design-agnostic and applicable to cross-sectional, longitudinal, and hybrid studies. Although all studies prioritize early identification of fraudulent participants, longitudinal studies may incorporate repeated checks during follow-up to detect issues that were not observable at enrollment, whereas randomized designs require particular attention to early verification to avoid contaminating treatment arms.

At every stage, investigators are encouraged to document not only the actions taken (screened, flagged, excluded, and retained) but also the numbers associated with these outcomes. This dual emphasis on process and reporting highlights both how fraud can be mitigated and how comparability and transparency can be strengthened across studies. Underlying this approach is the familiar tradeoff between sensitivity and specificity: greater sensitivity to fraud increases the risk of excluding legitimate participants, while greater specificity reduces false exclusions but may allow more fraudulent participants through. Each study will need to determine how to strike this balance in line with its aims, resources, and population. The CATCH framework is illustrated in Figure 1.

Figure 1.

The Configure, Assess, Triage, Corroborate, and Hone (CATCH) framework. The funnel begins with pre-study configuration, then proceeds through systematic risk assessment, candidate triage, and manual corroboration, with ongoing monitoring to hone strategies as new threats emerge. Investigator roles corresponding to each stage are shown on the right.

Research teams are encouraged to select and apply the CATCH approaches that are most suitable for their circumstances, timeline, study population, resources, and budget. The framework is designed to serve as a flexible approach rather than a model that is excessively prescriptive or proscriptive. To support implementation, we include a supplemental checklist that investigators can adapt to their study. The checklist, available as Supplemental Table 1, provides a structured template aligned with each CATCH stage and outlines suggested decisions, documentation points, and reporting elements to promote transparency and consistency across studies.

CATCH and Emerging Technologies

The development of mitigation frameworks such as CATCH is critical as new technologies create opportunities for increasingly sophisticated forms of fraudulent participation. Emerging tools such as artificial intelligence (AI) make fraudulent participation more scalable and harder to detect.³⁰ Large language models can be leveraged to generate human-like text that mimics genuine responses, enabling fraudulent actors to craft detailed answers that bypass quality control measures such as free-text justifications.^31–35 Combined with text obfuscation platforms, AI-generated responses can evade automated detectors entirely, with false negative rates reaching 100% in evaluations of AI-text classifiers.³⁶ Beyond text, AI-generated images, video, and audio can be used to create realistic participant profiles that pass identity verification checks, complicating detection efforts across modalities.^36–39 These challenges highlight the need for frameworks such as CATCH to remain adaptable. Fraud mitigation cannot be static throughout a study or across studies; configuring, assessing risk, triaging, and corroborating cases should all undergo iterative refinement, with strategies honed through ongoing monitoring as new threats emerge.

Emerging technologies also create opportunities for more sophisticated fraud detection that can be integrated into CATCH once validated. Some techniques already appear in online survey practice, such as timezone alignment checks,⁴⁰ edit-distance matching for duplicate emails,⁴⁰ and contradiction checks across survey responses.⁴¹ In online health research, these tools can support geographic verification and help identify repeated or ineligible enrollment. Other approaches represent emerging possibilities that require further evaluation in health-research settings. Chatbots can support scalable prescreening, drawing from methods in phishing and financial fraud detection.^42,43 For participants requiring ongoing monitoring, AI-driven approaches may help flag longitudinal signals of fraudulent participation, such as patterned response behavior or inconsistencies over time.⁶ Ultimately, fraudulent participants themselves may be deliberately recruited as intended targets for study by researchers seeking to learn more about their methods in an effort to safeguard data collection.

Conclusion

Fraud in online research is a perpetually evolving challenge. As detection systems advance, fraudulent strategies will adapt in response. Adjustments to mitigation measures can introduce new risks of false exclusion, underscoring the need to continually reassess how safeguards are applied. The task for researchers, then, is not to aim for permanent solutions but to continually update detection and mitigation measures while reporting them transparently so that the field can learn, compare, and refine responses over time. The CATCH framework constitutes a practical starting point for this process. Through the synthesis of existing fraud-mitigation strategies into a unified, staged framework, CATCH provides investigators with practical guidance for structuring decisions, documenting actions, and balancing data integrity with inclusivity. By implementing its staged structure and documenting outcomes transparently, researchers can build a shared knowledge base that strengthens study integrity and enhances resilience against future forms of fraud.

The proposed framework aims to enhance methodological rigor and comparability across online studies by clarifying when and how different mitigation strategies can be applied. Future work should examine how the CATCH framework performs across different study designs, populations, and research contexts. Empirical applications may evaluate its impact on recruitment efficiency, false exclusion rates, and equity-related outcomes. Additional research may also explore ethical tradeoffs between fraud detection and participant privacy, particularly in digitally mediated and decentralized studies. Such efforts will be essential for refining the framework and informing best practices for online research integrity.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076261418807 - Supplemental material for Practical guidance for mitigating fraud in online research: The Configure, Assess, Triage, Corroborate, and Hone (CATCH) framework

Supplemental material, sj-docx-1-dhj-10.1177_20552076261418807 for Practical guidance for mitigating fraud in online research: The Configure, Assess, Triage, Corroborate, and Hone (CATCH) framework by Maya Stemmer, Justin Tauscher, Benjamin Buck, Patrick Wedgeworth, Oliver John Bear Don’t Walk, Trevor Cohen and Dror Ben-Zeev in DIGITAL HEALTH

Footnotes

Acknowledgements

Not applicable.

ORCID iDs

Maya Stemmer

Justin Tauscher

Patrick Wedgeworth

Ethical considerations

Not applicable because this article does not contain any studies with human or animal subjects.

Consent to participate

Not applicable because this article does not contain any studies with human subjects.

Consent for publication

Not applicable because this article does not contain any studies with human subjects.

Author contributions

MS conceived and outlined the manuscript and integrated co-author contributions. MS, JT, BB, PW, and OJBDW reviewed the literature, contributed conceptual input, and drafted sections of the manuscript. DBZ and TT provided senior guidance on framing and organization. All authors reviewed, edited, and approved the final version of the manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors are supported by a grant from the National Institute of Mental Health (U01MH135901). Dr Buck is supported by a Mentored Patient-Oriented Career Development Award from the National Institute of Mental Health (K23MH122504). The views expressed in this manuscript do not necessarily represent the views of the National Institute of Mental Health, nor did the sponsor play any role in the conception or drafting of this manuscript.

Declaration of conflicting interest

The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Dr. Ben-Zeev has financial interests in Merlin and FOCUS technology. He has provided consultation services to K Health, Boehringer Ingelheim, Deep Valley Labs, Butler Hospital, and Otsuka Pharmaceuticals.

Data availability

Not applicable.

Supplemental material

Supplemental material for this article is available online.

References

Inan

Tenaerts

Prindiville

, et al. Digitizing clinical trials. npj Dig Med 2020; 3: 101.

van Dorn

. COVID-19 and readjusting clinical trials. Lancet 2020; 396: 523–524.

Brøgger-Mikkelsen

Ali

Zibert

, et al. Online patient recruitment in clinical trials: systematic review and meta-analysis. J Med Internet Res 2020; 22: e22179.

Strickland

Ferketich

Tackett

, et al. Imposters, bots, and other threats to data integrity in online research: scoping review of the literature and recommendations for best practices. Online J Public Health Inform 2025; 17: e70926.

Comachio

Poulsen

Bamgboje-Ayodele

, et al. Identifying and counteracting fraudulent responses in online recruitment for health research: a scoping review. BMJ Evid Based Med 2025; 30: 173–182.

Rand

McGraw

Wang

, et al. Impostor syndrome: fraudulent participants in qualitative research can skew results. J Law, Med Ethics 2025; 53: 307–312.

Moore

Chieng

Pirner

, et al. Mitigating fraud in a fully decentralized clinical trial of a digital health intervention. Ann Behav Med 2025; 59: kaaf047.

Louis

Thompson

. Bots and fake participants: ensuring valid and reliable data collection using online participant recruitment methods. Int J Soc Res Methodol 2025; 28: 463–473.

Pozzar

Hammer

Underhill-Blazey

, et al. Threats of bots and other bad actors to data quality following research participant recruitment through social media: cross-sectional questionnaire. J Med Internet Res 2020; 22: e23021.

10.

Willis

Wright-Hughes

Skinner

, et al. The detection and management of attempted fraud during an online randomised trial. Trials 2023; 24: 94.

11.

Morrow

Hopewell

Williamson

, et al. Threat of imposter participants in health research. Br Med J 2025; 391: r2128.

12.

Sharma

McPhail

Kularatna

, et al. Navigating the challenges of imposter participants in online qualitative research: lessons learned from a paediatric health services study. BMC Health Serv Res 2024; 24: 24.

13.

Giles

McKenzie

Kyei-Nimakoh

, et al. Management of imposter participants when conducting online research with victim-survivors and perpetrators of violence. Methodol Innov 2025; 18: 79–88.

14.

Kumarasamy

Goodfellow

Ferron

, et al. Evaluating the problem of fraudulent participants in health care research: multimethod pilot study. JMIR Form Res 2024; 8: e51530.

15.

Lei

. Online recruitment for an online survey study: our experience of dealing with fraudsters. Appl Nurs Res 2024; 80: 151854.

16.

Garcia-Iglesias

Heaphy

Yodovich

, et al.

Imposter participants? Towards a reflexive epistemology of ‘suspected participants’.

Int J Qual Methods 2025; 24: 16094069251335497.

17.

Mayer

Tryon

Ricks

, et al. Preventing fraudulent research participation: methodological strategies and ethical impacts. J Genet Couns 2025; 34: e70048.

18.

Yadlin

Tsuria

Nissenbaum

. Understanding researcher risk and safety in qualitative research online. Dig Soc 2024; 3: 4.

19.

Medero

Abdi

Ford

, et al. Detecting and preventing imposter participants: methods and recommendations for qualitative researchers. Qual Health Res 2025; 0(0). https://doi.org/10.1177/10497323251333243

20.

Webb

Tangney

. Too good to be true: bots and bad data from Mechanical Turk. Perspect Psychol Sci 2024; 19: 887–890.

21.

Zhu

Wilpers

Hansen

, et al. Moving beyond red flags for imposter participants: a novel screening strategy to enhance data integrity. Qual Health Res 2025; 0(0). https://doi.org/10.1177/10497323251359207

22.

Heinz

Mackin

Trudeau

, et al. Randomized trial of a generative AI chatbot for mental health treatment. NEJM AI 2025; 2(4). https://doi.org/10.1056/AIoa2400802

23.

Davies

Monssen

Sharpe

, et al. Management of fraudulent participants in online research: practical recommendations from a randomized controlled feasibility trial. Int J Eat Disord 2024; 57: 1311–1321.

24.

Ching

BCF

Ani

Pitt

, et al. Editorial perspective: a call for action on imposter participants in child and adolescent mental health research. Child Adolesc Ment Health 2026. https://doi.org/10.1111/camh.70041

25.

Cho

Lewis

Broden Arciprete

. Striking a balance: mitigating fraud while ensuring equity in online qualitative research recruitment. J Med Internet Res 2025; 27: e68393.

26.

Dhariwal

Mohammadi

Golden

, et al. Fraud, deception, and subversion: recommendations for maintaining data integrity. Arch Phys Med Rehabil 2025; 106: 1770–1778.

27.

Dutkowska-Zuk

Hounsel

Morrill

, et al. How and why people use virtual private networks . 2022, pp. 3451–3465. https://www.usenix.org/conference/usenixsecurity22/presentation/dutkowska-zuk

28.

Carey

McLean

Chvasta

, et al. Methods to reduce fraudulent participation and highlight autistic voices in research. Autism 2025; 29: 859–867.

29.

Hill

Hoel

Caldwell

, et al. Flagged for fraud: lessons from 3 case studies on detecting inauthentic participants in online research. J Med Internet Res 2025; 27: e78554.

30.

Westwood

. The potential existential threat of large language models to online survey research. Proc Natl Acad Sci USA 2025; 122: e2518075122.

31.

Hoerger

. Faking health vulnerabilities to meet eligibility criteria to participate in paid internet-mediated research during the COVID-19 pandemic: three case reports. Ethics Behav 2025; 35: 107–112.

32.

Storozuk

Ashley

Delage

, et al. Got bots? Practical recommendations to protect online survey data from bot attacks. Quant Met Psychol 2020; 16: 472–481.

33.

Susnjak

McIntosh

. ChatGPT: the end of online exam integrity? Educ Sci 2024; 14: 56.

34.

Yarrish

Groshon

Mitchell

, et al. Finding the signal in the noise: minimizing responses from bots and inattentive humans in online research. Behavior Therap 2019; 42: 235–242.

35.

Fang

Kong

Zhuang

, et al. Your language model can secretly write like humans: contrastive paraphrase attacks on LLM-generated text detectors (version 3). arXiv. 2025. 10.48550/ARXIV.2505.15337

36.

Lebrun

Temtsin

Vonasch

, et al. Detecting the corruption of online questionnaires by artificial intelligence. Front Robot AI 2024; 10: 1277635.

37.

Zhao

Song

Duah

, et al. More human than human: LLM-generated narratives outperform human-LLM interleaved narratives. In: Proceedings of the 15th conference on creativity and cognition, virtual event, USA, 2023, pp. 368–370. New York, NY, USA: Association for Computing Machinery.

38.

Karnouskos

. Artificial intelligence in digital media: the era of deepfakes. IEEE Trans Technol Soc 2020; 1: 138–147.

39.

Masood

Nawaz

Malik

, et al. Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward. Appl Intell 2023; 53: 3974–4026.

40.

Navarro

. A guided tour to approximate string matching. ACM Comput Surv 2001; 33: 31–88.

41.

Pinzón

Koundinya

Galt

, et al. AI-powered fraud and the erosion of online survey integrity: an analysis of 31 fraud detection strategies. Front Res Metr Analytics 2024; 9: 1432774.

42.

Heiding

Schneier

Vishwanath

, et al. Devising and detecting phishing emails using large language models. IEEE Access 2024; 12: 42131–42146.

43.

Johora

Hasan

Farabi

, et al. AI advances: enhancing banking security with fraud detection. In: 2024 first international conference on technological innovations and advance computing (TIACOMP), Bali, Indonesia, 29–30 June 2024, pp. 289–294. New York: IEEE.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.02 MB

Practical guidance for mitigating fraud in online research: The Configure,Assess,Triage,Corroborate,and Hone (CATCH) framework