Abstract
Artificial intelligence (AI) is transforming qualitative research by streamlining data management and analysis. However, its application raises methodological, ethical, and cultural considerations, especially in large-scale, multilingual studies. We outline the step-by-step integration of AI into our qualitative data analysis of two projects, QualiECAC4 and BUMPER, guided by Bengtsson’s stage content analysis framework. The first project involved 141 individual interviews conducted across nine EU Member States (MS), while the second involved 73 participants (eight individual interviews and twelve focus groups) across seven EU MS. In both projects, AI tools (ATLAS.ti for coding; DeepL Pro for translation) facilitated transcription, translation, initial coding, and theme identification, with all outputs subjected to systematic human review at each analytical phase. Integrating AI into our workflow accelerated data processing and highlighted consistent coding patterns across diverse multilingual datasets. At each phase of the content analysis framework, we pinpointed concrete benefits and tackled challenges, such as overlapping codes, nuanced interpretations, and cultural subtleties, through a structured human-in-the-loop process that combined open and intentional coding. While AI significantly enhances speed and depth, it still requires active human oversight to maintain methodological rigour and preserve contextual accuracy. Therefore, rather than reporting thematic findings, we draw on our team’s practical experiences to provide clear, actionable recommendations for integrating AI into qualitative research, suggesting specific updates to reporting standards (COREQ) to ensure transparency, accountability, and ethical practice.
Keywords
Introduction and Context
The rapid advancement of artificial intelligence (AI) has profoundly reshaped how we live, work, and conduct research. While AI holds transformative potential for accelerating scientific discovery, its integration also introduces methodological, ethical, and practical challenges. To guide responsible AI adoption in research, the European Commission (EC) has issued comprehensive recommendations emphasising accountability, transparency, and iterative learning (European Commission, 2021).
Contextualising Cancer Prevention in the EU
The EC has also funded two projects under the Europe’s Beating Cancer Plan (EBCP), one to update the European Code Against Cancer (ECAC), and one to develop an European Union (EU) Mobile App for Cancer Prevention informed by the BUMPER project (European Commission, 2021). These projects aim to provide accessible, evidence-based guidance on cancer risk reduction, covering recommendations on lifestyle and environmental risk factors, and interventions proven effective such as vaccination and screening. First launched in 1987 and lately revised in 2014, the fourth edition of the European Code Against Cancer (ECAC4), outlines established evidence-based actions to lower cancer risk (Schüz et al., 2015). Simultaneously, the BUMPER project has engaged end-users to optimise app usability, trained health promoters in digital health literacy, and piloted the app across multiple Member States (MS) to enhance adoption and reinforce preventive actions.
This is particularly relevant, considering that across the EU, over 10% of the population has inadequate health literacy, which hinders engagement with preventive actions and exacerbates health disparities (European Commission, 2021). Additionally, poor health literacy correlates with lower uptake of health-promoting behaviours (Fernandez et al., 2016), reduced participation in screening programmes (Sørensen et al., 2015), and higher morbidity and mortality (Fan et al., 2021), while undermining risk perception and self-management skills crucial for chronic disease prevention (Ferrer & Klein, 2015; van der Gaag et al., 2022).
Overview of AI-Assisted Case Studies
Currently, AI is revolutionising qualitative research by enabling efficient management of large datasets, expediting data processing, and supporting automated theme detection. Tools like ATLAS.ti and NVivo now incorporate AI-assisted coding, quickly generating initial codes that researchers refine, thereby reducing the manual labour of traditional coding (Jiang et al., 2021; Marshall & Naff, 2024). Moreover, AI-driven analytics can propose thematic inferences, enhancing both the efficiency and depth of content analysis (Jiang et al., 2021). However, because qualitative inquiry is inherently interpretative, AI should augment rather than replace human judgment (Jiang et al., 2021; Marshall & Naff, 2024). Key considerations—data privacy protection, mitigation of algorithmic bias, and continuous human oversight—are essential to uphold the rigour and integrity of qualitative findings (Acheampong & Nyaaba, 2024; Christou, 2023). Current reporting frameworks, such as the COREQ checklist, predate AI’s widespread use and require updates to address transparent documentation of AI’s role (Tong et al., 2007).
Under the EBCP, two AI-assisted qualitative projects, QualiECAC4 and BUMPER, examined public engagement with cancer prevention guidance and digital tools. Although awareness of ECAC4 was low, 60% of readers indicated willingness to adopt prevention behaviours (Ritchie et al., 2021). Therefore, the ECAC needs to undergo further revision to improve clarity and uptake. To achieve this, the QualiECAC4 study has explored perceptions of the ECAC4 recommendations across 141 participants in 9 EU MS, utilising AI tools to streamline transcription, multilingual synthesis, and thematic analysis.
The BUMPER project, on the other side, has focused on the development and usability testing of an EU Mobile App for Cancer Prevention, designed to translate ECAC recommendations into an interactive, user-centred platform. By engaging diverse end-users to identify barriers and facilitators to digital adoption and training health promoters in digital health literacy, BUMPER has ensured that the app is scientifically robust and accessible across various populations. Piloted in multiple EU MS, BUMPER aimed to boost app usability and reinforce preventive behaviours.
Instead of detailing the case studies’ results, this paper examines the use of AI-assisted qualitative analysis in these two EBCP cancer prevention projects, highlighting AI’s strengths and limitations, offering practical guidance for its implementation, and assessing its alignment with established qualitative frameworks to enhance transparency and rigour.
Study Design and Integration of AI in Qualitative Analysis
This section outlines how AI was methodically embedded into the qualitative research of the QualiECAC4 and BUMPER projects, using Bengtsson’s (2016) content analysis framework as the guiding structure. Both projects explored cancer prevention literacy and behaviour among diverse European populations and utilised AI tools to augment analytical depth and efficiency while maintaining methodological rigour and transparency. Additionally, this section suggests how the use of AI tools could be incorporated into the COREQ, a 32-item checklist developed by Tong et al. (2007), to ensure comprehensive and transparent reporting of qualitative research.
Case Study Objectives and Methodological Approach
The QualiECAC4 project aimed to identify barriers and facilitators related to adopting cancer prevention behaviours and uptake of interventions recommended by the ECAC4 using in-depth interviews across nine EU MS (Feliu et al., 2024). This research is part of a broader initiative to inform the development of the fifth edition of the ECAC and help experts ensure the population’s needs are met. AI tools augmented traditional qualitative methods, enabling a complex, multi-country analysis that spanned diverse cultural and linguistic contexts. The QualiECAC4 study employed the COM-B Model of Behaviour (Michie, 2011) as a theoretical framework to develop an exploratory qualitative design, utilising semi-structured interviews with adults aged 18–65 across nine EU MS: Bulgaria, Croatia, France, Germany, Ireland, Poland, Portugal, Romania, and Spain. The primary aim was to investigate perceived barriers and facilitators influencing the adoption of cancer prevention actions and interventions recommended by ECAC4 (Feliu et al., 2024). A quota sampling strategy was employed to ensure heterogeneity and inclusion of populations with relevant characteristics related to our research question, as per the literature (i.e., sex, age, and level of education). A total of 141 interviews were conducted.
The BUMPER project employed qualitative research methods to refine the usability of the EU Mobile App for Cancer Prevention. This multi-country EU study examined the enablers and barriers experienced by end-users of the cancer prevention mobile app across varying levels of digital health literacy. In-depth interviews and focus group discussions were conducted in seven EU MS (Cyprus, Finland, Germany, Hungary, Portugal, Slovenia, and Spain), and data were analysed to generate evidence-based recommendations for improving the app’s usability while ensuring equitable access without deepening the digital divide. Participants were sampled to include diverse socioeconomic backgrounds, age groups, genders, and digital health literacy levels. Recruitment combined in-person outreach with online social media campaigns, tailored to each country’s context. Trained moderators facilitated focus group discussions (FGDs) and individual interviews using a semi-structured interview guide complemented by socio-demographic questionnaires. In total, 73 participants took part in 12 FGDs and eight individual interviews. This paper emphasises the implementation and experience of AI-driven qualitative analysis, rather than reporting the studies’ outcomes.
Transcription and Translation
For both projects, professional services conducted manual verbatim transcription to ensure initial accuracy. These transcripts were then translated into English using AI-assisted software (DeepL Pro) to expedite multilingual processing. To ensure rigorous quality control, local research teams fluent in the source languages conducted line-by-line checks of AI-generated translations against the original recordings, flagging any discrepancies for review and correction. Local reviewers independently annotated translation inconsistencies and integrated corrections directly into the transcripts. Bilingual researchers further reviewed the revised translations to ensure conceptual equivalence and cultural appropriateness, collaboratively resolving any remaining discrepancies.
Thematic Content Analysis (TCA) with AI-Assisted Tools
To streamline transcript analysis, we chose ATLAS.ti v9.1—an early AI adopter with a hybrid architecture that blends OpenAI GPT-powered generative coding and summaries, smart search, and sentiment analysis tools (ATLAS.ti, 2023). This automation prioritised human interpretation over manual coding, ensuring adherence to project timelines. AI-assisted initial coding generated over 5,000 codes in QualiECAC4 and more than 1,300 in BUMPER, which two trained reviewers refined using an inductive–deductive hybrid approach: • Inductive (Open) Coding: AI detected recurring phrases and co-occurrence patterns as a preliminary code set. • Deductive (Intentional) Coding: Predefined codes, informed by ECAC4 (QualiECAC4) recommendations and digital health frameworks (BUMPER), ensured theoretical alignment.
Reviewers validated and refined AI-generated codes to maintain conceptual fidelity and methodological rigour, exemplifying a human-in-the-loop process that harnesses AI efficiency without sacrificing interpretive depth.
Results
Implementing the Content Analysis Framework with AI Integration
We adopted Bengtsson’s (2016) four-stage content analysis framework—Decontextualisation, Recontextualisation, Categorisation, and Compilation—to illustrate AI’s practical application at each phase through two case studies, rather than to evaluate AI critically. We document each AI integration step with human oversight, outlining protocols for transparent and reproducible reporting, as well as bias mitigation. The subsequent sections detail the specific tools, reporting measures, and precautions taken to ensure methodological integrity. This detailed documentation is crucial for guiding other researchers in adopting and adapting AI-driven qualitative methods with transparency, rigour, and reproducibility.
Stage 1: Decontextualisation
In the decontextualisation phase, all transcripts underwent a comprehensive review. We used ATLAS.ti v9.1’s AI-driven NLP capabilities to accelerate coding, enabling efficient detection of recurring terms and preliminary theme generation (ATLAS.ti, 2023). We applied both open (inductive) and intentional (deductive) coding approaches. In open coding, AI identified patterns and suggested codes, generating over 5,000 initial codes in QualiECAC4 and 1,300 in BUMPER that required consolidation. Intentional coding relied on a predefined codebook based on our research questions, ECAC4 recommendations, and digital health literacy literature to maintain analytical focus; however, this approach may have missed insights outside of those codes. To mitigate these issues, two reviewers independently reviewed AI-derived codes, collaborated with a third reviewer to reconcile discrepancies, and documented all coding decisions in an Excel audit trail to ensure transparency and facilitate further scrutiny.
Stage 2: Recontextualisation
At this stage, we re-evaluated the coded transcripts to confirm inclusion of all pertinent data and exclusion of irrelevant information, ensuring alignment with our research aims. In the BUMPER project, open and intentional coding were conducted concurrently, and emerging insights from both approaches were synthesised into the final categories. ATLAS.ti’s AI features streamlined this process by flagging overlooked segments and generating transcript summaries, which helped in organising the data and cross-verifying themes for consistency. Researchers then jointly examined the coded data to maintain consistency and coherence. Although AI efficiently highlighted duplicate codes, the research team manually reviewed these suggestions. It made the final decisions on merging, ensuring that all consolidations accurately reflected the participants’ nuanced contexts and meanings.
Stage 3: Categorisation
At this stage, codes were synthesised into coherent categories and overarching themes. QualiECAC4’s theoretical framework, the COM-B Model of Behaviour, was used to identify and categorise the codes in the main factors believed or found to influence behavioural change outcomes. Simultaneously, the team utilised AI-generated code summaries to reveal overarching patterns and systematically structure the categories.
However, because open coding generated such a large volume of codes, we conducted a review and consolidation of duplicates to preserve clearly defined and meaningful categories. Consequently, integrating both open and intentional coding strategies effectively mitigated the individual limitations of each method. The combined use of open and intentional coding allowed for capturing both anticipated and emergent themes, providing a comprehensive analysis. Ultimately, our findings underscored that human expertise was crucial for ensuring that the derived themes authentically represented the data. Furthermore, our cross-national study design required close collaboration with local teams to validate that AI-generated themes were both culturally sensitive and contextually accurate.
Stage 4: Compilation
In the compilation phase, we interpreted the results and developed them. AI-powered tools produced visualisations, such as bar charts, tree maps, network diagrams, and word clouds, that enhanced the clarity of our data presentation. AI also generated summaries for integration into reports and publications. To strengthen credibility, the team implemented three key measures: rigorously verifying AI-generated visualisations and summaries for accuracy, transparently acknowledging AI’s analytical contributions in reporting, and engaging multiple researchers to review findings and interpretations for validation.
Qualitative Content Analysis Framework Stages and AI Integration With Credibility Enhancements
Reporting the Use of AI in Qualitative Research
Drawing on lessons learned from the QualiECAC4 and BUMPER projects, we propose tailored recommendations for incorporating AI into the COREQ checklist. Table 2 then aligns each COREQ criterion with the specific phases, such as transcription, coding, theme development, and pattern detection, where AI-powered qualitative analysis tools utilise their functionality. As AI becomes an integral component of qualitative research, updating established reporting standards, such as COREQ, to reflect this evolution is essential. With AI-driven analytical tools gaining traction in the field, immediate focus should be directed toward the following areas: • Training and proficiency of investigators (Item 5): AI platforms, such as ATLAS.ti, require investigators to possess both technical fluency with the software and a solid foundation in qualitative research methods. Reporting the scope and depth of AI-specific training and methodological expertise helps improve the credibility and methodological integrity of AI-supported findings. • Methodological orientation and AI integration (Item 9): Articulate how AI-driven features, such as automated thematic coding, are integrated with conventional methodologies (e.g., content analysis, grounded theory), and explain how these AI enhancements refine or reshape each analytic phase to improve clarity and methodological transparency. • Transcription and coding (Items 19 and 24): Although AI-driven transcription and auto-coding tools accelerate data processing, they require ongoing human oversight. Researchers should detail the validation procedures, such as inter-coder reliability assessments, error correction workflows, and rules for resolving disagreements, used to confirm the precision and relevance of AI-generated transcripts and codes. • Derivation of themes and coding tree (Items 25 and 26): AI-driven analytics can detect themes and patterns within extensive datasets; however, researchers should describe how these AI-generated themes were manually reviewed, refined, or merged. Such documentation ensures the resulting thematic structure integrates algorithmic insights with the researcher’s interpretive expertise. COREQ Checklist With AI Integration Impact and Reporting Guidelines
Incorporating these AI-focused components within the COREQ checklist enhances transparency while reinforcing the methodological rigour and accountability of qualitative research. Revising COREQ to address AI integration is essential for aligning reporting standards with the field’s evolving analytical capabilities.
Discussion
Adopting AI-driven analysis significantly expedited our qualitative analyses and enhanced their depth, in line with Anis and French (2023). Automated processing of raw data shifted the research team’s focus from routine transcription and coding tasks to in-depth interpretation. Although we did not conduct a side-by-side comparison with traditional manual methods, notable reductions in processing time were observed. This efficiency not only streamlined the workflow but also fostered a more iterative and introspective approach, permitting ongoing refinement and expansion of our coding scheme and thematic insights.
While AI integration delivered substantial benefits, it also introduced challenges, mainly controlling redundant codes and finding an equilibrium between automated and manual coding. Open coding produced an extensive array of codes, highlighting the necessity for systematic methods to synthesise and interpret them. Mitigating code redundancy was essential to preserve the distinctiveness and relevance of categories. Equally important was harmonising AI’s efficiency with meticulous human review, which involved thorough consolidation of overlapping codes to uphold a coherent classification framework.
Despite the extra time invested in human oversight and manual code review, which reduced the anticipated efficiency gains from AI-driven analysis, our approach still delivered a marked reduction in overall processing time compared to standard methods. Employing open and intentional coding in combination was pivotal for uncovering both expected and unexpected themes. Relying solely on intentional coding risked overlooking novel insights, while exclusive open coding tends to produce an excessive number of codes without a clear structure. By combining these approaches, we achieved a thorough analysis that merged predefined categories with emergent themes, thereby enriching and expanding our findings.
AI-Enhanced Qualitative Research on Cancer Prevention Literacy
Integrating AI tools into the QualiECAC4 and BUMPER projects accelerated data processing and enriched thematic analysis, deepening our insights into cancer prevention literacy across diverse European cohorts. Recent studies have highlighted AI’s transformative impact on healthcare, improving patient care pathways, streamlining clinical workflows, and advancing medical research (Adegboye, 2024; Shaban-Nejad et al., 2018). In particular, AI-driven platforms can generate personalised health literacy plans for home care (García-López et al., 2023) and power virtual health environments that deliver tailored information to users (Rubinelli et al., 2009).
In our work, AI-enabled rapid transcription and coding enabled us to investigate how varying levels of prevention literacy influence healthy behaviours. We found that participants with lower literacy often misunderstood preventive messages, an insight consistently flagged by AI across multiple interviews, and struggled to assess their own risk of cancer. AI-assisted coding and theme detection also revealed key concerns of misinformation, cultural variations in health beliefs, and disparities in digital health competence factors that influence individuals’ engagement in prevention-focused behaviours (Schulz & Nakamoto, 2013). While AI tools can democratize access to health information, they may also propagate inaccuracies if unchecked, reinforcing the need for human oversight to validate AI outputs (Nutbeam, 2023).
Ethical Challenges and Considerations
Applying AI tools in qualitative studies necessitates careful attention to ethical considerations, including:
Data Privacy and Confidentiality
To protect participant anonymity, all datasets underwent de-identification before entering the analysis pipeline. We implemented end-to-end encryption for both stored and transmitted data, adhering to GDPR (Regulation EU, 2016/679, 2016) and other relevant local data protection laws. Given that many AI-driven qualitative platforms operate on cloud infrastructure, researchers must confirm that service providers comply with these regulations and enforce robust safeguards, such as data-at-rest and data-in-transit encryption, clearly defined retention schedules, strict access controls, and secure deletion protocols. Consent forms and ethics approvals should explicitly describe the use of AI tools and cloud storage to ensure participants understand where and how their information will be processed (Schmoll, 2024).
Bias Mitigation
AI models trained on specific datasets may unintentionally perpetuate biases embedded in their source data (Acheampong & Nyaaba, 2024; Christou, 2023). To counter this, we rigorously evaluated AI-generated codes and themes against the objectives of our study. Furthermore, independent reviewers manually assessed the AI outputs to verify cultural appropriateness and contextual accuracy.
Transparency, Accountability, and Active Human Engagement
We ensured transparency by explicitly documenting the use of AI tools in our methodology. Although AI enhanced our analytical workflow, it never replaced researchers’ interpretive contributions, particularly for culturally nuanced or complex insights (Christou, 2023). Throughout AI-assisted analysis, researchers remained actively involved, applying critical judgment to maintain contextual grounding and ethical rigour. This continuous human oversight was crucial for preserving the study’s integrity and validity.
Future Directions for AI in Qualitative Research
AI’s influence in qualitative research is set to expand, opening doors to more sophisticated analysis and interpretation tools. To harness these opportunities, researchers should receive in-depth training in both qualitative methodologies and AI technologies. Equally crucial is designing AI platforms that are intuitive and accessible to users with varying levels of technical expertise. This approach promotes wider uptake of AI platforms and innovation throughout the qualitative research community.
As AI becomes increasingly embedded in qualitative research, it is vital to sustain ongoing dialogue about ethical concerns, including data privacy, bias mitigation, and transparency. Future efforts should prioritise creating sophisticated AI solutions that not only excel at processing and interpreting rich qualitative data but also embed ethical safeguards to protect participants and integrity throughout the research process.
Looking ahead, ongoing research must ensure the ethical and effective implementation of AI in public health interventions, balancing technological advances with professional AI literacy and robust oversight (Adegboye, 2024; Nutbeam, 2023; Wiljer & Hakim, 2019). By combining AI’s analytical power with the contextual expertise of human researchers, we can develop more personalised, culturally attuned interventions to improve cancer prevention literacy and ultimately enhance public health outcomes across Europe.
Conclusion
Our application of AI-driven tools in the QualiECAC4 and BUMPER projects demonstrated notable gains in processing speed and analytic depth when handling large, multilingual qualitative datasets. Automated translation and initial coding freed researchers to focus on interpretation, resulting in richer insights into health literacy, digital engagement, and cancer prevention behaviours across diverse European populations. However, AI’s outputs frequently required human refinement to resolve code redundancy and capture cultural and contextual subtleties. Maintaining a human-in-the-loop approach, combining open (inductive) and intentional (deductive) coding, ensured a balanced capture of both emergent themes and theory-informed categories. Rigorous oversight also safeguarded ethical standards, with clear protocols in place for data privacy (GDPR compliance), bias mitigation, and transparent reporting, all aligned with COREQ guidelines.
Looking forward, embedding AI in qualitative research demands sustained investment in researcher training and the development of user-friendly, interpretable platforms. Updating reporting frameworks to document AI’s role explicitly will foster methodological transparency and accountability. By integrating AI’s computational power with researchers’ interpretive expertise, future studies can deliver more personalised, culturally attuned interventions and advance cancer prevention literacy across populations. In sum, AI has the potential to advance qualitative public health research by improving scalability, accelerating analysis, and uncovering new insights. However, its value lies in augmenting—not replacing—the nuanced judgment and ethical responsibility of experienced researchers.
Footnotes
Authors Note
Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organization, the authors alone are responsible for the views expressed in this article, and they do not necessarily represent the decisions, policies or views of the International Agency for Research on Cancer /World Health Organization.
Ethical Approval
These studies were conducted in accordance with the Declaration of Helsinki. Ethical approval for the QualiECAC4 study was approved by the International Agency for Research on Cancer (IARC/WHO) Ethics Committee (IEC project 22–24). Local ethical approval for the study methods was obtained from the Ethics Commission Medical Chamber Bremen (No. 855, 01.03.2023) in Germany, from the Dublin City University Research Ethics Committee (DCUREC/2023/006) in Ireland, and from the Ethics Committee of the Center for Innovation in Medicine (EC-INOMED) (No. EC-INOMED: 2). Ethical approval for the BUMPER qualitative study was granted by the Ethics Commission at the University of Bremen (Application number 2023-10).
Authors’ Contributions
B.B. led the study, including conceiving the original idea with F.A. and A.F., and conducted the investigation by developing the research methodology, analysing the information, and performing literature searches. B.B. drafted the initial manuscript and incorporated feedback based on the co-authors’ suggestions. A.F. contributed to refining the study design, supported data analysis, and provided critical feedback throughout the manuscript drafting process. A.F. reviewed the manuscript thoroughly and suggested substantial edits to improve clarity and content. F.A. offered expertise and guidance on the research methodology and provided key insights that shaped the study’s analytical approach. F.A. also contributed to revising and editing the manuscript for intellectual content. T.B., H.Z., and C.E. provided overall supervision, ensuring the rigour of the research. They also offered guidance during critical stages of the study and reviewed and provided feedback on the draft of the manuscript. All authors read and approved the final version of the manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The QualiECAC4 project has been funded by the Cancer Prevention Europe (CPE) consortium and was additionally supported by the IARC’s Environment and Lifestyle Epidemiology Branch budget. The project, Boosting the Usability of the EU Mobile App for Cancer Prevention (BUMPER), received funding from the EU4Health (European Union) programme under Grant Agreement No. 101079924 (November 2022 - November 2024). The qualitative study conducted was a part of the BUMPER project.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The datasets generated and/or analysed during the current study are not publicly available but can be obtained from the corresponding author upon reasonable request.
