Abstract
Artificial intelligence (AI) technologies are rapidly expanding in qualitative health research and often promise improved efficiency or novel discoveries. However, this promise has yet to be realized, and further, serious ethical issues emerge ranging from the use of AI videoconferencing technologies to conduct interviews, AI transcription services, and AI-augmented qualitative analysis tools. These ethical dilemmas are not always obvious and require careful consideration of the ramifications of integrating these technologies in the research process. These concerns are relevant to all stages of research experience ranging from emerging scholars to more practiced researchers but are particularly significant in training new scholars who are early adopters of AI technologies. To trace the ethical issues surrounding AI in the practice of qualitative health research, we map the specific values of autonomy, privacy, validity, and equity to highlight decision points and provide a framework for navigating ethical use of the AI tools.
Artificial intelligence (AI) is increasingly featured in novel tools for qualitative health research and is also quietly embedding in the existing technologies qualitative researchers already use. Augmented analysis programs like MaxQDA AI Assist (Loxton, 2024), AI transcription services such as Rev.ai (Meanwell et al., 2024), and Zoom meeting transcriptions (Herdiyanti, 2024) provide promising avenues for efficiency in research. Further, new services such chatbots (e.g., ChatGPT) (Van Manen, 2023) enable “a large language model to assist with tasks such as coding, summarizing, data exploration, generating interview questions, refining writing, and exploring theoretical ideas” (OpenAI, 2025). However, some researchers recommend caution (Hitch, 2024), and acceptance of the use of these technologies among qualitative researchers is mixed. Regarding the specific processes of qualitative research, one study found that qualitative researchers were more accepting of the use of natural language processing (NLP) AI transcription services compared to the use of generative AI coding for analysis (Marshall & Naff, 2024). With regard to patterns of acceptance among researchers at various universities, researchers at R1 universities were found to be less accepting of the use of AI compared to researchers at other institutions (Marshall & Naff, 2024). Further, one study found that while early career scholars are more comfortable using AI for theory development, more experienced researchers were “skeptical” about the validity of research using these tools (Chatzichristos, 2025, p. 5). Despite this trepidation, “many researchers value AI for its ability to handle large datasets and identify patterns and themes” (Chatzichristos, 2025, p. 6), and the use of AI in qualitative health research has been called “inevitable” (Paulus, Lester, & Davis, 2025), “the way of the future,” and “a virtual colleague” (Hitch, 2024, p. 604).
Given the novel nature of these technologies, even seasoned researchers are encountering new and unseen ethical challenges related to their use. Additionally, learners being trained in qualitative research practices may be eager to use these innovations but without guidance are likely unprepared to address the ethical dilemmas AI qualitative research presents. These technologies may have been marketed as a “revolution” in research; however, allowing these technologies to “simplify” the research process requires trade-offs that may be costly in other ways (Paulus & Marone, 2025, p. 398).
The first author is a qualitative health researcher, teacher, and bioethicist, and the second author is a software engineer who uses AI in their work developing enablement software. The following essay was inspired by the first author’s conversations with graduate students in a Qualitative Research Methods class at the University of Missouri curious about using AI for their work. In course discussions, ethical questions emerged, such as: “Should I use AI transcription programs for my interviews?” “Do I have to disclose that I’m using AI in this teleconference?” and “Can I trust the output of an AI analysis program?” Many of these students were qualitative health researchers who ask participants questions about queer identity, neurodivergence, mental health, burnout, and birthing experiences. The practical questions raised by these students prompted the first author to seek insights from the second author explaining the specificities and technicalities of how AI presents ethical dilemmas in qualitative health research. In developing answers to these questions, it became clear that these considerations would be relevant to any researcher interested in learning to use AI for qualitative analysis, or even researchers who want to learn more about how AI may be integrated into research procedures in ways that are becoming less explicit and may be less obvious to users. Most extant research in the use of AI in qualitative research is descriptive in nature, tracing uptake of AI among various groups (Chatzichristos, 2025). Adding to this literature, the following essay outlines key theoretical ethical issues relevant to decision points in the use of AI in qualitative health research, thus linking ethical theory with research practice. Further, by connecting ethical principles to “ethics-in-action,” this essay answers a call for applying abstract ethical principles to specific contexts (Knight et al., 2025), namely, qualitative research. Finally, examining ethical issues surrounding the current use of AI in qualitative research and reflecting on how to train future researchers in use of AI is particularly important because history has shown the harm that missteps in ethical reasoning can cause.
Notoriously unethical studies by researchers such as Milgram, Humphreys, and Zimbardo were framed as innovative and progressive at the time, yet they created material harm (Santi & Luna, 2024). These harms transpired despite the efforts of individuals who warned of the ethical issues present in the studies. In light of this history, the practice of gathering data from participants is sensitive. And while institutional review boards (IRBs) and research ethics committees (RECs) should provide oversight on the rapidly changing landscape of AI in research, these groups may not currently assess the full scope of potential harm resulting from the use of participant health-related data in AI technologies (Doerr & Meeder, 2022). Doerr and Meeder (2022) argue that some IRBs and RECs may limit their assessment of the ethics of AI in research based on how they interpret a statement in the “Common Rule,” a U.S. federal policy regarding research protection of research subjects. The “Common Rule” states that IRBs should not consider long-term effects of group harm—which are particularly relevant to the use of AI (Doerr & Meeder, 2022). Another pillar of research ethics, the Belmont Report, outlines specific principles in research ethics: respect for persons (autonomy), beneficence, and justice (Eto & Miller, 2025). However, these principles may not sufficiently account for emerging challenges related to the use of AI in human subjects research (Eto & Miller, 2025). As such, the ethical use of AI technologies rests on the researcher, who is responsible for the ethical integrity of their study. This relationship between researchers and participants requires particular care in the context of qualitative health research, wherein participants share highly personal and vulnerable information about their health. The following domains of ethical issues emerged inductively from conversations with students and address key ethical considerations around AI in qualitative health research involving potential harms in the ethical areas of autonomy, privacy, validity, and equity.
Ethical Considerations for the Use of AI in Qualitative Health Research
Autonomy
Participants’ autonomous consent is a key value in human subjects research (Santi & Luna, 2024). A recent examination of ethics in the use of AI used the Declaration of Helsinki to draw a parallel between AI ethics in medical research and human subjects research (Baracho Dittrich & Reichenbach, 2025, p. 15). The authors emphasized the researcher’s obligation to “protect the life, health, dignity, integrity, autonomy, privacy and confidentiality of personal information of research participants. The responsibility for the protection of research participants must always rest with physicians or other researchers and never with the research participants, even though they have given consent” (Baracho Dittrich & Reichenbach, 2025, p. 15). The authors proposed that “each potential participant must be adequately informed in plain language of the aims, methods, anticipated benefits and potential risks and burdens […] The potential participant must be informed of the right to refuse to participate in the research or to withdraw consent to participate at any time without reprisal” (Baracho Dittrich & Reichenbach, 2025, p. 17). However, it is unclear whether it is truly possible to gain participants’ consent for their health-related data to be used in an AI program. The consequences of inputting data into AI tools are far reaching and difficult to predict.
For example, the use of AI technologies to process health-related data presents ethical issues represented by a “black box” phenomenon, wherein the technologies and processes of AI are opaque and poorly understood by both practitioners and laypeople (Chau et al., 2025). The lack of transparency around the “black box” makes it difficult for researchers and participants to ascertain and discuss the potential risks of AI processing of data. Further, with regard to data security, there are several avenues wherein the security of sensitive data may be compromised, exposing identifiable health information about a participant. For example, if a participant’s identifying data are sent to a cloud-based service such as Dedoose, there are several joints wherein data that were purportedly secure become compromised through privacy breaches (Gal, 2023; Herdiyanti, 2024; Hitch, 2024). Or, if participant data are used in a program that uses submissions for model training, then it is difficult to predict the various risks to their data. For example, privacy could be compromised through information being in the public domain (ChatGPT) or reconstituted in generative AI outputs. Because at the present we cannot predict these potential breaches, some may argue that in these cases we cannot obtain informed consent for data use in certain AI technologies.
Further, in the new era of AI technologies, protecting participant autonomy begins with the design of the study. For example, technologies such as Zoom video conferencing currently use generative and NLP AI in the AI Companion and Transcription features. If interviews are to be conducted on Zoom, interviewers must ensure in advance that AI features are only used when participants explicitly consent to them. This informed consent preserves autonomy by ensuring participants understand and have control over how their private health information will be used (Weiner et al., 2025). AI Companion and AI Transcription can be disabled in Account Settings (Enabling or disabling the AI Companion Panel in Zoom Workplace, 2023).
Given these concerns, when initially exploring use of AI, we recommend researchers use only data where participants explicitly consent for their data to be inputted into AI programs, even if their local IRBs do not explicitly require this practice. When obtaining consent, researchers should seek to articulate, to the best of their ability, the risks and benefits of data processing using AI. Samuel and Wassenaar (2025) propose guidelines for obtaining participant consent for their data to be transcribed using NLP AI. If consent is not obtained, but a researcher is still interested in exploring AI research tools, it’s recommended to instead develop skills in generative AI prompt design. If researchers want to explore the capabilities of AI without needing data from consenting participants, they may seek to learn “prompt engineering” (Chubb, 2023). Prompt engineering, also known as prompt design, is essential to the quality of outputs in generative AI analysis (Zhang et al., 2025). Zhang et al. (2025) propose best practices for prompt design in qualitative analysis, arguing that good prompt design includes background, the goal of the task, a description of the analytical process, a definition of input and output data format, defined roles, output prioritization, and rules to make output transparent. Finally, using synthetic data generated by AI (Fuchs et al., 2025), researchers can gain skills in prompt design that will be foundational if a researcher seeks to use AI in qualitative analysis of real participant data. For instructors of qualitative methods, Wheldon and McKee (2025) provide detailed lesson plans for teaching the use of AI in qualitative research with sample data, with a brief discussion of ethical considerations. These lesson plans encourage students to experiment with the processes of AI-assisted qualitative analysis in group, exploring the advantages and disadvantages of using these tools. These guided activities also allow learners to reflect on beneficence by weighing the risks against benefits to participants.
Privacy
Protecting participants’ private health information presents ethical issues in all forms of electronic data storage but becomes particularly salient when processing participant data using AI. For example, use of NLP AI for transcription is alluring as it saves valuable resources of time and energy, but there are several ethical issues to consider when using this technology. First, transcription is often used as a first layer of analysis and allows for a confidential practice of removing identifying information about an individual that can be triangulated (Tracy, 2024). If audio is automatically transcribed using certain NLP AI programs (particularly low-cost and free programs), these potential benefits of transcription are lost, and those identifying factors will be automatically imputed into the transcript. Even if specific proper names are removed after using NLP AI transcription, it may be possible to triangulate one’s identity using a participant’s story, such as places, dates, and the details of a participant’s context, data which can be compromised if used for model training and then reconstituted into generative AI outputs.
In the case of transcription, there are ways of using computing—but not AI—in order to become more efficient with transcribing. For example, in the Microsoft Word desktop processing program, listening to interview audio and then repeating the audio using the Word dictation feature to transform voice input to text output currently avoids the use of AI. However, researchers should confirm whether or not the practice of dictating employs AI. As of this writing, the Word dictation feature in the desktop Word application does not use AI, but the web version allows the user to upload an audio file and automatically transcribe the file using NLP AI (Microsoft adds AI-powered transcription to Word, n.d.). It’s important to note that as of this writing, the use of AI is not disclosed in the web application of Word or the “Learn More” section. This is one example of hidden integration of AI in current technologies. Further, as discussed earlier, automatically generated Zoom transcription also uses Otter.ai in the transcription of the interviews, which can potentially compromise data security, and also stores data in the cloud unless specifically told not to, which adds another joint wherein privacy may be breached. The same privacy issues apply to storage of qualitative health data in cloud-based AI analysis programs. Using “local,” or non-networked, AI features can mitigate some of the risks associated with cloud-based storage (Wheldon & McKee, 2025).
Navigating these features is a part of becoming AI literate. It is the researcher’s responsibility to seek to understand how our participants’ data will be used and ensure that they are protected. We argue that if it is too much of a burden to investigate and seek to understand these technologies, then one should continue to use non-AI, traditional technologies for qualitative analysis. We also anticipate that in the future, scholarly journals will be asking for more detailed disclosures of use of AI. Researchers should be aware of these issues before they allow an entire project to be compromised by inadvertent or unethical AI use in transcription or cloud storage.
Validity
In addition to autonomy and privacy concerns, considerations regarding validity are significant when analyzing data using AI. Serious questions surround AI’s capability to engage in complex interpretations, compromising the truthfulness or trustworthiness of generative AI outputs. Although current generative AI technologies seem intelligent and advanced, they lack the emotions that sensitize us to what’s meaningful in our data (Shojaee et al., 2025). As of this writing, generative AI is not capable of being creative, generating new ideas, or making novel links between extant concepts in communication research. Research demonstrates that large language models (LLMs), the most common form of AI used in qualitative research, are still unable to assess complex meanings (Shojaee et al., 2025). For example, Dr. Usama Fayyad, the Inaugural Executive Director of the Institute for Experiential AI at Northeastern University, says that general AI is often lacking in common sense reasoning, such as: “if I put something heavy on something light, I am going to crush it” (Burns, 2025, p. 00:03:30). Dr. Fayyad also explains that large language models struggle to “decompose a problem,” or reason through the sequential steps required to solve it, such as determining how to get from one place to another. AI outputs are “stochastic,” or random, and are “parrots,” highly sensitive to the prompts we give them (Burns, 2025, p. 00:19:20). These inherent weaknesses in LLMs have been demonstrated to produce low-quality output, leading the authors to caution against the use of LLMs in qualitative research (Friedman et al., 2024).
Further, LLMs are vulnerable to “hallucinations.” LLMs work by predicting the next word in a sentence using the prompt a user provides. If the model is not provided with enough context, it may drift off subject or even make up facts since it is only using probabilities of what the next word is based on the data it has been trained on in the past (Zubiaga, 2024). These “unfaithful outputs” (Zubiaga, 2024, p. 03) may not accurately reflect themes in the data and instead may present false “hallucinations” that compromise the integrity and trustworthiness of the research findings. Topics in qualitative health research seem to be particularly vulnerable to hallucinations. For example, a systematic review of the use of generative AI in research found that ChatGPT is particularly vulnerable to hallucinations when analyzing high-risk or nuanced content, or when critical analysis is needed (Adel & Alani, 2025). This weakness may explain why generative AI outputs in the fields of social science and public health were worse compared to other fields. Further, accuracy of interpretation of existing research was found to be as low as 4.6%, and hallucination rates in systematic reviews ranged between 28 and 91%. These inaccuracies are particularly dangerous considering the need for a sound evidence base to understand health experiences that have a profound effect on people’s lives.
To improve the validity of AI-assisted analysis programs, users should develop skills in “prompt engineering,” or practicing interacting and refining outputs of LLMs (Chubb, 2023). In one description of the use of ChatGPT for qualitative analysis, the authors state, “Finally, we are not professional AI prompt designers and some of our instructions might seem clumsy at best. We note that this reality likely mirrors the use case of many qualitative researchers who lack familiarity with LLMs” (Wachinger et al., 2025, p. 964). If one seeks to use generative AI to analyze qualitative health data, skills in prompt design will likely be necessary to strengthen trustworthiness of the results. It’s possible that in the future, AI will improve in the area of validity. However, even then, not every generative AI feature will be capable of this level of sophistication, nor will they be accessible to everyone. Further, these technologies reflect the biases imprinted in AI algorithms that reflect so much of our human consciousness and cognition—and yet AI does not have the reflective capacity to overcome, attenuate, or disclose those biases.
Finally, in the process of conducting qualitative research, technological reflexivity unearths the decision-making processes surrounding the use of technology in qualitative research design (Paulus, Pope, & Bower, 2025). Technological reflexivity requires a researcher to consider the consequences of various technological research choices, making explicit the reasoning and potential effects of these choices (Paulus, Pope, & Bower, 2025). Engaging in this practice likely strengthens the trustworthiness of findings and provides an opportunity for ethical reflection. Similarly, maintaining a robust audit trail explaining “what I did, how I did it, and why I did it” is a form of transparency in qualitative research (Tuval-Mashiach, 2017, p. 131). These reflections will be valuable when disclosing the use of AI in scholarly publications, a practice Resnik and Hosseini (2025) argue should be mandatory.
Equity
One aspect of justice involves equity or ensuring fairness or correcting systemic inequities. These inequities often manifest through biases inherent in both individuals and systems. Qualitative health researchers—and specifically interpretive researchers—are often encouraged to engage in self-reflection. This practice involves sitting outside of oneself and assessing to what extent the researcher’s interpretations are shaped by their perceptions. In some cases, it is appropriate to name those perceptions explicitly (Tracy, 2024). In other cases, it is acceptable to allow biases to shape interpretations, or, at times, researchers are asked to bracket off and avoid allowing bias to shape the analysis (Tracy, 2024). However generative AI is not capable—or even motivated—to engage in the same practice. As such, it is currently impossible to know which biases to address in an AI interpretation. Researchers working from a positivist paradigm may use AI-augmented analysis to propose that their findings are more objective (though this argument may be less relevant for other paradigmatic approaches) (Hitch, 2024). However, generative AI technologies don’t understand the reasoning behind the decisions they make or even understand the distinction between different concepts because of the LLM format (Burns, 2025). As such, prejudicial biases have been perpetuated and even amplified in generative AI technologies along the lines of race, gender, class, and ability (Williams, 2024). This tendency is counterproductive to the goals of many qualitative health researchers—to identify and correct patterns of injustice and to improve material outcomes for individuals experiencing illness and disease. Thus, the use of these tools for analysis require the reflection of a researcher with the motivation and ability to reflect on, attenuate, or disclose potential biases in the interpretations they present (Wachinger et al., 2025).
A second cognitive bias involves anchoring or treating initial information as true and failing to integrate emerging new information, even when that information is contradictory (Barsky-Moore & Barsky, 2025). This cognitive bias may cause researchers to put too much confidence in AI analysis outputs, doubting their own interpretations. To avoid anchoring bias, researchers should view generative AI responses with skepticism and only after an initial layer of analysis (Barsky-Moore & Barsky, 2025).
In addition to perpetuating these prejudicial biases, stratification of the quality and security of AI technologies will present an entrenched system of unequal resources available only to certain groups of researchers. Inevitably, the increasing sophistication of AI will likely become an important part of the quality of research process, particularly in the area of transcription and data organization. Currently, there are promising low-cost transcription services that seem to focus on protecting participant data. Researchers should look for Health Insurance Portability and Accountability Act (HIPPA) or General Data Protection Regulation (GDPR)–compliant transcription services such as Amazon Transcribe Medical at a few cents per minute. This advancement will be beneficial to many researchers—and hopefully participants, if even tangentially. However, it’s likely that as this technology progresses, the most sophisticated models and the most secure technologies will still be resource intensive. Similar to ways that we see the most secure transcription services costing more money, future AI technologies will also be stratified along the lines of complexity—there is no indication that AI will ultimately be a democratic force in qualitative research.
Conclusion
AI technologies are expanding at a rapid pace, and finite resources available for research may enhance the appeal of potential efficiency in the various processes of qualitative health research. These considerations for the ethical use of AI in qualitative health research hold relevance for all levels of research experience but are particularly relevant in training emerging scholars who are often eager and early adopters of these types of technologies. By mapping use of AI technologies to the specific values of autonomy, privacy, validity, and equity, we hope to highlight potential areas where qualitative health researchers can approach the use of this technology in a way that reflects their values and aspirations for conducting ethical research.
Footnotes
Acknowledgments
The first author thanks the graduate students in her Qualitative Methods course at the University of Missouri for raising these important questions and for providing the rich intellectual groundwork that informed the considerations addressed in this article.
Ethical Consideration
Our study did not require an ethical board approval because it did not directly involve humans or animals.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
