Abstract
Large Language Models (LLMs) are revolutionizing how qualitative researchers can work with textual data. Rather than relying only on codebooks or manual line-by-line analysis, scholars can “converse” with their materials by asking targeted questions, probing for contextual insights, and refining theoretical connections. This dialogue-like process speeds up traditional tasks—transcription, coding, theme identification—while sparking broader possibilities for exploration. Researchers prompt the LLM to surface recurring patterns, detect subtle shifts in tone, or suggest new interpretive angles. These capabilities neither diminish the centrality of human expertise nor replace the analytical depth that defines qualitative research. Instead, they expand the researcher’s toolkit, allowing more time for theoretical reflection and rich meaning-making. The LLM becomes an active partner: it quickly identifies connections that might take days or weeks of manual work, yet the scholar remains responsible for selecting which prompts to use, verifying the outputs, and situating them within appropriate conceptual frameworks. Ethical considerations—such as data security and bias—demand careful oversight and transparent reporting, emphasizing the importance of maintaining scholarly integrity. Taken together, LLM-facilitated analysis offer a promising avenue to enhance qualitative research, leveraging both computational efficiency and the nuanced, reflexive insights at the heart of interpretive inquiry.
Qualitative research seeks to uncover nuanced meanings and contextual subtleties that elude simplistic categorization. However, the labor-intensive nature of traditional qualitative processes—manual coding, iterative reading, audio transcription—creates tension between analytical depth and practical constraints. Recent advances in generative artificial intelligence (AI), and with Large Language Models (LLMs), in particular, offer methodological innovations that can fundamentally reshape how social scientists engage with qualitative data (see e.g., Christou, 2023; Schroeder et al., 2024; van Dis et al., 2023).
LLMs allow researchers to explore qualitative data in a “conversational” way for the first time, rather than being confined to predetermined codebooks or line-by-line interpretation of otherwise static text. By posing dialogue-like queries and natural-language prompts, analysts can uncover insights, highlight patterns, and discover conceptual connections that might otherwise remain hidden. What makes LLMs uniquely suited for qualitative analysis is their sophisticated ability to process and generate human-like responses based on extensive training across diverse corpora. Unlike traditional computational tools that rely on predetermined rules or statistical pattern matching, LLMs demonstrate a form of language “understanding” that, while fundamentally different from human cognition, enables remarkably nuanced analysis (Karimova, 2024; Marcus & Davis, 2019) that can grasp semantic relationships, contextual meanings, and subtle linguistic patterns across multiple scales of analysis (Brown et al., 2020; Ziems et al., 2023). This technological foundation enables LLMs to recognize not just explicit content but implicit meanings, cultural references, and contextual nuances—capabilities that more closely approximate human interpretive practices (Kim et al., 2024).
LLMs thus extend far beyond basic automation, offering sophisticated capabilities that strengthen the researcher’s toolkit: they can summarize and analyze interview transcripts, reveal thematic patterns across multiple datasets, generate theoretical propositions from empirical data, and reconcile coding disagreements with near-human accuracy (Bail, 2023; Wadhwa et al., 2023). They excel at identifying narrative structures, tracking conceptual evolution across texts, synthesizing findings across multiple data sources, and can even suggest theoretical frameworks that might explain observed patterns (Collins, 2024). Their multilingual capabilities enable cross-cultural research without extensive translation resources, while their recall capacity can track analytical decisions which enhances methodological accountability (Rathje et al., 2023). Recent advances have also enabled these systems to handle multimodal data, integrating text, audio, and visual elements in their analysis (Zhang et al., 2024).
Comparing the use of LLMs to traditional computational approaches reveals key distinctions. Traditional computational approaches, such as Latent Dirichlet Allocation (LDA) and topic modeling, have helped researchers identify patterns within large text corpora (DiMaggio et al., 2013). However, these methods, based on word frequency and co-occurrence, often miss the deeper contextual meanings central to qualitative inquiry. Similarly, while conventional Computer-Assisted Qualitative Data Analysis Software (CAQDAS) tools help organize and retrieve data, they offer limited analytical insight. LLMs, in contrast, can actively engage with content, suggesting interpretations and theoretical connections that move beyond simple coding assistance to a sophisticated analytical partnership.
When researchers “converse” with their data through LLMs, they can seamlessly move between empirical detail and conceptual abstraction. With carefully constructed prompts, researchers can encourage LLMs to generate provisional hypotheses, to compare and contrast thematic strands across datasets, or to suggest new lines of inquiry for further human-led analysis. As a result, the computational process becomes less about rote extraction and more about stimulating intellectual exploration and discovery.
This, therefore, does not mean that LLMs replace the interpretive acumen of the researcher, but rather supplements it in new and exciting ways. Still, the integration of LLMs into qualitative analysis demands a reconceptualization of the researcher’s role. Rather than delegating interpretive authority to the machine, successful implementation requires a hybrid approach where human insight and reflexivity guides and critically evaluates computational analysis (Li et al., 2024). The researcher remains central as the architect of inquiry, bringing theoretical sensitivity, methodological expertise, and ethical judgment to the analytical process. This human-AI partnership enables a sort of “three-way conversation” between researcher, data, and LLM, where each component contributes distinct capabilities to the analytical process. Just as a skilled interlocutor can push a researcher to consider new angles or challenge their assumptions, the LLM serves as an intelligent mediator helping surface how participants’ own words and experiences speak to particular research questions while remaining grounded in the datal itself. The researcher remains the sense-maker, testing the model’s suggestions against their broader disciplinary knowledge and the empirical realities under study.
Still, it is crucial to recognize the ethical and epistemological dimensions at play. LLM-driven analysis must be carried out transparently and responsibly. Researchers should clearly state when and how LLMs were used to analyze data and remain attentive to the potential biases and gaps encoded in these models. Efforts to ensure data privacy, to respect the rights and dignities of research participants, and to critically assess the model’s outputs remain essential. Moreover, the epistemic nature of the researcher-model interaction demands continual reflection on the boundary between computational suggestion and interpretive authority. How do LLMs’ capabilities complement or conflict with different theoretical or methodological traditions? What epistemic authority should we grant to AI-generated insights? The ability to intuitively converse with data can yield richer insights but also risks naturalizing the model’s outputs as self-evidently meaningful if not properly scrutinized.
Ultimately, “conversing” with data through LLMs represents an emergent methodological practice that resonates with the spirit of established qualitative traditions while leveraging advanced computational capabilities. Rather than settling for either laborious manual coding or reductive statistical topic models, researchers can begin to inhabit a hybrid space, engaging dynamically with their data at scale and depth. In doing so, qualitative inquiry can become both more efficient and more imaginative, encouraging new ways of understanding the social world.
In what follows, I present a framework that progresses from basic orientation to more creative exploration of qualitative data, demonstrating how LLMs can augment human interpretive expertise. Through an extended case study that analyses existing interview data, I illustrate how LLMs might be used at multiple levels of engagement.
What are LLMs—and Why are They Especially Suited for Qualitative Research?
Large Language Models (LLMs) are artificial intelligence systems designed primarily for understanding and generating human language. While various computational tools exist for analyzing text, LLMs are distinctive in both their architecture and their operation. They rely on deep neural networks that excel at handling sequences of text by leveraging so-called “attention” mechanisms (Devlin et al., 2019; Vaswani, 2017). Rather than just predicting the next word based on narrow statistical patterns, attention-based networks weigh contextual relationships across entire passages, allowing them to capture deeper semantic and syntactic dependencies that go beyond simple autocomplete functions. Through training on massive and diverse textual corpora (literally billions of books, scholarly works, media & literary articles, websites, human conversations, and other textual sources), LLMs acquire a grounded representation of language that confers contextually coherent and semantically rich responses (Brown et al., 2020; Chowdhery et al., 2023; OpenAI et al., 2024).
This linguistic grounding is significant because qualitative research often relies on precisely these kind of data—words, phrases, narratives, dialogues, and texts—that LLMs are uniquely optimized to process. Traditional computational methods, such as Latent Dirichlet Allocation (LDA) and other topic modeling techniques, reduce language to statistical distributions of words across documents, identifying patterns in frequency and co-occurrence (Blei et al., 2003; DiMaggio et al., 2013). While such methods can illuminate structural regularities, their analytical granularity often stops at the word level. LLMs, by contrast, capture patterns across multiple layers of linguistic organization (Tenney, 2019). They can discern not just which words co-occur, but how phrases, sentences, and paragraphs interrelate, and how these relationships shift across contexts.
Through their deep learning architecture and extensive pre-training, LLMs develop what appears to be an intuitive grasp of language in much the same way humans learn to understand nuance and context—through massive and repetitive exposure to examples (Rogers et al., 2021; Wei et al., 2022). Just as a child learns that “yeah, right” often means the opposite of agreement, and “I’m fine” can mean very different things depending on tone and context, LLMs go through a similar semantic process, but at an enormous scale (Peters et al., 2018). This grants a unique ability to grasp complex linguistic features such as tone (detecting whether something is formal, casual, sarcastic, or sincere); cultural connotations (understanding idioms, specific phrases, and cultural touchpoints); contextual shifts (recognizing how the same words mean different things in different situations); and implicit meaning (understanding what’s suggested but not directly stated; to “read between the lines”)—among others.
This deep awareness of language enables modern “chat” models to engage in seemingly natural conversations that far surpasses a basic processing of text. LLMs can take everyday questions or instructions (called prompts), written in normal language, understand what’s being asked, and respond in a natural, appropriate way (Brown et al., 2020; Ouyang et al., 2022). It is the difference between using a basic translation dictionary and talking with a fluent bilingual speaker. Indeed, earlier systems were more like the dictionary—matching words and following rigid rules—but chat models are far more like conversational partners. When you ask them to “explain this paragraph like I’m five” or “what’s the main argument here?”, they engage in back-and-forth exchanges that feel effortless and helpful, explaining difficult concepts, answering follow-up questions, and even asking for clarification when needed—all without growing tiresome or annoyed.
Because language is the primary “raw material” of so much qualitative inquiry—whether in interviews, focus group transcripts, ethnographic fieldnotes, online forums, or historical documents—LLMs are naturally attuned to the textures and complexities that researchers seek to interpret. Their linguistic sensitivity extends to subtle aspects of discourse, such as metaphorical expressions, code-switching, distinctive genre conventions, or culturally specific idioms. Whereas traditional computational techniques must often rely on predefined dictionaries, stop-word lists, or topic priors, LLMs are able to adapt dynamically without any domain-specific examples or explicit rules for the particular analysis task at hand (known as “zero-shot”), drawing instead on their broad training to interpret content in context (Bender et al., 2024; Bommasani et al., 2021; Shanahan et al., 2023).
Such adaptability opens new avenues for qualitative analysis. Rather than treating language as a static artifact to be statistically decomposed, LLMs respond interactively. When a researcher prompts an LLM to highlight contradictions in a set of interview excerpts, to relate certain thematic patterns to sociological theories, or to generate hypotheses about underlying cultural meanings, the model transforms the traditional relationship between researcher and data in a fundamental way: here, situating micro-level textual details within macro-level theoretical frameworks. Researchers can move seamlessly between examining specific textual evidence and exploring its theoretical significance, using each level of analysis to inform and enrich the other. In short, LLMs can help researchers quickly move from broad surveys of thematic patterns toward more nuanced inquiries that reflect the complexity of human communication.
Beyond an intuitive and expert grasp of human language, LLMs’ extensive pre-training also affords them an extensive knowledge base (Pan et al., 2023). Indeed, LLMs integrate encyclopedic factual knowledge and gain domain expertise from their training data—which amounts, essentially, to ingesting the equivalent of the Library of Congress several times over, plus a significant portion of the public internet (Pezeshkpour, 2023; Roberts et al., 2020). This vast exposure to human knowledge and creativity allows them to recognize and engage with the almost limitless references to historical events, theoretical constructs, or disciplinary paradigms that inform the meaning of texts. Indeed, an LLM like ChatGPT has “read” far more extensively than even the most dedicated scholar could in a lifetime. While it is true that early instances of publicly available LLMs faced issues of “hallucinations” (i.e., plausible-sounding but factually incorrect or nonsensical information), subsequent advancements in fine-tuning and reinforcement learning have already significantly reduced these occurrences by aligning model outputs more closely with verified information and user intent (Liu et al., 2025).
Thus, if an analyst is developing, for instance, a conflict theory framework to help explain interview data collected from workplace disputes in a tech company, not only has it “read” Marx’s entire corpus along with both contemporary and modern interpretations & critiques, but also the state-of-the-art in organizational theory, tech industry analyses, and labor relations research. The model can draw from these diverse sources to suggest relevant, informed connections and interpretive frameworks, inclusive of both classical theory and cutting-edge applications (see: Cornelissen et al., 2024; Hau, 2024).
Despite their remarkable abilities, LLMs available today do not possess actual understanding or consciousness in any meaningful sense—they are fundamentally computer systems, albeit extremely sophisticated ones (Marcus & Davis, 2019). While they can process language with impressive fluency and draw upon vast stores of information, they still lack true comprehension, self-awareness, or the ability to reason about the world in the way humans do (Mitchell, 2019). The role of human expertise, judgment, and critical interpretation therefore remains essential in guiding and validating LLM-assisted research.
From Basic Queries to Creative Exploration: Expanding the Researcher’s Toolkit
To demonstrate these methodological principles in practice, this section provides a detailed walkthrough of how LLM-assisted methods could be applied in practice. Drawing from a real-world research context, I analyze a dataset of ten semi-structured interviews chosen at random from a project uploaded to FigShare, a publicly available Open Access data repository, entitled “I’m Cited, Therefore I Am!: How Publishing Constructs the Scientific Identity” (Bowley, 2015).
1
The interviews include a mix of early- and late-career scholars across a range of academic disciplines, and were collected for a PhD dissertation project with the following abstract: This phrase “publish or perish” is one of the most enduring expressions in academia. With the rise of audit culture in higher education it has been argued that academic value is increasingly becoming “monetised” and as a result academic values are being transformed…This dissertation addresses how audit culture in higher education is transforming the identities of academic scientists and influencing how the scientific identity is dependent on peer reviewed publications.
Basic Orientation
At an elementary level, researchers can use LLMs to swiftly orient themselves within large and complex datasets. Instead of manually skimming hundreds of pages, they might prompt the model to summarize a set of interviews or identify the most frequently mentioned concepts. Such initial queries lay the groundwork for more substantive analysis. The LLM can quickly reveal, for example, that a particular theme—such as organizational trust or environmental risk—is central to the dataset, suggesting where the researcher should devote greater attention. While these outputs are simple and descriptive, they save valuable time and help researchers approach subsequent steps with greater clarity.
The ten transcripts were uploaded to a publicly-available LLM, followed by this initial prompt and subsequent response (the LLM was not made aware of the abstract, only he project title).
2
Note that the LLM intuitively adopts a numbered list with bullet points format. Unless instructed otherwise, this pattern of output is likely to follow subsequent interactions that follow from this initial prompt:
Major Recurring Themes:
1. Publication Pressure and Career Advancement
• Publications are widely viewed as “academic currency” • Strong link between publishing and career progression • Many participants noted feeling pressure to publish in high-impact journals • Several mentioned the phrase “publish or perish”
2. Impact Factor Considerations
• Most participants expressed ambivalence about impact factors • Many acknowledged playing the “impact factor game” for career reasons while personally disagreeing with its importance • Several noted that high impact doesn’t necessarily correlate with quality or citations • Multiple participants mentioned feeling institutional pressure around impact factors
3. Scientific Identity Formation
• First publication often marked a significant milestone • Many described feeling like a “real scientist” after publishing • Citations were seen as validation from the scientific community • Several participants noted that their value as scientists shouldn’t be reduced to metrics
4. Motivations for Doing Science
• Curiosity and discovery were consistently mentioned as primary motivators • Many described science as a form of exploration • Several emphasized wanting to contribute to knowledge • Teaching and mentoring were often mentioned as important aspects
5. Views on Open Access
• Generally supportive of the concept • Concerns about costs and implementation • Many viewed it as increasingly necessary but complicated by institutional constraints • Several mentioned tensions between open access and career advancement
6. Evolution of Scientific Publishing
• Many noted changes in how science is communicated • Several mentioned concerns about supporting information and data sharing • Discussion of how metrics and evaluation have changed over time • Recognition of changing expectations in academic careers
7. Institutional Pressures
• Many described navigating between personal values and institutional requirements • Several mentioned REF (Research Excellence Framework) influencing behavior • Discussion of how universities use metrics for evaluation • Tension between teaching and research expectations
Key Observations:
• Career Stage Differences ○ Early career researchers expressed more anxiety about publishing pressures ○ Senior academics often had more nuanced views about impact factors ○ Mid-career academics frequently mentioned balancing multiple demands • Disciplinary Variations ○ Different fields had varying publication expectations and norms ○ Some disciplines emphasized quantity while others focused on quality ○ Field-specific journal preferences were noted • Common Concerns ○ Work-life balance ○ Administrative burden ○ Funding pressures ○ Teaching versus research time
The interviews reveal a complex relationship between academic publishing and scientific identity, with most participants expressing both pride in their publications while also questioning aspects of the current system. There’s a notable tension between what many view as “good science” versus what advances careers, suggesting a system that many participate in while simultaneously critiquing.
A particularly representative quote comes from Participant #1:
“When I started I didn’t even think about that and the people I worked for didn’t either. It was very much, oh, great, we got this! We must write a paper! Oh, which journal does that fit in? It didn't really matter the impact factor of the journal it was most appropriate who would that appeal to, the readership was more important.”
This illustrates how many participants perceived changes in academic culture over time, with increasing emphasis on metrics and impact factors rather than just sharing scientific findings with appropriate audiences.
Thematic Coding
Once familiar with the general contours of the data, researchers can move toward thematic coding tasks. LLMs can respond to prompts that ask for emergent themes, clusters of related excerpts, or inductive categorization without the researcher predetermining codes. Rather than starting from scratch, a scholar might say, “Group these passages based on the issues they discuss,” allowing the model to suggest an initial coding framework. This provides a useful scaffolding on which the researcher can build a more nuanced, theory-driven coding scheme. The advantage here is efficiency and inspiration: the LLM can surface overlooked patterns, spurring the researcher to refine and enrich the initial categories.
First, I’ll create a codebook based on the emergent themes
Code Analysis Table.
Note. Created by Claude Sonnet 3.5. https://claude.site/artifacts/1aaddc51-77ee-4e4e-af50-877547e4d10b
CODEBOOK:
A. SCIENTIFIC IDENTITY & MOTIVATION (SI)
SI1: Initial attraction to science
SI2: Self-perception as scientist
SI3: Professional development;
SI4: Career path decisions
B. PUBLICATION CULTURE (PC)
PC1: Publication pressure
PC2: Journal selection criteria
PC3: Impact factor considerations
PC4: Publication strategies
PC5: Publishing anxiety
C. INSTITUTIONAL DYNAMICS (ID)
ID1: University expectations
ID2: Department culture
ID3: Assessment frameworks
ID4: Career progression requirements
D. SHIFTING LANDSCAPES (SL)
SL1: Changes in publishing
SL2: Evolution of metrics
SL3: Open access considerations
SL4: Data sharing practices
E. PROFESSIONAL TENSIONS (PT)
PT1: Research versus teaching
PT2: Quality versus quantity
PT3: Personal values versus institutional demands
PT4: Work-life balance
Illustrative Quotes for Major Themes:
1. Scientific Identity & Motivation: “Just because it’s interesting. It was really interesting.” (P8, SI1)
2. Publication Culture: “Papers are academic currency” (P7, PC1)
3. Institutional Dynamics: “I think nowadays people are a lot more clued up on to a lot of things for example how important it is getting funding, getting your publications in the highest impact journals” (P1, ID1)
4. Shifting Landscapes: “I think it has changed the way that people think about how they should approach publishing their results” (P2, SL1)
5. Professional Tensions: “You’re just so attuned to what the situation is rather than trying to say whether it should be changed” (P9, PT3)
This coding framework reveals several key insights:
1. Universal Experiences: All participants shared common experiences around publication pressure and impact factor considerations, regardless of career stage or discipline.
2. Career Stage Variations: Early career researchers showed more anxiety about publishing (PC5), while senior academics expressed more concern about institutional dynamics (ID1-3).
3. Disciplinary Differences: Different fields showed varying emphasis on certain themes, particularly around publication strategies (PC4) and data sharing practices (SL4).
4. Temporal Changes: Many participants noted significant shifts in academic publishing culture over time, particularly around metrics and evaluation.
Comparative Analysis
To dig deeper, the researcher can leverage the LLM’s contextual sensitivity to compare textual subsets. Perhaps some interviews are from experienced professionals, others from novices, or perhaps documents were created before and after a policy change. With prompts like “Compare how early-career and senior participants discuss professional challenges,” the LLM can highlight differences in tone, emphasis, or conceptual framing. This comparative capability encourages a more dynamic, relational approach to data analysis, helping researchers discern shifts in discourse over time, variations across demographic groups, or contextual contrasts between data subsets.
Note at this point in the response below, the LLM does make two small errors in misattributing (actual) quote excerpts to the wrong participant interview (marked in the transcript below). While not as common as they have been in the past, these errors that still occur likely stem from LLMs’ tendency to rely on patterns and similarities across texts, which can sometimes cause it to misidentify the source of a statement, especially when participants discuss similar themes or use overlapping language. Such mistakes underscore the importance of maintaining rigorous human oversight, cross-referencing the LLM’s outputs with original data, and ensuring that attributions are accurate to preserve the integrity of the analysis (Pangakis et al., 2023).
Key Contrasts:
1. Job Security & Career Progression
ECRs (PhD Students/Research Fellows):
• Express acute anxiety about job security • Focus on immediate publication needs • Concerned with building publication record
Example from P7 (PhD Student): “I picked this PhD in particular because I knew it had an industrial side that I could use if I really didn’t like it. I didn’t want to pick something that was purely academic that would limit me entirely”
Senior Academics (Professors/Readers):
• More focused on systemic issues • Concern for mentoring next generation • Broader perspective on career paths
Example from P1 (Professor): “I have to be honest with them and say, ‘Look, you really need a few more first author papers to really go that next step.’ I think that can be a bit of pressure for people like that.”
2. Relationship with Impact Factors
ECRs:
• Feel direct pressure to publish in high-impact journals • More likely to see impact factors as necessary evil • Strategic about journal selection
Example from P8 (PhD Student): “For me I try not to let it, but obviously as a PhD student making the decision of the journal on your own, and I think for most supervisors it factors into a lot more”
Senior Academics:
• More critical of impact factor culture • Can afford to be selective about journals • Focus on appropriate audience over prestige
Example from P5 (Research Fellow): “The article I got in the highest impact journal I’ve got in I thought the journal was pretty poor quality and had a large number of poor articles”
3. Work-Life Balance & Administration
ECRs:
• Struggle with immediate workload • Focus on research outputs • Less administrative burden but more anxiety about it
Example from P7 (PhD Student): “The times I get really stressed are generally in the run up to conferences or summer schools where I just want to get loads of stuff done”
Senior Academics:
• Managing multiple responsibilities • Balancing research, teaching, and administration • More accepting of administrative load
(is actually P10, Reader) “You could say 1/3 teaching, 1/3 research, 1/3 administration… it’s part of the job, really”
4. Views on Open Access
ECRs:
• More enthusiastic about open science • Concerned about costs • See it as future of publishing
Example from P8 (PhD Student): “For me the aspect of open science that are more important than open access are open data and open code”
Senior Academics:
• More pragmatic about implementation • Concerned about systemic implications • Focus on practical challenges
Example from P10 (Reader): “For example, the gold Open Access option is quite an expensive one… it’s one way of extracting money from the public sector”
5. Future of Academia
ECRs:
• Uncertain about long-term career prospects • More likely to consider alternative careers • Focus on adaptability
Example from P7 (PhD Student): “I think I’d give it a go. If I really like it at the end of my time [PhD] when I’m writing up I might just keep it as one option and keep industry as another”
Senior Academics:
• Concerned about systemic changes • Focus on maintaining standards • More invested in institutional change
(is actually P10, Reader) “I think the pendulum is swinging back… people are taking a step back and having a look at what it all means”
Overall Patterns:
1. Temporal Perspective
• ECRs tend to focus on immediate challenges and career progression • Senior academics take longer-term view and consider systemic issues 2. Agency and Control • ECRs feel more constrained by system requirements • Senior academics express more ability to choose their approach
3. Professional Identity
• ECRs still developing scientific identity • Senior academics more confident in professional standing
4. Institutional Responsibility
• ECRs focused on personal navigation of system • Senior academics feel responsibility to improve system
These differences reflect not just career stage, but also the changing nature of academic science and publishing over time. Senior academics often reference how the system has changed since their early careers, while ECRs are navigating a more metric-driven, competitive environment from the start.
Relating Data to Theory
A next level of sophistication involves asking the LLM to connect identified patterns to theoretical frameworks or established scholarly concepts. Researchers might prompt the model to interpret certain themes through the lens of cultural capital, social network theory, or framing theory. By asking, “Which patterns align with the concept of ‘moral panic’ in sociology?” the researcher spurs the LLM to offer a close and critical theory-informed reading. While the model’s suggestions do not replace the researcher’s expertise, they can serve as intellectual catalysts, nudging the scholar to consider theoretical perspectives they may not have initially thought of. This step encourages a dialogue between data-driven themes and conceptual constructs, ultimately broadening the analytical repertoire.
Key Elements of Moral Panic in Academic Publishing Context:
1. Concern About Metrics and “Gaming”
• Many participants express anxiety about impact factors distorting science • Clear pattern of concern about “publish or perish” culture • Worry about authenticity of science being compromised
Example from P2 (Professor): “Now people are judged more and more on where they publish than what they publish… there’s a risk that they may be distorting the system”
2. Perceived Threat to Scientific Values
• Frequent mentions of compromising scientific quality for metrics • Concern about “proper” science versus what gets published • Anxiety about losing scientific integrity
Example from P3 (Professor): “I was absolutely shocked that to get promotion at any kind of level that you need to publish in, what I would say, these vanity journals”
3. Consensus About “Crisis” The interviews show elements of moral panic around:
(a) Metrics-Based Evaluation
• Universal agreement that metrics are problematic • Shared concern about impact on scientific quality • Collective anxiety about future of science
(b) Publication Pressure
• Common narrative about pressure increasing • Shared worry about effect on young scientists • Agreement about systemic problems
4. Disproportionality Some responses suggest possibly exaggerated fears:
• Catastrophic language about future of science • Nostalgia for “simpler” past publishing practices • Potentially overestimated negative impacts
Example from P1 (Professor): “When I started I didn’t even think about that… It was very much, oh, great, we got this! We must write a paper!”
5. Volatility the interviews show rapid shifts in concern about:
• Open Access requirements • Data sharing expectations • Metrics and evaluation systems
However, some aspects diverge from classic moral panic:
1. Sustained Rather Than Temporary
• Concerns are ongoing rather than short-lived • Issues appear systemic rather than episodic • Problems seen as structural rather than temporary
2. Expert Rather Than Public Concern
• Concerns come from within scientific community • Based on direct experience rather than media portrayal • More nuanced understanding of issues
3. Rational Basis Many concerns appear grounded in concrete changes:
• Documented pressure on early career researchers • Real changes in funding structures • Actual shifts in evaluation metrics
The interviews suggest something more complex than simple moral panic - perhaps closer to what sociologists call “institutional anxiety” or “professional concern.” The worries expressed seem to reflect real structural changes in academic publishing, though sometimes expressed through moral panic frameworks.
Key Distinctions:
1. Traditional Moral Panic:
• Often media-driven • Short-lived • Based on exaggerated threats • Public focused
2. Academic Publishing Concerns:
• Experience-based • Sustained over time • Grounded in structural changes • Professional community focused
The pattern might better be described as “professional moral anxiety” - a sustained concern about changes to professional values and practices, expressed through some elements of moral panic rhetoric but grounded in concrete institutional changes.
This suggests the need for a more nuanced theoretical framework than traditional moral panic theory to understand how academic communities process and respond to systemic changes in their professional practice.
A significant quote from P2 captures this complexity: “There are healthy aspects as well… it’s good to come to a high performing university, because all your colleagues are all of international rank and that inspires you… but it is good to exposed to colleagues who really are pushing the envelope because that helps to inspire you”
This shows how the community simultaneously recognizes both benefits and drawbacks of current systems, suggesting a more complex dynamic than simple moral panic.
Interrogating Internal Dynamics
As the analysis deepens, LLMs can be prompted to identify contradictions, tensions, or evolving narratives within the corpus. For instance, a researcher might say, “Point out any contradictions in how Participant A discusses institutional trust across these three interviews.” Here, the model’s capacity for textual coherence allows it to locate subtle shifts in a single speaker’s stance or to highlight internal inconsistencies in official documents. By foregrounding these internal dynamics, the researcher gains richer interpretive opportunities, moving beyond surface-level themes to the complexities that often characterize qualitative data.
Note at this point, as well, that the prompting strategy becomes more specific, and cautions the LLM to be careful to analyze each transcript in isolation. This is because LLMs, when allowed to process multiple documents simultaneously, may inadvertently conflate distinct contexts or attribute statements from one interview to another. Such errors can arise from their tendency to generalize patterns or infer connections wherever they may exist, potentially leading to misinterpretations of the researcher’s intention. By isolating each transcript, the researcher mitigates the risk of the model introducing errors, ensuring that its analysis respects the unique situational contexts of each piece of data and avoids creating artificial consistencies or conflicts.
Participant #1 (Chemistry Professor):
Contradicts their stance on impact factors. Initially says: “I probably prefer that than the blue skies, because I can actually see that potentially what we do here could be useful in the future” when discussing applied versus basic research publications. But later advocates for publishing based on readership fit rather than impact, suggesting a tension between stated values and practice.
Participant #3 (Environmental Science Professor):
Shows contradiction regarding pressure to publish in high-impact journals. Initially states “I thought it was absolutely abhorrent. I vowed that I would never try to publish in those journals to get promotion.” But then acknowledges publishing in them anyway after getting promoted, showing a disconnect between stated principles and actions.
(is actually P9, Lecturer)
Displays contradiction about impact factors. States “I wouldn’t really say that I feel pressure, but I do know that I can benefit from those publications” and then later describes specific pressures and requirements around high-impact publications for career advancement.
Participant #7 (PhD Student):
Shows inconsistency about publication pressure. Initially describes being “laid back” about publishing but then in the follow-up email admits “The laid back PhD attitude is really not very normal for me at all” and describes significant stress about publishing.
Participant #8 (Molecular Biology PhD Student): Contradicts their stance on impact factors’ importance. States “For me I try not to let it” regarding impact factors influencing publication choices, but then details how impact factors significantly influence their submission strategies.
Participant #9 (Physics Lecturer):
Shows contradiction regarding administrative work. Initially says administrative load was expected and manageable: “No” when asked if surprised by admin load. But later complains about it taking away from research time: “Too much emailing really. I want to be a big more hands on in the lab.”
Participant #10 (Chemistry Reader):
Contradicts position on metrics. Initially criticizes metrics-based evaluation but then defends their use, saying they give “some information” and are necessary for assessment.
This analysis reveals several patterns in these contradictions:
1. Value-Practice Gap
• Participants often express idealistic views about how science should work while describing more pragmatic behaviors in practice
2. Career Stage Influence
• Early career researchers tend to show more contradictions between stated values and practical choices due to career pressures
3. Institutional Pressure Effects
• Many contradictions emerge around institutional requirements versus personal values, particularly regarding publication venues and metrics
4. Temporal Shifts
• Some contradictions appear when participants reflect on changes in their views over time or during the interview itself
A key observation is that these contradictions often reflect real tensions in academic life rather than simple inconsistency. They highlight the complex navigation between personal values, professional requirements, and institutional pressures.
Most contradictions cluster around:
• Impact factor considerations
• Publishing pressure
• Career advancement strategies
• Administrative load
• Open access participation
These patterns suggest that contradictions often emerge where individual values conflict with institutional requirements or career necessities.
Another possibility is to check for unique cases or outliers among the data. By surfacing outliers, the LLM highlights cases that may offer unique insights or challenge prevailing interpretations, enabling the researcher to explore exceptions that could deepen theoretical understanding or uncover overlooked complexities.
Participant #3 (Environmental Science Professor):
• Takes a uniquely strong stance against high-impact publishing
• Only participant to directly challenge the impact factor system as “abhorrent”
• Shows more emotional response to institutional pressures than others
• Example quote: “I was absolutely shocked that to get promotion at any kind of level that you need to publish in […] these vanity journals.”
Participant #5 (Physics Research Fellow):
• Only participant to consistently minimize publication pressure
• Claims indifference to institutional metrics
• Shows unusual detachment from career advancement concerns
• Example quote: “I don’t really see what getting promoted in academia gives you.”
These outliers are significant because they represent opposing responses to institutional pressures - active resistance versus detached acceptance - compared to the more pragmatic navigation described by other participants.
Scenario Testing & Hypotheticals
When researchers reach a point where they seek inspiration and intellectual playfulness, they can ask the LLM to imagine hypothetical scenarios or counterfactual situations. Questions like “How might these discussions differ if participants were in a different cultural context?” invite the model to generate plausible but speculative narratives. These hypothetical exercises can stimulate creativity, uncover implicit assumptions, and broaden the researcher’s conceptual horizons. By taking the inquiry beyond what is present in the data to what could be (or could have been), the researcher can set up novel comparisons, refine their analytical questions, challenge their assumptions, and open new avenues for future studies.
For this prompt, adding “Keep it to a short summary” also informs the LLM that the response should be less comprehensive here compared to prior responses.
1. Political Oversight versus Market Pressures
• Instead of impact factors and citation metrics, scientists would likely discuss political approval and ideological alignment
• Career advancement would depend more on party membership and conformity to state scientific priorities than publication metrics
2. Publishing Dynamics
• Less emphasis on international journals, more focus on Soviet journals and internal distribution of findings
• Publication decisions influenced by state security concerns rather than market forces
• Less pressure about where to publish, but more pressure about what to publish
3. Scientific Identity
• Scientists would likely discuss their role in advancing Soviet science and technology rather than individual career development
• Professional identity more closely tied to state institutions rather than international scientific community
• Different tensions between scientific truth and political requirements versus today’s tensions between scientific quality and metrics
4. International Competition
• Discussion of publishing would be framed more explicitly in terms of competition with Western science
• More emphasis on collective scientific achievement rather than individual recognition
• Different type of pressure - proving Soviet scientific superiority rather than achieving high impact factors
The key shift would be from today’s market-driven anxieties to more politically-driven concerns, though both systems create their own forms of pressure on scientific identity and practice.
Creative Synthesis & Prompting Further Inquiry
At an advanced level, researchers may treat the LLM as a conversational partner that can propose new research questions, adopt alternative theoretical lenses, or synthesize patterns into novel analytic frames. A prompt might read, “Adopt the lens of feminist social theory and reinterpret these data—what stands out now?” By guiding the model to “speak” in a particular analytic voice, the researcher can discover angles, arguments, or inquiries that might otherwise remain dormant. This capability moves the relationship from tool-assisted coding to genuine intellectual co-exploration, sparking unexpected insights and promoting methodological innovation.
1. Gendered Career Interruptions Notably, only one participant (P1, female) explicitly mentions maternity leave impacting publication record: “The time I did find it hard was during my two maternity leaves. You can actually see, if you look at my Web of Science profile, you can almost map the two times I was off, because it is hard to sustain that.”
2. Power Structures and Gatekeeping
• The language around “high impact” journals reveals masculine-coded power dynamics
• Hierarchical structures favor traditionally masculine traits like competition and individual achievement
• Senior positions (where publication requirements can be challenged) are predominantly described from male perspectives
3. Care Work in Academia
• Teaching and mentoring (traditionally feminized labor) are often presented as secondary to research/publishing
• Several participants describe concerns about student welfare and mentoring, but these are framed as competing with “real” academic work
4. Work-Life Integration The interviews reveal gendered assumptions about:
• The “ideal academic” who can dedicate unlimited time to research
• The ability to separate personal/family life from academic productivity
• What constitutes legitimate reasons for publication gaps
5. Language and Discourse The interviews use notably masculine-coded metaphors:
• Combat/competition metaphors (“in the field,” “pushing the envelope”)
• Hierarchical positioning (“top journals,” “high impact”)
• Individual achievement over collaborative success
6. Structural Inequalities The current system appears to perpetuate existing power structures through:
• Emphasis on uninterrupted publication records
• Networking requirements
• Resources needed for high-impact publications
7. Emotional Labor There’s unacknowledged emotional work in:
• Managing student anxiety about publishing
• Navigating departmental politics
• Maintaining research group harmony
8. Alternative Value Systems The interviews show tension between:
• Metric-based evaluation (masculine-coded)
• Holistic contribution assessment (feminine-coded)
• Teaching/mentoring value (feminized labor)
• Research output (masculinized achievement)
9. Resistance and Conformity Participants describe various responses to dominant structures:
• Some openly challenge the system
• Others strategically conform while expressing private doubts
• Many acknowledge participating in practices they critique
10. Intersectionality Notable absences in discussions of:
• How publication pressures differently affect minorities
• Institutional barriers beyond pure merit
• Resource disparities affecting publication access
This feminist reinterpretation reveals how academic publishing systems may perpetuate gender inequalities through:
• Valuing traditionally masculine traits
• Failing to account for care responsibilities
• Maintaining hierarchical power structures
• Privileging uninterrupted career trajectories
The analysis suggests the need to examine:
1. How publication metrics might disadvantage certain groups
2. Ways academic value systems perpetuate existing inequalities
3. Alternative evaluation frameworks that recognize diverse contributions
4. The role of care work and emotional labor in academia
This lens reveals how seemingly neutral academic practices may actually reproduce gendered power relations and suggests the need for structural changes to create more equitable academic environments.
Analysts can also query the LLM to mine for subtleties, nuance, and underrepresented perspectives that may be contained in the data, but which are overshadowed by more dominant narratives, themes, or tropes.
1. Language of Loss/Nostalgia
Several senior academics use temporal markers to describe a “purer” past:
• “When I started I didn’t even think about that...”
• “It was very much, oh, great, we got this!”
2. Embodied Anxiety
Physical manifestations of pressure appear in language:
• References to exhaustion
• Descriptions of stress
• Mentions of lab equipment breaking as metaphor for systemic pressure
3. Spatial/Geographic Hierarchies
Subtle references to institutional status:
• Global North/South divides
• Russell Group versus other universities
• Lab access and resources
4. Generational Protection
Senior academics express protective attitudes toward junior colleagues:
• “I can’t impose my philosophy and outlook on them because that’s obliging them to take risks that I myself didn’t take”
5. Technical/Emotional Labor Division
Participants unconsciously separate:
• Data work (valued)
• Emotional support (undervalued)
• Administrative tasks (resented)
6. Collective versus Individual Success
Tension between:
• Personal advancement
• Group achievement
• Departmental metrics
These nuances suggest deeper structural and psychological dynamics than initially apparent.
Reflexive Engagement
Finally, researchers can leverage LLMs to foster reflexivity about their own role and biases. Prompts might include, “Suggest ways to challenge my initial coding scheme to ensure a more balanced analysis,” encouraging the model to highlight potential blind spots or propose corrective strategies. The LLM’s ability to reflect on analytical process—albeit without true self-awareness—can still prompt scholars to become more methodologically rigorous. By making it easier to consider alternative interpretations or acknowledge potential oversights, this level of engagement helps maintain the integrity and trustworthiness central to qualitative research.
Limitations of Original Framework:
Power dynamics invisible (institutional/hierarchical relationships)
Gender/diversity aspects absent
Care work/emotional labor uncaptured
Assumes neutral institutional context
Focuses on individual over structural factors
Suggested Additional Code Categories:
F. POWER RELATIONS (PR)
PR1: Institutional hierarchy
PR2: Gatekeeping practices
PR3: Resource access
PR4: Decision-making authority
G. CARE & EMOTIONAL WORK (CE)
CE1: Student mentoring
CE2: Colleague support
CE3: Group maintenance
CE4: Work-life navigation
H. STRUCTURAL BARRIERS (SB)
SB1: Career interruptions
SB2: Resource inequalities
SB3: Institutional barriers
SB4: Systemic biases
I. RESISTANCE STRATEGIES (RS)
RS1: Open challenges
RS2: Strategic compliance
RS3: Alternative practices
RS4: Collective action
Implementation Changes:
1. Code co-occurrence analysis needed
2. Attention to power in language
3. Note absent voices/perspectives
4. Consider institutional context
5. Track emotional/care work references
This revision enables analysis of structural power dynamics previously invisible in the original coding scheme.
Some More Unconventional Uses of LLMs in Qualitative Inquiry
As researchers gain confidence and skill in conversing with their data through LLMs, the horizon of possibilities expands well beyond thematic coding or theoretical exploration. Venturing into more creative and transdisciplinary territory, researchers can treat LLMs as collaborative partners that generate new modalities of analysis, visualization, and even synthesis across media and methods. By prompting the model to produce instructions, code, or conceptual outlines, researchers can orchestrate entire workflows that turn static texts into dynamic, multifaceted research artifacts.
For instance, prompting the LLM to produce storyboards, prompts, or descriptive outlines, researchers can feed these outputs into image generators or other multimodal AIs. “Describe a symbolic image that captures the tension between institutional trust and individual autonomy” might yield a concept that, when given to an image-generation model, produces evocative visuals that accompany the analytic narrative (Figures 1 and 2).
Example visualization. Note: Produced by claude Sonnet 3.5. https://claude.site/artifacts/65ab2f81-2103-4bd0-90e9-7b6cb3ddb8a6. Text-to-Image Generation. Note: Created by Dalle-3 using ChatGPT 4o. https://chatgpt.com/share/6783cc6f-0a64-8006-af13-fd034ab9ef89.

LLMs can serve as a readily accessible methodological consultant and on-the-fly software coder. Asking, “write Python code to scrape social media feeds from Instagram” yields snippets that researchers can adapt and then execute for their data collection needs. Beyond code generation, LLMs can further explain methodological choices, suggest best practices for data handling, and help troubleshoot common technical issues (Siiman et al., 2023). This capability extends to statistical analysis, where LLMs can recommend appropriate analytical approaches and generate corresponding code implementations—or even take in and analyze code using an internal Python environment.
Beyond expanding the methodological toolkit in novel ways, LLMs can facilitate theoretical triangulation through structured dialogic exercises (Christou, 2023). The LLM can stage debates among theoretical perspectives, offering dialogic texts that animate the data. By prompting, for instance, “Craft a conversation where a Bourdieusian cultural theorist, a symbolic interactionist, and a critical race scholar debate these findings,” the researcher gains a meta-theoretical lens, stimulating intellectual play and potential interdisciplinary insight.
SYMBOLIC INTERACTIONIST: leans forward But you’re missing how scientists actively construct meaning through their interactions! Look at how they negotiate their identities through publication milestones. That first paper, first citation—these are symbolic interactions that validate their role as “real scientists.”
CRITICAL RACE SCHOLAR: shakes head You’re both overlooking the structural power dynamics. This whole system of “objective” metrics reinforces existing hierarchies. Notice how access to resources, networks, and prestigious institutions—all shaped by historical inequities—determines who can participate in this “game” you’re describing.
CULTURAL THEORIST: Precisely my point about field dynamics! The “rules of the game” appear neutral but...
CRITICAL RACE SCHOLAR: interrupts There’s nothing neutral about it. Look at who’s missing from these narratives. Where are the voices from minority-serving institutions? Global South scholars? The “universal” scientific experience described here is actually quite particular.
SYMBOLIC INTERACTIONIST: But shouldn’t we focus on how individual scientists navigate...
CRITICAL RACE SCHOLAR: That individualistic focus is part of the problem. It obscures systemic barriers and institutional racism embedded in publishing practices.
CULTURAL THEORIST: Perhaps we can synthesize these perspectives? The field produces both individual strategies and structural inequities through the same mechanisms of capital distribution.
SYMBOLIC INTERACTIONIST: And those mechanisms are maintained through daily interactions and meaning-making...
CRITICAL RACE SCHOLAR: Which systematically privilege certain voices and experiences while marginalizing others. Until we address that, analyzing individual strategies or field dynamics only tells part of the story.
ALL: uncomfortable silence
CULTURAL THEORIST: Well, at least we can agree these interviews reveal the complexity of academic publishing as a social phenomenon?
Murmurs of Reluctant Agreement
With careful prompting, LLMs can even produce “synthetic respondents” who mimic the style, concerns, and discursive patterns present in the original corpus. AI-generated synthetic data is already being taken seriously in quantitative domains (see: Raghunathan, 2021). In healthcare studies, for instance, synthetic patient records help researchers develop and validate analytical methods while maintaining confidentiality and adhering to regulations like HIPAA (Giuffrè & Shung, 2023). Financial institutions today use synthetic transaction data to test pricing models for assets in illiquid markets where real transaction data is sparse; and banks with fraud detection algorithms to explore suspicious patterns without compromising individual privacy (Selvaraj et al., 2022).
This quantitative precedent suggests promising applications for qualitative research. Just as synthetic quantitative data preserves statistical properties while obscuring individual identities, synthetic qualitative responses can maintain thematic patterns and discursive styles within small data sets, while protecting participant anonymity. Researchers might generate synthetic interview responses to expand their analytical scope by creating theoretically-informed variations of existing responses that help identify pattern boundaries and test theoretical assumptions; to explore how emerging themes could manifest across different contexts or demographics; or to test the robustness of their coding frameworks. For example, “Generate a persona who speaks in a manner similar to Participant C, but who addresses an emerging policy issue not covered in the original interviews.” This can help researchers explore potential responses in evolving social contexts or test how concepts might be articulated if new circumstances arose.
However, this practice demands careful ethical and methodological caution. Synthetic respondents blur the boundary between empirical evidence and AI-generated content. They must be transparently labeled and never misrepresented as genuine participants. Ethically, researchers must ensure that these synthetic voices do not distort real perspectives, mislead audiences, or violate participants’ confidentiality and dignity. Instead, they serve as analytical tools that complement traditional data collection methods, much as synthetic quantitative data supplements rather than supplants actual observations. When used responsibly, they can serve as a thought experiment—an analytic sandbox that prompts researchers to contemplate alternative scenarios and refine their interpretive frameworks.
I’m Dr. Morgan Chen, Reader in Paleontology at [University]. Like many of my colleagues, I've been watching these AI developments with interest, though perhaps not with the breathless enthusiasm some seem to have. Been in academia long enough to see various “revolutionary” tools come and go.
The irony isn’t lost on me that we’re using an AI to discuss AI in science. slight laugh But that’s rather the point, isn’t it? These tools are already here, whether we’re entirely comfortable with them or not.
I’m particularly interested in how AI intersects with questions of reproducibility and data analysis in my field. When you’re dealing with fossil data that took decades to collect, you want to be very careful about how automated systems interact with it.
Happy to share my thoughts on specific aspects that interest you.
Ethics, Bias, and Standards of Practice
Integrating Large Language Models (LLMs) into qualitative research frameworks raises fundamental ethical and methodological challenges. While these tools can enhance analysis, researchers must remember that LLMs do not possess true understanding or intentionality (Bender et al., 2024; Bommasani et al., 2021). Regardless of how sophisticated or insightful their outputs are, LLMs nevertheless produce contextually plausible text based on patterns gleaned from their training data, not from any genuine comprehension of meaning or social reality. In this context, human researchers bear full responsibility for interpretation, ensuring that the outputs align with ethical standards and the goals of scholarly inquiry.
Inherent Bias in Training Data
One concerning issue arises from the fact that LLMs tend to produce outputs grounded in countless texts that tend to reflect dominant viewpoints. If the training data predominantly represents a certain cultural stance, other interpretations or marginalized voices might be overlooked (Liu, 2024). These biases could manifest in subtle ways—highlighting certain types of viewpoints while sidelining others, or describing certain groups’ concerns as peripheral or “irrational.” In qualitative research that seeks to uplift underrepresented communities, this epistemic imbalance could replicate colonial or hegemonic narratives. Critical AI scholars and decolonial theorists have warned that uncritical AI use can reinforce imperial epistemologies, making it harder for non-dominant knowledge systems to emerge (Gebru, 2019). Indeed, high-profile controversies have shown that algorithmic systems, even without AI, can perpetuate racialized, gendered, and classed biases (Benjamin, 2019; Noble, 2018). This has clear implications for qualitative inquiry, so researchers must remain vigilant about biases in the model’s responses.
These potential biases are serious, but they are neither inevitable nor impossible to mitigate. To be sure, LLM outputs reflect learned data patterns rather than deliberate malice (see: Guo et al., 2024; Rozado, 2023). As a result, researchers should first and foremost explicitly prompt models to consider alternative perspectives and marginalized viewpoints when analyzing such data, and iterate as needed (Nazer et al., 2023). For instance, when analyzing interview data about urban development, researchers might prompt: “Examine how different socioeconomic groups discuss neighborhood change, paying particular attention to perspectives that might be underrepresented in mainstream discourse,” or “Identify patterns in how residents from different ethnic backgrounds describe their experiences with community planning processes, ensuring equal attention to all community voices.”
Second, validation processes should include diverse stakeholders who can identify potential blind spots or problematic assumptions in the model’s interpretations. Compare LLM outputs with manual coding, community consultations, or alternative data sources. And third, researchers should document instances where outputs appear biased and report them to LLM creators. Already, OpenAI, the parent company of ChatGPT, updates and fine-tunes models with debiasing techniques based, in part, on user feedback (OpenAI et al., 2024).
Data Privacy
Data privacy is another potential concern that might arise if researchers upload identifiable transcripts into external LLM interfaces. A breach or unauthorized use could expose sensitive participant information. High-profile tech scandals and data leaks (as covered by major media outlets) illustrate that even well-meaning platforms can fail at data protection. In a qualitative context, this might mean that participants’ trust is betrayed, undermining ethical commitments and potentially causing harm to vulnerable individuals.
Researchers already have robust tools to handle data responsibly. Techniques like anonymization, pseudonymization, and strict adherence to IRB protocols are standard in qualitative research and remain effective in the LLM era. Researchers can also download and deploy standalone LLMs in secure, on-premise environments, ensuring data never leaves their institutional control.
Authenticity, Transparency, and Intellectual Honesty
Because LLMs can produce text that closely resembles human speech, there’s a risk that researchers might conflate AI-generated content with genuine participant perspectives. Deliberate misuse (e.g., presenting synthetic respondents as real data) or neglecting to clarify the origin of certain narratives could damage scholarly credibility. The rise of “deepfakes” and AI-generated misinformation (Chesney & Citron, 2019) has heightened awareness of how convincingly AI can sometimes mimic authenticity. In qualitative inquiry, this could result in misleading conclusions and a breakdown of trust between researchers, participants, and audiences.
Transparent documentation—clearly indicating which insights came from participants and which stemmed from AI interpretation—helps maintain intellectual honesty. Beyond maintaining basic logs of chat interactions, researchers should implement comprehensive documentation protocols that detail their entire analytical process. This includes archiving prompt sequences, recording decision rationales, and maintaining version controls for evolving analyses. These materials should be deposited in publicly-available repositories with permanent URL links or DOIs, ensuring long-term accessibility and reproducibility. Peer review, editorial oversight, and community scrutiny further reduce the likelihood of deceptive practices.
Practical Limitations
In addition to these ethical concerns, there are also certain technical and practical limitations to keep in mind. LLM platforms typically have a maximum “token” capacity (i.e., the maximum number of words or symbols they can process and remember at one time), meaning they can handle only so much text in a single session. If researchers upload texts that exceed this limit, the model may lose context or ignore parts of the data. In this way, the model’s short-term “memory” within a conversation can sometimes overlook earlier content if the chat history grows too long. Moreover, many online chat-based LLMs have periodic usage limits that restrict users to a certain number of discrete interactions before requiring a paid subscription upgrade or imposing waiting periods (even for subscribers). While LLMs are constantly improving their token capacity, to address current limits, researchers can break large data sets into smaller, thematically cohesive chunks—ensuring the model remains focused and able to handle the text without discarding contextual cues.
Prompt design likewise should be adjusted so each query is targeted to produce a desired output (e.g., “Analyze recurring references to social hierarchy in these three paragraphs” vs. “Analyze these paragraphs”). Vague or overly broad prompts often yield generic responses that lack analytical depth. Instead, researchers should craft specific, focused queries that guide the LLM toward the particular aspects of the data they wish to explore (Shah, 2024).
Effective prompts typically include context about the research objectives, specific analytical tasks, and clear parameters for the desired output format. This specificity serves multiple purposes. It helps ensure the LLM’s response aligns with research objectives, makes the analytical process more transparent and reproducible, and enables more systematic documentation of the researcher’s decision-making process. Moreover, detailed prompts cna create clearer audit trails for other researchers who might wish to understand or replicate the analysis.
Conclusions: Bringing the Human Back in
Qualitative research often operates on the premise that social reality is actively shaped by how people interpret, negotiate, and assign meaning to their lived experiences. Interpretivist scholars emphasize the researcher’s role in co-creating these understandings through direct engagement, reflexivity, and attention to participants’ viewpoints. Constructivist traditions similarly foreground the ways individuals and groups develop shared understandings of the world, seeing reality as a product of social processes and interactions (Burns et al., 2022).
Introducing LLMs—a system devoid of personal or cultural experiences—into these traditions raises questions about how interpretation occurs and who (or what) is doing the interpreting. LLMs cannot truly experience the world; they recognize patterns based on large volumes of text without genuine self-awareness. From an interpretivist viewpoint, this offloading of sensemaking may risk overshadowing the researcher’s nuanced reading of social context. Even if these models can identify sophisticated patterns and generate nuanced analyses, they fundamentally lack the embodied understanding that can only come with genuine cultural immersion or participatory engagement.
In a constructivist sense, the researcher’s active role in co-constructing meaning might, too, risk being sidelined if LLMs are perceived as “neutral experts.” This perception could inadvertently privilege machine-generated interpretations over the careful, context-sensitive meaning-making that emerges through human interaction and reflection. Yet, constructivist scholars might see an opportunity: each LLM output can be treated as an interpretive “draft” that the researcher refines through a human lens, drawing on theoretical frameworks, field knowledge, or participants’ own insights. In doing so, LLMs become a catalyst for deeper investigation rather than a one-click solution. The danger lies not in using LLMs as analytical tools, but in potentially overvaluing their outputs or mistaking their pattern recognition capabilities for genuine understanding.
When used thoughtfully, LLMs can expand our perspectives rather than replace them. They can quickly assign thematic codes, conduct pattern recognition, or contrast different viewpoints on a topic, leaving the researcher to decide how these insights align with—or challenge—existing interpretive frameworks. As things still stand, only human analysts can connect textual cues to experiential knowledge—knowledge that extends beyond language to the broader social world. As a result, LLMs can amplify human capabilities—especially at early stages, such as organizing large datasets or surfacing potential themes—while the deeper interpretive and cultural insights often remain the domain of the human qualitative researcher.
Thus, the researcher stays in charge. Rather than a self-contained “analysis machine,” the LLM functions as a prompt-driven assistant that catalyzes new lines of thought through dynamic back-and-forth “conversation” with their qualitative data. Researchers apply their own theoretical lens when deciding which prompts to use, how to incorporate the model’s suggestions, and whether those suggestions hold up under critical examination. Through constant reflection on what is gleaned from the LLM, and where the researcher’s own expertise and skill is needed, it becomes possible to harness AI insights without undermining the foundational values of rich, human-centered qualitative inquiry.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by Israel Science Foundation (ISF) Grant #1836/22.
