Sage Journals: Discover world-class research

Abstract

Many behavioral science-based interventions, such as nudges and so-called psychologically wise interventions, seek to improve people’s lives by using words to shift them toward helpful perspectives and behaviors. Large language models (LLMs) such as OpenAI’s GPT models or Google’s Gemini have the potential to revolutionize these interventions by making them more personalized, scalable, and cost-effective. This article describes how three groups of people—designers of interventions, intermediaries who interact with the intended beneficiaries, and the intended beneficiaries themselves—might use LLMs and identifies potential benefits and risks of those uses for the intended beneficiaries of interventions. We hypothesize that the potential benefits and risks are lowest when designers interact with LLMs, higher when intermediaries interact, and highest when the intended beneficiaries interact directly. We provide suggestions for mitigating the risks so that policymakers can safely deliver on the promise of LLMs.

Keywords

Psychological interventions artificial intelligence machine learning large language models

Behavioral science interventions that aim to improve people’s lives often rely on verbal communications that encourage a given course of action—such as certain nudges, which steer a decision gently without reducing someone’s options,¹ or psychologically wise interventions, which use brief, precisely targeted strategies to shift how people view themselves or their situations.^2,3 Nudges might take the form of text message reminders to vote or complete other important tasks.⁴ Wise interventions might involve reading and writing exercises that help a person be less stymied by uncertainties at a critical transition. In one experiment, for instance, minority college students who worried they would not fit in and do well at a college were told that concerns over belonging are normal for anyone at the start of college and usually go away over time. After considering this idea in a one-hour exercise, those students went on to have better grades at the end of three years than comparable students who did not do the exercise.^5,6 (For further discussion of where and for whom nudges and wise interventions can be most effective, see References 7 through 9).

Artificial intelligence (AI) tools known as large language models (LLMs), such as OpenAI’s GPT models or Google’s Gemini, hold tremendous promise for tackling societal problems in ways that once seemed futuristic.^10–14 LLMs, which can process vast volumes of information, can revolutionize the delivery of behavioral science interventions because they can generate text that is tailored to the needs of the recipients and do so at scale. For instance, LLMs can be designed to deliver motivational, supportive messages tailored to an individual student in response to concerns revealed by the student¹⁵—something that could be extraordinarily time-consuming and costly to provide manually for large numbers of recipients. In other words, LLMs can potentially deliver behavioral science-based interventions that are more adapted to the needs of individual recipients, more scalable, and less costly than is now the case.

Standing in the way of this vision, however, are the possible risks of having LLMs intentionally influence people’s thoughts and behaviors. When public officials and others use LLMs to interact with people, they relinquish some control over the information, advice, and ideas provided in the interactions. This loss of control can increase risk for the affected populations, especially if the algorithms that produce the outputs harbor unrecognized errors or biases.¹⁶ It is paramount that researchers and policymakers understand and develop ways to mitigate these risks if they are to fully realize the potential benefits of using LLMs to power behavioral science-based interventions to improve societal well-being.^17,18

In this article, we describe three ways LLMs could be used in behavioral science interventions: (a) LLMs could help intervention designers to construct interventions for intended beneficiaries (such as students or employees), (b) LLMs could intervene directly with intended beneficiaries by interacting with them in ways designed to support or influence them, or (c) LLMs could intervene with intermediaries (such as teachers or managers) by supporting or influencing how they engage with the intended downstream beneficiaries of that intervention (such as students or employees). We argue that both the promise and the risk of using LLMs increase as the LLMs come in closer contact with the intended beneficiaries, with use by designers entailing the lowest potential benefit and risk and direct interaction with intended beneficiaries entailing the most (see Figure 1). For each of the three uses, we draw on real examples as illustrations and highlight potential benefits and primary risks that stand in the way of reaping those benefits. We then provide suggestions for ways that policymakers and others in charge of LLM use can help to mitigate the risks.

Figure 1.

Conceptual model of how stakeholders in behavioral science interventions may interact with LLMs & the associated levels of benefit and risk to the intended beneficiaries of interventions

Benefits & Risks of Incorporating LLMs Into Interventions

When LLMs Interface With Intervention Designers

Behavioral scientists who design interventions might leverage LLMs in various ways. Here, we focus on two of the most likely.

First, designers might have LLMs generate initial drafts of intervention materials.Behavioral science interventions often include carefully crafted arguments aimed at shifting beliefs and behaviors.^19,20 After having digested a huge amount of literature (including published behavioral science research and past examples of such arguments), LLMs could rapidly draft persuasive arguments or other intervention components—say, memorable metaphors that illustrate core concepts or writing exercises meant to reinforce desirable mindsets.^21,22 One study found that LLM-generated messages encouraging individuals to be vaccinated during the COVID-19 pandemic were perceived to be stronger and more effective than messages developed by the U.S. Centers for Disease Control and Prevention.²³ The draft components produced by LLMs could then be piloted and refined in small iterative A/B tests (comparing two versions of a message)²⁴ before being tested at scale. Similarly, AI tools might be asked to generate messages or exercises that could be microtargeted to the populations most likely to be persuaded by those particular approaches.¹⁶

Second, LLMs could be used to predict how human participants will respond to potential interventions.^13,25–28 In a recent study, investigators had LLMs create virtual research participants that were intended to simulate the human participants in 476 experiments that used nationally representative samples; then the researchers compared the responses of the “silicon samples” to those of the humans in the original experiments. The researchers found a high correlation (0.85) between the simulated and the actual experimental effects.²⁶ An intervention designer might therefore use LLM-generated synthetic participants for fast and inexpensive initial A/B tests of proposed interventions and use the results to inform the design of a more refined pilot test involving humans. Indeed, if LLMs prove sensitive to various participant demographics (such as age, nationality, or gender) and can accurately reproduce heterogeneous responses to interventions, the use of these synthetic research participants²⁵ could be a viable way to anticipate and account for heterogeneity in the way different subgroups would respond to a given intervention message.^7–9 This practice, if effective, could further accelerate the design process and help intervention designers make data-driven decisions about intervention materials at reduced cost.

As for risks posed to the intended beneficiaries of interventions designed with the help of LLMs, one is that LLMs have, at times, been found to generate information that could mislead or confuse recipients of an intervention, such as outputs not aligned with reality (colloquially known as hallucinations)²⁹ or that fail to apply common sense.³⁰

Another risk is that LLMs may generate content harmful to intended beneficiaries by perpetuating biases. Before LLMs are put into service, they are trained on a massive corpus of text generated by humans, often from the internet, discerning meaning via statistical learning. They are then trained further on materials relevant to outputs they are meant to produce and are given specific instructions, or prompts, on how to produce those outputs (such as, in simplified terms, “Draft a text message that will increase the rate at which mothers in community X set up vaccination appointments with pediatricians.”). Unfortunately, LLMs tend to perpetuate cultural biases embedded in the training data, such as favoring the values of the most-represented group^31–33 rather than capturing human psychological diversity.¹²

Similarly, messages microtargeted at subgroups less represented in the training data could also reflect inaccurate stereotypes about these minoritized subgroups. Cultural biases, of course, are not unique to LLMs. However, when bias is encoded in and replicated by AI tools that are widely used, especially foundational LLMs that form the basis of many AI tools, it spreads to many contexts—a condition known as algorithmic monoculture³⁴—and its negative effects can become increasingly widespread. Without sufficient caution on the part of the designer, LLM-generated messages could thus disadvantage the very subpopulations they are designed to help.

As Table 1 indicates, the possibility that LLMs will deliver flawed or biased information does not arise only when it is the intervention designer that uses LLMs; the same risks can occur when intermediaries or intended beneficiaries of interventions interact with LLMs. But a different problem is specific to the use of LLMs in intervention design: Using LLMs to test interventions on synthetic research participants can result in the delivery of useless or harmful interventions if the responses of the silicon samples do not, in fact, resemble the responses of the humans they were intended to mimic.

Table 1.

Potential benefits & risks of large language model use, by stakeholder

Aspects of LLM use	Intervention designer	Intermediary	Intended beneficiary
Examples of LLM uses	• Designer uses LLM to generate examples of persuasive messages • Designer uses LLM-simulated research participants to conduct inexpensive pilot tests of messages	• LLM provides a teacher real-time feedback on how to make emails to a struggling student more consistent with a growth mindset • LLM provides peer supporters feedback on how to write more empathetic responses in a peer support chat group	• LLM "college advisor" provides advice about attending college • LLM "therapist" provides mental health support to patients
Potential benefits	• LLM expedites the design process • LLM provides rapid, inexpensive tests of intervention elements • By assessing the value of the LLM output, intervention designers provide a buffer between the LLM and the intended beneficiary of an intervention	• LLM flexibly tailors advice to intermediaries depending on their specific interactions with intended beneficiaries • LLM helps intermediaries create supportive environments for intended beneficiaries • Intermediary provides a buffer between the LLM and intended beneficiaries	• LLM provides highly tailored support to intended beneficiaries • LLM reaches intended beneficiary immediately and directly
Potential risks to intended beneficiaries of LLM use by any stakeholder	• LLMs may generate inaccurate or nonsensical information or content infused with cultural biases, resulting in interventions that either cause harm to or fail to help intended beneficiaries
Potential risks to intended beneficiaries of stakeholder-specific LLM use	• Pilot studies that use LLM-generated research participants may produce inaccurate or misleading results	• Intermediary may fail to implement LLM-generated advice successfully	• There is no human buffer to prevent the delivery of harmful content • Deception may occur if individuals are not told they are interacting with an LLM

When LLMs Interface With an Intermediary

A major way that LLMs can intervene with intermediaries is by giving them feedback meant to improve the well-being of the people with whom the intermediaries interact. This feedback might relate to interactions that have already occurred with an intended beneficiary of an intervention³⁵ or, in digital environments such as chatrooms and emails, might be given in real time.³⁶

For instance, an LLM may provide teachers, workplace managers, or police officers with feedback on their language to facilitate their efforts to support more inclusive or empathic communication. Preliminary research has demonstrated success with this approach. One randomized controlled study used LLMs to provide real-time feedback to peer support providers who were interacting with support seekers on a text-based help platform³⁶ with the goal of promoting empathy, a feature of conversations that is linked to positive treatment outcomes in psychotherapy.³⁷ Compared to conversations that were not assisted by this AI tool, conversations assisted by LLM-based feedback were rated as more empathic by an independent group of support seekers. Related work has found similar effects in education, with AI feedback improving educators’ teaching practices as well as students’ academic performance.^35,38 In one case, the software indicated how frequently an educator asked open-ended questions, built on a student’s contributions, and gave students time to talk in a one-on-one online discussion.

When providing feedback to intermediaries, rather than merely giving general advice, LLMs can flexibly tailor messages to individual intermediaries. Another potential benefit of having LLMs interact with intermediaries is that these people can buffer intended beneficiaries from aspects of LLM-generated suggestions they believe could be unhelpful or harmful.

It is worth emphasizing that, as the examples above show, it is not only the ultimate beneficiaries who can profit from LLM feedback to the intermediaries; the intermediaries, too, tend to benefit —for example, peer supporters became more confident in providing empathy,³⁶ and tutors used improved teaching strategies when given LLM assistance.³⁸ Although the prospect of LLMs potentially replacing certain intermediaries raises concerns, the findings we have described show that LLMs can augment the capabilities of, rather than replace, human intermediaries while at the same time enhancing the intermediaries’ relationships with the beneficiaries of the interventions (such as by helping teachers to be better educators and role models).

Beyond the risks posed to the intended beneficiaries of an intervention when any stakeholder interacts with an LLM, the primary risk of directing LLM feedback to intermediaries for behavioral science interventions is that the approach requires the individual intermediary to be able to (a) implement helpful LLM-generated advice effectively and (b) identify and dismiss harmful advice. To the extent that individuals fail to do so, they risk causing harm to the beneficiaries with whom they engage.

When LLMs Interface With the Intended Beneficiary

If successfully trained to interact directly with the intended beneficiaries of a behavioral science intervention, LLMs could provide the psychological support recipients need precisely when they need it.^11,39–42 For instance, researchers examined whether an LLM could help address prospective college students’ concerns about attending college. After an LLM was trained on supportive language crafted by behavioral scientists, it was judged to be more supportive and understanding than were human-written messages on an existing college advising platform.¹⁵ Thus, LLM-based tools may help to shift beneficiaries toward more adaptive ways of thinking by giving highly tailored advice as they confront distinct challenges. In the future, one could imagine equipping LLMs with a wide repertoire of scientifically proven intervention strategies that could be tailored to an intended beneficiary’s situation and delivered as and when the beneficiary required it.

Recent evidence suggests that LLMs could also help to address pressing social issues through argumentative dialogues. One study found that personalized, evidence-based conversations with an LLM reduced conspiracy theorists’ conspiracy beliefs by approximately 20% for at least two months, even among participants whose beliefs were deeply entrenched.⁴³ By responding precisely to the lines of thought and argumentation that undergird individuals’ beliefs, the personalized messages generated by LLMs have the potential to shift entrenched beliefs in ways that would be difficult for more static interventions (such as a standard e-learning module or a persuasive essay) to accomplish.

With this potential for great impact comes high levels of risk. Similar to LLM interactions with an intermediary, LLM-based tools could present harmful advice or ideas. In addition, in the absence of an intermediary, the messages would be delivered without being filtered by a human buffer who could mitigate potential damage caused by the promotion of behaviors and beliefs that would lead to negative consequences. What is more, unless intervention designers build in transparency, individuals may not know they are interacting with an LLM. Yet, there may be a tradeoff between effectiveness and transparency. For example, when recipients were told that the LLM-generated COVID-19 messages described earlier were produced by an LLM, they found the messages less persuasive than did people who were not given that information.²³ Likewise, empathic messages labeled as coming from an LLM were deemed to be less empathic than were messages labeled as written by a human.^44,45

Mitigating the Risks of LLM-Powered Behavioral Science Interventions

Next, we address ways that LLM developers and intervention designers can mitigate risks that arise regardless of who is interacting with an LLM and then address ways they can mitigate risks related to interactions with specific stakeholders. After that, we offer specific recommendations for policymakers. For summaries, see the sidebars Risk-Mitigation Advice for Intervention Designers and Advice for Policymakers.

Mitigating Cross-Cutting Risks of Using LLMs in Designing or Delivering Interventions

Recall that one major cross-cutting risk of LLMs is the potential to generate inaccurate or nonsensical information or even content harmful to intended beneficiaries. Ensuring that LLM-powered behavioral science tools provide the intended type of information and advice will require intervention designers to rigorously fine-tune them, such as by specifically exposing the LLMs to the type of language that would be expected to help the intended beneficiaries and iteratively testing the tools’ efficacy across a wide variety of situations.^46,47 LLMs could also be constructed in ways that help to address gaps in users’ understanding of AI’s limitations, such as by providing explanations of the processes that yielded the LLM’s output.⁴⁸

How might the other major cross-cutting risk—the potential for LLM-generated messages to be infused with cultural biases^33,49—be mitigated? LLM developers and intervention designers can apply existing debiasing techniques,^33,49 although these techniques have been unable to fully prevent bias.⁵⁰ Bias can also be limited by rigorously pilot testing intervention materials before they are implemented at scale.²⁴ For example, researchers might evaluate bias in their materials in an iterative process involving running focus groups with individuals from targeted subpopulations, refining the materials accordingly, testing them in small-scale pilot experiments, and then repeating the process as needed. Together, the use of debiasing techniques and pretesting may help to ensure that intervention materials avoid inadvertently privileging dominant social groups.

Mitigating Risks Related to Stakeholder-Specific Uses

Advice related to the use of LLMs by intervention designers. To avoid investing in ineffective or even harmful intervention materials, intervention designers who use LLM-generated recipients to assess their ideas will need a way to predict how well these silicon samples approximate the responses of human recipients. One solution could be to verify the silicon findings in small samples of human volunteers before testing an intervention at larger scale.

Advice related to the use of LLMs by intermediaries. To address the risk that intermediaries may fail to implement LLM-generated advice effectively, intervention designers should, of course, fine-tune the tools as much as possible. But they or others may also have to provide training on the capacities and limitations of these technologies, including teaching intermediaries how to identify potentially harmful language or, in cases where intermediaries have the opportunity for dialogue with the LLM, how to write prompts that will yield the best possible output.⁵¹

Advice related to the use of LLMs by intended beneficiaries of interventions. When no intermediary will be present to buffer the harms that LLMs might cause, developers of LLM-powered tools that interact directly with intended beneficiaries should ideally take a “human-in-the-loop” approach,⁵² which escalates challenging or uncertain circumstances to trained human agents. For instance, in the case of mental health or other specialized behavioral science chatbots, it will likely be necessary to flag conversations with people at high risk of harm and move these conversations, at least temporarily, to a human operator.

A human-in-the-loop approach may also help to resolve the apparent tension between transparency that individuals are interacting with an LLM and decreased effectiveness when they know they are doing so.^23,44,45 For example, individuals conversing with an LLM may be transparently told that they are engaging with an AI-powered tool that is designed to help them with their specific situation but that if they want to talk to a human at any time, they can do so. A challenge for implementing this approach effectively is that the humans who step in may over-rely on an LLM’s judgments and fail to make better decisions.⁵³ Therefore, humans serving in this role will require rigorous training on the particular task the bot is intended to do (such as providing mental health services or psychologically supportive college advising) and how to correct LLM-generated output that is not consistent with the goals of the task.

Recommendations for Public Policy

Behavioral science’s influence on the interventions used in public policy is growing.⁵⁴ How can policymakers ensure that these interventions deliver the benefits of LLMs to society without exposing the recipients to significant risk? On the basis of our analysis of the uses of LLMs by different stakeholders, we offer five suggestions.

First, until an intervention is known to be safe for intended beneficiaries, favor the use of LLMs by people who are distal from those beneficiaries, such as intervention designers who use the LLMs to inform or test intervention ideas.

Second, to establish the safety of interventions that deploy LLMs directly to intended beneficiaries or intermediaries, insist on human trials of efficacy, preferably randomized controlled trials that can evaluate causality.

Third, given the ever-evolving nature of LLMs, demand ongoing evaluation of programs. Once an intervention is being applied in a population, trained researchers should sample human–LLM interactions for quality control.

Fourth, ensure that the intended beneficiaries of LLM-based interventions have ways of opting out and providing feedback if things go wrong.

Finally, develop policies that ascribe responsibility for LLM-generated errors; society will almost certainly require this action. These policies would clarify the circumstances under which a major error would be attributable to a tool’s developer, provider, or user.¹⁸ To minimize risk to intended beneficiaries, policymakers must create unambiguous rules about which stakeholder is accountable for mistakes under various circumstances, and above all, must protect the interests of vulnerable individuals.

Conclusion

LLMs hold tremendous promise for accelerating the science of behavioral interventions, including making interventions more flexible to recipients’ individual needs, increasing scalability, and reducing cost, but they also pose risks. We have described three interaction paradigms in which LLMs might be incorporated into behavioral science interventions: interacting with intervention designers, with intermediaries who interact with intended beneficiaries, or with the intended beneficiaries of an intervention directly. And, for each paradigm, we have discussed the potential benefits and risks to the intended beneficiaries of interventions. Of course, behavioral science interventions may well incorporate LLMs in multiple ways. For instance, an intervention designer may use an LLM to not only suggest content for interventions but to also generate content used to train a new LLM that will interface with an intermediary or an intended beneficiary.

It may soon be possible to sufficiently mitigate the hazards associated with some of the least risky LLM uses described here, such as the creation of static text messages by intervention designers. Adequately tackling the greater risks that emerge when LLMs interact with intermediaries and intended beneficiaries will likely take more time. We are optimistic that the risks described in this article will ultimately be addressed, enabling researchers to deliver on the promise of LLMs to revolutionize behavioral science interventions.

Risk-Mitigation Advice for Intervention Designers

Select debiased LLMs or use existing tools to debias LLMs. Review outputs for bias.

Rigorously fine-tune LLMs for the task at hand.

Iteratively run pilot tests and then refine outputs and test proposed interventions across a wide variety of situations.

Before testing an intervention at scale, verify that any pretests done using synthetic research participants approximate the findings from humans.

Train intermediaries on the limitations of the LLM that give them advice and teach them how to identify harmful language.

For interventions that will be delivered directly to intended beneficiaries by an LLM, select an LLM that incorporates a human-in-the-loop strategy, enabling a human to step in when necessary.

Advice for Policymakers

Here, we suggest ways to protect the intended beneficiaries of interventions when LLMs are involved in the design or delivery of the interventions.

Favor the use of LLMs by intervention designers or intermediaries unless an LLM-based intervention has proven safe for the intended beneficiaries.

Insist on rigorous human trials of the efficacy of an intervention before an LMM is allowed to deliver an intervention directly to an intended beneficiary.

Require the ongoing postlaunch evaluation of programs that use LLMs to deliver interventions.

Provide the intended beneficiaries of direct LLM-based interventions with ways of opting out and providing feedback if things go wrong.

Develop unambiguous policies that, above all, protect the interests of vulnerable individuals and clarify who is responsible for errors attributable to LLMs.

Footnotes

Author Note

C. A. Hecht and D. C. Ong contributed equally to this manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Cameron A. Hecht

Desmond C. Ong

Margarett Clapper

Michaela Jones

Dorottya Demszky

Johannes C. Eichstaedt

Christopher J. Bryan

David S. Yeager

References

Thaler

R. H.

Sunstein

C. R.

(2008). Nudge: Improving decisions about health, wealth, and happiness. Yale University Press.

Walton

G. M.

(2014). The new science of wise psychological interventions. Current Directions in Psychological Science, 23(1), 73–82. https://doi.org/10.1177/0963721413512856

Walton

G. M.

Wilson

T. D.

(2018). Wise interventions: Psychological remedies for social and personal problems. Psychological Review, 125(5), 617–655. https://doi.org/10.1037/rev0000115

Castleman

B. L.

Page

L. C.

(2015). Summer nudging: Can personalized text messages and peer mentor outreach increase college going among low-income high school graduates? Journal of Economic Behavior & Organization, 115, 144–160. https://doi.org/10.1016/j.jebo.2014.12.008

Walton

G. M.

Cohen

G. L.

(2007). A question of belonging: Race, social fit, and achievement. Journal of Personality and Social Psychology, 92(1), 82–96. https://doi.org/10.1037/0022-3514.92.1.82

Walton

G. M.

Cohen

G. L.

(2011). A brief social-belonging intervention improves academic and health outcomes among minority students. Science, 331, 1447–1451. https://doi.org/10.1126/science.1198364

Bryan

C. J.

Tipton

Yeager

D. S.

(2021). Behavioural science is unlikely to change the world without a heterogeneity revolution. Nature Human Behavior, 5, 980–989. https://doi.org/10.1038/s41562-021-01143-3

Hecht

C. A.

Dweck

C. S.

Murphy

M. C.

Kroeper

K. M.

Yeager

D. S.

(2023). Efficiently exploring the causal role of contextual moderators in behavioral science. Proceedings of the National Academy of Sciences, 120(1), Article e2216315120. https://doi.org/10.1073/pnas.2216315120

Walton

G. M.

Yeager

D. S.

(2020). Seed and soil: Psychological affordances in contexts help to explain where wise interventions succeed or fail. Current Directions in Psychological Science, 29(3), 219–226. https://doi.org/10.1177/0963721420904453

10.

Demszky

Yang

Yeager

Bryan

C. J.

Clapper

Chandhok

Eichstaedt

J. C.

Hecht

Jamieson

Johnson

Jones

Krettek-Cobb

Lai

JonesMitchell

Ong

D. C.

Dweck

C. S.

Gross

J. J.

Pennebaker

J. W.

(2023). Using large language models in psychology. Nature Reviews Psychology, 2(11), 688–701. https://doi.org/10.1038/s44159-023-00241-5

11.

Stade

E. C.

Stirman

S. W.

Ungar

L. H.

Boland

C. L.

Schwartz

H. A.

Yaden

D. B.

Sedoc

DeRubeis

R. J.

Willer

Eichstaedt

J. C.

(2024). Large language models could change the future of behavioral healthcare: A proposal for responsible development and evaluation. npj Mental Health Research, 3(1), Article 12. https://doi.org/10.1038/s44184-024-00056-z

12.

Abdurahman

Atari

Karimi-Malekabadi

Xue

M. J.

Trager

Park

P. S.

Golazizian

Omrani

Dehghani

(2024). Perils and opportunities in using large language models in psychological research. PNAS Nexus, 3(7), Article pgae245 https://doi.org/10.1093/pnasnexus/pgae245.

13.

Grossmann

Feinberg

Parker

D. C.

Christakis

N. A.

Tetlock

P. E.

Cunningham

W. A.

(2023). AI and the transformation of social science research. Science, 380(6650), 1108–1109. https://doi.org/10.1126/science.adi1778

14.

Choudhury

Elyoseph

Fast

N. J.

Ong

D. C.

Nsoesie

E. O.

Pavlick

(2025). The promise and pitfalls of generative AI for psychology and society. Nature Reviews Psychology, 4, 75–80. https://doi.org/10.1038/s44159-024-00402-0

15.

Clapper

Hecht

C. A.

Jones

JonesMitchell

Ong

D. C.

Johnson

Yeager

D. S.

Can large language models generate psychologically “wise” messages that rival human messages? [Manuscript in preparation]. University of Texas at Austin.

16.

Simchon

Edwards

Lewandowsky

(2024). The persuasive effects of political microtargeting in the age of generative artificial intelligence. PNAS Nexus, 3(2), Article pgae035. https://doi.org/10.1093/pnasnexus/pgae035

17.

Jobin

Ienca

Vayena

(2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence, 1, 389–399. https://doi.org/10.1038/s42256-019-0088-2

18.

Ong

D. C.

(2021). An ethical framework for guiding the development of affectively-aware artificial intelligence. 9th International Conference on Affective Computing and Intelligent Interaction (ACII). arXiv. https://doi.org/10.48550/ARXIV.2107.13734

19.

Bryan

C. J.

Yeager

D. S.

Hinojosa

C. P.

Chabot

Bergen

Kawamura

Steubing

(2016). Harnessing adolescent values to motivate healthier eating. Proceedings of the National Academy of Sciences, 113(39), 10830–10835. https://doi.org/10.1073/pnas.1604586113

20.

Bryan

C. J.

Yeager

D. S.

Hinojosa

C. P.

(2019). A values-alignment intervention protects adolescents from the effects of food marketing. Nature Human Behavior, 3, 596–603. https://doi.org/10.1038/s41562-019-0586-6

21.

Rege

Hanselman

Solli

I. F.

Dweck

C. S.

Ludvigsen

Bettinger

Crosnoe

Muller

Walton

Duckworth

Yeager

D. S.

(2021). How can we inspire nations of learners? An investigation of growth mindset and challenge-seeking in two countries. American Psychologist, 76(5), 755–767. https://doi.org/10.1037/amp0000647.

22.

Yeager

D. S.

Walton

G. M.

(2011). Social-psychological interventions in education: They’re not magic. Review of Educational Research, 81(2), 267–301. https://doi.org/10.3102/0034654311405999

23.

Karinshak

Liu

S. X.

Park

J. S.

Hancock

J. T.

(2023). Working with AI to persuade: Examining a large language model’s ability to generate pro-vaccination messages. Proceedings of the ACM on Human–Computer Interaction, 7(CSCW1), Article 116. https://doi.org/10.1145/3579592

24.

Yeager

D. S.

Romero

Paunesku

Hulleman

C. S.

Schneider

Hinojosa

Lee

H. Y.

O’Brien

Flint

Roberts

Trott

Greene

Walton

G. M.

Dweck

(2016). Using design thinking to improve psychological interventions: The case of the growth mindset during the transition to high school. Journal of Educational Psychology, 108(3), 374–391. https://doi.org/10.1037/edu0000098

25.

Argyle

L. P.

Busby

E. C.

Fulda

Gubler

J. R.

Rytting

Wingate

(2023). Out of one, many: Using language models to simulate human samples. Political Analysis, 31(3), 337–351. https://doi.org/10.1017/pan.2023.2

26.

Ashokkumar

Hewitt

Ghezae

Willer

(2024). Predicting results of social science experiments using large language models. https://docsend.com/view/ity6yf2dansesucf

27.

Dillion

Tandon

Gray

(2023). Can AI language models replace human participants? Trends in Cognitive Sciences, 27(7), 597–600. https://doi.org/10.1016/j.tics.2023.04.008

28.

Bhatia

Aka

. (2022). Cognitive modeling with representations from large-scale digital data. Current Directions in Psychological Science, 31(3), 207–214. https://doi.org/10.1177/09637214211068113

29.

Hong

Gema

A. P.

Saxena

Nie

Zhao

Perez-Beltrachini

Ryabinin

Fourrier

Minervini

(2024). The hallucinations leaderboard—An open effort to measure hallucinations in large language models. arXiv. https://doi.org/10.48550/arXiv.2404.05904

30.

Williams

Huckle

(2024). Easy problems that LLMs get wrong. arXiv. https://doi.org/10.48550/arXiv.2405.19616

31.

Ghosh

Caliskan

(2023). ChatGPT perpetuates gender bias in machine translation and ignores non-gendered pronouns: Findings across Bengali and five other low-resource languages. AIES ’23: Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, (pp. 901–912). https://doi.org/10.1145/3600211.3604672

32.

Kyrychenko

Rathje

Collier

van der Linden

Roozenbeek

(2023). Generative language models exhibit social identity biases. Nature Computational Science, 5, 65–75. https://doi.org/10.1038/s43588-024-00741-1

33.

Tao

Viberg

Baker

R. S.

Kizilcec

R. F.

(2024). Auditing and mitigating cultural bias in LLMs. PNAS Nexus, 3(9), Article pgae346. https://doi.org/10.1093/pnasnexus/pgae346

34.

Bommasani

Creel

K. A.

Kumar

Jurafsky

Liang

P. S.

(2022). Picking on the same person: Does algorithmic monoculture lead to outcome homogenization. In Koyejo

Mohamed

Agarwal

Belgrave

Cho

(Eds.), Advances in neural information processing systems (pp. 3663–3678). Curran Associates.

35.

Demszky

Liu

Hill

H. C.

Jurafsky

Piech

(2023). Can automated feedback improve teachers’ uptake of student ideas? Evidence from a randomized controlled trial in a large-scale online course. Educational Evaluation and Policy Analysis, 46(3), 483–505. https://doi.org/10.3102/01623737231169270.

36.

Sharma

Lin

I. W.

Miner

A. S.

Atkins

D. C.

Althoff

(2023). Human–AI collaboration enables more empathic conversations in text-based peer-to-peer mental health support. Nature Machine Intelligence, 5, 46–57. https://doi.org/10.1038/s42256-022-00593-2

37.

Elliott

Bohart

A. C.

Watson

J. C.

Greenberg

L. S.

(2011). Empathy. Psychotherapy, 48(1), 43–49. https://doi.org/10.1037/a0022187

38.

Wang

R. E.

Ribeiro

A. T.

Robinson

C. D.

Loeb

Demszky

(2024). Tutor CoPilot: A human–AI approach for scaling real-time expertise. arXiv. http://arxiv.org/abs/2410.03017

39.

Lim

S. M.

Shiau

C. W. C.

Cheng

L. J.

Lau

(2022). Chatbot-delivered psychotherapy for adults with depressive and anxiety symptoms: A systematic review and meta-regression. Behavior Therapy, 53(2), 334–347. https://doi.org/10.1016/j.beth.2021.09.007

40.

Zhan

Zheng

Lee

Y. L.

Suh

J. J.

Ong

D. C.

(2024). Large language models are capable of offering cognitive reappraisal, if guided. arXiv. https://arxiv.org/abs/2404.01288

41.

Lee

Y. K.

Suh

Zhan

J. J.

Ong

D. C.

(2024). Large language models produce responses perceived to be empathic. 2024 12th International Conference on Affective Computing and Intelligent Interaction (ACII; pp. 63–71). https://doi.org/10.1109/ACII63134.2024.00012

42.

Stade

Eichstaedt

J. C.

Kim

J. P.

Stirman

S. W.

(2024). Readiness evaluation for AI–mental health deployment and implementation (READI): A review and proposed framework. PsyArXiv. https://osf.io/preprints/psyarxiv/8zqhw_v2

43.

Costello

T. H.

Pennycook

Rand

D. G.

(2024). Durably reducing conspiracy beliefs through dialogues with AI. Science, 285(6714), Article eadq1814. https://doi.org/10.1126/science.adq1814

44.

Yin

Jia

Wakslak

C. J.

(2024). AI can help people feel heard, but an AI label diminishes this impact. Proceedings of the National Academy of Sciences, 121(14), Article e2319112121. https://doi.org/10.1073/pnas.2319112121

45.

Rubin

J. Z.

Zimmerman

Ong

D. C.

Goldenberg

Perry

(2024). The value of perceiving a human response: Comparing perceived human versus AI-generated empathy. OSF Preprints. https://doi.org/10.31219/osf.io/ng97s

46.

Ouyang

Jiang

Almeida

Wainwright

C. L.

Mishkin

Zhang

Agarwal

Slama

Ray

Schulman

Hilton

Kelton

Miller

Simens

Askell

Welinder

Christiano

Leike

Lowe

(2022). Training language models to follow instructions with human feedback. NIPS’22: Proceedings of the 36th International Conference on Neural Information Processing Systems, Article 2011. https://dl.acm.org/doi/10.5555/3600270.3602281

47.

Ziegler

D. M.

Stiennon

Brown

T. B.

Radford

Amodei

Christiano

Irving

(2019). Fine-tuning language models from human performances. arXiv. https://arxiv.org/abs/1909.08593

48.

Long

Magerko

(2020). What is AI literacy? Competencies and design considerations. CHI’20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1–16). https://doi.org/10.1145/3313831.3376727

49.

Bai

Kadavath

Kundu

Askell

Kernion

Jones

Chen

Goldie

Mirhoseini

McKinnon

Chen

Olsson

Olah

Hernandez

Drain

Ganguli

Tran-Johnson

Perez

. . . Kaplan

(2022). Constitutional AI: Harmlessness from AI feedback. arXiv. https://arxiv.org/abs/2212.08073

50.

Gonen

Goldberg

(2019). Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. arXiv. https://arxiv.org/abs/1903.03862

51.

Zamfirescu-Pereira

J. D.

Wong

R. Y.

Hartmann

Yang

(2023). Why Johnny can’t prompt: How non-AI experts try (and fail) to design LLM prompts. CHI ’23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Article 437. https://doi.org/10.1145/3544548.3581388

52.

Monarch

R. M.

(2021). Human-in-the-loop machine learning: Active learning and annotation for human-centered AI. Simon & Schuster.

53.

Fok

Weld

D. S.

(2024). In search of verifiability: Explanations rarely enable complementary performance in AI-advised decision making. AI Magazine, 45(3), 317–332. https://doi.org/10.1002/aaai.12182

54.

Hallsworth

Kirkman

(2020). The history and thought behind behavioral insights. In Behavioral insights (pp. 25–70). The MIT Press. https://doi.org/10.7551/mitpress/12806.003.0005

Using Large Language Models in Behavioral Science Interventions: Promise & Risk

Abstract

Keywords

Benefits & Risks of Incorporating LLMs Into Interventions

When LLMs Interface With Intervention Designers

When LLMs Interface With an Intermediary

When LLMs Interface With the Intended Beneficiary

Mitigating the Risks of LLM-Powered Behavioral Science Interventions

Mitigating Cross-Cutting Risks of Using LLMs in Designing or Delivering Interventions

Mitigating Risks Related to Stakeholder-Specific Uses

Recommendations for Public Policy

Conclusion

Risk-Mitigation Advice for Intervention Designers

Advice for Policymakers

Footnotes

Author Note

Declaration of Conflicting Interests

Funding

ORCID iDs

References