Sage Journals: Discover world-class research

Abstract

This brief review examines the potential to use decision science to objectively characterize depression. We provide a brief overview of the existing literature examining different domains of decision-making in depression. Because this overview highlights the specific role of reinforcement learning as an important decision process affected in the disorder, we then introduce reinforcement learning modeling and explain how this approach has identified specific reinforcement learning deficits in depression. We conclude with ideas for future research at the intersection of decision science and depression, emphasizing the potential for decision science to help uncover underlying mechanisms and targets for the treatment of depression.

Keywords

computational modeling computational psychiatry decision-making major depressive disorder reinforcement learning

Depression is the most debilitating disease in the world according to the World Health Organization (Kessler et al., 2005). Hallmarks of depression include pessimistic and inaccurate beliefs about the self, the world, and the future, and these beliefs can influence decision processes in individuals who suffer from depression (Roepke & Seligman, 2016). Indeed, difficulty making everyday decisions, such as what to wear or what to eat, is a diagnostic criterion for major depressive disorder (MDD). In addition, MDD is associated with impaired functioning in specific brain regions—the ventromedial prefrontal cortex and ventral striatum—known to play a critical role in incentivized decision-making (Bartra et al., 2013; Kumar et al., 2018). Using the theoretical and analytical tools of decision science to investigate depression could provide a more nuanced understanding of the cognitive mechanisms affected by the disorder, shed light on the nature of heterogeneity within depression, and help bridge gaps between current research in clinical psychopathology and neuroscience. Further, identifying specific decision impairments as the targets of treatment could make more personalized help possible, as opposed to the more generic standard protocol that is currently implemented.

Decision-Making in Depression Across Domains

Though several studies have investigated decision-making in depression, a major limitation is that these studies usually include only one decision-making task. For instance, several studies have examined decision-making performance on the Iowa gambling task, which a meta-analysis has demonstrated is impaired across a number of mental health disorders, including depressive disorders (Mukherjee & Kable, 2014). Although findings of this type remain informative, we would ideally like to characterize depression across multiple decision-making domains. Such a characterization would identify whether differences in decision-making in depression are specific to particular domains as opposed to widespread. Characterizing multiple decision domains across different disorders would further identify what pattern of decision-making differences is unique to depression.

As a first step toward this end, we recently conducted a study investigating a wide range of different decision processes in the same group of depressed individuals (Mukherjee, Lee, et al., 2020). We used a battery of nine tasks (yielding 10 measures; see Table 1) that spanned five different domains of choice behavior and had previously been studied only in separation. Although this is by no means an exhaustive list of the available decision-making paradigms, these specific tasks have been related to neural functioning and psychopathology in the literature.

Table 1.

Different Domains of Decision-Making, Decision Tasks, Performance Measures, and Task Structures Examined by Mukherjee, Lee, et al. (2020)

Decision domain and decision task	Performance measure	Task structure
Intertemporal choice
Delay discounting	Now choices (%)	Participant chooses between a smaller monetary reward available immediately and a larger one received after a delay. For example, “Would you prefer $10 now or $15 in 7 days?”
Willingness to wait	High-persistence wait (in seconds)	Participant starts a trial by placing a cursor in a box marked, “Wait for 30¢.” At any point before the 30¢ is awarded, they can stop waiting by shifting to a box marked “Take 2¢” and proceed to a new trial. The optimal strategy for this block was to always wait for the 30¢.
	Low-persistence wait (in seconds)	The procedure was the same as above, but in this block, the optimal strategy was to wait only for the 30¢ reward for 5 s, and after 5 s, take the 2¢ if the 30¢ reward had not arrived.
Decisions under uncertainty
Risk tolerance	Safe choices (%)	Participant chooses between a smaller guaranteed amount of money or a 50% chance of winning a larger amount. For example, “Would you prefer a 50% chance of winning $35 or a sure amount of $5?”
Ambiguity tolerance	50-50 choices (%)	Participant chooses between a gamble where the chances of winning are known to be 50% (risky) and a gamble where the chances of winning are not known (ambiguous).
Social decision-making
Ultimatum game, proposer	Propose for self ($)	Participant proposes how to split $10 with an anonymous responder, knowing the responder can accept or reject the proposal.
Ultimatum game, responder	Minimum accept for self ($)	Participant states whether they would accept or reject each possible (whole dollar) proposed split of $10.
Reinforcement learning
Reward learning	Rich choices (%)	Participants learn from experience to choose the fractal with the higher likelihood of reward feedback (win 25¢).
Punishment learning	Rich choices (%)	Participants learn from experience to avoid choosing the fractal with the higher likelihood of punishment feedback (lose 25¢).
Future expectations
Likelihood judgment	Win probability (%)	“What do you think are the chances that you will win additional money in this study (in addition to the $30 for participation) on a scale of 0–100%?”

Our decision-making battery included measures of intertemporal choice, decisions under uncertainty, social decision-making, reinforcement learning, and future expectations. In intertemporal choice, individuals choose between smaller rewards that they can receive sooner and larger rewards that they will have to wait for. In one task, the delays range from weeks to months that the individual must wait after experiment (Kable & Glimcher, 2007). In another, the delays occur during the experiment (and the monetary rewards collected are correspondingly smaller), and decision-makers face the problem of optimizing persistence given the delay statistics in order to make the most money during the experiment (McGuire & Kable, 2012). In decisions under uncertainty, individuals choose between smaller rewards that are more certain and larger rewards that are more uncertain (risk- and ambiguity-tolerance tasks). Decision scientists distinguish between two different forms of uncertainty: risk and ambiguity. Under risk, the likelihood of different outcomes is known; under ambiguity, this information is unknown (Levy et al., 2010). One aspect of interpersonal interactions and social decision-making is assessed by the ultimatum game. In the ultimatum game, one player (the proposer) makes an offer to another (the responder) regarding how to split a fixed amount of money between them. The responder can either accept the offer, in which case the money will be divided on the basis of the proposer’s decision, or reject the proposal, in which case neither the proposer nor the responder will receive any of the money. Even though responders should accept any nonzero offers if they care only about maximizing their payout, they generally reject highly unequal splits, and knowing this, proposers generally offer splits that are close to equal. Reinforcement learning refers to the process by which artificial agents, humans, or other animals learn how to acquire reward in an unsupervised environment, that is, how they maximize reward and minimize loss by modifying their behavior as a consequence of experience (Rutledge et al., 2017). In our battery, participants perform two tasks in which they learn from either rewards or punishments. Participants select one of two images that are associated with rewards (or punishments) with higher or lower probability, and these probabilistic associations reversed periodically throughout the task. Finally, because negative expectations are a key mechanism in the maintenance and recurrence of depression (Roepke & Seligman, 2016), we assess future expectations by asking participants to estimate their likelihood of receiving additional money on the basis of their performance on the other decision-making tasks.

We found that depressed participants (n = 64) differed from healthy control participants (n = 64) most in the reinforcement learning (reward and punishment learning tasks), future expectations (likelihood judgment task), and intertemporal choice (high-persistence condition of the willingness to wait task) domains (Fig. 1). Depressed individuals made fewer optimal decisions in both the reward and punishment learning tasks, expressed more pessimistic predictions about winning money on the basis of their task performance, and were less willing to wait when persistence was optimal. The two groups did not differ significantly in performance on any of the other measures (see Fig. 1). A factor analysis of the 10 performance measures identified four factors: a learning factor, which most strongly weighted the reward and punishment learning tasks; an expectations factor, which most strongly weighted the likelihood judgment (future expectations) task; a persistence factor, which most strongly weighted the willingness-to-wait task; and an uncertainty factor, which most strongly weighted the risk and ambiguity tolerance tasks. Of these four factors, the learning and expectations factors each accounted for unique variance in predicting depressed status (Mukherjee, Lee, et al., 2020). Beyond this, however, none of the factors were uniquely correlated with specific subsets of depression symptoms.

Fig. 1.

Effect size of performance difference between participants with major depressive disorder (MDD) and healthy control participants, separately for each decision task (data are from Mukherjee, Lee, et al., 2020). *p < .05, **p < .01, ***p < .001.

The differences we observed echo previous research on decision-making in depression using single tasks. As discussed in more detail in the next section, impaired reinforcement learning is strongly supported in the literature (Halahakoon et al., 2020; Pike & Robinson, 2022) and can often be detected with simple measures of performance (such as we used in this study) without the use of computational modeling (e.g., Kumar et al., 2018). Pessimistic expectations are consistent with the cognitive theory of depression, which posits that depressed individuals have inaccurate, negative biases about themselves and the future, and individuals with depression make significantly more pessimistic predictions than healthy control individuals in other tasks (Strunk & Adler, 2009). The finding that depressed individuals quit significantly earlier than healthy control individuals in the high-persistence condition of the willingness-to-wait task was novel but might be related to the reductions in motivation observed in depressed individuals in effort tasks (Berwian et al., 2020; Hershenberg et al., 2016). Past studies have generally failed to find differences between depressed and healthy individuals in the ultimatum game (Zhang et al., 2019) or in risk tolerance tasks (Chung et al., 2017). Past findings regarding delay discounting tasks are mixed, with some evidence that delay discounting choices are affected by anhedonia (Lempert & Pizzagalli, 2010) or suicidality (Tsypes et al., 2022), if not depression per se.

Modeling Reinforcement Learning in Depression

Given that depressed individuals showed the largest differences from healthy individuals in reinforcement learning, in a follow-up study, we used computational models (Huys et al., 2016) to better understand the mechanism underlying these differences. Fitting computational models to the data allows us to translate choice into subject-specific parameter estimates meant to reflect high-level behavioral tendencies (Sutton & Barto, 1998).

Perhaps the most commonly used model of reinforcement learning—and the one we used here—is Q-learning, which assumes that we learn the value of choice options through a weighted average of their reward history. When an unexpected outcome is encountered (e.g., if a previously rewarded choice suddenly leads to an unfavorable outcome), people can vary in terms of how much they allow this new information to affect their expectation of the value of the choice option. To capture this source of variability, a learning rate parameter α determines the extent to which prediction error—or the difference between expected (Q) and actual (R) outcome on any given trial—leads to value updating. High estimates of α reflect greater sensitivity to the most recent choice outcome, whereas lower values indicate more gradual value updating. The standard equation to describe this process is written as follows: $Q_{n, a}^{t + 1} = Q_{n, a}^{t} + α_{n} (R_{n}^{t} - Q_{n, a}^{t}),$ where n refers to subject, t to trial, and a to choice.

Q-learning models, in their simplest formulation, also include an additional parameter: inverse temperature, or β, whose magnitude determines the impact of value on choice. Indeed, the likelihood of choosing either of the choice options is weighted by a subject-specific β such that higher inverse temperatures reflect a greater tendency toward consistently choosing the option that is estimated to have a higher Q value. Smaller βs are indicative of more stochastic, or perhaps exploratory, behavior, and people who exhibit such choice patterns can be said to be less sensitive to value in their decision-making (for equations and further information, see van Geen & Gerraty, 2021, and Mukherjee, Filipowicz, et al., 2020).

In our follow-up study (Mukherjee, Filipowicz, et al., 2020), we found that fitting Q-learning models to our data can help elucidate the differences in performance between depressed individuals and healthy control individuals (Fig. 2). Specifically, we found that depressed individuals showed lower learning rates (α) and, to a lesser extent, lower value sensitivity (β), in both the reward and punishment learning tasks. Learning rates also outperformed simple performance metrics, such as number of optimal choices made, in predicting depression status, showing that computational modeling provides additional diagnostic power. Importantly, learning rates and value sensitivity were reduced in both the reward and punishment conditions, which is consistent with the claim that depression is characterized by a hyposensitivity to positive outcomes, but goes against the belief that depression is characterized by a hypersensitivity to negative outcomes (Eshel & Roiser, 2010).

Fig. 2.

Differences between participants with major depressive disorder (MDD) and control participants. Individuals with MDD (red) made fewer rich choices than control participants (blue) in both the reward and punishment conditions. (a) Mean proportion of rich choices in MDD and healthy control participants across all trials in the reward and punishment conditions. (b) Mean proportion of rich responses as a function of trial since the last reversal. (c) and (d) Identical statistics reported in (a) and (b), respectively, computed from simulations of subject behavior from reinforcement learning model fits (100 independent simulations per subject using their fit parameter estimates). Error bars in (a) and (c) and shaded regions in (b) and (d) correspond to ±1 SEM. Asterisks indicate significant between-group differences (*p < .05, **p < .01). Gray areas in (b) indicate trials after reversal on which signed-rank tests were significant (p < .05). Figure reprinted with permission from Mukherjee, Filipowicz, et al. (2020).

Our finding that depressed individuals exhibited lower learning rates is consistent with results of some previous studies. For instance, Chase et al. (2010) found that participants with MDD had lower learning rates for both punishments and reward in the initial training phase of a reward-learning paradigm. They also showed a negative correlation between learning rate and anhedonia, one of the primary symptoms of depression. Similarly, Dombrovski et al. (2010) found decreased learning rates in a probabilistic reversal learning task among suicide attempters. However, not all studies have found reduced learning rates in depression (Moutoussis et al., 2018). Combining evidence across studies, a recent meta-analysis concluded that there was evidence for lower reward learning rates in individuals with mood and anxiety disorders (Pike & Robinson, 2022).

Other work has also shown that depressed individuals exhibit decreased value sensitivity. For instance, one study found that patients with depression are more exploratory when they decide which of two differentially rewarding options to choose (Blanco et al., 2013). Using computational modeling, one study (Rupprechter et al., 2018), found both reduced memory for and sensitivity to past values in in their paradigm. In an adapted Q-learning model, Huys et al. (2013) further demonstrated that both MDD and anhedonia are correlated with lower reward sensitivity, meaning that patients’ choices are less deterministically guided by value. In this meta-analysis, however, the researchers did not find any association between depressive symptoms and learning rate. This suggests that more evidence is needed in order to parse the separate contributions of prediction-error updating and reward sensitivity to learning, as well as how they may be differentially affected by depression.

One critical reason there may be inconsistency across studies is that, as a field, we have not paid sufficient attention to the reliability and generalizability of parameter estimates derived from computational models. Indeed, a recent analysis (Eckstein et al., 2022) suggests that although estimates of reward sensitivity across different tasks are fairly reliable within participants, learning rates do not show evidence of generalization. As a consequence, two recommendations emerge. First, it is crucial to fit computational models using techniques that have been shown to provide the greatest out-of-sample reliability (e.g., using group-level priors, as suggested by van Geen & Gerraty, 2021). Second, it is necessary to consider the context of the experimental task when assessing how parameters may differ from optimality. For example, depending on the experimental paradigm, having a higher learning rate is not always better. If characteristics of different experiments make faster learning more optimal in one context and slower learning more optimal in another, a high-performing participant may demonstrate learning rates that differ greatly from one context to the next. In a similar vein, Vandendriessche et al. (2023) have recently shown that individuals with depression learn less accurately when the overall expected value of an environment is negative but not when it is positive. In cases such as this one, considering characteristics of the specific experimental context is necessary if we hope to identify biases in behavior that correlate with depressive symptoms. Moving forward, such considerations will be crucial if we hope to achieve a coherent understanding of how depression affects reinforcement learning.

An association between depression and parameters of reinforcement learning is broadly consistent with neurobiological ideas that link depression with alterations in dopamine and/or serotonin systems (Lan & Browning, 2022). Given how closely prediction errors track with dopamine release in the brain, reductions in the behavioral effect of prediction errors are in line with the hypothesis that depression is associated with reduced dopamine functioning (Delgado, 2000; Gradin et al., 2011). These findings could help explain the efficacy of antidepressants that increase dopamine availability. Serotonin, on the other hand, has been linked to reward and punishment signaling (Lan & Browning, 2022), and sustained intake of serotonin reuptake inhibitors leads to changes in reinforcement learning parameters (Michely et al., 2022). Beyond suggesting potential neurobiological mechanisms of depression, these ideas can also help reconcile some of the inconsistencies in the literature noted above. For example, though not all studies have found reduced learning rates in depression, some have nonetheless been able to leverage parameter estimates from reinforcement learning models to identify a widespread hyposensitivity to prediction error across the brain’s reward system (Gradin et al., 2011).

These links between reinforcement learning processes and dopamine and serotonin systems highlight the need to investigate potential medication-related confounds in studies of reinforcement learning in depression. For example, the Mukherjee, Filipowicz, et al. (2020) study systematically examined the effect of different forms of treatment in the MDD group. Individuals treated with medication only tended to have lower learning rates than those not on medication in the reward condition (medication only: M = 0.54, SD = 0.17, no medication: M = 0.69, SD = 0.17; signed-rank test, p = .041) but not in the punishment condition (p = .419). No other parameters in either condition differed between individuals being treated with medication only and no medication, between individuals undergoing therapy only and no therapy, or between individuals under any treatment (therapy and/or medication) and no treatment (no medication or therapy; all signed-rank ps > .05). These analyses show that the association between MDD and reinforcement learning parameters in Mukherjee, Filipowicz, et al. (2020) are unlikely to be confounded by MDD treatment, though prospective investigation of the effects of medication treatment on reinforcement learning parameters in depressed individuals is needed (Lan & Browning, 2022).

Future Directions

The findings above regarding reinforcement learning and depression are promising, but several gaps in our understanding remain. We see three particularly important directions that merit further investigation. First, future research should use more advanced reinforcement learning models to characterize how depression impacts reward and punishment learning. Simple reinforcement learning models such as the ones we have discussed so far posit that learning occurs exclusively through direct feedback and are often referred to as model-free reinforcement learning. On the other hand, model-based reinforcement learning assumes that we accumulate knowledge about the general structure of the environment and use this knowledge flexibly to infer value when planning and selecting actions. The relative contribution of model-free and model-based learning is often thought of as a marker of cognitive complexity, and some evidence suggests that depression may bias people toward more model-free learning strategies (Blanco et al., 2013). However, another study, using a large-scale online assessment of psychiatric symptoms, found that obsessive compulsive disorder symptomology is more predictive of a decrease in model-based behavior than depression is (Gillan et al., 2016). Other researchers have proposed that depression may be associated specifically with avoidance biases in model-based planning (Lally et al., 2017). Beyond the model-based versus model-free distinction, another idea is that mood, including depressed mood, is a representation of recent trends in the environment, referred to as momentum, which can be used to improve learning in more sophisticated reinforcement learning algorithms (Bennett et al., 2022). Thus, an important question for computational psychiatry going forward is how people’s mental models of the world may be affected by depression and how this could impact how depressed people learn.

Second, further research should better parse the relationship between reinforcement learning deficits and specific symptoms. Some of the work above suggests that decreased learning rate or decreased feedback sensitivity is associated with specific symptom dimensions of depression, such as high levels of anhedonia (Huys et al., 2013). Other work suggests that reinforcement learning processes are affected by symptoms of anxiety, which are highly comorbid with depression (Gillan et al., 2016). Thus, another important question is the sensitivity and specificity of reinforcement learning deficits across symptoms. Addressing this question might shed light on the heterogeneity of depression—two people may be diagnosed with MDD and not have a single overlapping clinical symptom—and reinforcement learning may prove useful to characterize subtypes of depression.

A third area for future investigation is targeting reinforcement learning processes for treatment. If depression symptoms are associated with features of reinforcement learning, then treatment-related symptom changes should be associated with changes in learning rate and reward sensitivity. In a recent mixed cross-sectional cohort study, participants with MDD completed a probabilistic learning task during functional MRI before and after cognitive behavioral therapy (CBT). Computational models applied to the data uncovered associations between parameter estimates and depression symptoms, but importantly, symptom improvement following CBT was associated with the normalization of learning parameters (Brown et al., 2021). This is exciting because the association of reinforcement learning parameters to symptoms of depression suggests novel therapeutic interventions could be developed that target and improve reinforcement learning processes to treat depression.

These areas for future research will help to determine the extent to which decision science advances our characterization of depression in two related yet distinct senses. It is possible that decision science could ultimately contribute to better measures of depression, or aspects of depression, that avoid key biases in self-reports, even if the theoretical implications of these measures for understanding depression remain unclear. It also possible that decision science could become an extremely useful theoretical framework for understanding depression, even if the psychometric properties of the most theoretically insightful paradigms are modest. Only future research can fully resolve whether one or both of these twin potentials of decision science—for better measurement and for better theoretical understanding of depression—is fulfilled.

Footnotes

Transparency

Action Editor: Teresa A. Treat

Editor: Robert L. Goldstone

ORCID iDs

Dahlia Mukherjee

Camilla van Geen

References

Bartra

McGuire

J. T.

Kable

J. W.

(2013). The valuation system: A coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. NeuroImage, 76, 412–427. https://doi.org/10.1016/j.neuroimage.2013.02.063

Bennett

Davidson

Niv

(2022). A model of mood as integrated advantage. Psychological Review, 129(3), 513–541.

Berwian

I. M.

Wenzel

J. G.

Collins

A. G.

Seifritz

Stephan

K. E.

Walter

Huys

Q. J.

(2020). Computational mechanisms of effort and reward decisions in patients with depression and their association with relapse after antidepressant discontinuation. JAMA Psychiatry, 77(5), 513–522.

Blanco

N. J.

Otto

A. R.

Maddox

W. T.

Beevers

C. G.

Love

B. C.

(2013). The influence of depression symptoms on exploratory decision-making. Cognition, 129(3), 563–568.

Brown

V. M.

Zhu

Solway

Wang

J. M.

McCurry

K. L.

King-Casas

Chiu

P. H.

(2021). Reinforcement learning disruptions in individuals with depression and sensitivity to symptom change following cognitive behavioral therapy. JAMA Psychiatry, 78(10), 1113–1122.

Chase

H. W.

Frank

M. J.

Michael

Bullmore

E. T.

Sahakian

B. J.

Robbins

T. W.

(2010). Approach and avoidance learning in patients with major depression and healthy controls: Relation to anhedonia. Psychological Medicine, 40(3), 433–440.

Chung

Kadlec

Aimone

J. A.

McCurry

King-Casas

Chiu

P. H.

(2017). Valuation in major depression is intact and stable in a non-learning environment. Scientific Reports, 7(1), Article 44374. https://doi.org/10.1038/srep44374

Delgado

P. L.

(2000). Depression: The case for a monoamine deficiency. Journal of Clinical Psychiatry, 61(6), 7–11.

Dombrovski

A. Y.

Clark

Siegle

G. J.

Butters

M. A.

Ichikawa

Sahakian

B. J.

Szanto

(2010). Reward/punishment reversal learning in older suicide attempters. American Journal of Psychiatry, 167(6), 699–707.

10.

Eckstein

M. K.

Master

S. L.

Xia

Dahl

R. E.

Wilbrecht

Collins

A. G.

(2022). The interpretation of computational model parameters depends on the context. eLife, 11, Article e75474. https://doi.org/10.7554/eLife.75474

11.

Eshel

Roiser

J. P.

(2010). Reward and punishment processing in depression. Biological Psychiatry, 68(2), 118–124.

12.

Gillan

C. M.

Kosinski

Whelan

Phelps

E. A.

Daw

N. D.

(2016). Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. eLife, 5, Article e11305. https://doi.org/10.7554%2FeLife.11305

13.

Gradin

V. B.

Kumar

Waiter

Ahearn

Stickle

Milders

Reid

Hall

Steele

J. D.

(2011). Expected value and prediction error abnormalities in depression and schizophrenia. Brain, 134(6), 1751–1764.

14.

Halahakoon

D. C.

Kieslich

O’Driscoll

Nair

Lewis

Roiser

J. P.

(2020). Reward-processing behavior in depressed participants relative to healthy volunteers: A systematic review and meta-analysis. JAMA Psychiatry, 77(12), 1286–1295. https://doi.org/10.1001/jamapsychiatry.2020.2139

15.

Hershenberg

Satterthwaite

T. D.

Daldal

Katchmar

Moore

T. M.

Kable

J. W.

Wolf

D. H.

(2016). Diminished effort on a progressive ratio task in both unipolar and bipolar depression. Journal of Affective Disorders, 196, 97–100. https://doi.org/10.1016/j.jad.2016.02.003

16.

Huys

Q. J. M.

Maia

T. V.

Frank

M. J.

(2016). Computational psychiatry as a bridge from neuroscience to clinical applications. Nature Neuroscience, 19(3), 404–413.

17.

Huys

Q. J. M.

Pizzagalli

D. A.

Bogdan

Dayan

(2013). Mapping anhedonia onto reinforcement learning: A behavioural meta-analysis. Biology of Mood & Anxiety Disorders, 3(1), Article 12. https://doi.org/10.1186/2045-5380-3-12

18.

Kable

J. W.

Glimcher

P. W.

(2007). The neural correlates of subjective value during intertemporal choice. Nature Neuroscience, 10(12), 1625–1633. https://doi.org/10.1038/nn2007

19.

Kessler

R. C.

Chiu

W. T.

Demler

Walters

E. E.

(2005). Prevalence, severity, and comorbidity of 12-month DSM-IV disorders in the National Comorbidity Survey Replication. Archives of General Psychiatry, 62(6), 617–627.

20.

Kumar

Goer

Murray

Dillon

D. G.

Beltzer

M. L.

Cohen

A. L.

Brooks

N. H.

Pizzagalli

D. A.

(2018). Impaired reward prediction error encoding and striatal-midbrain connectivity in depression. Neuropsychopharmacology, 43(7), 1581–1588. https://doi.org/10.1038/s41386-018-0032-x

21.

Lally

Huys

Q. J. M.

Eshel

Faulkner

Dayan

Roiser

J. P.

(2017). The neural basis of aversive Pavlovian guidance during planning. Journal of Neuroscience, 37(42), 10215–10229.

22.

Lan

D. C. L.

Browning

(2022). What can reinforcement learning models of dopamine and serotonin tell us about the action of antidepressants? Computational Psychiatry, 6(1), 166–188.

23.

Lempert

K. M.

Pizzagalli

D. A.

(2010). Delay discounting and future-directed thinking in anhedonic individuals. Journal of Behavior Therapy and Experimental Psychiatry, 41(3), 258–264.

24.

Levy

Snell

Nelson

A. J.

Rustichini

Glimcher

P. W.

(2010). Neural representation of subjective value under risk and ambiguity. Journal of Neurophysiology, 103(2), 1036–1047.

25.

McGuire

J. T.

Kable

J. W.

(2012). Decision makers calibrate behavioral persistence on the basis of time-interval experience. Cognition, 124(2), 216–226.

26.

Michely

Eldar

Erdman

Martin

I. M.

Dolan

R. J.

(2022). Serotonin modulates asymmetric learning from reward and punishment in healthy human volunteers. Communications Biology, 5(1), Article 812. https://doi.org/10.1038/s42003-022-03690-5

27.

Moutoussis

Rutledge

R. B.

Prabhu

Hrynkiewicz

Lam

Ousdal

O.-T.

Guitart-Masip

Fonagy

Dolan

R. J.

(2018). Neural activity and fundamental learning, motivated by monetary loss and reward, are intact in mild to moderate major depressive disorder. PLOS ONE, 13(8), Article e0201451. https://doi.org/10.1371/journal.pone.0201451

28.

Mukherjee

Filipowicz

A. L.

Satterthwaite

T. D.

Kable

J. W.

(2020). Reward and punishment reversal-learning in major depressive disorder. Journal of Abnormal Psychology, 129(8), 810–823.

29.

Mukherjee

Kable

J. W.

(2014). Value-based decision making in mental illness: A meta-analysis. Clinical Psychological Science, 2(6), 767–782. https://doi.org/10.1177/2167702614531580

30.

Mukherjee

Lee

Kazinka

R. D.

Satterthwaite

Kable

J. W.

(2020). Multiple facets of value-based decision making in major depressive disorder. Scientific Reports, 10(1), Article 3415. https://doi.org/10.1038/s41598-020-60230-z

31.

Pike

A. C.

Robinson

O. J.

(2022). Reinforcement learning in patients with mood and anxiety disorders vs control individuals: A systematic review and meta-analysis. JAMA Psychiatry, 79(4), 313–322. https://doi.org/10.1001/jamapsychiatry.2022.0051

32.

Roepke

A. M.

Seligman

M. E. P.

(2016). Depression and prospection. British Journal of Clinical Psychology, 55(1), 23–48. https://doi.org/10.1111/bjc.12087

33.

Rupprechter

Stankevicius

Huys

Q. J. M.

Steele

J. D.

Seriès

(2018). Major depression impairs the use of reward values for decision-making. Scientific Reports, 8(1), Article 13798. https://doi.org/10.1038/s41598-018-31730-w

34.

Rutledge

R. B.

Moutoussis

Smittenaar

Zeidman

Taylor

Hrynkiewicz

Lam

Skandali

Siegel

J. Z.

Ousdal

O. T.

Prabhu

Dayan

Fonagy

Dolan

R. J.

(2017). Association of neural and emotional impacts of reward prediction errors with major depression. JAMA Psychiatry, 74(8), 790–797.

35.

Strunk

D. R.

Adler

A. D.

(2009). Cognitive biases in three prediction tasks: A test of the cognitive model of depression. Behaviour Research and Therapy, 47(1), 34–40.

36.

Sutton

R. S.

Barto

A. G.

(1998). Reinforcement learning: An introduction (Vol. 1). MIT Press.

37.

Tsypes

Szanto

Bridge

J. A.

Brown

V. M.

Keilp

J. G.

Dombrovski

A. Y.

(2022). Delay discounting in suicidal behavior: Myopic preference or inconsistent valuation? Journal of Psychopathology and Clinical Science, 131(1), 34–44. https://doi.org/10.1037/abn0000717

38.

van Geen

Gerraty

R. T.

(2021). Hierarchical Bayesian models of reinforcement learning: Introduction and comparison to alternative methods. Journal of Mathematical Psychology, 105, 102602.

39.

Vandendriessche

Demmou

Bavard

Yadak

Lemogne

Mauras

Palminteri

(2023). Contextual influence of reinforcement learning performance of depression: Evidence for a negativity bias? Psychological Medicine, 53(10), 4696–4706.

40.

Zhang

Szanto

Clark

Dombrovski

A. Y.

(2019). Behavioral empathy failures and suicidal behavior. Behaviour Research and Therapy, 120, Article 103329. https://doi.org/10.1016/j.brat.2018.10.019

Leveraging Decision Science to Characterize Depression

Abstract

Keywords

Decision-Making in Depression Across Domains

Modeling Reinforcement Learning in Depression

Future Directions

Recommended Reading

Footnotes

Transparency

ORCID iDs

References