Sage Journals: Discover world-class research

Abstract

The use of machine learning is increasing in clinical psychology, yet it is unclear whether these approaches enhance the prediction of clinical outcomes. Several studies show that machine-learning algorithms outperform traditional linear models. However, many studies that have found such an advantage use the same algorithm, random forests with the optimism-corrected bootstrap, for internal validation. Through both a simulation and empirical example, we demonstrate that the pairing of nonlinear, flexible machine-learning approaches, such as random forests with the optimism-corrected bootstrap, provide highly inflated prediction estimates. We find no advantage for properly validated machine-learning models over linear models.

Keywords

machine learning data mining prediction clinical psychology suicide

Get full access to this article

View all access options for this article.

References

Belsher

B. E.

Smolenski

D. J.

Pruitt

L. D.

Bush

N. E.

Beech

E. H.

Workman

D. E.

. . . Skopp

N. A.

(2019). Prediction models for suicide attempts and deaths: A systematic review and simulation. JAMA Psychiatry, 76(6), 642–651.

Breiman

(2001). Random forests. Machine Learning, 45(1), 5–32.

Burke

T. A.

Ammerman

B. A.

Jacobucci

(2018). The use of machine learning in the study of suicidal and non-suicidal self-injurious thoughts and behaviors: A systematic review. Journal of Affective Disorders, 245, 869–884.

Christodoulou

Jie

M. A.

Collins

G. S.

Steyerberg

E. W.

Verbakel

J. Y.

van Calster

(2019). A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology, 110, 12–22.

Dwyer

D. B.

Falkai

Koutsouleris

(2018). Machine learning approaches for clinical psychology and psychiatry. Annual Review of Clinical Psychology, 14, 91–118.

Fox

K. R.

Huang

Linthicum

K. P.

Wang

S. B.

Franklin

J. C.

Ribeiro

J. D.

(2019). Model complexity improves the prediction of nonsuicidal self-injury. Journal of Consulting and Clinical Psychology, 87, 684–692.

Franklin

J. C.

(2019). Psychological primitives can make sense of biopsychosocial factor complexity in psychopathology. BMC Medicine, 17(1), Article 187. https://doi.org/10.1186/s12916-019-1435-1

Freund

Schapire

R. E.

(1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.

Friedman

J. H.

(2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367–378.

10.

Hastie

Tibshirani

Friedman

(2009). The elements of statistical learning. Springer.

11.

Huang

Ribeiro

Franklin

(2020a). The differences between individuals engaging in nonsuicidal self-injury and suicide attempt are complex (vs. complicated or simple). Frontiers in Psychiatry, 11, Article 239. https://doi.org/10.3389/fpsyt.2020.00239

12.

Huang

Ribeiro

J. D.

Franklin

J. C.

(2020b). The differences between suicide ideators and suicide attempters: Simple, complicated, or complex? Journal of Consulting and Clinical Psychology, 88, 554–569.

13.

John

(2018, December 25). Optimism corrected bootstrapping: A problematic method. Intobioinformatics. https://intobioinformatics.wordpress.com/2018/12/25/optimism-corrected-bootstrapping-a-problematic-method/

14.

Kendler

K. S.

(2005). Toward a philosophical structure for psychiatry. American Journal of Psychiatry, 162(3), 433–440. https://doi.org/10.1176/appi.ajp.162.3.433

15.

Kendler

K. S.

(2019). From many to one to many—The search for causes of psychiatric illness. JAMA Psychiatry, 76, 1085–1091. https://doi.org/10.1001/jamapsychiatry.2019.1200

16.

Kessler

R. C.

McGonagle

K. A.

Zhao

Nelson

C. B.

Hughes

Eshleman

Wittchen

H.-U.

Kendler

K. S.

(1994). Lifetime and 12-month prevalence of DSM-III-R psychiatric disorders in the United States: Results from the National Comorbidity Survey. Archives of General Psychiatry, 51(1), 8–19. https://doi.org/10.1001/archpsyc.1994.03950010008002

17.

Matsuki

Kuperman

Van Dyke

J. A.

(2016). The random forests statistical technique: An examination of its value for the study of reading. Scientific Studies of Reading, 20, 20–33.

18.

Mitchell

S. D.

(2015). Explaining complex behavior. In Kendler

K. S.

Parnas

(Eds.), Philosophical issues in psychiatry: Explanation, phenomenology, and nosology (pp. 19–47). Johns Hopkins University Press.

19.

Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), Article aac4716. https://doi.org/10.1126/science.aac4716

20.

Ribeiro

J. D.

Huang

Fox

K. R.

Walsh

C. G.

Linthicum

K. P.

(2019). Predicting imminent suicidal thoughts and nonfatal attempts: The role of complexity. Clinical Psychological Science, 7, 941–957.

21.

Siddaway

A. P.

Quinlivan

Kapur

O’Connor

R. C.

De Beurs

(2020) Cautions, concerns, and future directions for using machine learning in relation to mental health problems and clinical and forensic risks: A brief comment on “Model complexity improves the prediction of nonsuicidal self-injury” (Fox et al., 2019). Journal of Consulting and Clinical Psychology, 88(4), 384–387.

22.

Steyerberg

E. W.

Bleeker

S. E.

Moll

H. A.

Grobbee

D. E.

Moons

K. G.

(2003). Internal and external validation of predictive models: A simulation study of bias and precision in small samples. Journal of Clinical Epidemiology, 56(5), 441–447.

23.

Tantithamthavorn

McIntosh

Hassan

A. E.

Matsumoto

(2016). An empirical comparison of model validation techniques for defect prediction models. IEEE Transactions on Software Engineering, 43(1), 118.

24.

van der Ploeg

Austin

P. C.

Steyerberg

E. W.

(2014). Modern modelling techniques are data hungry: A simulation study for predicting dichotomous endpoints. BMC Medical Research Methodology, 14, 1–13.

25.

Walsh

C. G.

Ribeiro

J. D.

Franklin

J. C.

(2017). Predicting risk of suicide attempts over time through machine learning. Clinical Psychological Science, 5(3), 457-469.

26.

Zou

Hastie

(2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society B: Statistical Methodology, 67(2), 301–320.

27.

Zuromski

K. L.

Bernecker

S. L.

Gutierrez

P. M.

Joiner

T. E.

King

A. J.

Liu

. . . Stein

M. B.

(2019). Assessment of a risk index for suicide attempts among US Army soldiers with suicide ideation: analysis of data from the Army Study to Assess Risk and Resilience in Servicemembers (Army STARRS). JAMA Network Open, 2(3), Article e190766. https://doi.org/10.1001/jamanetworkopen.2019.0766

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.23 MB

Evidence of Inflated Prediction Performance: A Commentary on Machine Learning and Suicide Research

Abstract

Keywords

Get full access to this article

References

Supplementary Material