Abstract
Reliability of clinical diagnoses is often low. There are many algorithms that could improve diagnostic accuracy, and statistical learning is becoming popular. Using pediatric bipolar disorder as a clinically challenging example, we evaluated a series of increasingly complex models ranging from simple screening to a supervised LASSO (least absolute shrinkage and selection operation) regression in a large (N = 550) academic clinic sample. We then externally validated models in a community clinic (N = 511) with the same candidate predictors and semistructured interview diagnoses, providing high methodological consistency; the clinics also had substantially different demography and referral patterns. Models performed well according to internal validation metrics. Complex models degraded rapidly when externally validated. Naive Bayesian and logistic models concentrating on predictors identified in prior meta-analyses tied or bettered LASSO models when externally validated. Implementing these methods would improve clinical diagnostic performance. Statistical learning research should continue to invest in high-quality indicators and diagnoses to supervise model training.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
