Abstract
Background
Most common forms of dementia, including Alzheimer's disease, are associated with alterations in spoken language.
Objective
This study explores the potential of a speech-based machine learning (ML) approach in estimating cognitive impairment, using inputs of speech audio recordings.
Methods
We develop an automatic ML pipeline that ingests multimodal inputs of audio and transcribed text, mapping speech and language to domain-specific biomarkers optimized for high explainability and predictive ability. The resulting features are fed through a multi-stage pipeline to determine efficient classification configurations.
Results
We evaluated the system on large real-world datasets, achieving above 90% and 70% weighted average F1 scores for two-class (AD versus normal controls) and three-class (AD versus mild cognitive impairment versus normal controls) classification tasks, respectively. Model performance remains stable across different population characteristics.
Conclusions
The study introduces a robust, non-invasive method for gauging the cognitive status of AD and MCI patients from speech samples, with the potential of generalizing effectively to multiple types of diseases/disorders which may burden language.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
