Abstract
Background
Speech abnormalities are recognized as early indicators of Alzheimer's disease (AD) and mild cognitive impairment (MCI).
Objective
To determine whether deep-learning models trained on mel-spectrograms of brief speech tasks can (i) discriminate individuals with MCI and AD from cognitively normal controls (NC) and (ii) estimate cognitive status with clinically useful accuracy.
Methods
Speech from 594 participants (185 NC, 231 MCI, 178 AD) was recorded through a mobile application that included 11 cognitive-linguistic tasks. Audio was converted into mel-spectrogram images and processed using a VGG16-based deep-learning model with transfer learning and fine-tuning of block 5. Task-specific feature vectors were extracted, concatenated, and used to train a deep neural network. The dataset was split into training, validation, and test sets (3:1:1), and five-split cross-validation was performed.
Results
The model demonstrated an overall accuracy of 72.4% in classifying NC from the abnormal group (MCI and AD), with sensitivity and specificity of 72.5% and 72.2%, respectively, a balanced accuracy of 72.4%, and an AUC of 0.997. In binary classifications, the model achieved 82.9% accuracy (balanced accuracy 82.9%, AUC 0.992) for NC versus AD, 70.7% accuracy (balanced accuracy 70.3%, AUC 0.956) for NC versus MCI, and 77.5% accuracy (balanced accuracy 78.9%, AUC 0.889) for MCI versus AD. Tasks such as serial subtraction, storytelling, and picture description contributed most to classification performance, indicating their effectiveness in capturing cognitive deficits.
Conclusions
Mel-spectrogram-based deep-learning analysis of speech shows promise as a rapid, non-invasive, and language-independent screening tool for early cognitive impairment, with potential advantages over traditional assessments such as the Mini-Mental State Examination.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
