Sage Journals: Discover world-class research

Abstract

Background

Speech abnormalities are recognized as early indicators of Alzheimer's disease (AD) and mild cognitive impairment (MCI).

Objective

To determine whether deep-learning models trained on mel-spectrograms of brief speech tasks can (i) discriminate individuals with MCI and AD from cognitively normal controls (NC) and (ii) estimate cognitive status with clinically useful accuracy.

Methods

Speech from 594 participants (185 NC, 231 MCI, 178 AD) was recorded through a mobile application that included 11 cognitive-linguistic tasks. Audio was converted into mel-spectrogram images and processed using a VGG16-based deep-learning model with transfer learning and fine-tuning of block 5. Task-specific feature vectors were extracted, concatenated, and used to train a deep neural network. The dataset was split into training, validation, and test sets (3:1:1), and five-split cross-validation was performed.

Results

The model demonstrated an overall accuracy of 72.4% in classifying NC from the abnormal group (MCI and AD), with sensitivity and specificity of 72.5% and 72.2%, respectively, a balanced accuracy of 72.4%, and an AUC of 0.997. In binary classifications, the model achieved 82.9% accuracy (balanced accuracy 82.9%, AUC 0.992) for NC versus AD, 70.7% accuracy (balanced accuracy 70.3%, AUC 0.956) for NC versus MCI, and 77.5% accuracy (balanced accuracy 78.9%, AUC 0.889) for MCI versus AD. Tasks such as serial subtraction, storytelling, and picture description contributed most to classification performance, indicating their effectiveness in capturing cognitive deficits.

Conclusions

Mel-spectrogram-based deep-learning analysis of speech shows promise as a rapid, non-invasive, and language-independent screening tool for early cognitive impairment, with potential advantages over traditional assessments such as the Mini-Mental State Examination.

Keywords

Alzheimer's disease deep learning diagnosis mel-spectrogram mild cognitive impairment speech

Get full access to this article

View all access options for this article.

References

Khan

Barve

Kumar

. Recent advancements in pathogenesis, diagnostics and treatment of Alzheimer's disease. Curr Neuropharmacol 2020; 18: 1106–1125.

Geda

. Mild cognitive impairment in older adults. Curr Psychiatry Rep 2012; 14: 320–327.

Tóth

Hoffmann

Gosztolya

, et al. A speech recognition-based solution for the automatic detection of mild cognitive impairment from spontaneous speech. Curr Alzheimer Res 2018; 15: 130–138.

Eyigoz

Mathur

Santamaria

, et al. Linguistic markers predict onset of Alzheimer's disease. EClinicalMed 2020; 28: 100583.

König

Satt

Sorin

, et al. Use of speech analyses within a mobile application for the assessment of cognitive impairment in elderly people. Curr Alzheimer Res 2018; 15: 120–129.

Roark

Mitchell

Hosom

, et al. Spoken language derived measures for detecting mild cognitive impairment. IEEE Trans Audio Speech Lang Process 2011; 19: 2081–2090.

Snowdon

Kemper

Mortimer

, et al. Linguistic ability in early life and cognitive function and Alzheimer's disease in late life. Findings from the Nun Study. JAMA 1996; 275: 528–532.

Szatloczki

Hoffmann

Vincze

, et al. Speaking in Alzheimer's disease: importance of changes in language abilities. Front Aging Neurosci 2015; 7: 195.

Kim

Chung

, et al. A comparison of speech features between mild cognitive impairment and healthy aging groups. Dement Neurocogn Disord 2021; 20: 52–61.

10.

Hoffmann

Nemeth

Dye

, et al. Temporal parameters of spontaneous speech in Alzheimer's disease. Int J Speech Lang Pathol 2010; 12: 29–34.

11.

Williams

Weakley

Cook

, et al. Machine learning techniques for diagnostic differentiation of mild cognitive impairment and dementia. In: Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013.

12.

Haider

De La Fuente

Luz

. An assessment of paralinguistic acoustic features for detection of Alzheimer's dementia in spontaneous speech. IEEE J Sel Top Signal Process 2019; 14: 272–281.

13.

Tóth

Gosztolya

Vincze

, et al. Automatic detection of mild cognitive impairment from spontaneous speech using ASR. In: Proceedings of INTERSPEECH 2015, September 6–10, 2015; Dresden, Germany, pp.2694–2698. DOI:10.21437/Interspeech.2015-568

14.

Zhou

Shan

Ding

, et al. Cough recognition based on mel-spectrogram and convolutional neural network. Front Robot AI 2021; 8: 580080.

15.

Litjens

Kooi

Bejnordi

, et al. A survey on deep learning in medical image analysis. Med Image Anal 2017; 42: 60–88.

16.

Luz

Haider

de la Fuente

, et al. Alzheimer's dementia recognition through spontaneous speech: The ADReSS Challenge. arXiv preprint arXiv:2004.06833. 2020.

17.

Morid

Borjali

Del Fiol

. A scoping review of transfer learning research on medical image analysis using ImageNet. Comput Biol Med 2021; 128: 104115.

18.

Ding

. Speech-based detection of Alzheimer's disease: a survey. Artif Intell Rev 2024; 57: 325.

19.

Sung

. Cognitive impairment classification prediction model using mel-spectrogram and convolutional neural networks. Electronics (Basel) 2024; 13: 3644.

20.

Morris

. The clinical dementia rating (CDR): current version and scoring rules. Neurology 1993; 43: 2412–2414.

21.

Reisberg

Ferris

de Leon

, et al. The global deterioration scale for assessment of primary degenerative dementia. Am J Psychiatry 1982; 139: 1136–1139.

22.

Ryu

Yang

. The Seoul Neuropsychological Screening Battery (SNSB) for comprehensive neuropsychological assessment. Dement Neurocogn Disord 2023; 22: 1–15.

23.

Lee

, et al. Development of the Korean version of the consortium to establish a registry for Alzheimer's disease assessment packet (CERAD-K): clinical and neuropsychological assessment batteries. J Gerontol B Psychol Sci Soc Sci 2002; 57: 47–P53.

24.

Lee

Hwang

, et al. Reliability and validity of the beck depression inventory-II among Korean adolescents. Psychiatry Investig 2017; 14: 30–36.

25.

McKhann

Drachman

Folstein

, et al. Clinical diagnosis of Alzheimer's disease: report of the NINCDS-ADRDA Work Group. Neurology 1984; 34: 939–944.

26.

Tierney

Fisher

Lewis

, et al. The NINCDS-ADRDA Work Group criteria for the clinical diagnosis of probable Alzheimer's disease: a clinicopathologic study of 57 cases. Neurology 1988; 38: 359–364.

27.

Petersen

Lopez

Armstrong

, et al. Practice guideline update summary: mild cognitive impairment. Neurology 2018; 90: 126–135.

28.

Nasreddine

Phillips

Bédirian

, et al. The Montreal cognitive assessment (MoCA): a brief screening tool for mild cognitive impairment. J Am Geriatr Soc 2005; 53: 695–699.

29.

Hannun

Case

Casper

, et al. Deep speech: scaling up end-to-end speech recognition. arXiv preprint arXiv:1412.5567. 2014.

30.

Kingma

. Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. 2014. DOI:10.48550/arXiv.1412.6980

31.

Mitchell

. A meta-analysis of the accuracy of the Mini-mental state examination in the detection of dementia and mild cognitive impairment. J Psychiatr Res 2009; 43: 411–431.

32.

Nelson

Fogel

Faust

. Bedside cognitive screening instruments: a critical assessment. J Nerv Ment Dis 1986; 174: 73–83.

33.

Simard

van Reekum

. Memory assessment in studies of cognition-enhancing drugs for Alzheimer's disease. Drugs Aging 1999; 14: 197–230.

34.

Lin

O'Connor

Rossom

, et al. Screening for cognitive impairment in older adults: an evidence update for the US Preventive Services Task Force. Rockville (MD): Agency for Healthcare Research and Quality (US), 2013.

35.

König

Satt

Sorin

, et al. Automatic speech analysis for the assessment of patients with predementia and Alzheimer's disease. Alzheimers Dement (Amst) 2015; 1: 112–124.

36.

Bozkurt

Germanakis

Stylianou

. A study of time-frequency features for CNN-based automatic heart sound classification for pathology detection. Comput Biol Med 2018; 100: 132–143.

37.

Jiang

Peng

Zhang

. Automatic snoring sound detection based on deep learning. Phys Eng Sci Med 2020; 43: 679–689.

38.

Srivastava

Jain

Miranda

, et al. Deep learning-based respiratory sound analysis for detection of chronic obstructive pulmonary disease. PeerJ Comput Sci 2021; 7: e369.

39.

Suppakitjanusant

Sungkanuparph

Wongsinin

, et al. Identifying individuals with recent COVID-19 through voice classification using deep learning. Sci Rep 2021; 11: 19149.

40.

Zakariah M

Ajmi Alotaibi

, et al. An analytical study of speech pathology detection based on MFCC and deep neural networks. Comput Math Methods Med 2022; 2022: 7814952.

41.

Toyoshima

Okada

Ishimaru

, et al. Multi-input speech emotion recognition model using mel spectrogram and GeMAPS. Sensors (Basel) 2023; 23: 1743.

42.

Shah

Sawalha

Tasnim

, et al. Learning language and acoustic models for identifying Alzheimer's dementia from speech. Front Comput Sci 2021; 3: 624659.

43.

Gosztolya

Vincze

Tóth

, et al. Identifying mild cognitive impairment and mild Alzheimer's disease based on spontaneous speech using ASR and linguistic features. Comput Speech Lang 2019; 53: 181–197.

44.

Luis

Keegan

Mullan

. Cross-validation of the Montreal cognitive assessment in community-dwelling older adults residing in the southeastern US. J Geriatr Psychiatry Neurol 2009; 22: 189–195.

45.

Tombaugh

McIntyre

. The Mini-mental state examination: a comprehensive review. J Am Geriatr Soc 1992; 40: 922–935.

46.

Ahmed

Haigh

AMF

de Jager

, et al. Connected speech as a marker of disease progression in autopsy-proven Alzheimer's disease. Brain 2013; 136: 3727–3737.

47.

Henry

Crawford

Phillips

. Verbal fluency performance in dementia of the Alzheimer's type: a meta-analysis. Neuropsychologia 2004; 42: 1212–1222.

48.

Weintraub

Wicklund

Salmon

. The neuropsychological profile of Alzheimer disease. Cold Spring Harb Perspect Med 2012; 2: a006171.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.02 MB

Deep-learning analysis of speech using mel-spectrograms for the assessment of mild cognitive impairment and Alzheimer's disease

Abstract

Background

Objective

Methods

Results

Conclusions

Keywords

Get full access to this article

References

Supplementary Material