Sage Journals: Discover world-class research

Abstract

Objectives

This study aims to fulfill the current research gap pertaining to the lack of a comprehensive systematic mapping study (SMS) regarding the effectiveness of artificial intelligence (AI) machine learning algorithms in the diagnosis of speech and language disorders (SLDs), and the extent to which such AI algorithms can automate this diagnostic process.

Methods

An SMS has been implemented following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines; 19,774 research papers were screened resulting in 70 studies meeting purposely designed inclusion and exclusion criteria.

Results

The findings revealed multiple research gaps including substantial divide in the application of AI machine learning algorithms for diagnosing SLDs, where 91.43% versus 8.57% of the studies relating to SLDs, respectively. This is further exacerbated by the absence AI machine learning algorithms for diagnosing prevalent language disorders, such as developmental language disorders in children. Furthermore, most AI machine learning algorithms for diagnosing SLDs are focused on binary classification of these disorders, for example, healthy and pathological voices, but not providing detailed diagnostics, such as the impaired aspects and contextual SLDs severity. Finally, AI machine learning algorithms have predominantly focused on partially automating the SLD assessment phase of the diagnostic process (76%) compared to those that have extended the automation to partially include the diagnosis determination phase (24%).

Conclusion

The effectiveness of AI machine learning algorithms in automating SLD diagnosis cannot be claimed without larger population datasets, highlighting a research gap for developing AI models to automate the four phases of the SLD diagnostic process and link them to treatment protocols in clinical settings.

Keywords

speech disorder language disorder AI algorithms systematic mapping study PRISMA SLD SLD diagnosis SLD severity SLD treatment

Introduction

Speech and language disorders (SLDs) affect significantly a variety of populations impacting their communication abilities, and resulting in negative long-term difficulties in academic performance, social well-being status, and mental health status; thus, impacting their quality of life.¹ According to the National Institute on Deafness and Other Communication Disorders, developmental language disorder (also known as specific language impairment) has a prevalence of 7% among children, while speech sound disorders (articulation disorders or phonological disorders) have a prevalence of 8–9% in young children.²

An informative assessment process of SLDs that results in highly accurate diagnosis should adhere to five key principles: validity, reliability, tailored to the individual patient, use variety of assessment modalities, and thoroughness. Usually, clinicians use a combination of methods to assess these types of disorders including norm-referenced tests which are standardized tests that allow the clinician to compare the patient's performance to a larger group that is the normative group. Criterion-referenced tests inform clinicians to reveal what a patient can do and cannot do based on a predefined criterion, whereas authentic assessment tests (though similar to criterion-referenced tests) depend on the assessment of the contextual assessment of these disorders. Although these methods of assessment are valuable for the diagnosis of SLDs, they have some limitations such as lack of individualization in the norm-referenced and criterion-referenced tests; however, contextual testing of these disorders is not representative of real-life situations as it only evaluates isolated skills without taking into account other contributing factors. Furthermore, authentic assessment can lack objectivity, reliability, efficiency, and validity, and therefore may require extensive planning and experience.³

Given these limitations, there was a growing need to develop and innovate diagnostic approaches that consider these limitations and improve the diagnostic process of SLDs in terms of efficiency, reliability, and validity. One of the responses to this need is the utilization of artificial intelligence (AI) techniques, which have been integrated in the diagnosis of SLDs for almost two decades. Artificial intelligence techniques have been increasingly utilized in various healthcare domains, including medical imaging, diagnostic services, virtual patient care, medical search and drug discovery, and patients’ rehabilitation.⁴ In the field of speech and language patient journey, AI has been increasingly utilized, and in particular in the form of incorporating automatic speech recognition systems in augmentative and alternative communication devices, and in the evaluation of disorders such as voice and speech issues in head and neck cancer patients.^5,6

However, several questions remain unanswered regarding the use of AI in diagnosing SLDs, including which disorders have been assessed and diagnosed using AI, which AI algorithms and techniques have been employed in the diagnosis process and diagnosis, and the performance and effectiveness of these algorithms and techniques.

Additionally, the current literature lacks a comprehensive systematic mapping study (SMS) regarding the effectiveness of AI machine learning algorithms in the diagnosis of SLDs, and the extent to which such AI algorithms can automate this diagnostic process. The literature includes one publication namely Brahmi et al.,⁷ among the 70 publications of the SMS conducted in this research as can be seen in “Discussion” section. Although this study identified speech disorders that were investigated in light of diagnosing and treating processes (combined), the Brahmi et al. included diseases that are not classified as a speech disorder, such as the Parkinson's disease,⁸ facial palsy,⁹ dysphagia,¹⁰ and aphasia.¹¹ Additionally, in the Brahmi et al. study, AI machine learning algorithms were not identified in relation to their role in the diagnosis, treatment, or treatment and diagnosis stages, but rather in general. Also, their study does not reflect on the extent of automating the diagnosis process of speech disorders. Finally, the Brahmi et al. study does not include systematic review of the role of AI machine language algorithms in the diagnosis of language disorders.

Therefore, this SMS aims to address these questions and thus to fulfill the research gap pertaining to the utilization of AI in the diagnosis of SLDs through a formal methodological review. This study reports on the different types of SLDs that have been assessed and diagnosed using AI and machine learning algorithms, and hence informing the effectiveness of these algorithms in the diagnosis of SLDs. Finally, this study evaluates the extent to which AI and machine learning algorithms have automated the task of diagnosing SLDs.

To achieve the present study, aim and objectives, a systematic mapping literature review has been conducted regarding the use and effectiveness of AI in the diagnosis of SLDs in accordance with the guidelines outlined in the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement for systematic reviews.¹²

Background

Speech disorder or speech impairment is a condition where a person encounters difficulty in producing sounds and that it affects communication with others.¹³ Speech disorders include the following:

Speech sound disorders: These are disorders that interfere with the ability to produce sounds clearly. The assessment of these disorders relies on guiding the client to speak a list of words, and the speech-language pathologist examines the client's oral production, and determines whether there were any errors in the production of different speech sounds. The decision is highly dependent on the therapist's experience, making the diagnostic process highly subjective.¹⁴ Speech sound disorders include the following categories¹³:

Articulation disorder: This relates to the difficulty of producing speech sounds because of the inaccurate speed, pressure, placement, timing, or the flow of movement of tongue, lips or throat.^15,16 However, the diagnosis of articulation disorder is very difficult and relies on the expertise of the speech-language pathologist.¹⁶

Phonological disorder: Rule-based errors that affect multiple speech sounds within the same class.¹⁷

Neuromotor speech disorders/motor speech disorders: Disorders characterized with the difficulties to move the muscles responsible for speech production because of muscle weakness or reduced coordination.¹⁸ These disorders could include the following:

Apraxia of speech/verbal apraxia: A disorder characterized with the difficulties in motor planning for speech production with the absence of muscle weakness and may be considered as developmental, it is known as Childhood Apraxia of Speech or Acquired.^3,18

Dysarthria: A disorder characterized with the difficulties of speech production due to the presence of muscle weakness. Dysarthria has multiple types according to the site of lesion including flaccid, spastic, unilateral upper motor neuron, hypokinetic, hyperkinetic, ataxic, and mixed dysarthria.^3,18 However, the examination of this disorder is subjective and is based on the therapist's perceptual analysis capabilities and experience. Thus, the diagnostic results can be inconsistent and unquantifiable.¹⁹

Fluency disorders: It is the interruption in the flow of speech that may include atypical disfluencies, atypical rhythm of speech, or atypical rate of speech.²⁰ Fluency disorders include the following:

Stuttering/ stammering: A disorder that causes interruption in the flow of speech because of certain types of disfluencies which may include repetitions, prolongations, or blocks.²⁰ Stuttering could be developmental, acquired neurogenic, or acquired psychogenic.²¹ Traditional assessment of stuttering depends on counting the speech disfluencies manually, such method of assessment is arbitrary, incoherent, lengthy, and error-prone.²²

Cluttering: A disorder that is characterized by abnormal rapid speech rate, excessive disfluencies, decreased awareness of the fluency problem, and other characteristics that result in decreased speech intelligibility and clarity.²⁰

Voice disorders: Disorders characterized by a change in voice quality, pitch, and loudness that are not aligned with the individual's age, gender, cultural background, or geographical location.²³ These disorders include the following types:

Organic voice disorder: Voice disorders that result from physical or structural changes in the vocal tract.²⁴

Neurogenic voice disorder: Voice disorders that result from problems within the central or peripheral nervous system innervation to the larynx.²⁴

Functional voice disorder: Voice disorders that resulted due to the improper or insufficient use of the vocal tract with the absence of any structural of neurological abnormalities in the vocal tract. These disorders include muscle tension dysphonia and psychogenic voice disorder.^24,25

However, voice disorders require automated diagnostic methods, as the current diagnostic procedures lack standard methods and equipment.²⁶

Language disorders are disorders manifested by the impaired comprehension, and/ or use of the spoken language, written, or any other language models.²⁷ Language disorders include the following:

Spoken language disorders: Disorders characterized by continuous difficulties in the acquiring and using of speaking and listening skills in any of the following language aspects: phonology, morphology, semantics, syntax, and pragmatics.²⁸ These disorders include the following types:

Developmental language disorders: Also known as Specific Language Impairment, it is when the spoken language disorder is a primary disability without the presence of a known cause or medical condition, and that persists at school age and beyond.²⁸

Language disorders associated with [condition]: A spoken language disorder that is categorized as a secondary disorder due to the presence of another condition or diagnosis such as Autism.²⁸

Acquired neurogenic language disorder/aphasia: A disorder that has resulted from a brain damage that is associated with language comprehension or production and that has resulted in the loss of language function. Aphasia has several types according to the damaged area in the brain, including the following: Broca's Aphasia, Transcortical Motor Aphasia, Isolation Aphasia, Global Aphasia, Wernicke's Aphasia, Conduction Aphasia, Transcortical Sensory Aphasia, and Anomic Aphasia.³ However, accurately diagnosing aphasia is difficult and is “error prone medical task” due to linguistic ambiguity and uncertainty, diversity, and dissension in the description of aphasic symptoms, and the large number of test items.²⁹

Overview of diagnosing SLDs

The general diagnostic process of SLDs goes through four primary phases as represented in Figure 1:

Patient history intake:

The first phase of the diagnostic process includes collecting comprehensive background information about the patient through multiple means, such as written case history forms, interviews with the patient or their caregiver, or gathering information from other professionals that are involved in the patient's care journey. Additionally, this phase helps the professionals to identify potential contributing factors to the disorders, such as neurological factors, medical conditions, age and gender, and psychological factors.³

Assessment of disorders:

This phase usually starts with a screening procedure to detect the presence or absence of speech or language disorder. After screening, multiple formal and informal diagnostic tests that are specialized to the suspected disorder will be performed by the professionals. Additionally, this phase includes analyzing the tests results to determine the features of the disorder, severity of the disorder, and the stimulability enabling professionals to fully understand the patient's condition.³

Diagnosis determination:

After analyzing the assessment results, professionals will interpret the findings to decide the most relevant diagnosis.³

Providing information:

After determining the diagnosis, professionals convey the findings to the patient or to their caregiver through different ways such as through information-giving session, or writing a detailed assessment report to document the whole diagnostic process, with writing recommendations for the patient's care and management.³

Figure 1.

General overview on the diagnostic process of speech and language disorders.

Methods

This study employs a SMS to explore the effectiveness of AI in the diagnosis of SLDs. As the primary objective of this paper is to address evidence pertaining to specific research topic rather than addressing detailed research questions, systematic mapping methodology has been applied to this study rather systematic literature reviews (SLRs).

Systematic mapping studies are a broader form of SLRs which utilize a comprehensive collection of primary studies to meticulously pinpoint any existing gaps within specific research areas, while also identifying potential trends for further investigation.³⁰

This SMS has been applied with adhering to the guidelines outlined by Petersen et al.³¹ and that encompasses five steps as depicted in Figure 2.

Figure 2.

The systematic mapping research process applied to artificial intelligence (AI) in the diagnosis of speech and language disorders.

Defining the research questions

This SMS presents a detailed investigation into the utilization and effectiveness of AI methods (and algorithms) in the diagnosis of SLDs. Additionally, the study explores the extent to which the reported AI machine learning algorithms have automated the process of diagnosing SLDs, also the effectiveness of such automation. Accordingly, the following questions have been formulated to guide the application of the adapted SMS in context of the above aim of this research study:

RQ1: What are the different types of SLDs that have been assessed and diagnosed using AI machine learning algorithms?

RQ2: What are the AI and machine learning algorithms that have been utilized in the diagnosis of SLDs?

RQ3: To what extent have the AI and machine learning algorithms automated the task of diagnosing SLDs?

RQ4: How effective are the AI machine learning algorithms (reported in answering RQ2) in the diagnosis of SLDs?

Defining search and keywording criteria

A literature search was conducted to gather all studies relevant to the role of AI in the diagnosis of SLDs. Since the topic of this paper covers the three fields of “speech disorders,” “language disorders,” and “artificial intelligence,” the derived keywords have been semantically articulated as shown in Table 1. The Boolean operators “AND” and “OR” were used in forming the search strings, for example: “Speech Disorder” AND (“Artificial Intelligence” OR “Machine Learning” OR “Deep Learning” OR “Neural Networks” OR “Convolutional Neural Networks” OR “Recurrent Neural Networks” OR “Clustering Algorithms” OR “Support Vector Machine” OR “Automatic Speech Recognition System” OR “Natural Language Processing” OR Automatic OR Automated).

Table 1.

Keywords used in the application of AI in speech and language disorders diagnosis in systematic mapping study.

Categories	Search keywords
Speech disorders	Speech Disorder – Speech Impairment– Speech Disability – Speech Sound Disorder – Articulation Disorder – Phonological Disorder – Apraxia of Speech – Developmental Apraxia of Speech – Childhood Apraxia of Speech – Acquired Apraxia of Speech – Dysarthria – Flaccid Dysarthria – Spastic Dysarthria – Mixed Dysarthria – Ataxic Dysarthria – Hypokinetic Dysarthria – Hyperkinetic Dysarthria – Unilateral Upper Motor Neuron Dysarthria – Neuromotor Speech Disorder – Motor Speech Disorder – Acquired Neurogenic Speech Disorder – Speech Affected by Cleft Lip and Palate – Speech Affected by Deaf and Hard of Hearing – Fluency Disorder – Stuttering – Stammering – Developmental Stuttering – Psychogenic Stuttering – Neurogenic Stuttering – Cluttering – Voice Disorder – Organic Voice Disorder – Neurogenic Voice Disorder – Functional Voice Disorder – Muscle Tension Dysphonia – Psychogenic Voice Disorder
Language disorders	Language Disorder – Language Impairment– Language Disability – Spoken Language Disorder – Developmental Language Disorder – Specific Language Impairment – Language Disorder Associated with Autism – Language Disorder Associated with Intellectual Disability – Language Disorder Associated with Down Syndrome – Language Disorder Associated with Sensory Impairment – Acquired Neurogenic Language Disorder – Aphasia – Broca's Aphasia – Transcortical Motor Aphasia – Isolation Aphasia – Global Aphasia – Wernicke's Aphasia – Conduction Aphasia – Transcortical Sensory Aphasia – Anomic Aphasia
Artificial intelligence	Machine Learning – Deep Learning – Neural Networks – Convolutional Neural Networks – Recurrent Neural Networks – Clustering Algorithms – Support Vector Machines – Automatic Speech Recognition Systems – Natural Language Processing – Automatic – Automated – Artificial Intelligence – AI

Identifying the research literature repositories and associated research papers

The electronic search space for this SMS was determined by identifying the most prominent repositories (or databases) that encompass the diverse interests in the field of applying AI in the diagnosis of SLDs. The literature search spaced included the following Scopus and CrossRef indexed research repositories:

PubMed https://pubmed.ncbi.nlm.nih.gov/

Mendeleyhttps://www.mendeley.com/

IEEE https://www.ieee.or

ACM https://www.acm.org/

ScienceDirect https://www.sciencedirect.com/

Furthermore, only journal papers that are indexed in Scopus and identified as being relevant to the domain of AI in the diagnosis of SLDs were included. These journals are published in the English language, published in the period 2000–2024, as it appears that in the year 2000, the application of AI in SLDs assessment was first reported in the literature.³² The following journals were included in the systematic mapping literature search:

Journal of Communication Disorders

Journal of Biomedical Informatics

International Journal of Speech-Language Pathology

International Journal of Medical Informatics

Computers in Biology and Medicine

Diagnostics

Journal of Intelligent Systems

Neurocomputing

Computer Speech and Language

IEEE Access

International Journal of Speech Technology

Speech Communication

The Journal of the Acoustical Society of America

International Journal of Advanced Computer Science and Applications

Digital Signal Processing: A Review Journal

Journal of voice

Journal of Signal and Information Processing

International Journal of Electrical and Computer Engineering

Engineering Applications of Artificial Intelligence

IEEE Transactions on Audio, Speech and Language Processing

Psychology research and behavior management

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Journal of Speech, Language, and Hearing Research (JSLHR)

Screening research papers

The selection process of the included publications in the SMS is with adherence to the PRISMA protocol depicted in Figure 3. After identifying the possible papers from the selected databases, the papers were exported to the Rayyan Software, which is a free web application designed to help researchers with the screening process for literature review projects.³³ Duplicated papers (through the Rayyan Software) were automatically removed before starting the screening process. After resolving the duplicates, papers were screened manually based on their title, abstract, and keywords. The screening proceeded based on the exclusion and inclusion criteria articulated below. Finally, the obtained papers were screened in accordance to the included journals in this SMS. A final set of 70 papers obtained that were highly relevant from different databases and journals.

Figure 3.

Protocol-based paper selection process.

Inclusion and exclusion criteria

Inclusion criteria

Inclusion criterion 1:

Articles published in journals, conferences, or symposiums

Inclusion criterion 2:

Articles that focus on incorporating AI and machine learning algorithms and approaches in SLDs assessment and diagnosis since the early 2000s, when natural language processing and speech signal processing started to aid cognitive linguistic assessments.³⁴

Exclusion criteria

Exclusion criterion 1:

Publications that focus only on the application of AI and machine learning algorithms in SLDs therapy and rehabilitation.

Exclusion criterion 2:

Publications that focus on diseases not related to SLDs.

Exclusion criterion 3:

Publications with private access

Exclusion criterion 4:

Publications not published in the English language.

Data coding and synthesis

In this SMS, data coding and synthesis was performed by multiple reviewers utilizing predesigned data coding templates that have outlined specific variables to be extracted from the 70 included publications. This approach has resulted in translating the extracted outcomes into the presented Tables A1–A7 in Appendix A. Reviewers coded data independently and have collaborated in post reviewing the outcomes. In order to ensure consistency, reviewers regularly cross-checked their work, and any discrepancies were discussed and resolved through consensus meetings.

Analysis of results

Overview of studies

A total of 50,696 papers were imported after utilizing the searching keywords in the preselected repositories. After removing the duplicates and applying the inclusion and exclusion criteria that have been outlined previously in the Screening Research Papers Section, 50,626 papers have been excluded, and has resulted in including 70 papers in this systematic mapping review study. Additionally, Table 2 demonstrates the number of publications with their corresponding year of publication and their percentages, which has shown that the publications in the scope of the role of AI in diagnosing SLDs had a significant increase since 2018, with a notable rise in 2020 and 2022, when the number of papers in the targeted scope has nine papers. Figure 4 represents a reflection of the increase and decrease in the number of publications related to the targeted subject over the years.

Figure 4.

Number of publications in the role of artificial intelligence (AI) in diagnosing speech and language disorders per year.

Table 2.

Publication tendency of the role of AI in diagnosing speech and language disorders, sorted by year of publication.

Year of publication	References	Number of publications	Percentage (%)
2001	³⁵	1	1.42
2003	³⁶	1	1.42
2006	¹⁶	1	1.42
2007	^37,38	2	2.85
2008	^29,39	2	2.85
2009	^40–43	4	5.71
2010	^44,45	2	2.85
2011	^46–48	3	4.28
2012	⁴⁹	1	1.42
2013	^50–52	3	4.28
2014	⁵³	1	1.42
2016	^54–56	3	4.28
2017	^19,26,57	3	4.28
2018	^58–60	3	4.28
2019	^61–64	4	5.71
2020	^{14,22,65–71}	9	12.85
2021	^72–77	6	8.57
2022	^78–86	9	12.85
2023	^87–93	7	10.00
2024	^7,94–97	5	7.14

Furthermore, Figure 5 demonstrates the classification of the publications in accordance to their sources. The publications were classified into three types: journals, conferences, symposiums. Journals accounted for 50.00% of the total publication types, representing the most common type of publications. While conference publications accounted for 47.14%, whereas symposiums publications were the least common source of publications and accounted for 2.85% of the total publication types.

Figure 5.

Types of publications included in the systematic mapping study.

Answering the research questions

RQ1: What are the different types of SLDs that have been assessed and diagnosed using AI and machine learning algorithms?

Table 3 represents the distribution of SLDs that have been identified across the 70 publications included in this systematic mapping review study, in which the AI and machine learning algorithms have been utilized in diagnosing and assessing these disorders. Voice disorders emerge as a predominant area in the targeted subject of this study compromising 50.00% of the included publications, with a total of 35 publications. Stuttering follows voice disorders, represented by 10 publications, accounting for 15.71% of the included publications. Dysarthria accounts for eight publications, corresponding to 11.42% of the included publications. Aphasia and speech sound disorder each contribute equally to the included publications, with six publications accounting for 8.57% of the total included studies. Articulation disorder and speech disorders in children with cleft lip and palate each represent two publications, corresponding to 2.85% of the total publications. Lastly, multiple speech disorders are addressed in an only one single publication, representing 1.42% of the total publications.

Table 3.

Number of publications per type of speech or language disorder.

Type of speech or language disorder	Number of publications	References	Percentage (%)
Voice disorder	35	^{35,36,37,40,42,44–48,49,53–56,26,58,59,65,69,70,74,76,77,81,83,85,87,89–91,92,95,97}	50.00
Articulation disorder	2	^16,41	2.85
Aphasia	6	^{38,29,50,55,57,71}	8.57
Dysarthria	8	^{39,19,61,66,73,82,88,96}	11.42
Speech disorders in children with cleft lip and palate	2	^43,64	2.85
Stuttering	11	^{51,52,22,67,68,72,78,79,86,93,94}	15.71
Speech sound disorder	6	^{60,62,63,14,75,80}	8.57
Review on multiple disorders (aphasia, dysarthria, speech sound disorder, stuttering, voice disorder)	1	⁷	1.42

It is evident that the AI and machine learning algorithms have been predominantly utilized in the assessment and diagnosis of speech disorders, particularly voice disorders, compared to language disorders. This can be attributed to the acoustical, quantifiable, measurable changes caused by the voice disorder, such as changes in the intensity, pitch, jitter, and shimmer acoustic measures, and that can be analyzed using AI machine learning algorithms facilitating the automated diagnosis of these disorders.⁹⁸ Conversely, certain SLDs, such as apraxia of speech, spoken language disorders, cluttering, and phonological disorder, are absent from the 70 included publications. The absence of these disorders, in addition to the decreased focus on utilizing AI machine learning algorithms in the diagnosis of other disorders present within the included publications may attributed to the complexity of these disorders and possibly the current limitations of AI implementation, as these disorders involve complex interactions between cognitive, motor, and linguistic processes; and thus making the duty of identifying the diagnostic markers of these disorders more challenging for automating the process of diagnosing these disorders.⁹⁹ Additionally, the symptoms of these disorders are highly variable and are individually based, making it more challenging to develop a generalizable AI model to diagnose these disorders.⁹⁹

RQ2: What are the AI and machine learning algorithms that have been utilized in the diagnosis of SLDs?

AI machine learning algorithms are classified based on the learning approach into two main categories: primary approaches and hybrid approaches. Primary approaches include supervised learning that is used when the data are in the form of input variables and the output target value such as support vector machine, decision trees, and random forest (RF). This learning algorithm learns how to map the function from the input to the output. Additionally, primary approaches include unsupervised learning that is used when the data are present only in the form of an input without the availability of their corresponding output value such as fuzzy role-based approaches, and k-means clustering. This learning algorithm learns the pattern in the data to know about its characteristics. Furthermore, primary approaches include reinforcement learning that is used when the aim is to make a series of decisions that will lead to a final reward such as Q-learning and proximal policy optimization.

While hybrid approaches include three main learning approaches: (1) Semisupervised learning that combines supervised and unsupervised learning, (2) Self-training learning that is a form of unsupervised learning where the training data are automatically labeled, and (3) Self-taught learning that is used when there are labeled and unlabeled data, but the unlabeled data doesn’t belong to the same classes as the labeled data.^100,101

Tables A1–A7 in Appendix A provide a brief articulation of the AI machine learning algorithms that have been reported to have been implemented within the 70 SMS output publications, in relation to the diagnosis of SLDs.

The essential AI machine learning algorithms that have been utilized in the diagnosis of SLDs are briefly discussed and reflected on below:

Voice disorders:

Support vector machines: This algorithm has been used in distinguishing between normal (healthy) voice and pathological voice.^44,45,47

Neural networks: Multilayer perceptron (MLP) neural networks have been widely being used in the recognition process of pathological voices and classifying speech as normal or pathological voice.⁴⁰ Additionally, feedforward convolutional networks have been utilized in classifying specific types of voice disorders, such as hyperfunctional dysphonia, phono-trauma, vocal fold palsy/paralysis, and laryngeal neoplasm.⁸⁹

Hidden Markov Models: These models have been used in the recognition process of voice pathologies, and they are often used in combination with support regression or deep neural networks.^42,55

Aphasia:

Fuzzy Rule-Based Approaches: Models of the fuzzy rule-based approaches have been applied in aphasia diagnosis for distinguishing different types of aphasia, such as hierarchical fuzzy rule-based systems and adaptive neuro-fuzzy inference systems.^38,29

Convolutional neural networks: Convolutional neural networks (CNNs) have been used in assessing speech impairments in aphasic patients.⁷¹

Gaussian mixture models and deep neural networks: These models have been utilized to recognize aphasic speech for clinical assessment purposes.⁵⁵

Dysarthria:

Recurrent neural networks: Long short-term memory (LSTM) networks, which are type of the recurrent neural networks, have been widely applied to detect dysarthric speech to distinguish between healthy and dysarthric speech samples.⁶¹

Hybrid models: The combination of CNNs and the LSTM to detect dysarthric speech.⁸²

Stuttering:

Deep learning models: Bidirectional LSTM (bi-LSTM) and deep residual networks (ResNet) were used to detect and classify stuttering events, such as repetitions and prolongations.^22,67 Additionally, CNNs have been used to identify different types of stuttering events.^86,93

Hybrid models: Pipelined deep learners, which combine convolutional auto-encoders with dual classifiers, have been used in the detection and the classification of stuttering events.⁹⁴

Speech sound disorders:

Support vector machine, RF, and gradient boosting: These algorithms have been used to detect correct and incorrect pronunciations in children's speech to diagnose speech sound disorders.⁶⁰

Siamese networks: These networks have been utilized in evaluating pronunciation correctness to diagnose speech sound disorders.⁶³

Articulation disorder:

Support vector machine and MLP neural networks: These algorithms have been utilized for classifying speech sound errors, especially those related to the substitution process to diagnose the articulation disorder.^16,41

Speech disorders in children with cleft lip and palate:

Support vector machine: This algorithm has been utilized to detect phonetic errors in children with cleft lip and palate.⁴³

RQ3: To what extent have the AI and machine learning algorithms automated the task of the diagnosis of SLDs?

The investigation into the role of AI machine learning algorithms in diagnosing SLDs revealed that the full automation in this process, where human involvement is entirely or fully automated, has not been achieved yet. According to the review of the 70 included publications in this SMS study and outlined in Tables A1–A7 in Appendix A, most of the applied AI machine learning algorithms have primarily been automating the screening procedure which is part of the second phase (assessment) of the diagnostic process of SLDs, to discriminate between healthy and disordered individuals. Figure 6 represents the distribution of publications in relation to the automation phase of the diagnostic process of SLDs. It has been revealed that around 76% of the publications included in this SMS, equivalent to 53 publications, have been partially automating the second phase (assessment) of the diagnostic process. This partial automation was achieved through the automation of the screening procedure, the assessment procedure itself, or the analysis of the responses provided by the patients. Although the full automation of the second phase would facilitate the diagnostic process expediting the assessment procedures and saving time in the analysis of the patient responses.

Figure 6.

Distribution of publications in relation to the automation phase of the diagnostic process of speech and language disorders.

Also, around 24% of the included publications, which is equivalent to 17 publications, have been partially automating the second phase (assessment), alongside the partial automation of the third phase (diagnosis determination). The partial automation of the third phase has been achieved in certain disorders such as voice disorders, where AI machine learning algorithms automatically classify certain vocal pathologies into subtypes such as vocal fold cysts, unilateral vocal fold paralysis, and vocal fold polyps.²⁶ Similarly, in the case of aphasia, AI machine learning algorithms have been reported to categorize primary progressive aphasia into two subtypes, namely: semantic dementia (SD) and progressive nonfluent aphasia (PNFA).⁵⁰ Although the full automation of the third phase would reduce human bias and variability in decision-making, thus ensuring consistency and objectivity in the diagnostic process.

Furthermore, this SMS has revealed that none of the 70 included publications have included the automation of the first phase (history of the patient) and the fourth phase (providing information) of the diagnostic process of SLDs. Although the automation of the first phase would contribute significantly to the diagnostic process by saving time effectively through the elimination of manual data entry and processing, and reducing potential errors in the documentation of this phase. Additionally, fully automating the fourth phase would optimize time management by automatically generating patient assessment reports and treatment plans, thus facilitating the immediate documentation and communication of the diagnostic results to the patient and their caregivers.

RQ4: How effective are AI and machine learning algorithms in the diagnosis of SLDs?

Tables A1–A7 in Appendix A report on the effectiveness of AI machine learning algorithms reported in the 70 publications in this SMS for the diagnosis of SLDs. The effectiveness was assessed using common AI algorithms’ performance metrics:

Accuracy: The ability of the AI machine learning algorithm to identify both healthy and disordered patients among all cases, which is calculated by dividing the sum of true positives and true negatives over the total number of cases.¹⁰²

Sensitivity: The ability of the AI machine learning algorithm to identify the disordered patients correctly, which is calculated by dividing the true positives over the sum of true positives and false negatives.¹⁰²

Specificity: The ability of the AI machine learning algorithm to identify the healthy cases correctly, which is calculated by dividing the true negatives over the sum of true negatives and false positives.¹⁰²

However, as reported in Tables A1–A7 in Appendix A, many of the 70 included publications haven’t reported the performance of the AI machine learning algorithm using the previously mentioned metrics. Otherwise, they have used other metrics such as area under the curve, correlations, and equal error rate.

The following reflects the top three effective AI machine learning algorithm per type of speech or language disorder that have been assessed using the performance metrics:

Voice disorders:

Gaussian mixture models: Achieved an accuracy, sensitivity, and specificity percentages of 100% in study³⁵ to identify pathological voices from health voices using the Massachusetts Eye and Ear Infirmary (MEEI) Database which has been developed by the MEEI Voice and Speech Lab, and that contains more than 1400 recorded voice samples.²⁶

Support vector machine: Achieved an accuracy, sensitivity, and specificity percentages of 99.96% in²⁶ study to identify pathological voices affected by vocal fold cysts, unilateral vocal fold paralysis, and vocal fold polyps from healthy voices using the MEEI Database.

Multilayer perceptron neural networks: To identify pathological voices from healthy voices, this AI machine learning algorithm has achieved in⁴⁰ study an accuracy percentage of 96.3%, sensitivity percentage of 99%, and specificity percentage of 82% when using running speech. Additionally, it has achieved an accuracy percentage of 93.8%, sensitivity percentage of 97%, and specificity percentage of 78% when sustained vowels.

Aphasia:

Adaptive neuro-fuzzy inference system: Achieved an accuracy percentage of 94.6% in the diagnosis of anomic, global, Broca's, and Wernicke's aphasia in²⁹ study, using 265 Aachen Aphasia Test (AAT) profiles of aphasic patients. However, sensitivity and specificity percentages weren’t reported.

Hierarchical fuzzy rule-based: Achieved an accuracy percentage of 91.30% using spontaneous speech data, and 93.61% using comprehensive language test data in³⁸ study, using 265 AAT profiles of aphasic patients. However, sensitivity and specificity percentages weren’t reported.

Random forest: Achieved an accuracy percentage of 84.85% to identify the presence or the absence of primary progressive aphasia, and 75.59% to classify two types of the primary progressive aphasia (SD and PNFA) in⁵⁰ study. However, sensitivity and specificity percentages weren’t reported.

Dysarthria:

Hybrid CNNs and LSTM: This model has achieved an accuracy percentage of 99.59% in⁸² study to detect the presence of dysarthric speech using the TORGO Dataset which contains audio recordings from patients with cerebral palsy (CP) and amyotrophic lateral sclerosis.⁸² However, sensitivity and specificity percentages weren’t reported.

Support vector machine with wac2vec 2.0 model: Achieved an accuracy percentage of 93.95% in study to detect the presence of dysarthric speech, with a sensitivity of 93%, and a specificity of 95% using the UA-Speech Database that contains audio recordings from dysarthric patients with CP and healthy cases.⁸⁸

Convolutional neural network with wav2vec2-LARGE-2 feature: Achieved an accuracy percentage of 95.79% in⁹⁶ study, with a sensitivity of 94%, and a specificity of 98% to detect dysarthric patients from healthy cases using the UA-Speech Database.

Stuttering:

Deep learning Bi-LSTM: Achieved in²² study an accuracy percentage of 98.67% to detect disfluent speech (prolongation), with a sensitivity of 97.5%, and a specificity of 98.12%. While detecting the disfluent speech (syllable repetition) has achieved an accuracy percentage of 97.5%, and a sensitivity of 92.5%, and a specificity of 98.12%. Furthermore, an accuracy percentage of 97.19% was achieved to detect disfluent speech (word repetition), with a sensitivity of 87.5%, and a specificity of 99.37%. Additionally, achieved an accuracy percentage of 97.67% to detect disfluent speech (phrase repetition), with a sensitivity of 97.5%, and a specificity of 96.87%. Lastly, achieved an accuracy percentage of 97.33% to detect fluent speech, with a sensitivity of 90%, and a specificity 98.75%. University College London Archive of Stuttered Speech Dataset (UCLASS) was used in this study.

Support vector machine: Achieved an accuracy percentage of 96.67% ± 0.37% in⁵¹ study to classify speech disfluencies (repetitions and prolongations) using the UCLASS. However, sensitivity and specificity percentages weren’t reported.

Deep residual network (ResNet) and Bi-LSTM: Achieved in study⁶⁷ an accuracy percentage of 84.10% to classify disfluent speech (sound repetition), 96.60% to classify disfluent speech (word repetition), 95.54% to classify disfluent speech (phrase repetition), 97.14% to classify disfluent speech (revision), 81.40% to classify disfluent speech (interjections), and 94.08% to classify disfluent speech (prolongations) with an average classification accuracy of 91.15%. However, sensitivity and specificity percentages weren’t reported. Additionally, UCLASS was used in this study.

Speech disorders in children with cleft lip and palate:

Automatic detection based on time-domain waveform difference analysis: This model in⁶⁴ study has achieved an accuracy percentage of 88.9%, with a sensitivity of 70.89%, and a specificity of 91.86% to automatically detect consonant omissions in cleft palate children.

Hybrid approach: A combination of machine learning algorithms reported in Appendix A5, the study⁴³ has been utilized to develop a semiautomatic and fully automatic evaluation for five common speech disorders (hypernasality in vowels, nasalized consonants, pharyngealization, laryngeal placement, weakened pressure consonants) in children with cleft lip and palate. The study has reported for the semiautomatic evaluation an accuracy percentage, ranging from 94.2% to 99.6% at the frame level to detect these common speech disorders, with a sensitivity that ranged from 56.8% to 71.1%. While at the phoneme level, the accuracy percentage ranged from 95.6% to 99.6%, with sensitivity from 62.9% to 76.9%. Additionally, at the word level, the accuracy percentage ranged from 82.5% to 98.2%, with sensitivity from 60.6% to 75.8%. Regarding the fully automatic evaluation, the accuracy percentage at the frame level ranges from 94.2% to 99.1%, with sensitivity from 52.6% to 71.1%. Moreover, at the phoneme level, the accuracy percentage ranged from 88.5% to 99.6%, with sensitivity from 56.9% to 76.9%. Lastly, at the word level, the accuracy percentage ranged from 68.6% to 98.2%, with sensitivity from 50.7% to 75.8%. (The study didn’t report the performance using the specificity measure).

Articulation disorder:

Support vector machine: Achieved an overall accuracy percentage of 88.89% in¹⁶ study to detect the presence of articulation disorder for those with substitution errors only. However, sensitivity and specificity percentages weren’t reported.

Multilayer perceptron neural network: Achieved an overall accuracy percentage of 79.86% in⁴¹ study to detect the presence of articulation disorder for those with substitution errors only. However, sensitivity and specificity percentages weren’t reported.

Speech sound disorders:

Siamese recurrent network with bidirectional gated recurrent units: Achieved an accuracy percentage of 94.1% in study⁶³ to detect the presence of speech sound disorder in children by classifying the correctness of the pronunciation, with a sensitivity of 91% using the CUChild127 speech corpus, which contains speech data from 1500 Cantonese-speaking children aged 3–6 years. However, the specificity percentage wasn’t reported.

Siamese recurrent network with gated recurrent units: Achieved an accuracy percentage of 93.3% in study⁶³ to detect the presence of speech sound disorder in children by classifying the correctness of the pronunciation, with a sensitivity of 90% using the CUChild127 speech corpus, which contains speech data from 1500 Cantonese-speaking children aged 3–6 years. However, the specificity percentage wasn’t reported.

Blending situation awareness with machine learning: Achieved an accuracy percentage of 92.5% in study⁶⁰ in classifying children's speech as correct or incorrect in concordance with the therapist evaluation to detect speech sound disorder. However, sensitivity and specificity percentages weren’t reported.

Discussion

The findings of this SMS regarding the utilization of AI machine learning algorithms in diagnosing SLDs have identified multiple research gaps in the current research landscape. Out of the 70 reviewed publications, 64 publications were focusing on speech disorders, and only 6 publications were addressing language disorders. Accordingly, this informs that there is substantial divide in the application of AI machine learning algorithms in diagnosing SLDs, where 91.43% of the reported research relates to speech disorders, compared to 8.57% in relation to language disorders. Such a finding is further exacerbated by the absence of applying AI machine learning algorithms in diagnosing prevalent language disorders, such as developmental language disorders that affect many children,² and this may be related to the complexity of the signals that is being analyzed by the AI machine learning algorithms as diagnosing language disorders relies on assessing multiple aspects such as pragmatics, phonology, semantics, syntax, and morphology with taking into consideration the different developmental milestones of the assessed patient. While diagnosing speech disorders relies more on analyzing the acoustic features of the signals.

Also, this study revealed that most of the utilized AI machine learning algorithms in diagnosing SLDs are being focused on the binary classification of SLDs, such as classifying the voice sample into healthy voice and pathological voice,^35,44,47 where such classification is often possible through traditional methods like auditory perception by trained professionals. A more useful application of AI in diagnosing SLDs would involve AI models capable of providing detailed diagnostics, such as the impaired aspects and the contextual severity of the disorder itself.

Furthermore, this study revealed that the generalizability of findings in relation to efficiency of AI machine learning algorithms in diagnosing SLDs have not been objectively achieved, as it is subjectively limited by a number of factors. Many AI models have been trained and tested using datasets that are specific to certain languages, dialects, age groups, and/or disorder subtypes, which limits the applicability of the AI models to larger broader populations.^{16,41,42,55,56,76,95} For instance, in one study,⁴⁹ the AI model was trained on six subtypes of voice disorders (vocal fold cysts, laryngopharyngeal reflux disease, spasmodic dysphonia, sulcus vocalis, vocal fold nodules, and vocal fold polyps) to detect the presence of voice disorder, which limits the underlying AI model in detecting other voice disorders. Additionally, this SMS has included a wide range of SLDs that differ significantly by their diagnostic characteristics and their evaluation details. Thus, it may limit the applicability of our findings to all SLDs. Future research work would be recommended to be focused on targeted SLDs. Alongside, a recent research paper investigated the role of an AI model (ChatGPT) in supporting speech-language therapists. Although the study showed high accuracy in tasks such as writing assessment reports, and in clinical decisions, the study identified important risks such as linguistic limitations, and the absence of real-world clinical validation.¹⁰³ Thus, these results are in line with our observation, in the diagnosis of SLDs, differences in language, dialect, and culture affect the generalizability of AI models in the diagnostic process.

Moreover, our current review did not include a classical and detailed quality assessment to complement SMS methodology in SLD context, we have incorporated elements of quality evaluation indirectly. This is evidenced in Tables A1–A7, where we provide information on sample size within the renamed “Population: Sample Size Diversity” column and assessed the transparency of key performance metrics—such as accuracy, sensitivity, and specificity—under the renamed columns titled “Performance Analysis: Reported Metrics” and “Performance Analysis: Missed Metrics” in the context of the research questions. These aspects were rigorously reviewed and assessed by the two independent reviewers, with outcomes’ consensus reached and documented in the Tables A1–A7. This approach was intended to ensure a level of methodological scrutiny and transparency without critically affecting the limited number of growing research articles in the field of AI-assisted SLD diagnosis and underlying research synthesis process, while recognizing the diverse and emerging nature of research in AI for SLDs.

From a broader perspective and as future research directive, adopting a more formal quality assessment framework, such as a tailored version of tools like CASP¹⁰⁴ is anticipated to reinforcing the domain-specific criteria, undertaken in this research. This would allow researchers and practitioners to distinguish the varying methodological rigor and potential biases, toward more critical understanding and reflective implications for employing AI tools in SLD most effectively.

Additionally, this review may benefit significantly from adding a comparative analysis to the descriptive analysis of the 70 included publications in this work to gain deeper and strengthened insights into our conclusions. In addition, this review has included only English language publications which may have induced a linguistic bias in our findings. Thus, expanding the inclusion criteria to include publications from diverse languages would enhance the generalizability and the applicability of results in future work. Furthermore, while this review has included an internal protocol to guide our methodological approach, the absence of a preregistration in a public external registry such as PROSPERO¹⁰⁵ and OSF¹⁰⁶ remains a limitation. However, the transparency and the replicability of this SMS are not considered compromised. Finally, although the data coding in this work was performed primarily by multiple reviewers using coding templates with routine cross-checks, incorporating inter-rater reliability statistical measures would further improve reproducibility and transparency.

Finally, while detailing all algorithms referenced across the 70 publications presents significant challenges—particularly because our systematic review encompasses a broad range of SLDs rather than focusing on a specific condition—we believe it is more impactful to emphasize the most effective AI and machine learning techniques tailored to each specific SLD. This approach will not only enhance the clarity and relevance of our discussion but also provide a clearer pathway for future research. Moving forward, we plan to incorporate these insights into our subsequent studies, designing diagnostic tools that encompass the most promising AI algorithms aligned with targeted SLDs, thereby advancing precision and effectiveness in clinical applications with reference to highly effective AI machine learning algorithms.

Conclusion

This study has addressed the research gap in the provision of a comprehensive systematic study of the literature in relation to the effectiveness and extent of automation of AI machine learning algorithms, the diagnostic process of SLDs with a detailed emphasis on the role of these algorithms in diagnosis of specific respective disorders. This has resulted in 70 publications that have met the inclusion and exclusion criteria established in the research design of the SMS process conducted in this research. Furthermore, this study contributes to the SLDs field of knowledge a comprehensive articulation of the AI machine learning algorithms that have been utilized in the diagnosis of SLDs that have been reported since AI was utilized in this field of specialization since 2000, along with respective type of AI algorithms versus specific SLD, the datasets used in the testing and training of those AI algorithms, their effectiveness using AI algorithms metrics such as accuracy, specificity, and sensitivity. Finally, this SMS study reported on the extent of automation the speech and language process of disorders diagnosis.

As this study has confirmed the existence of a substantial divide in the application of AI machine learning algorithms in diagnosing SLDs, of 91.43% versus 8.57% of research output in SLDs, respectively. Accordingly, several questions that are ought to be addressed. First, what are the current shortcomings in AI machine learning algorithms that limit attempts to implement more research into adopting AI machine learning algorithms for the automatic diagnosis of language disorders and especially developmental language disorders that affect many children? Second, more research needs to be carried out to inform whether funding of AI-based research for language disorders is one key substantial contributor to this substantial divide between SLDs diagnosis using AI machine language algorithms.

Furthermore, this study confirmed that the focus in the application of AI machine learning algorithms has been primarily on the partial automation of the second phase (assessment) of the diagnostic process of SLDs, accounting for around 76% of the included publications. Additionally, around 24% of the included publications have extended the automation to include the partial automation of the third phase (diagnosis determination). Hence, this directs the finding of a further research gap for the need of developing AI models that are capable of automating the whole diagnostic process of SLDs encompassing all four phases, with post-linkages to treatment protocols.

Finally, the findings of this study have highlighted the top-performing AI machine learning algorithms in relation to their roles in the diagnostic process of SLDs. However, a significant gap remains in reporting the performance of these AI machine learning algorithms using the common performance metrics, particularly sensitivity, specificity, and accuracy. Therefore, this SMS study and its associated processes have highlighted a broader challenge in the field and more specifically: the lack of standardized and structured quality assessment protocols tailored to AI research applied in clinical contexts, like diagnosis of SLDs. Therefore, the lack of such standardized protocols in SLD context hinders the formal qualitative evaluation of research studies’ robustness, reproducibility, and comparability necessary for evidence-based practice. This underscores the urgent need for developing AI-based SLD study standardized quality assessment protocols that explicitly incorporate measures for data collection quality, metrics for formal AI algorithms transparency reporting, ethical adherence measures, performance transparency reporting standards, etc.

As a corollary from this SMS, we are working on the development of formal structured quality assurance protocols for AI algorithms and tools development and deployment in SLD journey context, aimed at paving the way for facilitating rapid and standardized regulatory approval processes, and hence, the betterment of stakeholders’ confidence in support of more effective clinical implementation.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076251379769 - Supplemental material for The role of AI in the diagnosis of speech and language disorders: A systematic mapping study

Supplemental material, sj-docx-1-dhj-10.1177_20552076251379769 for The role of AI in the diagnosis of speech and language disorders: A systematic mapping study by Dina Tbaishat, Rand Al-Shafei and Mohammed Odeh in DIGITAL HEALTH

Footnotes

Acknowledgements

The authors are grateful to Zayed University, Dubai, UAE, for funding and supporting this research, and to Global Academy for Digital Health for research support.

ORCID iD

Dina Tbaishat

Contributorship

DT and MO contributed to conceptualization and supervision. All authors contributed to the methodology, investigation, and validation of the study. RS and MO were responsible for the software, while RS also handled data curation, formal analysis, and writing the original draft. DT oversaw project administration and secured funding. Both DT and MO contributed to reviewing and editing the manuscript.

Funding

The authors express their gratitude to Zayed University for supporting this study through the Research Incentive Fund (RIF Grant No. 23010), with DT as the principal investigator.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Supplemental material

Supplemental material for this article is available online.

References

Langbecker

Snoswell

Smith

, et al. Long-term effects of childhood speech and language disorders: a scoping review. S Afr J Child Educ 2020; 10. Epub ahead of print 25 September 2020. DOI: 10.4102/sajce.v10i1.801.

NIDCD . Quick statistics about voice, speech, language, https://www.nidcd.nih.gov/health/statistics/quick-statistics-voice-speech-language (accessed 23 October 2024).

Shipley

McAfee

. Assessment in speech-language pathology: a resource manual. San Diego: Plural Publishing, 2019.

Al Kuwaiti

Nazer

Al-Reedy

, et al. A review of the role of artificial intelligence in healthcare. J Pers Med 2023; 13: 951.

Maier

Haderlein

Stelzle

, et al. Automatic speech recognition systems for the evaluation of voice and speech disorders in head and neck cancer. EURASIP J Audio Speech Music Process 2010; 2010: 1–7.

Murero

Vita

Mennitto

. Artificial Intelligence for Severe Speech Impairment: Innovative approaches to AAC and Communication. Naples, 2020.

Brahmi

Mahyoob

Al-Sarem

, et al. Exploring the role of machine learning in diagnosing and treating speech disorders: a systematic literature review. Psychol Res Behav Manag 2024; 17: 2205–2232.

National Institute on Aging . Parkinson’s disease: causes, symptoms, and treatments, https://www.nia.nih.gov/health/parkinsons-disease/parkinsons-disease-causes-symptoms-and-treatments (accessed 24 October 2024).

Bell’s palsy - Symptoms and causes – Mayo Clinic, https://www.mayoclinic.org/diseases-conditions/bells-palsy/symptoms-causes/syc-20370028 (accessed 24 October 2024).

10.

Dysphagia - Symptoms and causes - Mayo Clinic, https://www.mayoclinic.org/diseases-conditions/dysphagia/symptoms-causes/syc-20372028 (accessed 24 October 2024).

11.

Aphasia: causes, symptoms & treatment, https://my.clevelandclinic.org/health/diseases/5502-aphasia (accessed 21 August 2025).

12.

Page

McKenzie

Bossuyt

, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Br Med J 2021; 372: n71.

13.

Speech impairment: types, signs & causes, https://my.clevelandclinic.org/health/diseases/speech-impairment (accessed 23 October 2024).

14.

S-I

Lee

. Automatic detection of phonological errors in child speech using siamese recurrent autoencoder. In: Interspeech 2020 , 2020, pp.4476–4480: ISCA.

15.

NIDCD . Articulation disorder, https://www.nidcd.nih.gov/glossary/articulation-disorder (accessed 23 October 2024).

16.

Georgoulas

Georgopoulos

Stylios

. Speech sound classification and detection of articulation disorders with support vector machines and wavelets. In: 2006 international conference of the IEEE engineering in medicine and biology society , 2006, pp.2199–2202: IEEE.

17.

Speech sound disorders: articulation and phonology, https://www.asha.org/practice-portal/clinical-topics/articulation-and-phonology/ (accessed 23 October 2024).

18.

Stony Brook Medicine . Motor speech disorders, https://www.stonybrookmedicine.edu/patientcare/speechhearing/speechlanguage/motordisorders (accessed 23 October 2024).

19.

Ijitona

Soraghan

Lowit

, et al. Automatic detection of speech disorder in dysarthria using extended speech feature extraction and neural networks classification. In: IET 3rd international conference on intelligent signal processing (ISP 2017) , 2017, pp.15 (6)–15 (6): Institution of Engineering and Technology.

20.

Fluency disorders, https://www.asha.org/practice-portal/clinical-topics/fluency-disorders/ (accessed 23 October 2024).

21.

Johnson

. Book review. J Fluency Disord 2016; 47: 75–77.

22.

Gupta

Shukla

, et al. Deep learning bidirectional LSTM based detection of prolongation and repetition in stuttered speech using weighted MFCC. Int J Adv Comput Sci Appl 2020; 11. Epub ahead of print 2020. DOI: 10.14569/IJACSA.2020.0110941.

23.

Voice disorders, https://www.asha.org/practice-portal/clinical-topics/voice-disorders/ (accessed 23 October 2024).

24.

Functional voice disorders – StatPearls – NCBI Bookshelf, https://www.ncbi.nlm.nih.gov/books/NBK563182/ (accessed 23 October 2024).

25.

Boone

Mcfarlane

Zraick

. The voice and voice therapy. 9th ed. Boston, MA: Pearson Education, 2013.

26.

Al-Nasheri

Muhammad

Alsulaiman

, et al. Voice pathology detection and classification using auto-correlation and entropy features in different frequency regions. IEEE Access 2018; 6: 6961–6974.

27.

Definitions of communication disorders and variations, https://www.asha.org/policy/rp1993-00208/ (accessed 23 October 2024).

28.

Spoken language disorders, https://www.asha.org/practice-portal/clinical-topics/spoken-language-disorders/ (accessed 23 October 2024).

29.

Fazeli

Naghibolhosseini

Bahrami

. An adaptive neuro-fuzzy inference system for diagnosis of aphasia. In: 2008 2nd international conference on bioinformatics and biomedical engineering , 2008, pp.535–538: IEEE.

30.

Wolny

Mazak

Carpella

, et al. Thirteen years of SysML: a systematic mapping study. Softw Syst Model 2020; 19: 111–169.

31.

Petersen

Feldt

Mujtaba

, et al. Systematic mapping studies in software engineering. 2008. Epub ahead of print June 2008. DOI: 10.14236/ewic/EASE2008.8.

32.

Jothi

Sivaraju

Yawalkar

. AI based speech language therapy using speech quality parameters for aphasia person: a comprehensive review. In: 2020 4th international conference on electronics, communication and aerospace technology (ICECA) , 2020, pp.1263–1271: IEEE.

33.

Rayyan – Intelligent systematic review – Rayyan, https://www.rayyan.ai/ (accessed 23 October 2024).

34.

Voleti

Liss

Berisha

. A review of automated speech and language features for assessment of cognitive and thought disorders. IEEE J Sel Top Sign Proces 2020; 14: 282–298.

35.

Godino-Llorente

Aguilera-Navarro

Gomez-Vilda

. Automatic detection of voice impairments due to vocal misuse by means of Gaussian mixture models. In: 2001 conference proceedings of the 23rd annual international conference of the IEEE engineering in medicine and biology society , pp.1723–1726: IEEE.

36.

Nayak

Bhat

. Identification of voice disorders using speech samples. 2003, pp.951–953.

37.

Fonseca

Guido

Scalassara

, et al. Wavelet time-frequency analysis and least squares support vector machines for the identification of voice disorders. Comput Biol Med 2007; 37: 571–578.

38.

Akbarzadeh-T

M-R

Moshtagh-Khorasani

. A hierarchical fuzzy rule-based approach to aphasia diagnosis. J Biomed Inform 2007; 40: 465–475.

39.

Carmichael

Wan

Green

. Combining neural network and rule-based systems for dysarthria diagnosis. In: Interspeech 2008 , 2008, pp.2226–2229: ISCA.

40.

Godino-Llorente

Fraile

Sáenz-Lechón

, et al. Automatic detection of voice impairments from text-dependent running speech. Biomed Signal Process Control 2009; 4: 176–182.

41.

Georgoulas

Georgopoulos

Stylios

, et al. Detection of articulation disorders using Empirical Mode Decomposition and neural networks. In: 2009 international joint conference on neural networks , 2009, pp.959–964: IEEE.

42.

Maier

Haderlein

Eysholdt

, et al. PEAKS – a system for the automatic evaluation of voice and speech disorders. Speech Commun 2009; 51: 425–437.

43.

Maier

Hönig

Bocklet

, et al. Automatic detection of articulation disorders in children with cleft lip and palate. J Acoust Soc Am 2009; 126: 2589–2602.

44.

Vaziri

Almasganj

Behroozmand

. Pathological assessment of patients’ speech signals using nonlinear dynamical analysis. Comput Biol Med 2010; 40: 54–63.

45.

Khalil Arjmandi

Pooyan

Mohammadnejad

, et al. Voice disorders identification based on different feature reduction methodologies and support vector machine. In: 2010 18th Iranian conference on electrical engineering , 2010, pp.45–49: IEEE.

46.

Markaki

Stylianou

. Voice pathology detection and discrimination based on modulation spectral features. IEEE Trans Audio Speech Lang Process 2011; 19: 1938–1948.

47.

Erfanian Saeedi

Almasganj

Torabinejad

. Support vector wavelet adaptation for pathological voice assessment. Comput Biol Med 2011; 41: 822–828.

48.

Arjmandi

Pooyan

Mikaili

, et al. Identification of voice disorders using long-time features and support vector machine with different feature reduction methods. J Voice 2011; 25: e275–e289.

49.

Muhammad

Mesallam

Malki

, et al. Multidirectional regression (MDR)-based features for automatic voice disorder detection. J Voice 2012; 26: 817.e19–817.e27.

50.

Fraser

Rudzicz

Rochon

. Using text and acoustic features to diagnose progressive aphasia and its subtypes. In: Interspeech 2013 , 2013, pp.2177–2181: ISCA.

51.

Hariharan

Fook

Sindhu

, et al. Objective evaluation of speech dysfluencies using wavelet packet transform with sample entropy. Digit Signal Process 2013; 23: 952–959.

52.

Świetlicka

Kuniszyk-Jóźkowiak

Smołka

. Hierarchical ANN system for stuttering identification. Comput Speech Lang 2013; 27: 228–242.

53.

Sellam

Jagadeesan

. Classification of normal and pathological voice using SVM and RBFNN. J Signal Inf Process 2014; 05: 1–7.

54.

Souissi

Cherif

. Artificial neural networks and support vector machine for voice disorders identification. Int J Adv Comput Sci Appl 2016; 7. Epub ahead of print 2016. DOI: 10.14569/IJACSA.2016.070546.

55.

Lee

Liu

Huang

P-W

, et al. Automatic speech recognition for acoustical analysis and assessment of cantonese pathological voice and speech. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) , 2016, pp.6475–6479: IEEE.

56.

Ali

Elamvazuthi

Alsulaiman

, et al. Automatic voice pathology detection with running speech by using estimation of auditory spectrum and cepstral coefficients based on the all-pole model. J Voice 2016; 30: 757.e7–757.e19.

57.

Kohlschein

Schmitt

Schuller

, et al. A machine learning based system for the automatic evaluation of aphasia speech. In: 2017 IEEE 19th international conference on e-health networking, applications and services (Healthcom) , 2017, pp.1–6: IEEE.

58.

Lamia

Noureddine

. A new e-health tool for early identification of voice and neurological pathologies by speech processing. Int J Adv Comput Sci Appl 2018; 9. Epub ahead of print 2018. DOI: 10.14569/IJACSA.2018.090865.

59.

Pham

Lin

Zhang

. Diagnosing voice disorder with machine learning. In: 2018 IEEE international conference on big data (big data) , 2018, pp.5263–5266: IEEE.

60.

Franciscatto

Trois

Lima

JCD

, et al. Blending situation awareness with machine learning to identify children’s speech disorders. In: 2018 IEEE 12th international conference on application of information and communication technologies (AICT) , 2018, pp.1–6: IEEE.

61.

Mayle

Mou

Bunescu

, et al. Diagnosing dysarthria with long short-term memory networks. In: Interspeech 2019 , 2019, pp.4514–4518: ISCA.

62.

Franciscatto

Lima

JCD

Trois

, et al. Applying situation-awareness for recommending phonological processes in the children’s speech. In: Proceedings of the 34th ACM/SIGAPP symposium on applied computing , 2019, pp.739–746. New York, NY, USA: ACM.

63.

Wang

Qin

Peng

, et al. Child speech disorder detection with Siamese recurrent network using speech attribute features. In: Interspeech 2019 , 2019, pp.3885–3889: ISCA.

64.

Wang

Zhang

, et al. Automatic detection of consonant omission in cleft palate speech. Int J Speech Technol 2019; 22: 59–65.

65.

Barche

Gurugubelli

Vuppala

. Towards automatic assessment of voice disorders: a clinical approach. In: Interspeech 2020 , 2020, pp.2537–2541: ISCA.

66.

Hernandez

Yeo

Kim

, et al. Dysarthria detection and severity assessment using rhythm-based metrics. In: Interspeech 2020 , 2020, pp.2897–2901: ISCA.

67.

Kourkounakis

Hajavi

Etemad

. Detecting multiple speech disfluencies using a deep residual network with bidirectional long short-term memory. In: ICASSP 2020 – 2020 IEEE international conference on acoustics, speech and signal processing (ICASSP) , 2020, pp.6089–6093: IEEE.

68.

Alharbi

Hasan

Simons

AJH

, et al. Sequence labeling to detect stuttering events in read speech. Comput Speech Lang 2020; 62: 101052.

69.

AL-Dhief

Latiff

NMA

Malik

NNNA

, et al. Voice pathology detection using machine learning technique. In: 2020 IEEE 5th international symposium on telecommunication technologies (ISTT) , 2020, pp.99–104: IEEE.

70.

Milani

MGM

Ramashini

Krishani

. A real-time application to detect human voice disorders. In: 2020 international conference on decision aid sciences and application (DASA) , 2020, pp.979–984: IEEE.

71.

Jothi

Mamatha

. A systematic review of machine learning based automatic speech assessment system to evaluate speech impairment. In: 2020 3rd international conference on intelligent sustainable systems (ICISS) , 2020, pp.175–185: IEEE.

72.

Sheikh

Sahidullah

Hirsch

, et al. Stutternet: stuttering detection using time delay neural network. In: 2021 29th European signal processing conference (EUSIPCO) , 2021, pp.426–430: IEEE.

73.

Janbakhshi

Kodrasi

Bourlard

. Automatic dysarthric speech detection exploiting pairwise distance-based convolutional neural networks. In: ICASSP 2021 – 2021 IEEE international conference on acoustics, speech and signal processing (ICASSP) , 2021, pp.7328–7332: IEEE.

74.

Hadjaidji

Korba

MCA

Khelil

. Spasmodic dysphonia detection using machine learning classifiers. In: 2021 international conference on recent advances in mathematics and informatics (ICRAMI) , 2021, pp.1–5: IEEE.

75.

S-I

CW-Y

, et al. Detection of consonant errors in disordered speech based on consonant-vowel segment embedding. In: Interspeech 2021 , 2021, pp.2931–2935: ISCA.

76.

Al-Dhief

Baki

Latiff

NMA

, et al. Voice pathology detection and classification by adopting online sequential extreme learning machine. IEEE Access 2021; 9: 77293–77306.

77.

Sztahó

Gábor

Gábriel

. Deep learning solution for pathological voice detection using LSTM-based autoencoder hybrid with multi-task learning. In: Proceedings of the 14th international joint conference on biomedical engineering systems and technologies , 2021, pp.135–141: SCITEPRESS – Science and Technology Publications.

78.

Al-Banna

A-K

Edirisinghe

Fang

. Stuttering detection using atrous convolutional neural networks. In: 2022 13th international conference on information and communication systems (ICICS) , 2022, pp.252–256: IEEE.

79.

Barrett

Howell

. Systematic review of machine learning approaches for detecting developmental stuttering. IEEE/ACM Trans Audio Speech Lang Process 2022; 30: 1160–1172.

80.

S-I

CW-Y

Wang

, et al. Automatic detection of speech sound disorder in child speech using posterior-based speaker representations. In: Interspeech 2022 , 2022, pp.2853–2857: ISCA.

81.

Darouiche

El Moubtahij

Ben Yakhlef

, et al. An automatic voice disorder detection system based on extreme gradient boosting classifier. In: 2022 2nd international conference on innovative research in applied science, engineering and technology (IRASET) , 2022, pp.1–5: IEEE.

82.

Vora

Padalia

Mehta

, et al. Hybrid CNN-LSTM network to detect dysarthria using Mel-frequency cepstral coefficients. In: 2022 5th international conference on advances in science and technology (ICAST) , 2022, pp.615–621: IEEE.

83.

Yagnavajjula

Alku

Rao

, et al. Detection of neurogenic voice disorders using the fisher vector representation of cepstral features. J Voice 2022. Epub ahead of print November 2022. DOI: 10.1016/j.jvoice.2022.10.016.

84.

Altayeb

Al-Ghraibah

. Classification of three pathological voices based on specific features groups using support vector machine. Int J Electr Comput Eng 2022; 12: 946.

85.

Shrivas

Deshpande

Gidaye

, et al. Employing energy and statistical features for automatic diagnosis of voice disorders. Diagnostics 2022; 12: 2758.

86.

Sheikh

Sahidullah

Hirsch

, et al. Machine learning for stuttering identification: review, challenges and future directions. Neurocomputing 2022; 514: 385–402.

87.

Mohamed

Salama

Darwish

. Classification of voice pathology using SVM classifier. In: 2023 international telecommunications conference (ITC-Egypt) , 2023, pp.419–422: IEEE.

88.

Javanmardi

Tirronen

Kodali

, et al. Wav2vec-based detection and severity level classification of dysarthria from speech. In: ICASSP 2023 – 2023 IEEE international conference on acoustics, speech and signal processing (ICASSP) , 2023, pp.1–5: IEEE.

89.

Miliaresi

Pikrakis

. A modular deep learning architecture for voice pathology classification. IEEE Access 2023; 11: 80465–80478.

90.

Xie

Cai

, et al. A voice disease detection method based on MFCCs and shallow CNN. J Voice 2023. Epub ahead of print October 2023. DOI: 10.1016/j.jvoice.2023.09.024.

91.

Ribas

Pastor

Miguel

, et al. Automatic voice disorder detection using self-supervised representations. IEEE Access 2023; 11: 14915–14927.

92.

Tirronen

Javanmardi

Kodali

, et al. Utilizing Wav2Vec in database-independent voice disorder detection. In: ICASSP 2023 – 2023 IEEE international conference on acoustics, speech and signal processing (ICASSP) , 2023, pp.1–5: IEEE.

93.

Alnashwan

Alhakbani

Al-Nafjan

, et al. Computational intelligence-based stuttering detection: a systematic review. Diagnostics 2023; 13: 3537.

94.

Pravin

Elias

Sivaraman

, et al. Disfluency assessment using deep super learners. IEEE Access 2024; 12: 24079–24089.

95.

Pham

Holmes

Zou

, et al. Diagnosis of pathological speech with streamlined features for long short-term memory learning. Comput Biol Med 2024; 170: 107976.

96.

Javanmardi

Kadiri

Alku

. Pre-trained models for detection and severity level classification of dysarthria from speech. Speech Commun 2024; 158: 103047.

97.

Ur Rehman

Shafique

Azhar

Q-U-A

, et al. Voice disorder detection using machine learning algorithms: an application in speech and language pathology. Eng Appl Artif Intell 2024; 133: 108047.

98.

Ilapakurti

Kedari

Vuppalapati

, et al. Artificial intelligent (AI) clinical edge for voice disorder detection. In: 2019 IEEE fifth international conference on big data computing service and applications (BigDataService) , 2019, pp.340–345: IEEE.

99.

Gupta

Chaspari

Kim

, et al. Pathological speech processing: state-of-the-art, current challenges, and future directions. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) , 2016, pp.6470–6474: IEEE.

100.

Sah

. Machine learning: a review of learning types. 2020. Epub ahead of print 11 July 2020. DOI: 10.20944/preprints202007.0230.v1.

101.

Woodman

Mangoni

. A comprehensive review of machine learning algorithms and their application in geriatric medicine: present and future. Aging Clin Exp Res 2023; 35: 2363–2397.

102.

Baratloo

Hosseini

Negida

, et al. Part 1: simple definition and calculation of accuracy, sensitivity and specificity, https://www.jemerg.com (2015).

103.

Birol

Çiftci

Yılmaz

, et al.

Is there any room for ChatGPT AI bot in speech-language pathology?

Eur Arch Oto-Rhino-Laryngol 2025; 282: 3267–3280.

104.

CASP - critical appraisal skills programme, https://casp-uk.net/ (accessed 17 July 2025).

105.

PROSPERO, https://www.crd.york.ac.uk/prospero/ (accessed 17 July 2025).

106.

OSF, https://osf.io/ (accessed 21 August 2025).

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.86 MB