Abstract
Keywords
Introduction
About 90% of patients with idiopathic Parkinson disease (PD) develop various speech impairments during the course of their disease,
1
summarized under the term
The ultimate goal of the treatment, however, is not increased vocal intensity alone but improved overall intelligibility. Thus, improved articulation is at least as important. For the objective measurement of improved articulation, the so-called formant triangle can be used with the triangle’s area representing the range of articulatory movements.4,19-21 Despite an ongoing discussion about the strengths and weaknesses of this method,4,22 this area has been shown to be reduced in patients with PD, consistent with general hypokinesia in this disorder. 23 LSVT, however, has been observed to increase the triangle’s dimensions to normal values. 19 In addition, LSVT also increases pitch, but the potential benefit of this is still a matter of discussion. 9 Ultimately, LSVT is effective in improving intelligibility, which can be quantified in an unbiased fashion using the National Technical Institute for the Deaf (NTID) score.24,25
Although the clinical benefit of LSVT is well known, the underlying neuronal patterns of hypokinetic dysarthria and especially the brain areas associated with LSVT-induced improvements in speech are not well understood. Indeed, a better understanding of the underlying mechanisms involved in these processes may in turn inform speech therapists and neurologists on how to further optimize treatment while also providing basic insight into the pathophysiology of speech production in PD.
Several imaging studies in individuals with PD with hypokinetic dysarthria found cortical hyperactivation in speech-related areas mostly during overt speech production.13,26-28 These regions included the primary motor cortex,13,27 supplementary motor area13,26 (SMA), and inferior lateral premotor cortex13,26 as well as prefrontal cortices.13,26 Taken together, overactivation was generally interpreted as compensatory neuronal recruitment, which was likely related to an attempt to overcome hypokinetic dysarthria in an overt speech task.13,26-28 Hence, in an overt speech design, it is hardly possible to distinguish between pathophysiological and compensatory networks. A recent functional magnetic resonance imaging (fMRI) study addressed this issue by investigating individuals with PD who did not suffer from speech impairments at the time of investigation and did not develop their initial speech impairments until 2 years after the reported trial. 29 The authors observed pathologically reduced striato-prefrontal preparatory effective connectivity within subcortical and cortical compensatory networks. Furthermore, they observed a diminished monitoring of external auditory feedback and a reduced coupling between the ventral and dorsal striatum. However, because these patients did not suffer from hypokinetic dysarthria at the time of investigation, this study design still does not allow a certain distinction between an initially failing compensatory system and newly manifesting symptoms. Consequently, to elucidate the pathophysiological patterns associated with hypokinetic dysarthria in PD and unveil brain areas that contribute to effective voice training, it is important to limit compensatory demands. The present study was designed to address exactly this issue by aiming to minimize confounding effects on brain activation, such as (1) speech-related movement artifacts, (2) divergent auditory feedback (ie, loud vs soft voice), and (3) compensatory brain activation to overcome hypokinetic dysarthria. We thus opted for a covert—as opposed to overt—speech paradigm. Covert tasks are characterized by only mentally performing an operation, such as quietly reading in your head, whereas the overt equivalent would be to read out loud. Emulating the general principle of LSVT to increase intended speech intensity (“think loud to speak loud”), we aimed to investigate different modes of intended speech intensity. Therefore, we applied a covert speech task that entailed covert reading with normal intensity (similar to a conversation in a relatively quiet room) and covert reading with high intensity (similar to shouting on a windy beach). We applied this paradigm during fMRI in healthy controls (HCs) as well as individuals with PD with hypokinetic dysarthria before voice therapy and after 4 weeks of LSVT training. We additionally characterized quality of articulation and intelligibility using the formant triangle and an adapted version of the NTID (aNTID) score.
We hypothesized that by investigating covert speech as opposed to overt speech, compensatory brain activation would be minimized in dysarthric patients, and therefore, we may observe reduced speech-associated activation of the midline fronto-striatal motor network (ie, rostral SMA) before training. We further hypothesized that LSVT would at least partially normalize this hypoactivation. In addition, we expected training-induced compensatory increase of activation in areas associated with increased speech intensity.
Methods
Patients and Controls
We included 11 right-handed (7 male, 4 female) patients with PD diagnosed according to the Brain Bank Criteria, 30 who presented with PD-related dysarthria, and 11 right-handed (6 male, 5 female) age-matched HCs. Two highly experienced specialists in movement disorders (KEZ) and PD speech therapy (AN) confirmed relevant hypokinetic dysarthria in all patients. We originally recruited 12 patients but excluded one from the study because he was unable to follow the instructions at the very first appointment. All patients continued their regular medication for the duration of the study. All patients performed voice treatment according to the LSVT program 31 with speech therapists trained in this method. None of the HCs had a history of speech problems or neurological, vascular, or psychiatric diseases. We assessed handedness with the 10-item version of the Edinburgh Handedness Inventory 32 and cognitive performance was measured using the Montréal Cognitive Assessment (MoCA). 33 Epidemiological and clinical data of the whole sample are summarized in Table 1. Written informed consent was obtained from all participants, and all procedures were approved by the Ethics Committee of the Medical Faculty of the University of Kiel and were in full accordance with the current version of the declaration of Helsinki.
Clinical Data of All Participants.
Abbreviations: F, female; M, male; MoCA, Montréal Cognitive Assessment score points; UPDRS, Unified Parkinson Disease Rating Scale points.
1: Engineer; 2: merchant; 3: financial staff; 4: quality management; 5: cook; 6: paralegal; 7: teacher; 8: therapist; 9: musician.
Experimental Design
Standardized Overt Reading Session
Standardized reading was performed to analyze intelligibility and acoustic features of speech in patients before and after LSVT. Patients were examined under their regular medication. In a quiet room, all participants were recorded (headset microphone, AKG C444 and a MicroTrack 24/96 recorder, M-AUDIO) reading a short text of 180 syllables in their regular reading voice. At each time point, each participant received a different text of a local newspaper column written by the same author in the same style for consistency in intelligibility assessment, as described below. Each recording was calibrated with a reference tone. All participants silently read the text once before recording to minimize reading difficulties.
Covert Reading Paradigm: fMRI
During fMRI, participants performed an event-related covert speech production task. Short German sentences were presented in white letters on a gray background (see Figure 1). The sentences contained 4 to 6 syllables and were designed to be emotionally neutral and phonetically similar. Participants were instructed to read them covertly with 1 of 2 different intensities: NORMAL, with normal intensity, as if in a regular conversation in a silent room; HIGH, with high intensity, as if shouting to a distant person on a windy beach. Five sentences of the same condition (NORMAL or HIGH) were grouped into 1 block as follows (presentation time). Every block started with the presentation of an image informing the participant about the upcoming condition (2781 ± 172 ms; NORMAL: conversation scene; HIGH: beach scene). Additionally, an intensity symbol (symbol of an audio speaker with waves) matching the current condition appeared below every sentence shown (3470 ± 313 ms). Following each sentence, a fixation cross was presented (4495 ± 3341 ms) until the start of the next trial. Additionally, every block contained 1 single reaction time trial (300 ms), which required a button press for a nonsensical row of letters to monitor attention of the participant. The entire experiment was divided into 4 sessions of 4 blocks each with a pseudo-random distribution of the NORMAL and HIGH conditions, yielding a total of 80 sentences and 16 reaction time trials per experiment. Behavior was monitored during the whole experiment to rule out overt reading. In rare cases of overt reading, the session was restarted after reminding the participant to read covertly. The software PsychoPy (http://www.psychopy.org) was used for presentation of the stimuli, synchronization, and acquisition of the behavioral data. Prior to the experiment, participants performed a training session in the MRI scanner that included blocks of 3 short sentences with overt speech to ensure that participants understood the paradigm. The whole experimental procedure was performed once by HC and twice by patients with PD (8 ± 4 days before the start of LSVT and 7 ± 3 days after the end of the 4-week LSVT).

Experimental paradigm covert speech. NORMAL: Regular conversation, silent room; HIGH: Shouting to distant person, windy beach. Each block started with the condition instruction image followed by 5 sentences and a fixation cross. Additionally, each block included 1 reaction time trial.
MRI Data Acquisition
MRI scanning was performed at the Neurocenter at Kiel University Hospital with a 3T whole body MRI scanner (Achieva; Philips, Best, the Netherlands) with a 32-channel head coil. A visual system (NordicNeuroLab AS; Bergen, Norway) was used for stimulus presentation and an adapted computer mouse as response button for reaction time trials. A microphone (MR Confon GmbH, Magdeburg, Germany) was used to record the training session before every experiment and for detecting overt speech during scanning.
For fMRI, the first 5 scans of each session were discarded because of nonequilibrium of magnetization, followed by 205 echo planar images with 35 ascending transversal slices (field of view [FOV] = 210 × 210 × 115.2 mm3; voxel size = 3.3 × 3.3 × 3 mm3; slice thickness = 3 mm; gap = 0.3 mm; repetition time [TR] = 2500 ms; echo time [TE] = 33 ms; flip angle = 90°).
For spatial normalization and exclusion of gross structural abnormalities, a 3D T1-weighted image was acquired (scan duration= 335 s; 160 sagittal slices, thickness = 1 mm; FOV = 240 × 240 × 160 mm3; voxel size= 1 × 1 × 1 mm3; TR = 8.3 ms; TE = 3.9 ms; flip angle= 8°).
Statistical Analysis
Analysis of Standardized Overt Reading Sessions
Acoustic analysis of the recordings has already been reported in detail elsewhere.34,35 In short, patients and HCs were asked to sustain vowels and to read short sentences and a phonetically balanced text. For the vowel experiment, the sustained vowel duration, the average intensity (A-weighted), the intensity decay, as well as shimmer (cycle-to-cycle variations of amplitude) and jitter (cycle-to-cycle variations of fundamental frequency, which is the technical equivalent of pitch) were extracted and analyzed. For the short sentences, A-weighted average intensity and mean fundamental frequency were computed. Likewise, for the phonetically balanced texts, intensity, fundamental frequency, and so-called formant triangles were extracted. For the triangles, the first and second formant of /i:/, /a:/, and /u:/ were plotted (see Figure 2). The area between the formants forms the shape of a triangle and describes the range of articulation (ie, the bigger the triangle surface, the better the articulation). In the present study, the formant triangle value corresponds to the ratio between the patients’ size of the triangle surface and the mean of all HC participants.

Formant triangle: For the triangle, the first and second formant of /i:/, /a:/, and /u:/ were plotted. The surface area describes the range of articulation.
These acoustic metrics were complemented with perceptual judgments of intelligibility: 3 different raters without any phonetic education rated every single recording with the aNTID score.
25
Using the aNTID score, the rater can evaluate intelligibility with 6 levels, ranging from perfectly intelligible (6 points) to completely unintelligible (1 point). The statistical analysis of the scores was performed with Welch’s
Analysis of Imaging Data
Preprocessing and statistical analysis of the MRI image data was done with SPM8 (http://www.fil.ion.ucl.ac.uk/spm/software/spm8/) executed in MATLAB Version 8.1 (R2013a) (http://www.mathworks.com). fMRI image volumes were realigned to the first volume of each session and coregistered to the structural T1 image. The SPM8 segmentation procedure was applied on the T1 image to carry out spatial normalization to standard MNI coordinate space. After applying the normalization parameters to all fMRI images, they were smoothed with an isotropic 8-mm (full width at half maximum) Gaussian filter.
First-level analysis was performed using a general linear model with 2 regressors of interest (HIGH, NORMAL) and 8 regressors of no interest (button press, condition image, and 6 realignment parameters). We then computed contrast images for the conditions HIGH, NORMAL, and the between conditions effect HIGH > NORMAL on the single participant level. On the second level, we computed the main effects of condition per group (HC, patients before and after treatment) by means of a 1-sample
To test for general effects of LSVT, we used a paired
Results
Epidemiological and Clinical Data
The age of patients ranged from 54 to 74 years (M = 63.54; SD = 7.22); the age of HCs ranged from 62 to 75 years (M = 67.72; SD = 3.58). Between-group analyses revealed no significant differences [
Reading Performance
Compared with HCs, patients before LSVT showed significantly lower scores in the formant triangle [
Functional Imaging Results
Main Effects of Conditions
As the main effect of the NORMAL condition, both groups exhibited increased blood oxygen level–dependent (BOLD) signal in a widespread network of frontal, parietal, temporal, and occipital areas—for example, left-sided frontal operculum (Brodman’s area [BA] 48), left SMA (BA 06), left-sided temporal regions, and occipital areas (BA 19). As main effect of the HIGH condition, we observed increased BOLD signal in largely the same areas independently in both groups (see the figure and tables in supplementary material).
Differential Effects Between Conditions
For the contrast HIGH > NORMAL, the pooled analysis, including HCs and patients before LSVT, revealed increased activation in the bilateral SMA (BA 06), bilateral superior temporal areas (BA 22, 38, 48), bilateral rolandic operculum (BA 48), left Broca’s area (BA 45), left medial cingulum (BA 23), left insula (BA 48), and right precentral (BA 06) and right medial temporal areas (BA 21; see Figure 3A, Table 2).

(A) Neural correlates of HIGH > NORMAL Overall analysis with healthy controls and patients; (B) treatment-induced changes of cortical activation patterns during NORMAL in patients.
Neural Correlates Associated With Covert Speech Paradigm. a
Abbreviation: SMA, supplementary motor area.
Clusters applied an uncorrected threshold of
Right (R), left (L).
Between-Group Effects
During NORMAL, patients before LSVT showed significantly reduced activation in comparison to HCs in the bilateral SMA (see Table 2). There were no significant differences between the groups during HIGH and for the contrast HIGH > NORMAL.
Effects of LSVT
After LSVT, the above-mentioned hypoactivity in patients during the NORMAL condition was not observable anymore. Furthermore, after LSVT, patients showed significantly increased activity for the HIGH > NORMAL contrast as compared with HCs in the left medial temporal area (MNI coordinate for peak voxel: −54, −26, −4; cluster size = 131 voxels; BA = 21;
Discussion
This is the first fMRI study investigating the neural correlates of high-effort LSVT in patients with PD using a covert speech production paradigm. The correlation of functional data with behavioral measures provides novel insight into PD-specific speech impairments and therapy-related changes on a neural level.
First, we found hypoactivation of the rostral SMA as part of an impaired mesial fronto-striatal dopaminergic network associated with covert speech production in dysarthric patients. Second, we found LSVT to effectively alleviate dysarthria in our study population, which was accompanied by partial normalization of SMA hypoactivation. Third, we identified cortical patterns underlying increased intensity of covert speech production. Finally, we showed that successful LSVT leads to an increased activation in exactly these areas associated with increased intensity of covert speech production (ie, mainly right superior temporal area and right supramarginal gyrus). Our data, therefore, suggest that LSVT leads to a partial transfer of the cortical activation patterns for high-intensity speech into regular conversational speech.
Main Effect of the Paradigm and Neural Correlates of High Intensity
When analyzing the activation patterns across conditions and groups, we observed activation in areas known for their role in speech production (see Figure 3). Hence, we assumed an appropriate performance in our sample as well as validity of the covert reading paradigm. The HIGH > NORMAL contrast revealed increased BOLD signal in a secondary motor network, including the left Broca’s area (BA 45), left insula (BA 48), bilateral SMA, and left medial cingulum. These regions are well known for their role in speech production. Although Broca’s area is commonly described as a main speech production area, 36 the left insula has been shown to be important for articulation.37-39 SMA is among others involved in the initiation of vocalization, 40 controlling of speech motor output, 41 and articulation as well as vocalization. 42 As the medial cingulum is possibly connected to caudal premotor and motor areas, 43 it might additionally be part of this secondary motor network, which modulates precentral motor neurons. Increased temporal activity was also a prominent finding in this contrast. Temporal activations are also frequently observed during speech production paradigms.36,39,42,44 In line with the designated design of this paradigm, we interpret this cortical network as neural correlates of high intensity (speech), which is the main modulatory tool of LSVT.
Correlates of Hypokinetic Dysarthria in Individuals With PD
Before treatment, patients showed reduced intelligibility consistent with dysarthria, which has frequently been shown to negatively affect daily living. 6 In addition, they showed lower scores in the formant triangle as compared with HCs, indicating altered articulation. 20 Our analysis did not reveal significant differences regarding average pitch between HCs and patients before LSVT, even though abnormal control of fundamental frequency (ie, pitch) has been reported in the literature. 45 During fMRI, the NORMAL condition was expected to demonstrate main differences in cortical speech functions between patients with PD and HCs because features of hypokinetic dysarthria are most obvious in normal (as opposed to loud) speech. We did not expect large differences between groups during HIGH because patients with PD are usually able to produce cued high-intensity speech, which is used during LSVT.
As mentioned before, several studies found mainly increased activity in speech-related areas during speech production tasks in patients with PD. These findings were mainly explained as a compensatory activity.13,26,28 Contrary to these results, we observed reduced secondary motor activations in bilateral rostral SMA. Our results seem to be contradictory, but it is necessary to consider substantial differences between paradigms. Most of the studies used PET, whereas this study is based on fMRI. Different results of similar semantic paradigms in these imaging methods have been described. 46 More important, increased activation in the SMA has been observed during overt compared with covert speech. 42 Thus, differences in imaging methods and experimental tasks used are possible explanations for deviating cortical patterns. Additionally, it should be mentioned that in individuals with PD, influence of attention on neural activation patterns during motor tasks 47 as well as the influence of different tasks on symptom severity of hypokinetic dysarthria have been reported. 48 However, throughout the literature of movement disorders, hypoactivity of motor networks in different syndromes and motor tasks has been described.28,49-51 Therefore, even if different aspects of motor impairment were investigated and other paradigms were used, we consider our finding quite consistent with longstanding findings of medial frontal cortical dysfunction in PD.
Finally, considering differences from other studies regarding imaging technique, paradigm, and sample, we interpret the observed secondary motor hypoactivity as a correlate of fronto-temporal dysfunction. This supports the hypothesis that striato-prefrontal dysfunction is one of the main reasons for hypophonia and hypokinetic dysarthria in PD and extends previously described patterns of dysfunctional networks. 29
LSVT-Induced Changes
Intelligibility improved significantly in patients as a result of LSVT, as it has previously been reported by others. 24 Although LSVT is primarily designed to improve intensity of speech, it is important to highlight that, ultimately, the patient should be more intelligible as a product of various improved aspects of speech. As a likely underlying cause of improved intelligibility, patients also successfully improved their articulation after LSVT, as shown by the increased formant triangle. A relationship between improvements in formant triangle and a gain in loudness and vowel duration has been reported previously. 19 These findings underline the great importance of effort-based changes in speech production for improved intelligibility. Importantly, the formant triangle seems to represent a sensitive measure for PD-associated speech impairments and the quantification of therapy success. Even though pitch is not a direct target in LSVT, an increase in pitch has previously been observed by others, which might be an indirect consequence of enhanced respiration for phonation.9,31 In our study, however, no significant changes in pitch were observed.
As expected, the NORMAL condition not only yielded the largest differences between controls and dysarthric patients (see above), but it also showed the greatest therapy-induced changes. We observed increased activation in the right superior temporal area (BA 22) and right supramarginal gyrus, which correlated with the beneficial effects of LSVT. This generally confirms the previously reported LSVT-induced activation shift toward the right hemisphere. 14 Interestingly, Liotti et al 13 observed decreasing activity during overt speech after LSVT. They explained this as a therapy-induced normalization, arising from reduced compensatory needs. At first sight, our findings seem to be contradictory. However, as mentioned throughout the article, we adopted a paradigm specifically designed to reduce the need for compensation (along with other substantial methodological differences). Importantly, the areas showing a positive correlation between beneficial treatment effects on intelligibility of patients reading in their regular voice and activation during NORMAL were all part of the specific high-intensity network (HIGH > NORMAL) described above. Therefore, it is conceivable that LSVT leads to a partial transfer of the specific cortical activation patterns for high-intensity speech into regular conversational speech.
Additionally, after treatment, patients showed increased BOLD signal in the left medial temporal area for the HIGH > NORMAL contrast as compared with HC. This difference was not found in the comparison between HC and patients before treatment. An association between the left temporal lobe and intelligibility has been reported previously. 52 However, the therapy-related main effect in our data remains the activation increase in the contralateral hemisphere as described above.
A possible explanation for the observed right lateralization may be the design of LSVT itself. Increased intensity exerts a strong influence on prosody. For such suprasegmental features, a dominance of the right hemisphere has been well described.37,53,54 Thus, an interpretation of the therapy-induced rise of activity as a correlate of altered prosody during high-intensity speech seems quite plausible.
The increased BOLD signal in the right sided supramarginal area might also be explained by altered prosody because this area is known to be important for paralinguistic features, 55 especially rhythm. 56 Regarding its role in prosody and speech processing, a treatment-induced increase of activity seems to fit well with artificially trained prosody. A complementary explanation for increased right temporal lobe activation may be increased self-monitoring and feedback regulation. 44
Limitations
A potential drawback of a covert speech paradigm is that the participant’s performance (and compliance) cannot be directly monitored. However, using our paradigm, we observed increased task-related activity in classical areas for speech production 57 during covert speech and are, thus, confident that participants were engaged in the task. Reaction trials, overt speech practicing, and acoustic monitoring while scanning provided further evidence for sufficient task engagement.
Patients and HCs differed regarding their levels of education, which theoretically could have affected the reading performance and subsequently our results. However, we consider this as quite unlikely because the sentences used during fMRI were very short, simple, and clear. In addition, overt reading was practiced before recordings were conducted in order to eliminate possible reading difficulties. Nevertheless, studying a bigger sample would be crucial because our small sample size allows only conservative conclusions to be made. In addition, the study design was not controlled by a waiting list crossover design or similar. This was mainly for practical reasons because recruitment to and organization of such a study would have been extremely problematic. Thus, confirmatory studies are certainly needed.
Conclusion
Our data point toward a primary dysfunction of mesial fronto-striatal neuronal networks contributing to hypokinetic dysarthria in individuals with PD. Increased intensity of covert speech seems to primarily rely on a right temporo-parietal network engaged in prosodic elements of speech. Effective LSVT may be characterized by an increased recruitment of exactly these high-intensity cortical speech networks for regular conversational speech. By contributing to a better understanding of underlying neural mechanism involved in hypokinetic dysarthria in patients with PD and the neural changes induced by effective LSVT, our study may help further optimize individual speech therapy.
Supplemental Material
Supplementary_material – Supplemental material for Neural Correlates of Hypokinetic Dysarthria and Mechanisms of Effective Voice Treatment in Parkinson Disease
Supplemental material, Supplementary_material for Neural Correlates of Hypokinetic Dysarthria and Mechanisms of Effective Voice Treatment in Parkinson Disease by Alexander Baumann, Adelheid Nebel, Oliver Granert, Kathrin Giehl, Stephan Wolff, Wiebke Schmidt, Christin Baasch, Gerhard Schmidt, Karsten Witt, Günther Deuschl, Gesa Hartwigsen, Kirsten E. Zeuner and Thilo van Eimeren in Neurorehabilitation and Neural Repair
Footnotes
Acknowledgements
We are indebted to the patients and healthy volunteers for participating in this study.
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: AB: none; AN: none; OG: referee honoraria from UCB Pharma GmbH (2016); KG: none; SW: none; WS: none; CB: supported by research grant from the Deutsche Forschungsgemeinschaft (DFG); GS: supported by a research grant from the DFG; KW: Government employee and receives funding for his research from the German Research Council and the German Ministry of Education and Health via his institution; received royalties from Elsevier; GD: none; GH: none; KEZ: received grants from Ipsen, Allergan; advisory board activity at Merz; TvE: received research support from the Deutsche Forschungsgemeinschaft (DFG), the Leibniz Association and EU Joint Programme—Neurodegenerative Disease Research (JPND). He received consultancies and Honoria from Eli Lilly, Shire, and the CHDI Foundation.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study has been supported by grants to TVE (DFG, EI 892 3-1).
Supplementary material for this article is available on the
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
