Abstract
Background. Little is known about the efficacy of language production treatment in subacute severe nonfluent aphasia. Although Melodic Intonation Therapy (MIT) is a language production treatment for this disorder, until now MIT effect studies have focused on chronic aphasia. Purpose. This study examines whether language production treatment with MIT is effective in subacute severe nonfluent aphasia. Methods. A multicenter, randomized controlled trial was conducted in a waiting-list control design: patients were randomly allocated to the experimental group (MIT) or the control group (control intervention followed by delayed MIT). In both groups, therapy started at 2 to 3 months poststroke and was given intensively (5 h/wk) during 6 weeks. In a second therapy period, the control group received 6 weeks of intensive MIT. The experimental group resumed their regular treatment. Assessment was done at baseline (T1), after the first intervention period (T2), and after the second intervention period (T3). Efficacy was evaluated at T2. The impact of delaying MIT on therapy outcome was also examined. Results. A total of 27 participants were included: n = 16 in the experimental group and n = 11 in the control group. A significant effect in favor of MIT on language repetition was observed for trained items, with mixed results for untrained items. After MIT there was a significant improvement in verbal communication but not after the control intervention. Finally, delaying MIT was related to less improvement in the repetition of trained material. Conclusions. In these patients with subacute severe nonfluent aphasia, language production treatment with MIT was effective. Earlier treatment may lead to greater improvement.
Introduction
Aphasia is a common consequence of stroke; the incidence in a stroke population ranges from 21% to 38%.1-3 The recovery pattern from poststroke aphasia shows much variability: some patients recover quickly and experience no (or only mild) language problems within a few weeks poststroke, whereas others remain severely nonfluent, that is, they remain completely or almost unable to produce language.3-6 In the past decades, the evidence for the effectiveness of aphasia treatment has increased.7-11 However, most studies focused on treatment effects in chronic populations. Only a few examined aphasia treatment in the subacute phase poststroke, when treatment interacts with spontaneous recovery processes.8,10,12-15
This study addresses language production treatment for patients with severe nonfluent aphasia, persisting until 2 to 3 months poststroke. These patients generally receive a combination of exercises aiming at language production, language comprehension, and nonverbal communication strategies. Although a focus on language production alone is thought to be too frustrating, to our knowledge this claim has yet not been examined.
Melodic Intonation Therapy (MIT) 16 is a language production treatment for severe nonfluent aphasia. It is based on the observation that these patients are often able to sing words they cannot produce during speech. The treatment involves repetitive singing of short sentences, while hand tapping the rhythm. Originally, it was claimed that melody activates language-capable regions in the right hemisphere.16-19 However, recent evidence highlights the critical role of rhythm and formulaic language in MIT. 20 The contribution of the right hemisphere is still unclear: whereas some report increased right hemisphere activation related to MIT success,21,22 others suggest that MIT-induced language recovery is related to reactivation of left perilesional regions.23,24 Many studies have shown the beneficial effects of MIT on language production.19,21,22,25,26 However, most are single-case or case series studies in chronic patients.27,28 A recent pilot study investigated the effect of a modified form of MIT in subacute patients with mild to severe aphasia and reported positive effects immediately after one short therapy session. 29 Thus, overall, the level of evidence for MIT is low and little is known about its effect in early phases poststroke.
To examine the efficacy of MIT as a therapy to improve language production, this method was contrasted with a control therapy not aiming at language production but using linguistic tasks often trained in severe nonfluent aphasia, such as written language production, language comprehension, and nonverbal communication strategies.
The aim of the present study was 3-fold. First, the efficacy of MIT as a language production therapy for severe nonfluent aphasia was evaluated in the subacute phase. Second, we investigated whether the timing of MIT within the subacute phase affects therapy outcome; it is suggested that early aphasia intervention yields greater improvement but, until now, the evidence for early treatment is not well established.10,15 Third, we examined potential determinants influencing MIT outcome.
Methods
Design
A waiting-list randomized controlled design was used (Figure 1). Between baseline (T1) and the first intervention period (T2), participants in the experimental group received intensive MIT (6 weeks; 5 h/wk); no other language therapy was allowed in this period. In the same way, participants in the control group received intensive control treatment only (6 weeks; 5 h/wk), thereby allowing comparison between MIT and the control therapy at T2. After T2, patients allocated to the control group received delayed MIT following the same protocol (6 weeks; 5 h/wk), allowing to examine the effect of timing of MIT. Patients in the experimental group resumed their regular therapy after T2.

Flow diagram.
Patients were randomly allocated to either MIT or the control group. For this, a computer-generated random allocation sequence was used and the results placed in consecutively numbered sealed envelopes. The study was approved by the Medical Ethics Committee of Erasmus University Medical Center. Written informed consent was obtained from all participants or close relatives.
For obvious reasons, participants and speech-language therapists (SLTs) could not be blinded for treatment condition. The researchers administering and scoring the assessments at each test moment were blinded for group allocation. In a few cases, blinding could not be maintained because the patients spontaneously informed the researcher about their therapy allocation.
Participants
Between 2009 and 2011, patients were recruited from 15 aphasia treatment services in hospitals, rehabilitation centers, and nursing homes in the Netherlands. Inclusion criteria were: aphasic after left hemisphere stroke, time poststroke 2 to 3 months, premorbidly right-handed, age 18 to 80 years, native language Dutch, and MIT candidate. MIT candidacy was based on the MIT literature19,30 and defined as follows: nonfluent aphasia (<50 words/min), articulation deficits (Aachen Aphasia Test 31 [AAT], subscore spontaneous language ≤3), repetition severely affected (AAT subtest repetition ≤100), and moderate to good auditory language comprehension (AAT subtest auditory comprehension ≥33; functional comprehension ≥5). Exclusion criteria were: prior stroke resulting in aphasia, bilateral lesion, intensive MIT prior to start of the study, severe hearing deficit, and psychiatric history relevant to language communication.
Interventions
All interventions were given by the patients’ SLT, experienced in language rehabilitation in aphasia. MIT was applied following the American manual.30,32 All SLTs were trained to deliver MIT according to the therapy protocol. The patient and the SLT sang short utterances together, while hand tapping the rhythm. Gradually, the support from the SLT decreased and singing was replaced by speaking. The study protocol listed a set of utterances of increasing complexity to be trained. The first utterances were selected because of their frequent use in daily-life communication (eg, “coffee please”). Later in the program, the utterances became longer, more complex, and less frequent in daily life. In addition, the SLT and the patient composed a set of self-chosen utterances that were functionally relevant to the patient, such as utterances related to hobbies. A minimum of 50% of the therapy time had to be spent on the utterances provided in the protocol.
The control intervention did not emphasize spoken output but focused on other linguistic modalities usually trained in severe nonfluent aphasia (writing, language comprehension, nonverbal communication strategies). Spoken output was not discouraged but the therapists did not provide feedback regarding patients’ verbal production and offered no structural training of language production.
To ensure therapy intensity, homework assignments were provided for both the MIT and the control group. We developed an iPod application containing short videos of a mouth singing the target utterances; patients could sing along with the video or repeat the utterance afterwards. Homework assignments for the control group included paper-and-pencil tasks such as written sentence completion, word–picture matching, and word categorizing tasks. The minimum amount of face-to-face therapy time was 3 h/wk. Therapists recorded therapy time per session, and patients or a close relative recorded homework time per session.
Regular practice (T2-T3 for the early MIT group, Figure 1) depended on the needs and capabilities of the individual patient. Most patients received a combination of language production therapy (eg, word-finding therapy), semantic therapy, and nonverbal communication strategies. Treatment intensity was not recorded.
Assessment
Prior to inclusion, the AAT was administered to establish inclusion eligibility. Assessments were performed at baseline (T1, within 2 weeks from the AAT), after the first treatment phase of 6 weeks (T2), and 6 weeks later (T3) (Figure 1). These assessments included the following tests: the Sabadel story retelling task measuring information content in connected speech 33 ; the Amsterdam Nijmegen Everyday Language Test (ANELT) measuring verbal communication in daily life 34 ; the AAT subtests repetition and naming; the MIT repetition task, a repetition task designed for the present study including 11 utterances trained during MIT and 11 matched untrained utterances; and the nonverbal Semantic Association Task (SAT) measuring semantic disorders. 35 All language production tests were audio recorded.
Outcome measures were the Sabadel, the ANELT, the AAT subtests repetition and naming, and the MIT repetition task. The MIT repetition task allowed comparing the effects for trained and untrained material, that is, the direct effect of the MIT training on spoken repetition of the trained utterances and the generalization to untrained material. Generalization to the repetition of untrained material was also examined by the AAT subtest repetition. The AAT naming task was used to investigate further generalization to word finding and word production. The ANELT and the Sabadel were used to examine generalization to functional language use, respectively, in everyday communicative situations and in story retelling. In the ANELT, the researcher presents an everyday communicative situation, for example, a doctor’s visit. The patient’s task is to produce an adequate, verbal reaction. The Sabadel evaluates the production of connected speech. The patient’s task is to retell a story immediately after it has been presented by the researcher, supported by pictures.
Statistical Analysis
The power analysis was based on a small Sabadel pilot study. 36 From this, we calculated that a sample size of 15 patients per group was needed to provide 80% power (α = .05, β = .20) to detect a mean difference in improvement of 11.5 (standard deviation [SD] = 12.42) content information units (CIUs) on the Sabadel with an expected effect size of .90 (Cohen’s d). CIUs are words that are adequate and comprehensible in relation to the target story and are often used to assess communicative efficiency in aphasia. 37 We aimed to recruit 20 patients per intervention group, thereby taking into account that not all patients would complete the intervention.
Analyses were performed on an intention-to-treat basis. Independent t tests for continuous data and χ2 tests for categorical data were used to test group differences at baseline.
To evaluate the efficacy of MIT at T2, univariable linear regression analyses, adjusted for baseline, were used for all outcome measures. Furthermore, the proportion of participants in each group showing a clinically relevant improvement on the ANELT (>7) between T1 and T2 was compared by means of a χ2 test.
To examine the impact of timing, we used a linear mixed model analysis with repeated measurements, analyzing possible between-groups differences over the total intervention time (T1, T2, T3), taking into account correlations within subjects.
For analysis of the determinants, the data on all patients pre- and post-MIT were taken into account (T1-T2 early MIT, T2-T3 delayed MIT). Potential determinants, that is, age, gender, severity of the aphasia (score AAT Token Test), treatment intensity, time poststroke at start of MIT, patients’ linguistic profile at the start of MIT: preMIT scores on AAT language repetition, AAT auditory comprehension, and the nonverbal SAT were examined for all outcome measures by means of univariable linear regression analyses. Furthermore, scores on all outcome measures were dichotomized into 2 groups: responders (improvement >10 on MIT repetition, >14 on AAT repetition, >16 on AAT naming, >7 on the ANELT, >0 on the Sabadel) and nonresponders. Group differences were examined using χ2 tests and independent t tests. All analyses were performed using the SPSS version 18.0.
Results
Participants
In each center, all aphasia patients were screened by the SLT to establish whether they fit the clinical picture for MIT. Information on the proportion of screened patients eligible for inclusion in the study was only available from the main study center (64% eligible for inclusion); here, the reasons for nonparticipation were failure to meet the inclusion criteria (66.7%), early discharge (22.2%), and refusal to participate (11.1%).
A total number of 27 patients were included in the study: 16 were allocated to the experimental group and 11 to the control group. Four patients withdrew from MIT after 1 or 2 weeks, because they felt uncomfortable with the therapy or were disappointed by their progress. Thus, the required number of 15 patients per group was not achieved. Figure 1 presents the CONSORT diagram of patient flow.
Table 1 presents patient characteristics. Except for gender (χ2 = 4.03, p = .045), at baseline there were no significant differences between the 2 groups.
Baseline Characteristics of the 2 Study Groups.
Abbreviations: SD, standard deviation; LH, left hemisphere; AAT, Aachen Aphasia Test; ANELT, Amsterdam Nijmegen Everyday Language Test; CIU, content information units; MIT, Melodic Intonation Therapy.
Level of education: 1 = lowest (primary school), 8 = highest (university).
Handedness before stroke (Edinburgh Handedness Inventory and/or medical information).
Efficacy of MIT
There was no significant difference in treatment intensity between the 2 groups (MIT: mean = 6.52 h/wk [SD = 3.55]; control: mean 5.67 h/wk [SD = 1.41]; t = −.71, p = .49).
Table 2 presents the difference scores of both groups between T1 and T2. At T2, the MIT group showed significant improvement on all tasks, except for the Sabadel task. The control group showed significant improvement only on the repetition of untrained MIT items.
Changes Over the Intervention Period T1 to T2 on All Outcome Measures and Group Comparisons Adjusted for Baseline a .
Abbreviations: MIT, Melodic Intonation Therapy; SD, standard deviation; ANELT, Amsterdam Nijmegen Everyday Language Test; AAT, Aachen Aphasia Test.
Positive mean scores represent an increase in outcome score over the intervention period. A positive β indicates more effect in the experimental group than in the control group. A negative β indicates more effect in the control group than in the experimental group. Significant values are represented in bold.
The linear regression analysis revealed a significant difference in improvement at T2 between the 2 groups for the MIT repetition test (trained items) and on the AAT subtest repetition. Furthermore, a trend was observed for one functional task: the ANELT (Table 2). Because of the gender difference between the 2 groups, we controlled for this potentially confounding variable; in addition we controlled for aphasia severity. In both cases, this did not alter the results.
The mean improvement of 6.6 points in the MIT group approaches the clinically relevant improvement of >7 points as specified in the ANELT 34 at the individual level. In the experimental group, 35.7% of the participants showed an improvement >7 on the ANELT, considerably more than in the control group (9.1%). However, this difference did not reach significance (χ2 = 2.39, p = .12, Fisher’s exact test p = .18).
Differences Over Time
The linear mixed model analysis showed a main effect of time on all outcome measures: Sabadel: F = 5.49, p = .011; ANELT: F = 7.82, p = .003; AAT naming: F = 11.37, p = .001; AAT repetition: F = 16.33, p < .001; MIT repetition trained items: F = 26.62, p < .001; MIT repetition untrained items: F = 17.19, p < .001. This effect of time was present on the repetition tasks for both groups, but only for the experimental group on the more functional tasks: Sabadel (experimental group F = 5.30, p = .02; control group F = 1.46, p = .28), ANELT (experimental group F = 8.81, p = .004; control group F = 1.21, p = .34) and naming (experimental group F = 19.92, p < .001; control group F = 1.77, p = .27). Thus, whereas both groups improved over time on language repetition, only the experimental group showed significant improvement over time on the functional tasks. The analysis revealed an interaction between time and intervention group for repetition of trained items (F = 8.89, p = .001). Over time, the experimental group improved more on the repetition of trained items than the control group.
Although the analysis does not show interaction on any of the other variables, visual inspection of the improvement patterns (Figure 2) shows that the results are in favor of the experimental group for all measures.

Improvement over time: mean score ± 1 SD.
The differences do not reach significance because the study is underpowered.
Determinants for Therapy Outcome
Of all potential determinants, only treatment intensity and time post onset had an impact on one or more outcome variables. Treatment intensity predicted outcome on the repetition of trained items, MIT task (β = .04, p = .02). Time poststroke at the start of MIT predicted outcome on untrained items, MIT task (β = −.68, p = .01), on AAT repetition (β = −1.54, p = .02), and on the ANELT (β = −.46, p = .04). The earlier MIT was started, the greater the improvement on these outcome measures. No significant differences were found between the groups of responders and nonresponders.
Discussion
This study shows that training language production with MIT has a beneficial effect on language production in severe nonfluent aphasia in the subacute phase poststroke. The experimental group, receiving early MIT, showed significant improvement on all outcome measures except for the Sabadel. Furthermore, their improvement in language repetition was significantly greater than that in the control group, receiving control therapy of the same intensity from the same time poststroke. This effect was present for trained (MIT test) and untrained material (AAT subtest repetition), indicating a generalization to untrained material. Finally, the considerable difference between the MIT and the control group in improvement on verbal communication (as measured by the ANELT) provides support for generalization of these capabilities to verbal communication in daily life.
The study is too underpowered to obtain significant effects on most of the outcome measures, which is a clear limitation of the study. However, all observed differences, although not significant, are in favor of the MIT group. This suggests that in a larger sample size significant differences would be found. A larger study verifying our observations is therefore worthwhile.
The role of treatment intensity is worth considering. Several studies have shown a relation between treatment intensity and treatment effect: higher intensity yields larger treatment effects.7,8,10,12 In this study, we chose a high treatment intensity that is clinically feasible in the subacute stage poststroke, both in terms of patient burden and in the context of a rehabilitation program that entails other therapies (eg, physiotherapy) as well. As such, the results of the study are relevant for clinical practice. However, it is possible that with higher treatment intensity larger treatment effects and generalization to daily life communication would have been observed.
These results provide support for the efficacy of language production training with MIT in subacute severe nonfluent aphasia. Contrary to the belief of many clinicians that a focus on language production is too frustrating for patients with subacute severe nonfluent aphasia, this study shows that intensive language production training is possible and effective in this population. However, our study does not indicate that MIT is the best way to achieve improved language production in this group. A direct comparison between different language production interventions is needed to resolve this issue. Similarly, it is possible that an adaptation of the MIT technique might yield better results. Several therapeutic elements of MIT may be responsible for its effects on language production: melody, rhythm, hand tapping, or reduction of speed in singing versus speaking.20,21,38,39 A recent study comparing the effect of melody and rhythm on language production in nonfluent aphasia showed that melody had no additional effect over rhythm. 20 In the present study, it was impossible to unravel the impact of the MIT components, since we used the original MIT technique. Similarly, our study allows no conclusions regarding the role of formulaic language. 20 MIT involved both formulaic (eg, “How are you?”) and nonformulaic utterances (eg, “The ministers are talking nonsense”).
Our 2 tasks for functional language (the Sabadel and ANELT) showed different results. In contrast to the improvement of everyday life communication seen with the ANELT, the Sabadel failed to show improvement in either of the intervention groups. However, it is possible that this task is not suitable for measuring verbal communication in people with severe nonfluent aphasia; storytelling is known to be extremely difficult for severely aphasic patients. 40 In this study, many participants (48%) were unable to produce more than one adequate word on the Sabadel story retelling, both before and after the intervention period. In contrast, these same patients were able to produce adequate words and utterances on the ANELT.
The observed contrast between trained and untrained material in the MIT repetition task is clinically relevant. From the start of MIT it was emphasized that personally relevant utterances should be trained. The results of this study underline the importance of carefully selecting the target utterances. One of the limitations of this study is that we did not examine patients’ use of the trained utterances in their daily life communication. This would require partner questionnaires, which are not very reliable. Anecdotic reports from partners suggest that patients did benefit from the improvement on trained utterances in daily life; this is in line with the results of Stahl et al. 20
Remarkably, the present study also showed that a small difference in the timing of MIT had considerable consequences for its effect. A delay of merely 6 weeks was related to less improvement. Although the difference between early and delayed MIT was only significant for the repetition of trained items, the overall larger improvement in the early MIT group (Figure 2) suggests that timing does affect therapy outcome. The earlier application of MIT may have had more interaction with the processes of spontaneous recovery, which mainly occur during the first 3 months after stroke.3,4,6,41 This is a challenging idea on the effect of timing of an intervention on neuromodulatory effects during recovery. The timing of aphasia treatment remains an important but unresolved question. A meta-analysis showed that aphasia treatment in the first 3 months after stroke yields larger effect sizes than treatment in later phases. 10 In contrast, a recent study reported no additional beneficial effect of aphasia treatment in the first 4 months after stroke. 15 However, this latter study examined aphasia treatment in an unselected group of aphasic patients, with heterogeneous and low-intensity treatment paradigms. To our knowledge, besides the present study, no other study has evaluated the effect of delaying a specific treatment; nevertheless, clinically, this is a highly relevant issue.
All participants fit the reported criteria for MIT candidacy, that is, nonfluent aphasia after left hemisphere lesion, language repetition severely disordered, and relatively good auditory comprehension.19,30 Nevertheless, large individual differences with respect to MIT success were observed. To implement MIT more effectively in clinical practice, we examined potential determinants influencing therapy outcome. However, we were unable to detect any determinants in patients’ profile before MIT. Earlier studies reported age and initial aphasia severity to be important predictors for language recovery; however, these results are inconsistent and the relation between these factors and the type of intervention is not clear.3,5,11,41 Neurological variables, such as size and location of the lesion, may play an important role.3,5,41,42 A limitation of the present study is that lesion characteristics were not documented in detail. Although information on lesion was collected from the participants’ medical records (scans and/or scan reports), these records often lacked information on the exact size and location of the lesion. Because routine scans were made shortly after the stroke, at the moment of hospitalization many of the scans showed no structural damage, and no reliable information on the lesion was available at the start of MIT.
Another limitation is the lack of follow-up measurements. Stahl et al 20 suggested that the effects of MIT are stable during a 3-month period after treatment completion.
In conclusion, the present study shows that intensive oral language production training is possible and effective in subacute severe nonfluent aphasia.
Footnotes
Acknowledgements
We thank all therapists and evaluators in the participating centers for their participation and commitment.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Van der Meulen and Van de Sandt-Koenderman are preparing a Dutch version of the MIT treatment program, to be published by Bohn Stafleu van Loghum, the Netherlands. The publisher has had no influence on the data collection, methods, interpretation of the data, and final conclusions.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the Stichting Rotterdams Kinderrevalidatie Fonds Adriaanstichting (Grant No. 2007/0168 JKF/07.08.31 KFA).
