Abstract
Empirical studies comparing newer text-to-speech (TTS) synthesis systems to older systems are lacking. This study compared two speech synthesizers, DECtalk ‘Perfect Paul,’ one of the most popular ‘older’ synthesizers, and a ‘newer’ synthesizer, AT&T's Natural Voices ‘Mike,’ for intelligibility utilizing the Modified Rhyme Test (MRT). Each system was evaluated at three speech-to-noise (S/N) ratios: −5 dB, −8 dB, and −11 dB, in a within-subjects design. Aircraft engine noise at 85 dB(A), produced by a Cessna 172R flight simulator, served as background noise. Normal hearing non-pilots served as subjects. Results indicated differences in intelligibility between the two speech synthesizers at each speech-to-noise ratio, with the AT&T product showing significantly better intelligibility than the DECtalk product. Potential applications of this research include guidance for the integration of automated voice technologies in the cockpit and in similar systems that present elevated levels of background noise during normal communications and auditory display operations.
Get full access to this article
View all access options for this article.
