As an approach to the separation of the information in speech from speaker-specific attributes, we have used a modified cepstral analysis followed by a peak-tracking process in the cepstral domain. The resulting tracks show a high degree of similarity between different male speakers saying the same words and is a basis for speaker-independent word recognition. The method does not involve pitch frequency determination.
Get full access to this article
View all access options for this article.
References
1.
SinexDGGeislerCD. Responses of auditory-nerve fibers to consonant-vowel syllables. J Acoust Soc Am1983;73: 602–15.
2.
YoungEDSachsMB. Representation of steady-state vowels in temporal aspects of the discharge patterns of populations of auditory-nerve fibers. J Acoust Soc Am1979;66: 1381–403.
3.
FantG. Acoustic theory of speech production. D's-Gravenhage, The Netherlands: Mouton and Co, 1960.
4.
SchaferRWRabinerLR. System for automatic formant analysis of voiced speech. J Acoust Soc Am1970;47: 634–48.
RabinerLRGoldB. Theory and application of digital signal processing. Englewood Cliffs, NJ: Prentice-Hall, Inc, 1975: 61–3.
7.
OoyamaGKatagiriSKidoK. A new method of cepstral analysis by using comb lifters. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Tulsa, Oklahoma, April 10–12, 1978: 19–22.