Assessing the Performance of Three Methods for Separating Non-Spontaneous and Spontaneous Speech Through Simulation

Abstract

The ability to distinguish spontaneous from non-spontaneous speech can prove helpful, such as in forensic evidence situations, sorting voice-mail responses from voice-mail menus, and automatic segmentation of spontaneous responses from prepared questions. The latter situation occurs when trying to create a database of spontaneous data from data of a speaker responding spontaneously to prepared prompts. This paper outlines and compares three methods for automatically classifying spontaneous and non- spontaneous speech, and presents the experimental results of the performance of all three methods, evaluated on high quality simulated data. All three methods are based on an analysis of the probability distributions of prosodic features extracted from speech signals.

Keywords

Spontaneous speech non-spontaneous speech prosodic features probability distribution Hellinger's method Gaussianity

Get full access to this article

View all access options for this article.

References

Donoho, D. "Large-Sample Modulation Classification Using Hellinger Representation." Center for Research on Applied Signal Processing Research Review, 1996.

Fama, E. and Roll, R. "Parameter Estimates for Stable Distributions." Journal of the American Statistical Association, Vol. 66, pp 331-338, 1971.

Harris, J.D. and Nelson, D.J. "Glottal Pulse Alignment in Voiced Speech for Pitch Determination." Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing," Vol. 2, pp 519-522, 1993.

Keith, Selkirk and Longman. Mathematics Handbook: The Language and Concepts of Mathematics Explained, New York Press, 1992.

Marinovic, N. , Nelson, D. , Cohen, L. and Umesh, S. "Classification of Digital Modulation Types ." Proceedings of the SPIE Conference on Advanced Signal Processing, San Diego, 1995.

McCulloch, J.H. "Simple Consistent Estimators of Stable Distribution Parameters." Communications in Statistics and Simulation, Vol. 15, No. 4, pp 1109-1136, 1986.

Nelson, D. "Correlation Based Methods for Speech Formant Recovery." Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 3, pp 1643-1646, 1997 .

Nelson, D. and Pencak, J. "Pitch Based Methods for Speech Detection and Automatic Frequency Recovery ." Proceedings of the SPIE Conference on Time-Frequency , 1995.

Pencak, J. and Nelson, D. "The N-P Speech Detection Algorithm." Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 1, pp 381-384, 1995.