Abstract
Computerized speech technology, both for automated speech recognition and automated speech generation, seems to be gaining acceptance as a valuable enhancement of the interface between computerized systems and the system user. Many laboratory studies of automated speech technology (AST) performance have reported speech recognition accuracies in excess of 90% with high reliability across users, an indication that AST is nearly ready for full service in field applications. However, several recent field evaluation studies of experimental prototype training systems incorporating AST have found recognition accuracies in the 40-60% range, an indication that AST may not yet be ready for implementation in the next generation of high-tech training systems. While the most obvious conclusion to draw from these studies would be that AST works well in the laboratory but not in a complex integrated system in the field, a critical examination of these evaluation studies seems to indicate otherwise.
This paper will first review and critically examine several studies of automated speech recognition systems, both in the field and in the laboratory, in an attempt to understand the recognition performance differences outlined above. It will then propose that AST appears, in fact, to be a viable technology, and that the poor speech recognition performance observed in some integrated training systems is due more to the design and implementation of the entire system than to the recognition capabilities of the AST itself. This paper will discuss how guidelines for future system design and evaluation can be derived from the analysis of existing systems, and will present a first-approximation set of guidelines based upon the author's own evaluations of several prototype systems. Finally, the paper will propose that consideration of a few critical features of the total system implementation, which may be relatively minor from a hardware point of view, would probably result in a significant improvement in speech recognition performance and user friendliness, in addition to enhancing the performance of the system as a whole.
Get full access to this article
View all access options for this article.
