Abstract
Recognizing handwritten equations is a challenging problem, and even more so when they are written in a classroom environment. However, since videos of the handwritten text and the accompanying audio refer to the same content, a combination of video and audio based recognition has the potential to significantly improve the recognition accuracy. In this paper, using a combination of video and audio based recognizers, we focus on improving the character recognition accuracy for handwritten mathematical content in videos using audio and propose an end-to-end recognition system. The system includes components for video preprocessing, selecting the characters that may benefit from audio-video based combination, establishing a correspondence between handwritten and the spoken content, and finally combining the recognition results from the audio and video based recognizers. The current implementation of the system makes use of a modified open source text recognizer and a commercially available phonetic word spotter. For evaluation purposes, we use videos recorded in a classroom-like environment and our experiments demonstrate the significant improvements in character recognition accuracy that can be achieved using our techniques.
Get full access to this article
View all access options for this article.
