Abstract
As robots enter the human environment and come into contact with inexperienced users, they need to be able to interact with users in a multimodal fashion—keyboard and mouse are no longer acceptable as the only input modalities. In this paper we introduce a novel approach for programming robots interactively through a multimodal interface. The key characteristic of this approach is that the user can provide feedback interactively at any time—during both the programming and the execution phase. The framework takes a three-step approach to the problem: multimodal recognition, intention interpretation, and prioritized task execution. The multimodal recognition module translates hand gestures and spontaneous speech into a structured symbolic data stream without abstracting away the user’s intent. The intention interpretation module selects the appropriate primitives to generate a task based on the user’s input, the system’s current state, and robot sensor data. Finally, the prioritized task execution module selects and executes skill primitives based on the system’s current state, sensor inputs, and prior tasks. The framework is demonstrated by interactively controlling and programming a vacuum-cleaning robot. The demonstrations are used to exemplify the interactive programming and the plan recognition aspect of the research.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
