Interactive Multimodal Robot Programming

Abstract

As robots enter the human environment and come into contact with inexperienced users, they need to be able to interact with users in a multimodal fashion—keyboard and mouse are no longer acceptable as the only input modalities. In this paper we introduce a novel approach for programming robots interactively through a multimodal interface. The key characteristic of this approach is that the user can provide feedback interactively at any time—during both the programming and the execution phase. The framework takes a three-step approach to the problem: multimodal recognition, intention interpretation, and prioritized task execution. The multimodal recognition module translates hand gestures and spontaneous speech into a structured symbolic data stream without abstracting away the user’s intent. The intention interpretation module selects the appropriate primitives to generate a task based on the user’s input, the system’s current state, and robot sensor data. Finally, the prioritized task execution module selects and executes skill primitives based on the system’s current state, sensor inputs, and prior tasks. The framework is demonstrated by interactively controlling and programming a vacuum-cleaning robot. The demonstrations are used to exemplify the interactive programming and the plan recognition aspect of the research.

Get full access to this article

View all access options for this article.

References

Agah, A. and Tanie, K. 1996. Human-machine interaction through an intelligent user interface based on contention architecture . IEEE International Workshop on Robot and Human Communication RO-MAN’96, Tsukuba, Japan, pp. 537-542 .

Bahlmann, C. and Burkhardt, H. 2001. Measuring HMM similarity with the Bayes probability of error and its application to online handwriting recognition . Proceedings of the 6th International Conference on Document Analysis and Recognition, Seattle, WA, pp. 406-411 .

Batavia, P. H. and Nourbakhsh, I. 2000. Path planning for the Cye personal robot . Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Takamatsu, Japan, October 30-November 5, pp. 15-20 .

Boehme, H. J.et al. 1997. Neural architecture for gesture-based human-machine interaction. Gesture and Sign Language in Human-Computer Interaction, Bielefeld, Germany , pp. 219-232.

Fong, T., Conti, F., Grange, S., and Baur, C. 2000. Novel interfaces for remote driving: gesture, haptic and PDA. SPIE Telemanipulator and Telepresence Technologies VII, Boston, MA .

Forsyth, D. and Ponce, J. 2003. Computer Vision: A Modern Approach, Prentice-Hall, Upper Saddle River, NJ .

Fujita, M. and Kitano, H. 1998. Development of an autonomous quadruped robot for robot entertainment . Autonomous Robots 5(1): 7-18 .

Gertz, M., Stewart, D., and Khosla, P. K. 1993. A software architecture-based human-machine interface for reconfigurable sensor-based control systems . IEEE International Symposium on Intelligent Control, Chicago, IL, pp. 75-80 .

Ghidary, S. S.et al. 2001. Multimodal human-robot interaction for map generation . Proceedings of the IEEE/RSJ International Conference on Intelligent Robot and Systems, Maui, Hawaii, pp. 2246-2251 .

10.

Huang, X.et al. 1993. The SPHINX-II speech recognition system: an overview . Computer Speech and Language 7(2): 137-148 .

11.

Iba, S., Vande Weghe, J. M., Paredis, C. J. J., and Khosla, P. K. 1999. An architecture for gesture-based control of mobile robots . Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Kyongju, Korea, pp. 851-857 .

12.

Ikeuchi, K. and Suehiro, T. 1994. Toward an assembly plan from observation, part I: task recognition with polyhedral objects . IEEE Transactions Robotics and Automation 10(3): 368-385 .

13.

Ishida, T.et al. 2001. Motion entertainment by a small humanoid robot based on OPEN-R . Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Piscataway, NJ, pp. 1079-1086 .

14.

Kang, S. B. and Ikeuchi, K. 1997. Toward automatic robot instruction from perception-mapping human grasps to manipulator grasps . IEEE Transactions on Robotics and Automation 13(1): 81-95 .

15.

Kawamura, K., Alford, A., Hambuchen, K., and Wilkes, M. 2000. Towards a unified framework for human-humanoid interaction . Proceedings of the 1st IEEE-RAS International Conference on Humanoid Robots, Boston, MA.

16.

Kimura, H., Horiuchi, T., and Ikeuchi, K. 1999. Task-model based human robot cooperation using vision . Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Kyongju, Korea, pp. 701-706 .

17.

Kortenkamp, D., Huber, E., and Bonasso, R. P. 1996. Recognizing and interpreting gestures on a mobile robot . National Conference on Artificial Intelligence, Portland, OR, pp. 915-921 .

18.

Kortenkamp, D., Bonasso, R. P., and Subramanian, D. 2001. Distributed, autonomous control of space habitats . IEEE Aerospace Conference, Piscataway, NJ, pp. 2751-2762 .

19.

Koyama, T. 2002. On combining sampled statistics without prior samples. Private communication (September 27).

20.

Kuno, Y., Murashima, T., Shimada, N., and Shirai, Y. 2000. Interactive gesture interface for intelligent wheelchairs . International Conference on Multimedia and Expo, New York, NY, pp. 789-792 .

21.

Lee, C. and Xu., Y. 1996. On-line, interactive learning of gestures for human/robot interfaces . Proceedings of the IEEE International Conference on Robotics and Automation, Minneapolis, MN, pp. 2982-2987 .

22.

Mardia, K. V. 1972. Statistics of Directional Data, Academic, New York .

23.

Mascaro, S. and Asada, H. H. 1998. Hand-in-glove human-machine interface and interactive control: task process modeling using dual Petri nets . Proceedings of the IEEE International Conference on Robotics and Automation, Leuven, Belgium, pp. 1289-1295 .

24.

Matsumoto, Y. and Zelinsky, A. 2000. An algorithm for real-time stereo vision implementation of head pose and gaze direction measurement . Proceedings of the 4th International Conference on Automatic Face and Gesture Recognition, Grenoble, France, pp. 499-504 .

25.

Matsumoto, Y., Ino, T., and Ogasawara, T. 2001. Development of intelligent wheelchair system with face and gaze based interface . Proceedings of the 10th IEEE International Workshop on Robot and Human Communication (ROMAN 2001), Bordeaux-Paris, France, pp. 262-267 .

26.

Morrow, J. D. and Khosla, P. K. 1997. Manipulation task primitives for composing robot skills . Proceedings of the IEEE International Conference on Robotics and Automation, Albuquerque, NM, pp. 3354-3359 .

27.

Musser, G. 2003. Robots that suck . Scientific American 288(2): 84-86 .

28.

Nishimura, T., Mukai, T., Nozaki, S., and Oka, R. 1998. Adaptation to gesture performers by an on-line teaching system for spotting recognition of gestures from a time-varying image. Transactions of the Institute of Electronics, Information and Communication Engineers D-II J81DII(8):1822-1830.

29.

Ogawara, K.et al. 2000. Acquiring hand-action models in task and behavior levels by a learning robot through observing human demonstrations . Proceedings of the IEEE-RAS International Conference on Humanoid Robots, Boston, MA.

30.

Onda, H.et al. 2000. A telerobotics system using planning functions based on manipulation skills and teaching-by-demonstration technique in VR . Journal of the Robotics Society of Japan 18(7): 979-994 .

31.

Oviatt, S. 2000. Taming recognition errors with a multimodal interface . Communications of the ACM 43(9): 45-51 .

32.

Perzanowski, D.et al. 2001. Building a multimodal human-robot interface . IEEE Intelligent Systems 16(1): 16-21 .

33.

Perzanowski, D.et al. 2002. Communicating with teams of cooperative robots. Multi-Robot Systems: From Swarms to Intelligent Automata, A. C. Schultz and L. E. Parker, editors, Kluwer, Dordrecht , pp. 185-193.

34.

Rabiner, L. R. 1989. A tutorial on hidden Markov models and selected applications in speech recognition . Proceedings of the IEEE 77(2): 257-286 .

35.

Rybski, P. E. and Voyles, R. M. 1999. Interactive task training of a mobile robot through human gesture recognition . Proceedings of the IEEE International Conference on Robotics and Automation, Detroit, MI, pp. 664-669 .

36.

Skubic, M., Perzanowski, D., Schultz, A., and Adams, W. 2002. Using spatial language in a human-robot dialog . Proceedings of the International Conference on Robotics and Automation, Washington, DC, pp. 4143-4148 .

37.

Stallman, R. M. 1984. Emacs: the extensible, customizable, selfdocumenting display editor. Interactive Programming Environments, D. R. Barstow, H. E. Shrobe, and E. Sande-wall, editors, McGraw-Hill, New York , pp. 300-325.

38.

Starner, T. and Pentland, A. 1995. Real-time American sign language recognition from video . Proceedings of the IEEE International Symposium on Computer Vision, Coral Gables, FL, pp. 265-270 .

39.

Thrun, S.et al. 1999. MINERVA: a second-generation museum tour-guide robot . Proceedings of the IEEE International Conference on Robotics and Automation, Piscataway, NJ, pp. 1999-2005 .

40.

Todd, D. J. 1986. Fundamentals of Robot Technology: An Introduction to Industrial Robots, Teleoperators, and Robot Vehicles, Wiley, New York .

41.

Virtual Technologies Inc. 1998. CyberGlove Reference Manual.

42.

Voyles, R. M., Agah, A., Khosla, P. K., and Bekey, G. A. 1997. Tropism-based cognition for the interpretation of context-dependent gestures . Proceedings of the IEEE International Conference on Robotics and Automation, Albuquerque, NM, pp. 3481-3486 .

43.

Voyles, R. M., Morrow, J. D., and Khosla, P. K. 1999. Gesture-based programming for robotics: human-augmented software adaptation . IEEE Intelligent Systems 14(6): 22-31 .

44.

Waldherr, S., Romero, R., and Thrun, S. 2000. A gesture-based interface for human-robot interaction . Autonomous Robots 9(2): 151-173 .

45.

Wren, C. R., Clarkson, B. P., and Pentland, A. P. 2000. Understanding purposeful human motion . Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition, Grenoble, France, pp. 378-383 .

46.

Yamada, Y., Morizono, T., Umetani, Y., and Yamamoto, T. 2002. Human error recovery for a human/robot parts conveyance system . Proceedings of the International Conference on Robotics and Automation, Washington, DC, pp. 2004-2009 .

47.

Young, S. J.et al. 2000. HTK: Hidden Markov Model Toolkit V3.0, Microsoft Corporation, Redmond, WA .

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB