Markerless Kinect-Based Hand Tracking for Robot Teleoperation

Abstract

This paper presents a real-time remote robot teleoperation method using markerless Kinect-based hand tracking. Using this tracking algorithm, the positions of index finger and thumb in 3D can be estimated by processing depth images from Kinect. The hand pose is used as a model to specify the pose of a real-time remote robot's end-effector. This method provides a way to send a whole task to a remote robot instead of sending limited motion commands like gesture-based approaches and this method has been tested in pick-and-place tasks.

Keywords

robot manipulator markerless Kinect

1. Introduction

If a task is too complex for an autonomous robot to complete, then human intelligence is required to make a decision and control the robot, especially when it is in unstructured dynamic environments. Furthermore, when the robot is in a dangerous environment, robot teleoperation may be necessary. Some human-robot interfaces (Yussof et al. [1]; Mitsantisuk et al. [2]) like joysticks, dials and robot replicas, have been commonly used, but these contacting mechanical devices require unnatural hand and arm motions to complete a teleoperation task.

Another way to communicate complex motions to a remote robot, which is more natural, is to track the operator hand-arm motion which is used to complete the required task using contacting electromagnetic tracking sensors, inertial sensors and gloves instrumented with angle sensors (Hirche et al. [3]; Villaverde et al. [4]; Wang et al. [5]). However, these contacting devices may hinder natural human-limb motions.

Because vision-based techniques are non- contact and less hindrance to hand-arm motions, they have also been used. Vision-based methods always use physical markers placed on the anatomical body part (Kofman et al. [6]; Lathuilière and Hervé [7]; GuangLong Du et al. [8]). There are a lot of applications (Peer et al. [9] Borghese and Rigiroli [10]; Kofman et al. [6]) using marker-based human motion tracking, however, because body markers may hinder the motion for highly dexterous tasks and may get occluded, this marker-based tracking is not always practical. Thus, a markerless approach seems better for many applications.

Compared to image-based tracking which uses markers, markerless is not only less invasive, but also eliminates problems of marker occlusion and identification (Verma [11]). Thus, markerless tracking may be a better approach for remote robot teleoperation. However, existing markerless human-limb tracking techniques have so many limitations that they may be difficult to use in robot teleoperation applications. A lot of existing markerless tracking techniques capture images and compute the motion later like a post-process (Goncalves et al. [12]; Kakadiaris et al. [13]; Ueda et al. [14] Rosales and Scarloff [15]). The markerless tracking has to perform simultaneously in real-time for remote robot teleoperation when controlling continuous robot motion. To allow the human operator to perform hand-arm motions for a task in a natural way without interruption, the position and orientation of the hand and arm should be provided immediately. Many techniques can only provide 2D image information of the human motion (Koara et al. [16]; Mac Cormick and Isard [17]) and the tracking methods cannot be extended for accurate 3D joint-position data. An end-effector of a remote robot would require the 3D position and orientation information of the operator's limb-joint centres with respect to a fixed reference system, and identifying human body parts in different orientations has always been a significant challenge (Kakadiaris et al. [13]; Goncalves et al.[12]; Triesch and Malsburg [18]).

For robot teleoperation, there is limited research on markerless human tracking. Most techniques have tried to use a human-robot interface based on hand-gesture recognition to control robot motion (Fong et al. [19]; Hu et al. [20]; Moy [21]). Coquin et al. and Ionescu et al. [22] developed markerless hand-gesture recognition methods which can be used for mobile robot control where only a few different commands are enough like “go,” “stop,” “left,” “right” and so on. However, for object manipulation in 3D space, it is not possible to achieve natural control and flexible robot motion using gestures only. If a human operator wants to use gestures, he/she needs to think of those limited separate commands that the human-robot interface can understand like move up, down, forward and so on. A better way of human-robot interaction would be to permit the operator to focus on the complex global task as a human naturally does when grasping and manipulating objects in 3D space instead of thinking about what type of hand motions are required. To achieve this goal, a method that allows the operator to complete the task using the hand-arm motions naturally, providing the robot with information of the hand-arm motion in real-time like the hand and arm anatomical position and orientation (Kofman et al. [23]), is needed. However, to achieve the initialization, the human operator must assume a simple posture with an unclothed arm in front of a dark background, hand placed higher than the shoulder. It is not possible to get a precise result with a complex background. In addition, the human operator would find it hard to work in cold weather as the arm is unclothed. It is also limited because of the lighting effect, i.e., it is difficult to use when it is too bright or too dark.

This paper presents a method of remote robot teleoperation using markerless Kinect-based 3D hand tracking of the human operator (Figure 1). Markerless Kinect-based hand tracking is used to acquire 3D anatomical position and orientation, and then it sends the data to the robot manipulator by a human-robot interface to enable the robot end-effector to copy the operator hand motion in real-time. This natural way to communicate with the robot allows the operator to focus on the task instead of thinking in terms of limited separate commands that the human-robot interface can understand like gesture-based approaches. Using the non-invasive Kinect-based tracking avoids the problem that physical sensors, cables and other contacting interfaces may hinder natural motions and that there may be marker occlusion and identification when using marker-based approaches.

Figure 1.

Non-invasive robot teleoperation system based on the Kinect

2. Human hand tracking and positioning system

Human hand tracking and positioning is carried out by continuously processing RGB images and depth images of an operator who is performing the hand motion to complete a robot manipulation task. The RGB images and depth images are captured by the Kinect which is fixed in the front of the operator.

The Kinect has three autofocus cameras: two infrared cameras optimized for depth detection and one standard visual-spectrum camera used for visual recognition.

2.1 Kinect coordinate system

In Figure 2, an operator stands in front of the Kinect and controls a robot. We can define the Kinect-coordinate as shown in Figure 2: axis X is upturned, axis Y is rightward and axis Z is vertical. The Kinect can capture the depth of any objects in its workspace. In Figure 2 we can see the index finger tip(I), the thumb tip(T) and a part of the hand between the thumb and the index finger(B). Every distance between the Kinect and I, B, T or U is different. I and T are closest to the Kinect and the upper arm U is furthest. The 3D position of B is used to control the position of the robot end-effector. The I, T and B of the operator are used to control the orientation of the robot end-effector.

Figure 2.

depth of objects. K: Kinect; I: index finger tip; T: thumb tip, B: a part of the hand between the thumb and the index finger; U: upper arm.

2.2 Image capture and segmentation of hand

In order to catch the hand motion used for controlling the robot manipulator, we need to separate the hand from the depth image. The arm is segmented from the body by thresholding the raw depth image.

A depth image D(i,j), shown in Figure 3a, records the depth of all the pixels of RGB image which is shown in Figure 3b. Assume that the distance between the human operator and the Kinect is not more than T(m) and there is no other object between the human operator and the Kinect. For all i and j in depth image D(i,j), the body image C_b(i,j) is then divided as:

\begin{array}{l} C_{b} (i, j) = {d (i, j) | d (i, j) < T ; d (i, j) \in D; \\ i = 1, 2, ..., n; j = 1, 2, ..., m} \end{array}

(1)

Figure 3.

Segmentation of hand and determination of thumb and index-finger tip positions

Where d(i,j) is the pixel of depth image D; n is the width of D and m is the height of D.

When the human operator held out the hand to control the robot manipulator, the arm is closer than the body, we can first compute the mean value M of all the body, including the arm:

M = \frac{\sum_{i = 1}^{n} \sum_{j = 1}^{m} C_{b} (i, j)}{m n}

(2)

Then we can divide the arm region A(i,j) as follows:

A = {d (i, j) | d (i, j) \in C_{b} & d (i, j) > M}

(3)

The arm region A is shown in Figure 3c.

2.3 Determination of thumb and index- finger tip positions

The positions of thumb tip and index-finger tip are determined by an image that contains the arm. The arm region A_3d(x,y,z) can be reconstructed from A(i,j) as shown in Figure 3d.

For all 2D points (i,j) in the A(i,j), the 3D points can be calculated by:

A_{3 d} (x, y, z) = [i, j, d (i, j)]

(4)

Then project the 3D points A_3d(x,y,z) to the face YOZ as shown in Figure 3e.

A_{Y O Z} (y, z) = A_{3 d} (0, y, z) = [j, d (i, j)]

(5)

Define the minimize project function f:

f (y) = \min_{z = 1, 2, ... m} (A_{YOZ} (y, z))

(6)

Determine the one maximum (at y=y1) and two minimum (at y=y2, y=y3) for the minimize project function f. Then the 3D point of I can be reconstructed by:

I (x, y, z) = {\begin{matrix} x = \sum_{x^{'} = 1}^{n} A_{3d} (x^{'}, y 2, f (y 2)) \\ y = y 2 \\ z = f (y 2) \end{matrix}

(7)

T (x, y, z) = {\begin{matrix} x = \sum_{x^{'} = 1}^{n} A_{3d} (x^{'}, y 3, f (y 3)) \\ y = y 3 \\ z = f (y 3) \end{matrix}

(8)

B (x, y, z) = {\begin{matrix} x = \sum_{x^{'} = 1}^{n} A_{3d} (x^{'}, y 1, f (y 1)) \\ y = y 1 \\ z = f (y 1) \end{matrix}

(9)

3. Position model

To avoid large scale motion when the operator performs manipulation, we need to confine the working space of the operator to a relatively small space. However, the working space of the remote robot should not be limited. This means the mapping from a relatively small place to an unconfined large space is necessary. Because of direct mapping from small space to a larger space, the mapping will lose some precision. To avoid this problem, we adjust a differential positioning method in this situation.

Similar to the mouse and the keyboard, the position of the hand can be calculated by the incremental method. From section 2, the 3D position of B, T and I are calculated in the world coordinate, shown as Figure 3. The initial position and orientation of the robot end-effector in the starting point are also stored as the robot reference position and orientation, respectively. The position of the robot tool-control point on the end-effector is controlled by position B of the human operator.

Define the 3D position of the I, T and B in the current frame as I′(x,y,z), T′(x,y,z) and B′(x,y,z), respectively. Define the length of the line segment jointing the index-finger tip (I) and thumb-tip (T) on the operator hand as L (shown as Figure 5),

L = | | T ’ (x, y, z) - I ’ (x, y, z) | |

(10)

Figure 4.

Hand pose

Figure 5.

Positioning model

The 3D position of B in the last frame is B″(x,y,z). The end-effector reference position in the last frame is P″(x,y,z) and the new end-effector reference position is updated:

{\begin{matrix} P' = P'' + L * δ \\ P' = P'' \end{matrix} \begin{matrix}  \end{matrix} \begin{matrix} L \geq u \\ L < u \end{matrix}

(11)

Where u is a threshold that determines whether the robot keeps moving or pauses. When L=0, it means the operator stops to control the robot, shown as Figure 4.

Because σ is an adjustable parameter, theoretically the space manipulated by the operator is an infinite space and we can obtain coarse-control and fine-control through adjustment to the value of σ.

4. Orientation Model

As described in Figure, the orientation of the end-effector is in accordance with the orientation formed by thumb tip, index finger tip and The part between the thumb and the index finger

Figure 6.

Orientation model

The orientation of the end-effector is calculated using the 3D positions of the I, T and B. In the mapping of the operator hand to the robot-tool coordinate system, the line from B to the midpoint M of the line segment, which joints the index-finger tip (I) and thumb-tip (T) on the operator hand, is mapped to the robot-tool axis X′ (Figure 4), and the X'Y′ plane is defined by B, I and T.

This means that if we only get the transformation matrix from the coordinate system of the console to the coordinate system of the operator's hand, we can obtain the transformation matrix from the base coordinate system to the end-effector. The details of the derivation of the orientation matrix are given below:

Assuming the origin of the operator's hand coordinate system is identical to the one in console coordinate system and the transformation matrix is a 3*3 matrix M. Let Point A in the operator's hand coordinate system transfer to Point A′ in the console coordinate system, we have:

A ’ = M A

(12)

In hand tracking and positioning, the unit vector [x₁, x₂, x₃], [y₁, y₂, y₃], [z₁, z₂, z₃] _in _direction X, Y, Z can be measured by Kinect yielding:

[\begin{matrix} \begin{matrix} m_{11} \\ m_{21} \\ m_{31} \end{matrix} & \begin{matrix} m_{12} \\ m_{22} \\ m_{32} \end{matrix} & \begin{matrix} m_{13} \\ m_{23} \\ m_{33} \end{matrix} \end{matrix}] [\begin{matrix} 1 \\ 0 \\ 0 \end{matrix}] = [\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix}]

(13)

[\begin{matrix} \begin{matrix} m_{11} \\ m_{21} \\ m_{31} \end{matrix} & \begin{matrix} m_{12} \\ m_{22} \\ m_{32} \end{matrix} & \begin{matrix} m_{13} \\ m_{23} \\ m_{33} \end{matrix} \end{matrix}] [\begin{matrix} 0 \\ 1 \\ 0 \end{matrix}] = [\begin{matrix} y_{1} \\ y_{2} \\ y_{3} \end{matrix}]

(14)

[\begin{matrix} \begin{matrix} m_{11} \\ m_{21} \\ m_{31} \end{matrix} & \begin{matrix} m_{12} \\ m_{22} \\ m_{32} \end{matrix} & \begin{matrix} m_{13} \\ m_{23} \\ m_{33} \end{matrix} \end{matrix}] [\begin{matrix} 0 \\ 0 \\ 1 \end{matrix}] = [\begin{matrix} z_{1} \\ z_{2} \\ z_{3} \end{matrix}]

(15)

Through (13), (14), (15), we can get:

[\begin{matrix} \begin{matrix} m_{11} \\ m_{21} \\ m_{31} \end{matrix} & \begin{matrix} m_{12} \\ m_{22} \\ m_{32} \end{matrix} & \begin{matrix} m_{13} \\ m_{23} \\ m_{33} \end{matrix} \end{matrix}] = [\begin{matrix} \begin{matrix} x_{1} \\ x_{2} \\ x_{3} \end{matrix} & \begin{matrix} y_{1} \\ y_{2} \\ y_{3} \end{matrix} & \begin{matrix} z_{1} \\ z_{2} \\ z_{3} \end{matrix} \end{matrix}]

(16)

As stated before, the transformation matrix from the console coordinate system to the operator's hand coordinate system is identical to the one from the base coordinate system to the end-effector coordinate system, and the translation relationship between the end-effector and the base coordinate system is already yielding in the positioning model, so the transformation matrix of orientation is:

M = [\begin{matrix} x_{1} & y_{1} & z_{1} & p_{1} \\ x_{2} & y_{2} & z_{2} & p_{2} \\ x_{3} & y_{3} & z_{3} & p_{3} \\ 0 & 0 & 0 & 1 \end{matrix}]

(17)

Notice that the [p₁, p₂, p₃] is the translation matrix from the base coordinate system to the end-effector.

5. Virtual Robot Manipulation System

We use a six degree-of-freedom industrial robot to perform this experiment, as shown in Figure 7. The task is to grab the target object which is in the robot's working space and then place the object at the destination.

Figure 7.

Six-axis robot-manipulator used at the remote robot site

There are two working modes for the robot. The first one is to calculate the angle of every joint by reversing kinematic according to the position of the end-effector. After joints execute the entire requested angles, the end-effector of the virtual robot reaches the destination. This mode is suitable for a situation where no obstacle occurs in the work space of the virtual robot. However, the second mode is suitable for the situation where an obstacle shows up in the virtual robot's working space. In this mode the virtual robot has to move along a safe path, which ensures the virtual robot will not collide with the obstacle.

In DH representation, Ai presents the homogeneous coordinate transformation matrix from coordinate i-1 to i:

A_{i} = [\begin{matrix} \begin{matrix} \cos θ_{i} \\ \sin θ_{i} \\ 0 \\ 0 \end{matrix} & \begin{matrix} - \sin θ_{i} \cos α_{i} \\ \cos θ_{i} \cos α_{i} \\ \sin α_{i} \\ 0 \end{matrix} & \begin{matrix} \sin θ_{i} \sin α_{i} \\ - \cos θ_{i} \sin α_{i} \\ \cos α_{i} \\ 0 \end{matrix} & \begin{matrix} l_{i} \cos θ_{i} \\ l_{i} \sin θ_{i} \\ r_{i} \\ 1 \end{matrix} \end{matrix}]

(18)

For a robot with six joints, the homogeneous coordinate transformation matrix from the base coordinate system to the end-effector's coordinate system is defined as:

T_{6} = A_{1} A_{2} ... A_{6} = [\begin{matrix} \begin{matrix} n_{6}^{0} & s_{6}^{0} & a_{6}^{0} & p_{6}^{0} \end{matrix} \\ \begin{matrix} 0 & 0 & 0 & 1 \end{matrix} \end{matrix}]

(19)

Where $n_{6}^{0}$ is the row vector of the end-effector, $s_{6}^{0}$ is the pitch vector, $a_{6}^{0}$ is the yaw vector and $p_{6}^{0}$ is the position vector.

Using (17), (19), we have:

T_{6} = M

(20)

Through (8) we can have the angle of six joints: (θ₁, θ₂, …, θ₆).

6. Experiments

We evaluated the algorithm on our robot platform. When testing it, we built up an experimental environment of teleoperation. We built a set of emulation environments for the technical robot and a set of virtual reality systems based on video at the local site. The remote site is the real robot in the working environment. In this experiment, considering the real environment of teleoperation, we limit bandwidth to 30kB/s and the delay time is approximately 3 seconds.

To evaluate the Kinect-based teleoperation algorithm described in this paper, we use C++ to develop a Kinect-based human-robot interface system (Figure 8) and this system is used for the teleoperation of a six-axis technical robot. This experimental system includes three modules:

Use the human hand tracking and positioning system to get the hand images, and then calculate the 3D positions of T (the thumb tip), I (the index-finger tip) and B (the part of the hand between the thumb and the index finger).

Virtual robot manipulation system drives the virtual robot based on the joint angles which are calculated through reverse kinematic. If the commands are safe, they will be transmitted to the remote site to control the real robot.

The remote site will transmit the video to the local site and the video fuse system displays the virtual environment and the real environment. Then the edges of the virtual robot cover the video frame which is transmitted from the remote site.

In the experiment, the operator placed his hand in the workspace to control the virtual robot. The orientation of the virtual robot's end-effector coincided with the human hand. The position of the virtual robot's end-effector was adjusted by moving the human hand through different faces of the direction space, as shown in Figure 3.

As shown in Figure 3, the way the operator controls the robot is natural and intuitive. Because of using an incremental method which is similar to keyboard control, the operator is not required to make large scale movements to control the robot.

7. Result

After reconstructing and controlling robots by reverse kinematics, the precision of manipulation will decrease because of the transformation of the coordinate system and solving of the equations set.

Figure shows the position and orientation of the robot's end-effector and the operator's hand during teleoperation experiments. The dashed line represents the end-effector's path. The solid line with green squares represents the path of the operator's hand. The virtual robot was manipulated to grab the ball which is placed on a square. The data generated by this experiment has shown that the position errors ranged from −13 to +13 mm and the orientation errors ranged from −2 to 2 degree. Figure (c,d,e) shows the X, Y, Z displacements of the end-effector and hand, while the rotations of them are shown in Figure (f,g,h).

Figure 8.

Non-invasive vision-based teleoperation system

Figure 9.

Analysis of the experiment

8. Discussion

In the remote unstructured environment of the robot teleoperation, we assume that all the remote robot site components, including robotic arm, robot controller, cameras on end-effectors and some other cameras, can be installed on a mobile platform and enter those unstructured environments. The method shown here is proved on grabbing objects, picking up objects and positioning accurately during grabbing objects in the fine adjustment controlling mode. One advantage of this system is that it includes the operator into the decision control loop. It allows a robot to grab, move and place the object without any prior knowledge like starting location and even destination location. There are some similar tasks which require decision making when picking up objects and targets from multiple objects like packing and cleaning some objects which may contain some dangerous items. It is expected that this system can be used to achieve those more complex poses when the joints of the robot are limited. The hole task shows how to determine the position of an extruded body and a target hole randomly. Assembly and disassembly may include more limited hole tasks. We may need an appropriate grab hook, bigger hole and groove unless this system includes force feedback.

Compared with the automatic capture (Kofman et al. [6]), this algorithm uses manual positioning. Considering hand tremor, this algorithm includes a coarse adjustment and fine adjustment function. When guiding the robot, we can use the coarse adjustment to move the robot close to the target quickly. When grabbing the target, we can use the fine adjustment to position the robot accurately. That can ensure the safety and the efficiency of the teleoperation, and solve the problem of inaccuracy caused by manual operation.

This paper contributes to the guiding teleoperation system based on non-contact measurement. By using tracking based on Kinect, robot teleoperation allows the operator to control the robot in a more natural way. Generally speaking, using the same hand motion that naturally would be used in a task can accomplish the operation task and what is more, this tracking based on Kinect is non-contact. Thus, compared with contacting electromagnetic devices, devices based on sensor and data gloves which are used normally, non-contact devices may cause less hindrance to the natural human-limb motion. The method proposed here will allow the operator to focus on the task instead of thinking of how to decompose the commands into some simple commands that the voice recognition teleoperation system can understand. This method is more natural and intuitive than the operation in Kofman et al. [23]. The system can be used immediately without any initialization and this non-contacting control system can be used outdoors. Because this algorithm uses infrared distance measurement to get arm information, it can ignore the lighting effect and does not need to extract the 3D coordinates by accurate image processing. That allows the system to be used in more severe environments, like when it is too bright or too dark. In addition, the algorithm of [23] reference 1 needs a bare hand to recognize the colour of skin, otherwise, it cannot be used to extract the hand data. Compared with that algorithm, this algorithm does not require a bare hand and the operator can wear gloves when using the system in a cold outdoor working environment. That enlarges the field of application of the system.

9. Conclusion

A method of human-robot interaction using markerless Kinect-based tracking of the human hand for a robot-manipulator teleoperation has been presented. Via tracking of the thumb tip, index-finger tip and the part of the hand between the thumb and the index finger in real-time, the 3D position and orientation of the hand are computed accurately and the robot manipulator can be controlled by hand to perform the task of picking up and placing. To complete the complex tasks, multi-Kinect will be used to work together in future work.

References

Yussof

Capi

Nasu

Yamano

Ohka

A CORBA-Based Control Architecture for Real-Time Teleoperation Tasks in a Developmental Humanoid Robot. International Journal of Advanced Robotic Systems, 8(2):29–48, 2011.

Mitsantisuk

Katsura

Ohishi

Force Control of Human-Robot Interaction Using Twin Direct-Drive Motor System Based on Modal Space Design. IEEE Transactions on Industrial Electronics, 57(4):1338–1392, 2010.

Hirche

Buss

Human-Oriented Control for Haptic Teleoperation. Proceedings of the IEEE, 100(3):623–647, 2012.

Villaverde

Raimundez

Barreiro

Passive Internet-based Crane Teleoperation with Haptic Aids. International Journal of Control Automation and Systems, 10(1):78–87, 2012.

Wang

Giannopoulos

Slater

Peer

Buss

Handshake: Realistic Human- Robot Interaction in Haptic Enhanced Virtual Reality. Presence-Teleoperators and Virtual Environments, 20(4):371–392, 2011.

Kofman

Jonathan

Xianghai

Luu

Timothy

, and Verma

Siddharth

. 2005. Teleoperation of a robot manipulator using a vision-based human-robot interface. IEEE Transactions on Industrial Electronics 52(5):1206–1219.

Lathuilière

Fabienne

and Jean-Yves

Hervé

. 2000. Visual hand posture tracking in a gripper guiding application. Proc. Int. Conf. Robotics and Automation (ICRA) 1688–1694.

Guanglong

Zhang

Ping

Yang

Liying

Yanbin

. Robot teleoperation using a vision-based manipulation method. Audio Language and Image Processing (ICALIP), 2010 International Conference. 2010, 945–949.

Peer

Pongrac

Buss

Influence of Varied Human Movement Control on Task Performance and Feeling of Telepresence. Presence-Teleoperators and Virtual Environments, 19(5): 463–481, 2010.

10.

Borghese

N. Alberto

and Paolo

Rigiroli

. 2002. Tracking densely moving markers. IEEE First International Symposium on 3D Data Processing and Transmission, Padova Giugno, 682–685.

11.

Siddharth

Verma

. 2004. Vision-based markerless 3D human-arm tracking. M.A.Sc. Thesis, Department of Mechanical Engineering, University of Ottawa, Ottawa, Canada.

12.

Goncalves

Luis

Enrico

DiBernardo

Enrico

Ursella

and Pietro

Perona

. 1995. Monocular tracking of the human arm in 3D. Proceedings of IEEE International Conference on Computer Vision, ICCV95, 764–770.

13.

Kakadiaris

Ioannis A

Dimitri

Metaxas

and Ruzena

Bajcsy

. 1994a. Active part-decomposition, shape and motion estimation of articulated objects: A Physics-based approach. Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 980–984.

14.

Etsuko

Ueda

Matsumoto

Yoshio

MasakazuImai and Ogasawara

Tsukasa

. 2001. Hand pose estimation for vision based human interface. 10th IEEE International Workshop on Robot and Human Communication (ROMAN2001) 473–478.

15.

Romer

Rosales

and Stan

Sclaroff

. 2000. Inferring body pose without tracking body parts. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition 2:721–727.

16.

Koara

Kengo

Nishikawa

Atsushi

and Miyazaki

Fumio

. 2001. Contour based hierarchical part decomposition method for human body motion analysis from video sequence. In Human Friendly Mechatronics. Ed. By Arai

Arai

, and Takano

, Elsevier Science.

17.

Cormick

Mac

John and Isard

Michael

. 2000. Partitioned sampling, articulated objects, and interface-quality hand tracking. Proceeding of European Conference on Computer Vision 2:3–19.

18.

Triesch

Jochen

and von der Malsburg

Christoph

. 2002. A system for person independent hand posture recognition against complex backgrounds. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(12):1449–1453.

19.

Fong

Terrence

Francois

Conti

Sebastien

Grange

and Charles

Baur

. 2000. Novel interfaces for remote driving: Gesture, haptic and PDA. SPIE Telemanipulator & Telepresence Technologies VII 4195:300–311.

20.

Chao

Qing

Max

Meng

Liu

Peter Xiao Ping

, and Wang

Xiang

. 2003. Visual gesture recognition for human-machine interface of robot teleoperation. IEEE=RSJ International Conference on Intelligent Robots and Systems, USA, 1560–1565.

21.

Moy

Milyn C.

1999. Gesture-based interaction with a Pet Robot. Proceedings of 6th National Conference on Artificial Intelligence and 11th Conference on Innovative Applications of Artificial Intelligence, USA, 628–633.

22.

Ionescu

Bogdan

Didier

Coquin

Patrick

Lambert

and Vasile

Buzuloiu

. 2005. Dynamic and gesture recognition using the skeleton of the hand. Journal on Applied Signal Processing 13:2101–2109.

23.

Jonathan

Kofman

Siddharth

Verma

Xianghai

. Robot-Manipulator Teleoperation by Markerless Vision-Based Hand-Arm Tracking. International Journal of Optomechatronics, 1:331–357, 2007.