Sage Journals: Discover world-class research

Abstract

Nuts and bolts are common components in assembly lines. Their position and pose estimation is a vital step for automatic assembling. Although many approaches using a monocular camera have been proposed, few works consider a monocular camera’s active movements for improving estimation accuracy. This article presents an active movement strategy for a monocular eye-in-hand camera for high position and pose estimation accuracy of a spatial circle. Extensive experiments are conducted to validate the effectiveness of the proposed method for position and pose estimation of circles printed on paper, real circular flat washers, and nuts.

Keywords

Position estimation pose estimation monocular eye-in-hand system active movement spatial circle

Introduction

Anthropomorphic dual arm robots that can execute complex bimanual operations to replace or work with human workers in unstructured environments have been attracting more and more attention.¹ It is anticipated that robots will perform human-like tasks in both domestic and industrial settings. Accurately localizing objects is of great importance for efficiently accomplishing a grasping task, which is one of the common applications of robots.² One example is a robotic automotive tire assembly scenario involving various sizes of nuts and bolts. An arm with a gripper is assigned the task of picking up the right nuts or bolts whose pose and position may be arbitrary in the workspace, screwing bolts into the holes of hubs, and screwing nuts to the bolts.

The position and pose of an object in Cartesian space can be directly estimated with images. Features extracted from images are used to acquire the relative position and pose estimation of an object with respect to the camera image frame usually with the help of a geometric three-dimensional (3-D) model of the object.³ Using the correspondence between the geometric features of objects and their projections in the image plane, such as circular features,⁴ triplet of image lines,⁵ and five points,⁶ some closed-form analytical solutions of 3-D position and pose estimation have been presented. In order to deal with challenges due to changes in illumination, rotation, scale, and viewpoint, several invariant feature descriptors, such as scale invariant feature transform (SIFT),⁷ speeded up robust features,⁸ and Affine-SIFT⁹ are being used to find the correspondence between an input image and a data set of images whose features alone with their 3-D coordinates are stored off-line or built in advance.

Considering the requirements of efficiency and robustness of pose estimation, recursive methods which rely on temporal-filtering, such as extended Kalman filter, have been proposed.¹⁰ These methods need an object’s geometric model and assume zero-mean Gaussian noise. Although an object’s geometric model can be reconstructed simultaneously with pose estimation,¹¹ the reconstruction leads to high computation complexity. Statistics of the measurement and dynamic noise need to be known in advance and must remain constant. Poor measurement and dynamic models or poor noise estimates degrade the estimation performance. Iterative adaptive Extended Kalman Filter (EKF) is proposed for dealing with these varying statistics.¹² Several filter parameters need to be tuned in order to make the filter work. The sampling rate for the filter must be high enough to guarantee accuracy performance, otherwise, the system performance would be degraded.

Binocular vision systems are used to estimate 3-D position and pose without requiring 3-D models stored or built in advance.¹³ Multicamera sensor fusion techniques are employed for more accurate and enhanced robust pose estimation.¹⁴ An improved hybrid filter is proposed to estimate the object pose by fusing data from multiple cameras. Red Green Blue Depth (RGB_D) cameras have also been used for object pose estimation. By combining multimodal cues, such as two-dimensional local features SIFT, 3-D local features, and 3-D semi-global features, which are extracted in parallel and independently based on color, shape, and size cues, an object’s 6-degrees of freedom (DOF) pose hypothesis is generated.¹⁵ A sensor-fusing system, which consists of a camera, a line laser, and an inertial measurement unit, is used to estimate the spatial circle and trunk parameters.¹⁶ Measurements from a monocular vision system are fused with inertial/magnetic measurements for pose estimation.¹⁷ However, due to the limited workspace of dual arm robots, the monocular eye-in-hand system is preferred, especially for small nuts and bolts. This system rigidly attaches a monocular camera next to the end effector.

Circular shape is the most common geometric feature that has been addressed for 3-D position and 3-D pose estimation because many manufactured objects have circular holes or circular surface contours. There exists two possible pose solutions for a single circle while reconstructing with a single perspective image.¹⁸ The duality problem can be solved with external geometric constraints, such as vertices of a convex hull,¹⁹ known radius, and detected center,²⁰ two concentric circles,²¹ two coplanar circles,²² circles with same rotation center axis,²³ and circles with different rotation center axis.²⁴ Six-dimensional (6-D) position and pose estimation of a nut can be obtained by using one image from the monocular eye-in-hand system with a prior knowledge of the nut’s geometry. However, there are many kinds of nuts and bolts in automotive assembly lines. Moreover, they lack obvious color and texture. In this situation, it is impossible to estimate the posture of nuts with only one image. Without a prior knowledge of a spatial circle, its 6-D position and pose can be estimated with two images from two different perspectives. Aiming at the application of autonomous underwater vehicle (AUV) docking in a circular-shaped docking station, the fact that the eccentricity of an ellipse monotonically decreases with decreasing view angle is exploited for pose estimation of the circular docking station.²⁵ The eccentricity of the ellipse from progressively captured images is computed and the AUV moves in the direction such that eccentricity →₀ and hence ellipses tends to a circle. With these images, the correct orientation solution can be acquired. Their research has not concerned the effect of the perspectives on estimation errors of 6-D position and pose nor the design of the monocular eye-in-hand system’s moving trajectory to improve the estimation accuracy.

This artilce focuses on active 6-D position-pose estimation with a monocular eye-in-hand system for nuts and bolts without a prior knowledge of size. The monocular vision system is moved to obtain two images from two different perspectives. The two images are used for 6-DOF posture estimations of nuts. Then, the influence of the two perspectives on position-pose estimation performance is analyzed and the movement trajectory of the monocular eye-in-hand system can be designed for more accurate position-pose estimation.

This article is organized as follows. The section entitled, “6-DOF position and pose estimation of a spatial circle with monocular eye-in-hand system” describes the process of 6-D position-pose estimation with two images taken by a monocular eye-in-hand system from two different perspectives. The section, “Influence of monocular eye-in-hand system’s perspectives on 6-D position-pose estimation performance of a spatial circle” analyzes the influence of the two perspectives on position-pose estimation performance. Based on this analysis, the section, “Active movement strategy for 6-D position-pose estimation of a spatial circle” presents the active movement strategy of a monocular eye-in-hand for improving the estimation accuracy. The verification of the proposed method is provided in the section “Experiments and discussions”. The last section summarizes this article.

6-DOF position and pose estimation of spatial circle with monocular eye-in-hand system

As shown in Figure 1, a robotic arm is mounted with a camera and an end-effector. Without requiring any prior knowledge of a nut, its 6-DOF position and pose can be estimated with two images taken by the monocular eye-in-hand system from two views.

Figure 1.

Coordinates and nut’s position and pose.

Hand-eye calibration

The position/pose relationship between the robot’s end effector coordinate ( $O_{r} - x_{r} y_{r} z_{r}$ ) and the world coordinate ( $O_{w} - x_{w} y_{w} z_{w}$ ) is represented with the rotation matrix $^{r} R_{w}$ and the translation vector $^{r} t_{w}$ , which can be computed with the known control corresponding to the end effector’s move. The position/pose relationship between the camera coordinate ( $O_{c} - x_{c} y_{c} z_{c}$ ) and the end effector coordinate ( $O_{r} - x_{r} y_{r} z_{r}$ ) is represented with the rotation matrix $^{c} R_{r}$ and the translation vector $^{c} t_{r}$ , which are fixed after mounting the monocular eye-in-hand system next to the end effector. $^{c} R_{r}$ and $^{c} t_{r}$ can be computed by using the hand-eye calibration algorithm.²⁶

The projection matrix of the camera that performs the projection between the world coordinate space ℜ ³ and the image plane ℜ ² is a 3 × 4 matrix P of rank 3

P = A (I_{3} O_{3}) (\begin{matrix} ^{c} R_{r} & ^{c} t_{r} \\ O_{3}^{T} & 1 \end{matrix}) (\begin{matrix} ^{r} R_{w} & ^{r} t_{w} \\ O_{3}^{T} & 1 \end{matrix})

where, A is the interior parameters matrix of the camera, I ₃ is the 3 unit matrix, and O ₃ is a 3 × 1 zero vector.

The projection matrix P projects a point X of the world coordinate space ℜ ³ into a point u of the image plane ℜ ²

λ u^{T} = P X,

where, $X = (x, y, z {,1)}^{T}$ represents a point in the world coordinate space ℜ ³, $u = (u, v {,1)}^{T}$ is the projection of X in the image plane ℜ ², and λ is a nonzero scalar.

Ellipse detection

Assume that one image has been taken by the calibrated monocular eye-in-hand system. A fast ellipse detection method is used to detect the ellipse in the image.²⁷ The detected ellipse’s φ center coordinates $O_{e} (u_{c}, v_{c})$ , the semi-major axis of length a, the semi-minor axis of length b, the eccentricity e, and the orientation angle θ can be estimated. The detected ellipse φ in the image plane is represented with a conic equation $u^{T} C u = 0$ . The 3 × 3 matrix C is

C = [\begin{matrix} A & B / 2 & D / 2 \\ B / 2 & C & E / 2 \\ D / 2 & E / 2 & F \end{matrix}]

where $A = a^{2} s i n^{2} θ + b^{2} c o s^{2} θ$

$B = 2 (b^{2} - a^{2}) sin θ cos θ$

$C = a^{2} {cos}^{2} θ + b^{2} {sin}^{2} θ$

$D = - (2 A u_{c} + B v_{c})$

$E = - (B u_{c} + 2 C v_{c})$

$F = A u_{c}^{2} + B u_{c} v_{c} + C v_{c}^{2} - a^{2} b^{2}$

Matching ellipses in two images

Given the projection matrix P of a camera, the equation of the cone which joins the projected conic $φ : u^{T} C u = 0$ in the image plane to the projection center of the camera is $X^{T} Q X = 0$ , with $Q = P^{T} C P$ . The two cones which join the two ellipses $φ_{1} : u^{T} C_{1} u = 0$ , $φ_{2} : {u^{'}}^{T} C_{2} u^{'} = 0$ in the two corresponding image planes to the camera’s optical center at the two positions are represented with,

\begin{array}{l} X^{T} Q_{1} X = 0 \\ X^{T} Q_{2} X = 0 \end{array}

where $Q_{1} = P_{1}^{T} C_{1} P_{1}$ , $Q_{2} = P_{2}^{T} C_{2} P_{2}$ . P ₁ and P ₂ are 3 × 4 projection matrices of the camera in the two positions.

Quan proved that reconstructing a spatial circle from two perspectives is equivalent to finding a value of λ such that the λ-matrix $C (λ) = Q_{1} + λ Q_{2}$ has a rank of 2.²⁸ A rank 2 λ-matrix $C (λ)$ should satisfy with the following equation,

l_{3}^{2} - 4 l_{2} l_{4} = 0.

Equation (5) is used for matching the two ellipses in the two images. In practice, a small-value threshold is used to combat image processing noise. $l_{2}, l_{3}, l_{4}$ are the corresponding polynomial coefficients of $| C (λ) |$ .

Calculating the spatial circle’s center and normal vector

The spatial circle’s center cannot be directly reconstructed with the geometric centers of the two ellipses by stereo vision methods. This is because distortion resulting from perspective projection transformation renders the projection of the spatial circle in the image different from the ellipse’s geometric center.²⁹

By joining the two ellipses φ ₁, φ ₂ in the two corresponding image planes to the camera’s optical centers at the two positions respectively, the two elliptic cones π ₁ and π ₂ are built. In each of the elliptic cones, two circles which have the same projection profile can be obtained. The normal vector and the center projection of each circle can be calculated. By matching the circles in the two cones, the circle which is parallel to the spatial circle can be determined in each of the cones. The projection of the spatial circles center can be computed with the centers of the two parallel circles.³⁰ The computational procedure is given as follows.

Step 1: Obtain the ellipse (τ ₁) whose major axis parallel to the spatial circle in π

As shown in Figure 2, an elliptic cone π is built with the optic center O_c and the projection ellipse φ in one image. The center of φ is O_e . Initialize $f_{x} \leftarrow O_{c} O_{e}$ and $O_{c} P \leftarrow O_{c} O_{e} / | O_{c} O_{e} |$ . Build the plane ξ which passes through the point P and whose normal vector is $f_{x}$ . ξ intersects π with the ellipse τ ₁ whose center is c_e . If $O_{c} c_{e}$ is not perpendicular to τ ₁, then reassign $f_{x} \leftarrow O_{c} c_{e}$ and $O_{c} P \leftarrow O_{c} c_{e} / | O_{c} c_{e} |$ . Then, re-build the plane ξ which passes through the point P and whose normal vector is $f_{x}$ . ξ intersects the elliptic cone π with the ellipse τ ₁ whose center is c_e . After several iterations, $O_{c} c_{e}$ is perpendicular to τ ₁ whose center is c_e . Set $| O_{c} c_{e} | = d$ . The major axis of τ ₁ is parallel to the spatial circle. The semi-major axis of length and the semi-minor axis of length of τ ₁ are a_e and b_e respectively. Its normal vector is $n_{e} = O_{c} c_{e}$ .

Figure 2.

Position relationship between the elliptic cone and the image plane.

Step 2: Obtain two rotation circles C ₁ and C ₂

As the ellipse τ ₁ rotates ±γ around its major axis, two circles, C ₁ and C ₂, are obtained by intersecting with the elliptic cone’s surface. The rotation angle γ is as follows:

γ = a r c cos \sqrt{\frac{b_{e}^{2} (a_{e}^{2} + d^{2})}{a_{e}^{2} (b_{e}^{2} + d^{2})}} .

Step 3: Compute the rotation circles’ normal vectors and their projection centers

The normal vectors of the two circles are

\begin{matrix} n_{1} = {(0 - sin γ cos γ)}^{T} \\ and n_{2} = {(0 - sin (- γ) cos (- γ))}^{T} \end{matrix}

The two circles’ centers are $c_{1}^{R}$ and $c_{2}^{R}$ . Their coordinates in the image planes are $u_{c 1}$ and $u_{c 2}$ . The normal vectors of the two circles in the world coordinate are $^{w} n_{1}$ and $^{w} n_{2}$ .

Step 4: Obtain the two parallel circles

Similarly, ${u^{'}}_{c 1}$ and ${u^{'}}_{c 2}$ are the projection of the spatial circle’s center obtained with the projection ellipse φ ₂ in the second image. Their corresponding normal vectors are $^{w} {n^{'}}_{1}$ and $^{w} {n^{'}}_{2}$ . The fact is, there is only one spatial circle. One of the values of $^{w} n_{1}$ or $^{w} n_{2}$ is equal to one of the values of $^{w} {n^{'}}_{1}$ or $^{w} {n^{'}}_{2}$ . And the value is equal to n , the normal vector of the spatial circle. Taking errors into account, the two normal vectors are treated identically if their difference is smaller than a threshold. The two corresponding circles are parallel to the spatial circle.

Step 5: Compute the spatial circle’s normal vector n and its center coordinate X _c

The spatial circle’s normal vector n is equal to $^{w} n_{1}$ (or $^{w} n_{2}$ ), and $^{w} {n^{'}}_{1}$ (or $^{w} {n^{'}}_{2}$ ), the normal vectors of the parallel circles. The coordinate values $X_{c} = (x_{c}, y_{c}, z_{c})^{T}$ of the spatial circle’s center are computed with $u_{c 1}$ (or $u_{c 2}$ ) and ${u^{'}}_{c 1}$ (or ${u^{'}}_{c 2}$ ), the two parallel circles’ centers through linear triangulation stereo vision method.³¹

Calculating the spatial circle’s radius

After computing the spatial circle’s center $X_{c} = (x_{c}, y_{c}, z_{c})^{T}$ and its normal vector n , and the two elliptical cones, the representation of the spatial circle in $(O_{w} - x_{w} y_{w} z_{w})$ can be obtained as follows,

{\begin{array}{r} X^{T} Q X = 0 \\ {(X - X_{c})}^{T} n = 0 \end{array}

where $Q = Q_{1}$ or Q ₂.

As shown the proof in the first section of Appendix 1, the projection of a spatial circle of 3-D Cartesian coordinates ( $O - x y z$ ) in the three coordinate planes ( $O - x y$ ), ( $O - x z$ ), and ( $O - y z$ ) are three ellipses. The three semi-major axes of length of the three ellipses, a ₁, a ₂, and a ₃ are equal to the spatial circles’ radius r. a ₁, a ₂, and a ₃ can be easily computed with equation (8). Thus, the radius of the spatial circle is computed with

r = \frac{a_{1} + a_{2} + a_{3}}{3} .

Influence of monocular eye-in-hand system’s perspectives on 6-D position-pose estimation performance of a spatial circle

With two different images taken by the monocular eye-in-hand system from two different views, the estimation performance for the spatial circle’s center coordinate X _c , its normal vector n , and its radius r are different, which is shown in section entitled “Experiments and discussions”. For a spatial circle, the camera’s perspective can be reflected with the position of the projection ellipse in the image and the ratio of the semi-minor axis of length to the semi-major axis of length.

Influence factor for computation error of the spatial circle’s normal vector

The spatial circle’s normal vector is $n = (0 - sin γ cos γ)^{T}$ , where $γ = a r c cos \sqrt{\frac{b_{e}^{2} (a_{e}^{2} + d^{2})}{a_{e}^{2} (b_{e}^{2} + d^{2})}}$ . Without generality loss, d is set to be a constant.

Set $x = \frac{b_{e}}{a_{e}}$ , then equation (6) changes to

γ (x) = a r c cos \sqrt{x^{2} \frac{a_{e}^{2} + d^{2}}{a_{e}^{2} x^{2} + d^{2}}} .

The measuring error $Δ x$ of x results in the computation error of γ,

e_{γ} = γ (x) - γ (x + Δ x)

As shown in the second section of Appendix 1, it can be easily proved that $| e_{γ} |$ monotonically increases with the increase of x , the ratio of the semi-minor axis of length to the semi-major axis of length $\frac{b_{e}}{a_{e}}$ . On the other hand, the projection ellipse φ is in the normal section of the elliptic cone π, and is parallel to the ellipse τ ₁. Thus, $\frac{b_{e}}{a_{e}} = \frac{b}{a}$ .

Since the normal vector of the spatial circle is $n = (0 - sin (\pm γ) cos (\pm γ {))}^{T}$ , the computation error of n is proportional to $| e_{γ} |$ . Therefore, $\frac{b}{a}$ should be as small as possible for decreasing the computation error of normal vector n .

Influence factor for computation error of the spatial circle’s radius

Due to image processing errors, the projection of 3-D point X with true projection u may be mistakenly treated as $u^{'}$ in the image plane. Thus, X could be mistakenly reconstructed to $X^{'}$ , as shown in Figure 3.

Figure 3.

Influence of α (the angle between the optic axis and spatial circle’s plane) on computation error of radius.

Assume that the distance $O_{c} X = d_{x}$ and the angle between the optic axis $O_{c} u$ and the spatial circle’s plane is $α \in (0, \frac{π}{2})$ . The angle between $O_{c} u$ and $O_{c} u^{'}$ is β. $β = α$ .

The computation error is

e_{d} = d_{x} (\frac{1}{sin α tan (α - β)} - \frac{1}{sin α tan α})

As shown in the third section of Appendix 1, it can be easily proved that $| e_{d} |$ monotonically decreases as α increases. Since $\frac{b}{a}$ increases with α, its value could reflect the value of α. $\frac{b}{a}$ should be near to 1, where the optic axis of the camera is perpendicular to the spatial circle’s plane and the projection of the spatial circle locates the center of the image.

Influence factor for computation error of the spatial circle’s center coordinates

The spatial circle’s center is computed by using the linear triangulation stereo vision method.³¹ As shown in Figure 4, the shaded uncertainty region depends on the angle of the two rays that connect the spatial circle’s center with the camera’s optical center at two positions respectively.

Figure 4.

Uncertainty of reconstruction with the linear triangulation stereo vision method.³¹

The angle between the two rays should be large enough to decrease the computation error of the spatial circle’s center coordinates.

Active movement strategy for 6-D position-pose estimation of a spatial circle

From the analysis in the above section, the following three conditions should be met for better 6-D position and pose estimation of a spatial circle with two images through the linear triangulation stereo vision method.

The three conditions are:

The projection ellipses should locate in the images’ centers.

$\frac{b}{a}$ in one image should be as small as possible to decrease the normal vector’s computation error.

$\frac{b}{a}$ in the other image should be near to 1 to decreases the radius’s computation error.

If condition 2 and condition 3 are satisfied, then the angle of the two rays that connect the spatial circle’s center with the camera’s optical center at two different positions are large enough for decreasing the spatial circle’s center coordinate computation error.

In order to satisfy the above three conditions, we design the following active movement strategy of a monocular eye-in-hand system.

At first, assume that one ellipse is detected with the monocular eye-in-hand system, as shown in Figure 5. Control the monocular eye-in-hand system to move until that the projection ellipse is in the center of the image and the major axis of the projection ellipse is parallel to the u axis of the image plane.

Figure 5.

Active movement strategy.

Step A: Move eye-in-hand system to Location 1

Compute $\frac{b_{0}}{a_{0}}$ , the ratio of the semi-minor axis of length to the semi-major axis of length of the projection ellipse.

If $\frac{b_{0}}{a_{0}} \geq 0.95$ , then the current position of the camera is labelled as Location 1, and the detected ellipse in the image is labelled as φ ₁. Control the robotic arm to move such that the camera rotates clockwise or counterclockwise around x _c axis and translates along the negative or positive direction of y _c axis. The robotic arm stops until $\frac{b^{'}}{a^{'}} < 0.95$ and the ellipse is still at the center of the image. Label the position of the camera as Temp 1, and the ellipse as $φ_{t 1}$ . Then, control the robotic arm to move so that the camera go back to Location 1. Go to Step B.

If $\frac{b_{0}}{a_{0}} < 0.95$ , then the current position of the camera is labelled as Temp 1, and the detected ellipse is labelled as $φ_{t 1}$ . Control the robotic arm to move so that the camera rotates clockwise or counterclockwise around x _c axis and translates along the negative or positive direction of y _c axis. The robotic arm stops until $\frac{b^{'}}{a^{'}} > 0.95$ and the ellipse is still at the center of the image. Label the position of the camera as Location 1, and the ellipse as φ ₁. Go to Step B.

Step B: Move eye-in-hand system to Location 2 and Location 3

Compute the spatial circle’s center ${X^{'}}_{c}$ with the two detected ellipses $φ_{t 1}$ and φ ₁ in the two images taken at the two positions Temp 1 and Location 1. Compute d ₁, the distance from the optical center of the camera at Location 1 to ${X^{'}}_{c}$ . Make ( $O_{c} - x_{c} y_{c} z_{c}$ ) rotate counterclockwise $ω_{1} (\frac{π}{6} < ω_{1} < \frac{π}{3})$ around $x_{c}$ axis. Then, make ( $O_{c} - x_{c} y_{c} z_{c}$ ) translate $2 \cdot d_{1} \cdot c o s (\frac{π - ω_{1}}{2})$ along a direction in the ( $O_{c} - y_{c} z_{c}$ ) plane. The angle between the translation direction and $O_{c} {X^{'}}_{c}$ is $\frac{π - ω_{1}}{2}$ , where O_c is the optic center of the camera at Location 1. Label the new position of the camera as Location 2. Thus, d ₂, the distance from the optical center of the camera at Location 2 to ${X^{'}}_{c}$ , is equal to d ₁. $d_{2} = d_{1}$ . Take an image at Location 2. Then, control the robotic arm to move so that the camera goes back to Location 1.

From Location 1, make ( $O_{c} - x_{c} y_{c} z_{c}$ ) rotate clockwise $ω_{2} (\frac{π}{6} < ω_{2} < \frac{π}{3})$ around x _c axis. Then, make ( $O_{c} - x_{c} y_{c} z_{c}$ ) translate $2 \cdot d_{1} \cdot cos (\frac{π - ω_{2}}{2})$ along a direction in the ( $O_{c} - y_{c} z_{c}$ ) plane. The angle between the translation direction and $O_{c} {X^{'}}_{c}$ is $\frac{π - ω_{2}}{2}$ . Label the new position of the camera as Location 3. Similarly, d ₃, the distance from the optical center of the camera at Location 3 to ${X^{'}}_{c}$ , is equal to d ₁. $d_{3} = d_{1}$ .

After the above movements, the optic center of the camera at Location 1, Location 2, Location 3, and the spatial circle’s center are in a single plane. The angle between the ray that connects the optic center of the camera at Location 2 with the spatial circle’s center and the ray that connects the optic center of the camera at Location 3 with the spatial circle’s center is $\frac{π}{3} < ω_{1} + ω_{2} < \frac{π}{2}$ .

With the above active movement strategy, the three conditions can be met.

The projection ellipses are always in the center of the images taken at Location 1, Location 2, and Location 3.

$\frac{b}{a}$ of the projection ellipse in the image taken from Location 2 or Location 3 is small for decreasing the normal vector’s computation error.

$\frac{b}{a}$ of the projection ellipse in the image taken from Location 1 is near to 1 for decreasing the radius’s computation error.

Moreover, the angle between the ray that connects the optic center of the camera at Location 2 with the spatial circle’s center and the ray that connects the optic center of the camera at Location 3 with the spatial circle’s center is large enough for decreasing the spatial circle’s center coordinates computation error.

Step C: Compute the spatial circle’s center coordinates, normal vector, and radius

With the images taken at Location 1, Location 2, and Location 3, the spatial circle’s center coordinates X _c , normal vector n , and radius r can be computed.

1. Spatial circle’s center coordinates X _c

The spatial circle’s center coordinates X _c can be computed with the two projection ellipses in the images taken at Location 2 and Location 3. $\frac{π}{3} < ω_{1} + ω_{2} < \frac{π}{2}$ guarantees a smaller computation error of the spatial circle’s center coordinates.

2. Spatial circle’s normal vector n

The projection ellipse with the smaller ratio of semi-minor axis to semi-major axis in the images taken at Location 2 or Location 3 is chosen for decreasing normal vector computation error.

3. Spatial circle’s radius r.

Due to the ratio of the semi-minor axis of length to the semi-major axis of length of the projection ellipse of the image taken at Location 1 is near to 1, the projection ellipse of the image taken at Location 1 is used for computing the spatial circle’s radius r.

Experiments and discussions

Extensive experiments were conducted to validate our proposed method. A Logitech C910 camera is attached to the end of a Universal Robot UR5. Three cases are considered: one circle printed on A4 paper, one circle flat washer, and one nut. In order to avoid computation error caused by coordinate transformation, a calibration board is used as the reference coordinates and placed beside the printed circle, the circle flat washer, and the nut, as shown in Figure 6. The radius of the printed circle is 40.5 mm. The radius of the inner and outer circles of the circle flat washer are 40 mm and 16.5 mm, respectively. The radius of the inner circle of the nut is 16 mm. The calibration board is a 6 × 9 checkerboard with each tile having a side length of 15 mm. For software, we use Visual Studio 2010 and Matlab 2010b. The computer is Intel (R) Core (TM) i5-4590 CPU, 3.30 GHz, 4.00 GB RAM.

Figure 6.

Experiment scenes.

Experiments of influence of monocular eye-in-hand system’s perspective on estimation performance of the printed circle’s normal vector, center coordinates, and radius

The printed circle’s normal vector n , center coordinates X _c , and radius r are computed with two different images taken at two different perspectives to validate the analysis of the influence of monocular eye-in-hand system’s perspective on a spatial circle’s 6-D position-pose estimation performance.

As shown in Figure 8, the camera moves around the printed circle. Initially, the angle between the optical axis of the camera at P ₀ and the plane of the printed circle is almost 90°. Then, the angle gradually decreases until $\frac{b}{a}$ of the projection ellipse is about 0.5. Eleven images are taken at 11 positions. The two images taken at P ₀ and P ₁₀ are shown in Figure 7. Here, the checkerboard is used as reference coordinates.

Figure 7.

Position of the monocular eye-in-hand system and printed circle.

Figure 8.

Images of printed circle. (a) P ₀ and (b) P ₁₀.

At first, the projection ellipses of the images taken at P ₁, $\dots$ , P ₁₀ are matched with the projection ellipse of the image at P ₀ respectively. After matching ellipses, the printed circle’s center coordinates X _c is computed with the two projection ellipses in the images at P ₀ and $P_{i}, i = 1, \dots,10$ , respectively. Its normal vector n is computed with the projection ellipse in the image taken at $P_{i}, i = 1, \dots,10$ , respectively. With the corresponding center coordinates X _c and normal vector n , the radius r is computed with the projection ellipse in the image taken at P ₀. With the corresponding center coordinates X _c and normal vector n , the radius r is computed with the projection ellipse in the images taken at $P_{i}, i = 1, \dots,10$ , respectively. Twenty experiments have been done.

The average computation errors of the printed circle’s center coordinates X _c , normal vector n , and radius r are shown in Table 1 and Figure 9.

Table 1.

Computation errors of the printed circle’s center coordinates X _c , radius r, and normal vector n .

1	Image $P_{i}, i = 1, \dots, 10$	P ₁	P ₂	P ₃	P ₄	P ₅	P ₆	P ₇	P ₈	P ₉	P ₁₀
1	$b / a$	0.96	0.95	0.92	0.88	0.8	0.73	0.68	0.62	0.54	0.45
2	Image P_i and P ₀ for X _c	P ₁	P ₂	P ₃	P ₄	P ₅	P ₆	P ₇	P ₈	P ₉	P ₁₀
2	Error of X _c (mm)	6.08	5.34	3.06	3.09	1.77	1.79	1.55	1.87	1.52	1.31
3	Image P_i for n	P ₁	P ₂	P ₃	P ₄	P ₅	P ₆	P ₇	P ₈	P ₉	P ₁₀
3	Error of n (°)	7.41	7.28	8.95	4.33	1.98	1.42	0.73	1.30	0.61	0.59
4	Image P_i for r	P ₀	P ₀	P ₀	P ₀	P ₀	P ₀	P ₀	P ₀	P ₀	P ₀
4	Error of r (mm)	0.91	1.20	1.01	0.88	0.69	0.68	0.65	0.67	0.64	0.59
5	Image P_i for r	P ₁	P ₂	P ₃	P ₄	P ₅	P ₆	P ₇	P ₈	P ₉	P ₁₀
5	Error of r (mm)	1.34	1.83	2.88	3.2	5.72	7.02	9.56	11.44	18.42	29.79

The values of boldface are the best.

Figure 9.

Influence of $\frac{b}{a}$ on computation errors of (a) X _c , (b) n , (c) r with image at P ₀, and (d) r with image at P_i .

Line 2 of Table 1 and Figure 9(a) show that the computation error of the printed circle’s center coordinates X _c computed with the two projection ellipses in the images taken at P ₀ and P ₁ is the largest. The reason is that the angle between the optical axe of the camera at P ₀ and the optical axe of the camera at P ₁ is so small that the reconstruction uncertainty of the circle’s center is big. With the increasing of the angle, the error of the spatial circle’s center coordinates X _c decreases. With the two projection ellipses in the images taken at P ₀ and P ₁₀, the computation error of the printed circle’s center coordinates X _c is the smallest as shown of boldface.

From Line 3 of Table 1 and Figure 9(b), it can be seen that the computation error of normal vector n decreases as the decrease of $\frac{b}{a}$ . The computation error of the printed circle’s normal vector n is the smallest with the projection ellipse in the image taken at P ₁₀.

With the computed normal vector n and center coordinates X _c of the printed circle, its radius r can be computed with the projection ellipse in the image taken at P ₀ and P_i , $i = 1, \dots,10$ , respectively. The computation error of the printed circle’s radius r is shown in Line 4 and Line 5 of Table 1, Figure 9(c) and (d). Line 4 of Table 1 and Figure 4(c) show the computation error of r computed with the computed normal vector of Line 3, the center coordinates of the printed circle of Line 2, and the projection ellipse in the image taken at P ₀. Line 5 of Table 1 and Figure 9(d) show the computation error of r computed with the computed normal vector n of Line 3, the center coordinates X _c of the printed circle of Line 2, and the projection ellipse in the image taken at P_i , $i = 1, \dots,10$ . The computed radius r has the highest accuracy with the projection ellipse in the image taken at P ₀. At P ₀, $\frac{b}{a}$ , the ratio of the semi-minor axis of length to the semi-major axis of length, is near to 1. It is shown that the computation accuracy of r is also influenced by the accuracy of center coordinates X _c and normal vector n . As shown in Figure 9(c), we can see that more accurate the estimation of the center coordinates X _c and the normal vector n , the more accurate the estimation of radius r if all computed with the projection ellipse of the image taken at P ₀. However, the computation error of radius r increases largely with the increase of the ratio of the semi-minor axis of length to the semi-major axis of length of the projection ellipse in the image, as shown in Figure 9(d).

From the above experimental results, we can see that the printed circle’s center coordinates X _c computed with the projection ellipses in the images taken at P ₀ and P ₁₀ have the highest accuracy, the printed circle’s normal vector n computed with the projection ellipse in the image taken at P ₁₀ has the highest accuracy. Based on the most accurate estimation of center coordinates X _c and normal vector n , the printed circle’s radius r computed with the projection ellipse in the image taken at P ₀ has the highest accuracy.

The experimental results validate the analysis of the influence of monocular eye-in-hand system’s perspective on a spatial circle’s 6-D position-pose estimation performance. $\frac{b}{a}$ , the ratio of the semi-minor axis of length to the semi-major axis of length of the projection ellipse in one image, should be as small as possible to decrease the computation error of the spatial circle’s normal vector n . $\frac{b}{a}$ in the other image should be near to 1 to decrease the computation error of radius r. And the angle between the two rays that connect the spatial circle’s center with the optical center of the camera at two positions respectively should be large enough to decrease the computation error of the spatial circle’s center coordinates X _c .

Comparison experiments for 6-D position-pose estimation of the spatial circle with different movement strategies

In order to validate the effectiveness of the active movement strategy, the comparison experiments have been done for 6-D position-pose estimation of one circle flat washer and one nut. Images are taken by the monocular eye-in-hand system with the proposed active movement strategy, the movement strategy in the study by Ghosh et al.,²⁵ and the random movement strategy, respectively.

With the proposed active movement strategy, three images are taken at Location 1, Location 2, and Location 3 in each experiment, as shown in Figure 10. The circular flat washer’s normal vector n , center coordinates X _c and radius r are computed with Step C in the section entitled “Active movement strategy for 6-D position-pose estimation of a spatial circle”. With the random movement strategy, two images are taken from two different perspectives randomly in each experiment. n , X _c and r are computed with the two projection ellipses in the two images.

Figure 10.

Images of circular flat washer with active movement. The green ellipses are detected. (a) Location 1, (b) Location 2, and (c) Location 3.

Circular flat washer

The circular flat washer has one inner circle and one outer circle, as shown in Figure 6(b). Twenty experiments have been done with the proposed active movement strategy, the random movement strategy, and the movement strategy in the study by Ghosh et al.,²⁵ respectively. Three images taken at Location 1, Location 2, and Location 3 are shown in Figure 10. The average estimation error of the pose and position of the circular flat washer is shown in Table 2.

Table 2.

Estimation error of 6-D position and pose of the circular flat washer.

Estimation error	Active	Random	Move²⁵
n (°)	0.49	1.64	3.94
X _c (mm)	0.77	0.92	5.51
r_i (mm)	0.29	5.85	1.41
r_o (mm)	0.07	4.69	3.33
$δ_{r}_{_{i}} (%)$	1.76	35.45	8.55
$δ_{r}_{_{o}} (%)$	0.18	11.73	8.33

r_i : radius of inner circle; r_o : radius of outer circle; $δ_{r}_{_{i}}$ : relative error of r_i ; $δ_{r}_{_{o}}$ : relative error of r_o . The values of boldface are the best.

The average estimation error of the pose and position of the circular flat washer is shown in Table 2. The estimation error with the active movement strategy is much smaller than that with random camera movement and the movement strategy in the study by Ghosh et al.²⁵ Furthermore, the estimation error of the inner circle’s radius is larger than that of the outer circle. The reason is that the edge shadow has an influence on the inner circle’s projection ellipse in the image.

Nut

The nut is shown in Figure 6(c). Twenty experiments have been done with the proposed active movement strategy, the random movement strategy, and the movement strategy in the the study by Ghosh et al.,²⁵ respectively. Three images taken at Location 1, Location 2, and Location 3 are shown in Figure 11. The average estimation error of the pose and position of the nut is shown in Table 3. The estimation error with the active movement strategy is smaller than that of the random movement and the movement strategy.²⁵

Figure 11.

Images of nut with active movement. The green ellipses are detected. (a) Location 1, (b) Location 2, and (c) Location 3.

Table 3.

Estimation error of 6-D position and pose of the nut.

Estimation error	Active	Random	Move²⁵
n (°)	0.71	4.49	1.82
X _c (mm)	0.83	1.94	24.7
r (mm)	0.03	6.53	1.06
$δ_{r} (%)$	0.19	40.81	6.63

δ_r : relative error of r. The values of boldface are the best.

The movement of the monocular eye-in-hand system is planned in order to obtain the images from special perspectives. With the images from special perspectives, 6-D position-pose estimation of the nut can be improved. However, its cost is runtime for active movements, as shown in Table 4. It can be seen that the time for active movement is a large proportion of the total runtime. It is the disadvantage of the proposed method.

Table 4.

Runtime for 6-D position-pose estimation of the nut.

Time	Active	Random	Move²⁵
$T_{m} (s)$	15.222	6.415	8.499
$T_{f} (s)$	0.033	0.033	0.033
$T_{c} (s)$	0.280	0.242	1.299
$T_{t} (s)$	15.568	6.723	10.061

T_m : runtime for movement; $T_{f}$ : runtime for matching ellipses; T_c : runtime for computing n , X _c , and r; $T_{t}$ : total runtime for one experiment.

Conclusions

A method is proposed to plan the movement of a monocular eye-in-hand system for improving the position and pose estimation accuracy of a spatial circle. The analysis of the influence of the monocular eye-in-hand system’s perspective on the position and pose estimation accuracy of a spatial circle is given. Based on the analysis, the active movement strategy of the monocular eye-in-hand system is proposed for more accurate position and pose estimation of a spatial circle. Extensive experiments have been done to demonstrate the effectiveness of the proposed method for position and pose estimation of one printed circle, one circular flat washer, and one nut. The experimental results validate that 6-D position and pose estimation error with active movement strategy is smaller than that of random movement and other movement strategy. It is also needed to be pointed out that the improvement of estimation performance is at the expense of the time for movement.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study has been funded by the National High-tech Research and Development Program under Grant No. 2015AA042307, 2015AA042201 and the National Natural Science Foundation in China under Grant No. U1706228, U1613223.

Appendix 1

References

Smith

Karayiannidis

Nalpantidis

. Dual arm manipulation-A survey. Rob Autonom Syst 2012; 60(10): 1340–1353.

Jiao

Cao

. Embedded vision-based autonomous move-to-grasp approach for a mobile manipulator. Int J Adv Rob Syst 2012; 9(257): 1–8.

Chen

Birk

Kelley

. Estimating workpiece pose using the feature points method. IEEE Trans Autom Control 1980; 25(6): 1027–1041.

Safaee-Rad

Tchoukanov

Smith

. Three-dimensional location estimation of circular features for machine vision. IEEE Trans Rob Autom 1992; 8(5): 624–640.

Dhome

Richetin

Lapreste

. Determination of the attitude of 3D objects from a single perspective view. IEEE Trans Pattern Anal Mach Intell 1989; 11(12): 1265–1278.

Nister

. An efficient solution to the five-point relative pose problem. IEEE Trans Pattern Anal Mach Intell 2004; 26(6): 756–770.

Gordon

Lowe

. What and where: 3D object recognition with accurate pose. Toward Categ Level Obj Recognit 2006; 4170: 67–82.

Collet

Martinez

Srinivasa

. The MOPED framework: object recognition and pose estimation for manipulation. Int J Rob Res 2011; 30(10): 1284–1306.

Cheng

Jiang

. A robust and efficient algorithm for tool recognition and localization for space station robot. Int J Adv Rob Syst 2014; 11: 193, 1–15.

10.

Wilson

Hulls

CW,

Bell

. Relative end-effector control using cartesian position based visual servoing. IEEE Trans Rob Autom 1996; 12(5): 684–696.

11.

Deng

Wilson

WJ,

Janabi-Sharifi

. Decoupled EKF for simultaneous target model and relative pose estimation using feature points. In: Proceedings of IEEE conference on control applications, Toronto, Canada, 28–31 August 2005, pp.749–754. IEEE.

12.

Janabi-Sharifi

Marey

. A kalman-filter-based method for pose estimation in visual servoing. IEEE Trans Rob 2010; 26(5): 939–947.

13.

Rusu

Bradski

Thibaux

. Fast 3D recognition and pose using the viewpoint feature histogram. In: Proceedings of IEEE/RSJ conference on intelligent robots and systems, Taipei, Taiwan, 18–22 October 2010, pp.2155–2162. IEEE.

14.

Assa

Janabi-Sharifi

. A robust vision-based sensor fusion approach for real-time pose estimation. IEEE Trans Cybern 2014; 44(2): 217–227.

15.

Aldoma

Tombari

Prankl

. Multimodal cue integration through hypotheses verification for RGB-D object recognition and 6 DOF pose estimation. In: Proceedings of IEEE conference on robotics and automation, Karlsruhe, Germany, 6–10 May 2013, pp.2104–2111. IEEE.

16.

Dang

Suh

. A sensor-fusing system for spatial circle and trunk parameter estimation. J Circu Sys Comput 2016; 25(10): 1650117.

17.

Ligorio

Sabatini

. Extended Kalman filter-based methods for pose estimation using visual, inertial and magnetic sensors: comparative analysis and performance evaluation. Sensors 2013; 13(2): 1919–1941.

18.

Chen

Huang

. A vision-based method for the circle pose determination with a direct geometric interpretation. IEEE Trans Rob Autom 1999; 15(6): 1135–1140.

19.

Shauri

RLA

Nonami

. Calculation of 6-DOF pose of arbitrary inclined nuts for a grasping task by dual-arm robot. J Rob Mechatron 2012; 24(1): 191–204.

20.

Wang

Chen

. Direct solution for pose estimation of single circle with detected center. Electron Lett 2016; 52(21): 1751–1753.

21.

Leng

. A new solution of ambiguity in pose estimation of circle feature using a concentric circle constraint. In: Proceedings of international conference on information science and technology, Dalian, China, 6–8 May 2016, pp.470–475. IEEE.

22.

Huang

Sun

Zhu

. Vision pose estimation from planar dual circles in a single image. Optik Int J Light Electron Opt 2016; 127(10): 4275–4280.

23.

Xia

. 3D location estimation of underwater circular features by monocular vision. Optik Int J Light and Electron Opt 2013; 124(23): 6444–6449.

24.

Huang

Sun

Zeng

. General fusion frame of circles and points in vision pose estimation. Optik Int J Light and Electron Opt 2018; 154: 47–57.

25.

Ghosh

Ray

Vadali

SRK

. Reliable pose estimation of underwater dock using single camera: a scene invariant approach. Machin Vision Appl 2016; 27(2): 221–236.

26.

Tsai

Lenz

. A new technique for fully autonomous and efficient 3D robotics hand/eye calibration. IEEE Trans Rob Autom 1989; 5(3): 345–358.

27.

Fornaciari

Prati

. Very fast ellipse detection for embedded vision applications. In: Proceedings of international conference on distributed smart cameras, Hongkong, China, 30 October–2 November 2012, pp.1–6. IEEE.

28.

Quan

. Conic reconstruction and correspondence from two views. IEEE Trans Pattern Anal Mach Intell 1996; 18(2): 151–160.

29.

Zhang

Wei

. A position-distortion model of ellipse centre for perspective projection. Meas Sci Technol 2003; 14(8): 1420–1426.

30.

Liu

Sun

. Research on calculation method for the projection of circular target center in photogrammetry. Chin J Sci Instrum 2011; 32(10): 2235–2241.

31.

Hartley

Zisserman

. Multiple view geometry in computer vision. New York: Cambridge University Press, 2003, pp. 310–321.

Active 6-D position-pose estimation of a spatial circle using monocular eye-in-hand system

Abstract

Keywords

Introduction

6-DOF position and pose estimation of spatial circle with monocular eye-in-hand system

Hand-eye calibration

Ellipse detection

Matching ellipses in two images

Calculating the spatial circle’s center and normal vector

Step 1: Obtain the ellipse (τ 1) whose major axis parallel to the spatial circle in π

Step 2: Obtain two rotation circles C 1 and C 2

Step 3: Compute the rotation circles’ normal vectors and their projection centers

Step 4: Obtain the two parallel circles

Step 5: Compute the spatial circle’s normal vector n and its center coordinate X c

Calculating the spatial circle’s radius

Influence of monocular eye-in-hand system’s perspectives on 6-D position-pose estimation performance of a spatial circle

Influence factor for computation error of the spatial circle’s normal vector

Influence factor for computation error of the spatial circle’s radius

Influence factor for computation error of the spatial circle’s center coordinates

Active movement strategy for 6-D position-pose estimation of a spatial circle

Step A: Move eye-in-hand system to Location 1

Step B: Move eye-in-hand system to Location 2 and Location 3

Step C: Compute the spatial circle’s center coordinates, normal vector, and radius

Experiments and discussions

Experiments of influence of monocular eye-in-hand system’s perspective on estimation performance of the printed circle’s normal vector, center coordinates, and radius

Comparison experiments for 6-D position-pose estimation of the spatial circle with different movement strategies

Circular flat washer

Nut

Conclusions

Footnotes

Declaration of conflicting interests

Funding

Appendix 1

References

Step 1: Obtain the ellipse (τ ₁) whose major axis parallel to the spatial circle in π

Step 2: Obtain two rotation circles C ₁ and C ₂

Step 5: Compute the spatial circle’s normal vector n and its center coordinate X _c