Abstract
Feature matching is one of the most important steps in the location technology of zooming images. According to the scale-invariant feature transform matching algorithm, several improved false matches elimination algorithms are proposed and compared in this article. First, features of zooming images and ranging models are introduced in detail in the theory framework of the scale-invariant feature transform feature detection and matching algorithm. The key role of the feature matching algorithm and false matches elimination in the ranging technology of zooming images is discussed and addressed. Second, false matches are eliminated by the proposed approach based on geometry constraint in zooming images with a higher accuracy. Third, false matches are removed by an elimination algorithm based on properties of the scale-invariant feature transform features. Finally, an iterative false matches elimination algorithm based on distance from epipole to epipolar line is proposed and this algorithm can also solve the real-time calibration of the shrink-amplify center for zooming images. Experiments results demonstrate that the three false matches elimination algorithms proposed are stable, and the false matches of feature points can be eliminated effectively with combination of these three methods, and the rest matching points can be applied into robot visual servoing.
Keywords
Introduction
One key step to achieve image understanding is depth estimation of image, which is a fundamental problem in computer vision research and has important applications in robotics, scene understanding, and three-dimensional (3D) reconstruction.1,2 The depth cues of monocular vision are the zooming image, which has a wide application in the field of visual monitoring, visual tracking, the robot’s environment sensing and map building.3–5 According to the literature, the method of using zoom lens to achieve depth estimation was first proposed by Ma and Olsen, 6 and information about depth can be provided by zooming images in theory. Precise depth estimation depending on thick lens instead of zoom lens based on the precise study of optical properties of the zoom lens is proposed by Lavest and colleagues,7–10 but the experiments are conducted in structural scenes. Model of three parameters of zoom, focus, and aperture based on the actual structure of the lens zoom is proposed by Asada and colleagues11–13 The experiment results show that the model is only fit for the high accurate and low distortion lens. Active vision of zoom tracking is proposed by Fayman et al. 14 for visual tracking of depth estimation in zooming images, and applications of depth estimation of zooming image have been widened. In order to realize mobile robot visual servoing for object tracking and obstacle based on zooming image, some investigations are realized in Gao et al. 15 Robust feature matching based on scale-invariant feature transform (SIFT) is realized by geometry constraint of zooming image. 3D reconstruction of real scene based on zooming image is established. The robot experimental results validate the practicability of the related algorithms. However, these studies mostly focus on the reconstruction model of zooming image with some special points, which lacks automatic sparse and dense matching, especially the investigation of false matches elimination algorithm. The former focuses on the technology of automatically matching of feature points of images taken in two different focal lengths, and the latter focuses on the pixel-by-point matching algorithm of pixels of two images. The above two matching algorithms are the basis of the sparse reconstruction and dense reconstruction. According to sparse match, one crucial aspect to the image registration technique is how to choose characterizes of image.
To solve the above identified problem, Harris corner detection operator, 16 SUSAN corner detection operator, 17 SIFT detection operator, 18 and so on are classical operators. SIFT operator is one of the most popular and effective method because of its insensitive to light, rotation, and scaling. SIFT operator is ideally selected to detect and match image features because of rotation, translation, and scaling relations of the same scene in two images taken in different focal lengths. Subsequent 3D reconstruction or 3D ranging accuracy is greatly affected by part of false matches presented after the initial SIFT matching. According to the false matches of binocular vision system, random sample consensus (RANSAC) method is used to remove the false matches,19,20 parallax filtering is used to remove the false matches in the 3D space,21,22 and association rules are used to remove the false matches. 23 However, the methods of false matches mentioned above are limited to the binocular stereo vision system, which is quite different from the zooming ranging system. Therefore, this article tries to overcome the above issues by investigating approaches to eliminate false matches in zooming images. First, feature detection and matching algorithms are studied based on SIFT operator. Second, three improved false matches elimination algorithms are proposed based on different constraints. The experimental results are discussed to demonstrate the performance of the proposed methods. The rest of the article is organized as follows: The theory of zooming image depth estimation is described in section “Depth estimation principle of zooming images.” The theory of SIFT matching algorithm of zooming image is described in section “Feature detection and matching based on SIFT operator.” Two improved false matches elimination algorithms are proposed in section “False matches elimination.” The experimental results and analysis of different false matches elimination algorithms in the same environment are given in section “Experiment results and analysis.” Finally, the article is concluded with remarks in section “Conclusion.”
Depth estimation principle of zooming images
Depth estimate principle of the pinhole model
In the pinhole model, zoom is equivalent to the camera optical center’s movement along the axis (as shown in Figure 1 of
where

Pinhole model for two distinct focal lengths.
Depth estimate principle of the thick-lens model
The thick-lens model is considered as an ideal model of the zoom lens.
7
As shown in Figure 2(a), plane

Thick-lens model for zoom lenses.
Compared with the pinhole model, thick-lens model is a more accurate zoom depth estimate model, because the actual zoom focal length when of changing quantity is not equal to the object variation of the distance.
7
In the thick-lens model, the translation of the lord plane
Obviously, no matter in what kind of models, after the camera model calibration, getting the matching points in the zoom images is an crucial step for zoom depth estimates.
Feature detection and matching based on SIFT operator
SIFT feature detection
SIFT algorithm is first proposed by D.G. Lowe in 1999 and optimized in Lindeberg. 18 SIFT with principal components analysis (PCA) instead of histogram of the way and further improvement are described by Ke and Sukthankar. 24 The method has solved the scene, scaling, rotating partially occluded view changes caused by factors such as the image distortion, which is very suitable for sequence image processing research. Experiments and performance comparison with 10 kinds of the most representative feature matching describe operator (such as invariant moment, cross-correlation, SIFT) are discussed in Mikolajczyk and Schmid. 25 The results show that the SIFT features descriptor in light intensity change, image scaling, rotating, affine transformation can still achieve accurate and stable feature points with stable performance.
Scale-space extreme detection
The algorithm of scale-space extreme needs to detect all points. With experiments, Gaussian convolution effectively proved that it is the only linear transformation to show information of scale-space image. The scale-space of an image is defined as a function
with
where
In order to effectively detect stable feature points in the scale-space, we can let the original image do convolution with a set of consecutive Gaussian convolution, thus generating a set of scale-space images. Therefore, we can get an image of multi-scale expression. This is also equivalent to add a new scale coordinates to the image data.
In general, the smaller the
Extreme point edge response
The Gaussian difference operator extreme value has smaller principal curvature in the direction of vertical edges and has larger principal curvature in the edge of the place across. So it is necessary to positioning the extreme point more accurately. Generally, principal curvature could be given by a
It is known from formula (4), matrix
The principal curvature of
where
If
where
Therefore, detection of whether the principal curvature of
is formed or not. Generally,
Make sure direction parameter of key point
Each key point in SIFT algorithm need to be specified a principal direction, which can ensure the rotation invariant character. The gradient magnitude and orientation are computed from each pixel of the region around the key point, as the following equation
where
So far, the key points of image have been checked out. Each key point contains three parameters: position, scale, and direction.
Characteristic vector descriptor creation
The key points are regarded as the centers of 8×8 neighborhood window. Thus, each key point can form 128D characteristic vector. As shown in Figure 1 and 3(a), the central position is the position of current key point. Each division represents a pixel. The arrow’s direction represents gradient direction of the pixels as well as its length represents gradient modulus. And the circle in the graph represents the Gaussian weighted range. Characteristic vector descriptor is generated by accumulation of gradient direction histogram, which is shown in Figure 3(b).

Characteristic vectors are generated by image gradients: (a) image gradients and (b) key point characteristic vector.
Feature matching based on minimum distance
With SIFT characteristic vector generated, Euclidean distance between two image characteristic vectors of the key points is treated as a similarity criterion, which is used for feature matching. Then, take certain critical points from the first image, through the traversal search algorithm, and find out two key points which have smaller Euclidean distance in the second image. As the two key points, using the closest distance divide by the second closest distance and the result we get is less than a certain threshold. Therefore, we can call the two key points a pair of matching points. Lowering the threshold, the number of SIFT feature matching points will decrease, but matching can have a more stable performance (in this article, threshold is 0.6). However, at the bottom of the overlapping area of two images, most feature points of benchmark image cannot find matching points. As sensor motion, scene occlusion and the similar structure may generate false matching. It is necessary to investigate some effective false matches elimination algorithms.
False matches elimination
False matches elimination based on zooming image geometry constraint
As we know, there must be some inevitable false matching in the results when using SIFT algorithm for image matching. Whether the application and improvement of SIFT algorithm are successful or not, it largely depends on the level of correct matching. So how to determine the right matching results is the essential problem.
In the research of SIFT algorithm, the synthetic images are used as research samples generally. Because the corresponding relation between original image and adding noise image can be predicted, which makes it easy for matching result statistic and favorable to the development of the study. The images are sampled from different focal length, the correspondence between the matching points changes with the content of the images randomly. Therefore, it needs to analyze the characteristics of the images and find appropriate standards to evaluate matching results.
Figure 4 shows the ideal matching points and practical matching points in the zooming images. P1 and P2 is a pair of ideal matching points; P1′and P2′ is a pair of practical matching points.

The ideal matching points and practical matching points in zooming images.
For zooming images depth estimation, there is a basic assumption: the radial slope of matching point is the same in ideal state (matching points p1 and p2 in Figure 4). Obviously, having the same radial slope is the necessary conditions for correct matching points. So we can use this condition to get rid of false matching points in the matching results.
In fact, due to the influence of the distortion of imaging, even for the correct match points, radial slope could not be completely the same (matching points p1′ and p2′ in Figure 4). Therefore, it needs to give a reasonable tolerance to screen the ideal match points and try to eliminate those false matches.
Depth estimate of zooming images usually gathers image in two fixed focal lengths, and the zooming of center-collected images can be preset by the calibration of this two focal lengths. So, on this basis, we can design the following experiments for detecting the level of the radial angle of matching points. In Figure 5, the shown structured scene can find the matching points through angular points.

Structured scene of detection experiments of matching point’s radial angle.
Assuming that there are
The matrix equation can be simplified as
Apparently, zoom center
After zoom center is identified, the radial angle can be calculated by two line equations. Assume that a pair of matched point coordinates respectively is
The matching point radial angle set is
In summary, false matches elimination based on zooming image geometry constraint consists of the following steps:
Step 1: Make use of traditional SIFT method to obtain the zooming image matching points set
Step 2: According to the matching points set
Step 3: Calculate the matching point radial angle set
Step 4: Obtain a set of ideal matching points
False matches elimination based on SIFT feature property
False matches elimination algorithm introduced previously suggests that more ideal match points can be obtained based on the geometry constraint of zooming images. However, it cannot completely guarantee the correctness of the matching results, because geometry constraint of zooming image is just the necessary conditions of the right matching. And the experiment results also show that more obvious false matches appear with the raising level of geometry constraint error. In this section, most match points are obtained directly on the analysis of SIFT feature attributes based on geometry constraint of zooming image.
The key point of the SIFT features generally contains scale, the main direction, and coordinate values. Scale and the main direction are more important. The impact of SIFT features properties of match points can be examined by the geometry constraint of the zooming image after achieving match points.
Any scale and direction of the key points are likely to become a pair of match points, since the SIFT algorithm is mainly based on the local gradient features, and matching results are not affected by the scale and the main direction of key points. They do not affect the match results directly. But there is no rotation and peace in the image of the zooming image, so the main direction of the match points should be consistent. The size of the scale represents the fuzzy degree of the original image; only fuzzy levels are similar of descriptors, which are most likely to become the matching point in the SIFT algorithm. Usually, one image is relatively clear, and another is fuzzy for the same scene in zooming images, and the fuzzy degree is relatively fixed. So the smooth scales are not equal and relative fixed inevitability, to make two images have a similar degree of fuzzy. This is the scale ratio of the match points, because 18 mm and 55 mm of the focal length and the ratio of focal length close to 3 in zooming images are used.
The main direction of the match points is in the same direction in the zoom image, the scale ratio of the match point is close to a constant; generally, this constant is the ratio of the imaging focal length of the zooming image. Comparison between the scale and the direction can be used in key points matching, accuracy and efficiency of matching are also promoted.
A simple method to eliminate the abnormal match points in scale and direction is to use the probability and statistics features of the all match points’ scales and the main directions. Assuming
The standard deviations are
Assume that the scale ratio of the match point and the ratio of the main direction respectively follow a normal distribution, the abnormal match points in scale and main direction can be eliminated by a confidence interval. Set the confidence intervals of the scale ratio and the main direction ratio of the match point are
Iterative false matches elimination based on distance from epipole to epipolar line
Actually, zooming image is a special kind of translation image. In ideal condition, connecting lines of matching points shall intersect at a common epipole as shown in Figure 6. Therefore, an epipole can be fitted by epipolar line of matching points. Then, we can use the distance from epipole and epipolar line to eliminate the abnormal epipolar lines. Meanwhile, the false matches can also be eliminated.

Epipole of zooming image.
Least square method is used for fitting epipole. In order to increase the accuracy of epipole, feature attribute of SIFT can be adopted to wipe off most false matches before epipole is fitted. According to equation (22), the distances from pole to every straight line can be calculated
Distribution of distance from epipole to epipolar line is shown in Figure 7(a). And vertical distribution from epipole to epipolar line is shown in Figure 7(b). Length of vertical line is the Euclidean distance from epipole to epipolar line. The figure shows that a few epipoles which are far from epipolar lines should be eliminated. Suppose the

Statistic results of distance from epipole to epipolar line: (a) distribution from epipole to epipolar line and (b) position of epipole.
Standard deviation is
Presume that distance from epipole to epipolar line obeys normal distribution. Abnormal matching points can be eliminated by setting a confidence interval. The confidence interval of polar line distance of matching point is
Suppose
where
In summary, the steps of false matches elimination algorithm based on distance from epipole to epipolar line are as follows:
Step 1: Make use of traditional SIFT algorithm to obtain the zoom image matching points set
Step 2: According to the matching points set
Step 3: Calculate the distances from epipole to epipolar line, and calculate the confidence interval by type (23) and type (24), set the standard deviation factor
Step 4: Calculate
Experiment results and analysis
Results of initial SIFT matching
The experimental results show that there are still some obvious false matches after using RANSAC algorithm (Table 1). It is necessary to eliminate false matches furtherly and get ideal matching points.
The statistics of matching results.
Results of three proposed false matches elimination methods
False matches elimination based on zooming image geometry constraint
Before the matching of actual image, we can detect matching point radical angle between level and distribution through a simple structured scene, so as to provide reference for the removal of matching error. Suppose the matching point radial angle distribution interval to be
The experiment results show that radial angle of matching points in the actual zoom image is different. Most of them are smaller (proportion of angle less than 1° is about 69.6% and less than 2° is about 93.1%). Matching points set with bigger radial angle is focus on nearby of the zoom center (proportion of angle greater than 2° is just 6.9%). So, according to the experiment results, a suitable error level is chosen for screening the ideal matching points of SIFT matching result.
Further processing results of SIFT matching are shown in Figure 8, which is on the basis of false matches elimination algorithm based on geometry constraint. First, statistics about the value of below a certain angle values of the match points, as shown in Figures 9 and 10. The change range to 0°–180° of the angle. The angle has covered all the match points. The remaining match point increase along the angle levels’ gradually increase can be shown in the picture.

Matching results of traditional method: (a) a pair of original various focus images, (b) initial SIFT matching results, and (c) matching results after RANSAC algorithm.

Results of radial angle of the matching point detection experiments: (a) results of the corner detection, (b) results of the angle of the matching points, (c) distribution of the angle of the matching points, and (d) distribution of large angle (angle is greater than 2°).

Angle distribution of the match point.
The number of remained match points is increasing as the error levels’ gradually increases which can be seen from Figure 11(a)–(c). The false matches of feature points are eliminated effectively using the improved SIFT method, the accurate match of the feature points is realized, and the ideal match points in depth estimated of the zooming image are obtained.

The effect of false matches elimination: (a) the upper limit of the angle: 7°, (b) the upper limit of the angle: 15°, and (c) the upper limit of the angle: 30°.
Figure 11 shows the effect of elimination the false matches. For the example of ‘Flower’ in the figure, there are no obvious false matches when the angle is small, and there are obvious false matches as the angle is big. There could not be allowed a larger angle of the match point in depth estimation of the zooming image. So an ideal match point in a smaller angle can be obtained. The analysis of ‘Flower with checkerboard’ is the same with ‘Flower’, the data can be seen from Figure 11(a)–(c).
As seen from the experimental results, false matches are eliminated by the algorithm based on geometry constraint, the accurate matching of the feature points are realized, the ideal matching points for depth estimation of the zooming image are also obtained. The algorithm proposed in this article has the same property with RANSAC algorithm which can adjust matches numbers by the threshold.
False matches elimination based on SIFT feature property
For the example of ‘Flower’ in the figure, the size distribution and the main direction distribution of the ideal match point are analyzed in Figures 12 and 13. The curve of scale of match point in the left and right images is shown in Figure 12(a). The curve of scale ratio of corresponding match point is shown in Figure 12(b).The curve of the main direction of match point in the left and right images is shown in Figure 13(a). The curve of ratio of the main direction of corresponding match point in the left and right images is shown in Figure 13(b).

Scale distribution of ideal match point: (a) curve of scale of match point and (b) curve of scale ratio of match point.

Main direction distribution of ideal match point: (a) curve of the main direction of match point and (b) curve of ratio of the main direction of match point.
The scale ratio of match points approaches to constant 3, only the scale ratio of the first match point is abnormal,which can be seen in the size distribution of the ideal match point from Figure 12. The direction rate of correspond match points close to constant 1, also only the first match point of abnormal ratio, which can be seen from Figure 13. The scale and direction match point is really a false match by observing the match points on the original image, as shown in Figures 14 and 15.

False matches of abnormal scale and direction.

The result of false matches elimination based on SIFT feature property.
Initial matches used in the experiment are shown in Figure 8(b), there are 208 match points, many false matches exist. The result of a false matches elimination algorithm based on properties of the SIFT features is shown in Figure 16, the remaining match point reached 121, accounting for 91.67% of all match points, and only one false match points left, which is superior the algorithms based on geometry constraint of zooming image, can be used for the subsequent 3D reconstruction of the surface. The algorithms based on geometry constraint of zooming image can be improved, since the scale ratio and the main direction ratio to meet the condition of the false match points are still exists. The analysis of ‘Flower with checkerboard’ is the same with ‘Flower’, the data can be seen from Figure 12–15.

Statistic results after false matches elimination: (a) distance from epipole to epipolar line and (b) position of the epipole.
Iterative false matches elimination based on distance from epipole to epipolar line
The new distance from epipole to epipolar line can be calculated with new epipole fitted by latest matching points. Figure 16 shows the statistic results; it can be seen that the distances from epipole to epipolar line are limited to a smaller range.
The accuracy of matching results can also be seen from the radial angle among the matching points. The statistic results are shown in Figure 17, the max angle is 7.6°, mean value of the angles is 0.4°, and most radial angles of the matching points are limited to a smaller range.

Radial angle among the matching points.
The analysis of ‘Flower with checkerboard’ is the same with ‘Flower’, and the data can be seen from Figures 16–18. Results of false matches elimination based on distance from epipole to epipolar line are shown, and there are no false matches left.

Results of false matches elimination.
Conclusion
According to the false matches of SIFT matching algorithm, features of zooming images and the model of ranging are studied in detail on the basis of the basic theories of SIFT features detection and matching algorithms. Three kinds of false matching elimination approaches are mainly investigated. First, part of false matches is eliminated effectively based on geometry constraint of zooming image. Matches are filtered by the error level of geometry constraint from the experimental results. However, the percentage of ideal matching points needs increased further. Second, a false matches elimination algorithm based on properties of the SIFT features is investigated to further remove the false matches through setting confidence intervals of scale ratio of match points and ratio of the main direction, and more match points are gained. Third, an iterative false match elimination algorithm based on distance from epipole to epipolar line is proposed. The experiment results of real images collected show that the three proposed false matches elimination algorithms are stable, practical, and valuable, and the false matches of feature points can be eliminated effectively by combining the three methods. On the basis of RANSAC algorithm, the further application of the three proposed algorithms in this article can eliminate almost all the false matches, and the rest matching points can be applied into robot visual servoing, which can achieve more desirable 3D reconstruction results than those in Gao et al. 15
Future work can focus on improving the speed of SIFT matching algorithm and finding more constraints to eliminate the false matches mostly. The elimination algorithms developed can also be applied into the image pairs feature matching gotten by one camera moving on a linear lead rail.
Footnotes
Handling Editor: Fei Chen
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is partly supported by the National Natural Science Foundation of China (grant no. 51175494), the State Key Laboratory of Robotics Foundation (grant no. 2016008), Program for Liaoning Excellent Talents in University(grant no. LJQ2014021), the Natural Science Foundation of Liaoning Province (grant no. 201602652), and Shenyang Ligong University Computer Application Key Discipline Foundation (grant no. 4771004kfx09).
