Abstract
Establishing robust correspondences between two images is important for computer version tasks. However, in the real scene incorrect correspondences are inevitable no matter what kind of correspondence matching algorithms are adopted due to some complex factors, such as illumination, occlusion, and so on. To reduce the number of incorrect correspondences, an algorithm with the same object and same position constraints (SOSPC), is proposed to remove wrong correspondences from the given putative correspondences in this paper. The algorithm is based on the fact that in the given image pairs correct correspondences locate at the same position on the same objects. To select the correspondences on the same objects, an object matching method based on the correspondences selected by GMS is proposed. To select the correspondences on the correct positions, an iterative fundamental matrix estimation method based on clustering is presented. The experimental results have validated the effectiveness of the same object and the same position constraints, and the method achieves the state-of-art performance on five datasets.
Keywords
Introduction
Finding feature correspondences is a fundamental task in the field of computer vision, which has a wide range of applications such as structure-from-motion, 1 multi-view stereo, 2 image retrieval, 3 simultaneous localization and mapping, 4 face verification, 5 and et al. To acquire accurate correspondences, a popular two-step strategy is used. In the first step, the initial correspondence set is computed by correspondence matching algorithms (e.g. SIFT, 6 ORB, 7 A-KAZE, 8 etc). In the second step, more robust correspondences are determined out from the initial correspondence set by correspondence selection algorithms. By the second step, the ratio of correct correspondences can be raised, which offers more opportunities for the success of the subsequent operations in most cases.
Most correspondence matching algorithms establish correspondence relationship between two images by extracting keypoints and local image descriptors.6,7,9,10 Each keypoint has a descriptor to describe its texture, illumination and other features, and keypoints with most similar descriptors are matched. However, many correspondence matching algorithms usually have such problems: (1) due to complexity of real scenes, they are difficult to be adaptable to different photograph conditions in images simultaneously, such as changing viewpoints, repeated structures, different illumination, occlusion, similar textures and so on; (2) many matching algorithms only consider local descriptor, and some global structure information is ignored, which results in many correspondences having similar local structure but belonging to different objects.
Because of the above reasons, in the first step it is inevitable that correspondence matching results will suffer from a lot of incorrect matches as shown in Figure 1(a) and (b), no matter what kind of correspondence matching algorithm is applied. Thus in the second step it is necessary to remove the incorrect matches by the correspondence selection algorithm. Correspondence selection algorithms may select robust correspondences based on different principles. For example, some methods (e.g. RANSAC 11 and USAC 12 select robust correspondences by estimating a transformation model from one image to another image, and some other methods (e.g. GMS 13 and VFC 14 select correspondences based on the consistency of local geometric structures. However, these methods still have their own drawbacks: in the real world some transformation relationships between image pairs are complicated, for the methods that estimate the transformation model, it is hard to pick all correct correspondences out only by one kind of transformation model, for example, in Figure 1(c) correct correspondences on some objects are not selected using RANSAC method. For the methods based on local geometric structures, they often select the correspondences that match right regions, but sometimes they may not match the position of correspondences accurately in pixel level. Therefore, correspondence selection algorithm still deserves further study.

In this paper we focus on the second step and propose a method named SOSPC to select robust correspondences from the initial correspondence set. Based on the fact that correct correspondences should be on the same object and at the same position of the same object, we introduce the two constraints into the selection mechanism of correct correspondences in our method. For the same object constraint, the same objects in different images can be matched well by our designed method based on GMS, 13 then correspondences that match objects wrongly will be removed. For the same position constraint, the fundamental matrix that corresponds to each object is calculated by an iterative method, and according to the obtained fundamental matrix these correspondences on wrong positions are removed.
In summary, the contributions of this paper are as follows:
The constraint that correct correspondences are on the same objects is introduced into our algorithm. To select the correspondences on the same objects, an object matching method using the correspondences selected by GMS 13 is proposed.
The constraint that correct correspondences locate at the same position on the same objects are integrated into the correspondence selection algorithm. An iterative fundamental matrix estimation method based on clustering is presented to improve the precision of the extracted fundamental matrix.
Our method has achieved the state-of-art performance comparing with other correspondence selection algorithms on several challenging datasets.
Related works
Correspondence matching algorithms
The most popular strategy to establish the relationships among keypoints in different images are divided into three stages: extracting keypoints from images, computing descriptors of keypoints, and matching the keypoints in image pairs based on the descriptors. The most famous correspondence matching approaches are SIFT 6 and SURF. 9 But, because both of them use Gaussian scale space, some details and noises are smoothed, which affects the localization accuracy and distinctiveness. To solve this problem, KAZE method 15 is proposed, which uses nonlinear diffusion filtering to increase the distinctiveness. For decreasing the computational cost and taking the benefits of nonlinear diffusion filtering, some people design A-KAZE 8 method based on the Fast Explicit Diffusion framework. In some scenes, the computational resources are limited, ORB 7 and BRISK 16 are proposed, which both combine modifications of FAST corner detector 17 and binary descriptors based on BRIEF. 18
Correspondence selection algorithms
In the last decades, many correspondence selection algorithms have been presented. Generally speaking, the correspondence selection methods can be divided into parametric methods and non-parametric methods.
Parametric methods select robust correspondences mainly by estimating global transformation model. ACC, 19 as a kind of parametric method, uses Hessian affine detector, which is invariant to affine transformations, to estimate the local homography matrix as constraints. The most well-known correspondence selection algorithm is RANSAC, 11 which randomly samples some correspondences and estimates a plane homography matrix or a fundamental matrix based on the maximum number of inliers. According to RANSAC, some variants are developed such as MLEESAC, 20 PROSAC, 21 LO-RANSAC, 22 and USAC, 12 which also select robust correspondences by estimating a global homography or essential matrix. But such methods have some drawbacks: (1) the accuracy of estimated global transformation will be affected especially when many incorrect correspondences occur; (2) they often rely on a predefined parametric model, which is not suitable for the non-rigid image transformation.
For non-parametric methods, their principles may be different. A popular design strategy is based on the consistency of local geometric structures or appearance (feature similarity). Considering that the correct feature correspondences on the same object have coherent transformations, CoSeg-HV 23 integrates image co-segmentation into feature matching and applies the Hough voting and its inverted variant are applied to establish correspondences. LPM 24 removes the false matches in the initial correspondence set according to spatial neighborhood structures. The authors formulate the spatial structure constraint into a mathematical model, and derive a closed-form solution with linearithmic time. Based on the solution the true matches can be selected. VFC 14 assumes that noise around correct matches and wrong matches has different distribution. Then it estimates the probability of the correct matches by the maximum likelihood estimation. In Lee et al., 25 propose a robust correspondence selection method based on local neighborhood. In this method the feature of each correspondence consists of a local graph combined by features of neighboring correspondences, and by comparing the similarity of the local graph the algorithm can select robust correspondences. GMS 13 incorporates smoothness constraints into matching, and it proves that the number of correspondences around true matches and false matches has different distributions, which establishes a link between correspondence numbers and match quality. Thus the correct matches can be determined by the correspondence numbers around a correspondence. Because such methods distinguish true or false correspondences mainly by the local structure and impose no position constraint in the pixel level, sometimes the accuracy of such methods may be lower.
Method
Problem definition
Given two images
The same object constraint
The same object constraint means that the correct correspondences should locate on the same objects in image pairs, so different objects should be distinguished and matched in image pairs firstly, then correspondences that correspond to different objects should be removed.
To distinguish different objects in an image, the image is divided into several colorful regions by segmentation algorithm,
26
as shown in Figure 2. The region set that consists of these regions in

(a) The divided regions in
The same object constraint can be defined that: when
Based on the definition, the relationship of the same object in the two images should be established firstly. An intuitive method is to use existing initial correspondences to match the objects in image pairs. But, in fact, matching objects correctly may be difficult by the initial correspondences, because many initial correspondences are not reliable enough. Using initial correspondences to match regions will cause many incorrect object matching, which will result in the reduction of the obtained correct correspondences.
In this section, we propose an object matching method that utilizes the correspondences selected by GMS method.
13
GMS uses supported correspondences of a correspondence to determine whether the correspondence is robust. Let

The distributions of correspondence number. 13
Although by GMS algorithm some robust correspondences can be obtained, due to its limitation, many selected correspondences usually correspond to the same regions between images rather than the accurate position in pixel level. Fortunately this characteristics is suitable for object matching.
Another problem that needs to be solved is that due to the limitation of image segmentation algorithm, a region with a certain color may not be a real object, which causes many correct correspondences to be lost. For example, in Figure 2, due to the result of image segmentation, the real object that consists of
For a region
For correspondence
The true correspondence
The same position on the same object constraint
To describe the position constraint in pixel level, the fundamental matrix tool is used in our algorithm. From Faugeras,
27
for a correspondence
where
Many accurate algorithms, for example, RANSAC and its variants, use a fundamental matrix to describe the position constraint in pixel level. However, if the relative positions of different objects are changed in image pairs, such methods only select the correspondences of one object. Because when the relative position of different objects are changed, according to Theorem 1 it can be known that the these objects correspond to different fundamental matrix. Under the guidance of Theorem 1 and the goal to describe the position constraint in pixel level, the designed algorithm needs to obtain the fundamental matrix of each object.

The diagram of relative position of objects in image pairs: (a) the diagram when the relative position of objects are unchanged in image pairs, (b) the diagram when the relative position of objects are changed in image pairs, (c) assuming that the coordinate system of
Referring to the coordinate system of
where
Referring to the coordinate of
When the relative position between
The condition is equivalent to that
From equations (10) and (11), it can be seen that
(b) When the relative position between
The condition is equivalent to that
From equations (12) and (13), it can be seen that
Epipolar distance
As shown in Figure 5, according to the properties of fundamental matrix, it can be known that the epipolar lines in
where

The illustration of epipolar distance.
In equations (16) and (17),
An iterative fundamental matrix estimation method based on clustering (IFMEM)
To find the fundamental matrix of each object, the correspondence in
However, these seed fundamental matrix cannot be directly used to select correspondences at the same position on the same object, because under the limitation of accuracy of image segmentation and object matching, some correct correspondences may be removed while some wrong correspondences still remain. This disadvantage makes not all seed fundamental matrix so reliable, which causes some incorrect correspondences be selected and many correct correspondences be ignored.
To improve the precision and increase the number of selected correct correspondences, IFMEM is proposed. If the epipolar distance of the correspondence
After
While
The workflow of the whole algorithm is shown as Figure 6, and it can be also written as Algorithm 1. The number of initial feature matches is denoted as

The workflow of SOSPC with the same object and same position constraints.
Experiments
In this section, the performance evaluation and analysis of the proposed method are reported. The open source library OPENCV is employed to detect the initial correspondences by ORB algorithm.
Datasets
Five datasets are employed in our experiments:
Performacne evaluation
In the following, we refer to paper13,24 to evaluate our algorithms on five datasets by the precision, recall, and F-measure. They are defined as follows:
where
Verification experiments
In this subsection, the first experiment is used to prove Theorem 1. From Theorem 1, it can be known that each object corresponds to a fundamental matrix. Figure 7(a) is the initial correspondences obtained by ORB. The correct correspondences on

The experiments to prove that each object corresponds to a fundamental matrix: (a) the initial correspondences, (b) the correspondences selected by
To justify the two proposed constrains’ effects, we have also conducted two comparison experiments. Our contrast tests are both tested on the subset of the five datasets. The first comparison experiment is used to prove the effect of the same object constraint, and the results are shown in Table 1. From this table, it can be seen that the precision of GMS is higher than the initial correspondences, which means that the GMS methods can select more robust correspondences from the initial correspondences. Then the initial correspondences are used to match the same objects, and based on the object matching result the precision and F-measure of the selected correspondences are improved comparing with the initial correspondences, which illustrates that the same object constraint has a positive effect on the correspondence selection. The robust correspondences selected by GMS are also used to match the same objects in image pairs, and by this way it acquires the best precision and F-measure. This experiment shows that using robust correspondences for object matching can improve the performance of the same object constraint.
The experiment about the same object constraint.
The second comparison experiment is used to prove the effect of the same position on the same objects constraint, the related result is shown in Table 2. At first the correspondences that selected by the same object constraint using GMS is obtained, then the same position constraint is applied. The same position constraint is tested by two methods: one method is to utilize the seed fundamental matrix
The experiment about the same position constraint.
Contrast experiments of different methods
Our method is also compared with some competitive correspondence selection methods such as RANSAC, GMS, VFC on the five datasets. Table 3 shows the performance of these feature correspondences selection methods. Because different thresholds of an algorithm have different influences on precision and recall, the statistical results of precision and recall in Table 3 are obtained according to the thresholds corresponding to the highest F-measure for each algorithm. Due to similar texture in these datasets, there are a lot of incorrect initial correspondences, for example, the correspondence result shown in Figure 8(a). From Table 3, it can be seen that our algorithm achieves the best precision and F-measure in the first four datasets, which proves that our algorithm has better performance in these datasets. In the
Evaluation results of the five correspondence selection algorithm on the five datasets.

(a) The initial matches obtained by ORB, (b) the matches selected by RANSAC, (c) the matches selected by GMS, (d) the matches selected by VFC, and (e) the matches selected by our method SOSPC.

(a)The matches selected by RANSAC, (b) the matches selected by GMS, (c) the matches selected by VFC, and (d) the matches selected by our method SOSPC.
Evaluation results of GMS and ours on TUM-RGB dataset.
Furthermore, we analyze the performance of different correspondence selection algorithms, and explain the result based on a pair of images taken out from dataset
Evaluation results on an image pair.
The Precision-Recall (PR) curves are also drawn in Figure 10(a)–(e). These algorithms adopts different thresholds on

The performance of the evaluated algorithms on five datasets: (a) Graham, (b) Gerrard, (c) Person, (d) South, and (e) MultiObjects.
Conclusion
In this paper, we have proposed the correspondence selection method SOSPC to select the correct correspondences from the initial correspondences. In our approach, the same object constraint and the same position on the same object constraint are integrated into our correspondence selection algorithm. To match the same objects in the image pairs, we propose an object matching method by the correspondences selected by GMS. To improve the precision of the fundamental matrix, we propose the IFMEM method to calculate the fundamental matrix iteratively. Finally we have conducted several experiments to verify the effectiveness of the proposed constraints, and prove that our approach can achieve the best performance comparing with other correspondence selection algorithms in the given datasets.
Footnotes
Acknowledgements
We would like to thank the editors and the anonymous reviewers for their valuable comments and suggestions, which led to a substantial improvement of this paper.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by the National Key Research and This work is supported by the National Natural Science Foundation of China under Grants 61973302, National Key Research and Development Program of China under Grant 2018YFB1306303, and in part by the National Natural Science Foundation of China under Grant 61773374 and 61702323, and in part by the Major Basic Research Projects of Natural Science Foundation of Shandong Province under Grant ZR2019ZD07.
