Abstract
As to the problems of local stereo matching methods, such as matching window selection difficulty, fuzzy disparity edges and low accuracy in weak texture regions, this paper proposes an efficient stereo matching algorithm to improve the stereo matching accuracy in these regions. First of all, we segment the stereo images and calculate the adaptive support window according to the area of each segmentation region. Second, the matching cost is computed based on the feature fusion of color and gradient, and then the initial disparity can be achieved. Finally, the ultimate matching disparity can be obtained through a series post-processing, including consistency checking, mismatch correcting, disparity refinement and so on. Test results of Middlebury Stereo Datasets show that the proposed algorithm is effective with high matching precision, and especially can tackle well with the weak texture and slope surfaces regions.
Introduction
Stereo matching is one of the most active research areas in computer vision, and it is widely used in three-dimensional surface modeling, 3D object recognition, vision navigation, aviation and space exploration, industrial monitoring, image-based rendering and so on. 1 The Stereo matching problem is formulated with the process of computing a disparity map through a pair of stereo images given before, and the effect of three-dimensional reconstruction is directly affected by matching results. Domestic and foreign scholars have done a lot of researches on the stereo matching in recent years. Scharstein et al. summarized and evaluated some representative stereo matching algorithms in Scharstein and Szeliski, 2 and divided them into local and global algorithms. In global stereo matching algorithm, disparity can be estimated by minimized a global optimization function through establishing data item and smooth item. Graph cut algorithms, 3 belief propagation4,5 and dynamic programming 6 are some kinds of global algorithms with good effect. Global algorithms can handle well with occluded and weak texture regions; however, complex parameters setting, tedious calculation and optimization for global energy function are time-consuming, which makes it not suitable for real-time platform application. On the contrary, in local methods, disparity in complex texture regions can be recovered quickly and efficiently, but it is difficult to select the size and shape of the support window adaptively; especially, mismatch maybe appear in the weak texture and disparity discontinuous regions, and thus match accuracy decreases in the final result.
In local methods, the emphasis is on the selection of match support window. Matching algorithm-based window is often trying to enlarge the support window size to enhance the distinguish ability in weak texture regions. However, it would induce excessive smooth disparity and “foreground hypertrophy” effect in discontinuous areas. As to this problem, many representative local matching algorithms come up, and they not only consider the gray level of image, but also take color into account. All these local methods are divided into two categories in Zhang et al. 7 One category is mainly selecting the best window in the pre-given windows as the support window,8,9 or choosing the size and shape for support window point by point adaptively.10–12 When computing the matching cost, the match accuracy can be improved by changing the window size adaptively. With this kind of algorithms, the disparity precision in difficult matching regions can be improved, and the foreground hypertrophy phenomenon can be inhibited as well, but because of the fixed shape of the window and the lack of flexibility, matching error rate is still high and disparity edge is not distinct. The other kind of method is based on the idea of pixel weighted in the support window with established size and shape. Yoon et al. proposed a locally adaptive support-weight approach with highly match precision in Yoon and Kweon, 13 in which the segmentation weights adapts according to the color and geometric relationship between pixels within the specified window and pixels to be matched, but it is computational complexity and high operation cost, it could not reflect the efficiency of local algorithm. Federico et al. improved the weighted function by color segmentation in Tombari et al., 14 which improved the match precision, but further increased the operation cost. A highly precision disparity map can be obtained with the method of the combination with adaptive weights, non-parametric and disparity refinement in Gu et al., 15 but the combination produces a higher operation cost. An adaptive weighted method with highly precision disparity map is proposed in the literatures,16,17 the pixel in the match window is assigned weight adaptive based on the difference in local region, but the method increases the match window size and operation cost. In Yu and Rong, 18 the pixels in match window are weighed by multiple mode, but weight calculation algorithm is much complex. Segmentation algorithm based on color proposed in Ni et al. 19 exhibits superior performance in weak texture regions for stereo matching, but it depends on the stability of segmentation algorithm and the accuracy of the initial disparity, so it is not suitable for too large weak texture regions.
As to the problems mentioned in local methods which are difficult to select support window and the precision and speed are contradictory, a new stereo matching algorithm based on color segmentation in the CIELAB color space for adaptive window and multi-feature fusion is proposed in this paper, which combines the advantages of color segmentation and local window. Experiment result shows that highly match precision can be gained by the proposed algorithm in regions with discontinuities disparity and weak texture, especially there is better effect in large regions with weak texture, slope plane and disparity edge.
Basic assumption and process
As concluded above, there are some problems in local stereo matching. 20 First of all, it is difficult to select the matching support window, for small window size will induce mismatch in weak texture regions, but large window size will lower the match efficiency. Second, non-weighted window improves the efficiency and has noise immunity, but it will cause over smooth in large window edge and sensitive to occluded regions, and then can give rise to mismatch easily.
So we assume that disparity of pixels in the same segment region is approximately the same or smooth; in this premise, we propose an local stereo matching algorithm using adaptive support window and multi-feature fusion based on segmentation, ASWMFF for short. First of all, the stereo images are segmented by Mean-shift method in CIELAB color space, the area of each segment region is counted, and the size of each matching window is calculated according to the area. The key point is to select these pixels inside the window to compute the match cost which with the matching points belong to the same segmentation region. Meanwhile, the matching cost is calculated by using the multi-feature linear weighting fusion method, which includes color and gradient features. Then the initial disparity maps of the stereo image pairs, respectively, can be obtained by using sum of absolute differences (SAD) to measure similarity. Finally, the ultimate refined disparity map can be gained through a series of experiments by the optimization of the initial disparity map, including mismatch check, classification and disparity refinement, in which reliable and unreliable disparity can be gained by horizontal disparity consistency check and winner-take-all(WTA) method, and then classify the unreliable disparity as mismatch or occlusion, and optimize the unreliable disparity by the neighborhood of reliable disparity.
Adaptive window selection based on color segmentation
Color image segmentation based on CIELAB space
A more suitable segment image can be got for human visual system in CIELAB color space, for which is a kind of apperceive uniform color space, the luminance and color in this space are independent with each other, and the Euclidean distance of different colors in visual has equivalent chromatic perception in different direction and different place. The proposed method first of all converts the RGB image into CIELAB color space and then segments the converted image by Mean-Shift; finally, calculate the size of match window according to the segmentation. The segmentation results as shown in Figure 1.
Left image and segmentation result. (a) Left image and (b) result of segmentation.
Adaptive selection of matching window
According to the color information, we can segment the image into several regions. If the segment region is too large, though pixels inside the window only with close color can be segmented into the same area, the disparity inside the window may change greatly when there is slope plane or curve surface. So over-segmentation is often used to avoid this situation, but it is usually with high requirement of parameter setting. However, in the premise that disparity of pixels in the same segment area is approximately the same or smooth, this paper proposes an adaptive window selection method based on segmentation, which can overcome the limitation of too large segmentation, and even if it is so, the ratio calculation method can get proper matching windows.
The first step of calculating adaptive window is to get area of every segment region, i.e. compute the total pixel number of every segment region, and then determine the size of the initial rectangular window centered at the pixel to be matched using equation (1). Next, we choose some pixels in the initial rectangular window to form the adaptive matching support window, which belongs to the same segmentation region with the pixel to be matched, and the matching window may be the region of any shape and size rather than a fixed region. Detailed process is as follows.
Adaptive initial rectangular window selection. Selection for adaptive support match window Adaptive support matching window sketch map. (a) Initial rectangle windows and (b) adaptive support matching window.
i is segment label for image to be matched,
The initial matching window is selected according to the segment region without considering segmentation boundary limit. The pixels inside the window may cross more than one segment region and exist in discontinuous or occluded region as well. The accuracy of the match result will be disturbed if they all take part in calculating match cost, and it will cause boundaries of disparity fuzzy and over smooth and mismatch. In order to get disparity map with clear boundaries, many researchers describe the relationship between pixels inside the rectangular window and the pixels to be matched using weight method, although highly match accuracy got, the calculation of weight value is very complex with low efficiency. To improve the efficiency and meanwhile ensure the accuracy, we propose a simple adaptive matching support window selection method by re-selecting the initial rectangle match window of pixels to be matched. We just select the pixels belonging to the same segment region with the pixel to be matched as the actual matching region, and then a new adaptive deformable matching region is formed, the shape of which is changeable with different pixels to be matched and segment results, and so a reliable matching support window can be provided for the latter match cost calculation. The comparison results of five optional adaptive variable matching windows are shown in Figure 2. White rectangular region in Figure 2(a) is the initial rectangle window, and the blue region in Figure 2(b) is the ultimate selection of adaptive matching support window region.

Matching cost computation of multi-feature fusion and disparity optimization
Matching cost of multi-feature fusion
Matching cost based on segmentation of CIELAB color space
CIELAB color space is a perceptual uniform color space, in which luminance information and chrominance information are independent, which is visual to the human eye with visual characteristics, so the chroma and luminance information of the three-channel in LAB space is selected to calculate match cost in this paper.
It is very important to select the appropriate similarity measure function for the disparity accuracy of stereo matching, and SAD is a simple and efficient similarity measure function used in local matching process commonly. This paper calculates stereo matching cost as well by using SAD. Chroma segmentation matching cost is calculated by equation (3). IL and IR denote the reference image and the target image. Suppose p is an arbitrary pixel point in IL, q is the pixel in IR, i.e.
Combined matching cost with introducing gradient magnitude
The premise of combined matching cost based on chrominance segmentation is that disparity of pixels in the same segment region is approximately the same or smooth; in the actual case, a sudden change in the color is often the edge of depth map, i.e. the regions of discontinuous disparity, and therefore the matching accuracy can be improved in regions of weak texture and disparity discontinuous regions based on chrominance segmentation, but it will cause identical disparity or disparity smooth in the same segment region if only dependent on chrominance segmentation. It will cause blurred edges and other mismatch, i.e. when adjacent region is with similar color but different depth.
The problem of mismatching and depth edge blur can be avoided as far as possible by the proposed method, i.e. using a scaled-down adaptive matching window algorithm, but mismatching cannot be completely avoided, thanks to the image gradient features, which can describe the feature of edges and slope surface regions more accurate. 21 Therefore, in order to further improve the matching accuracy in disparity discontinuous regions, we introduce the image gradient information into the matching objective function, combining with Lab three-channel component information to form the weighted-combination matching cost calculation as shown in equation (8), thus further improving the matching accuracy of edges and weak texture regions.
The gradient magnitude characteristic of any point (x, y) in the image is calculated as follows
p and q are assumed to be corresponding matching points and the disparity between them is d. The match cost can be calculated as follows
We calculate the matching cost by using the linear weighted method of the two features, and in this paper, an improved C model combined with
The initial disparity maps of the reference left image and target right image can be obtained by the method described above.
Disparity optimization
The ultimate dense and high accuracy disparity map can be obtained after a series of optimization for the initial disparity maps. First of all, we check the initial left and right disparity map by disparity consistency check and winner-take-all method, then classify them into sets of reliable and unreliable disparity, and then classify the unreliable disparity 20 into three kinds, such as mismatch, occlusion, and both mismatch and occlusion. Optimize each kind unreliable disparity according to the method in Haizhong et al. 20
Experiments and results
The proposed algorithm is realized in MATLAB. In order to verify its efficiency, we test and evaluate the ASWMFF algorithm in the Middlebury dataset, which is released on Middlebury website (http://vision.middlebury.edu/stereo/) recognized by the current academic community. During the experiment, we set
Weighted method of data fusion is a simple and effective method, and the weighted coefficient determines the effect of fusion. Of course, the matching result will be affected by different values of
The disparity map result obtained by the proposed algorithm is shown in Figure 3. As can be seen from the figure, the proposed algorithm is simple and effective, and precision dense disparity can be obtained in the majority difficulty problems of stereo matching, such as weak texture, occlusion and discontinuous regions, even in large weak texture regions and slope surface.
Experimental results on Middlebury dataset. Left-to-right: Tsukuba, Venus, Teddy, Cones, respectively. Top-to-bottom: color images, ground truth, results of our method, and error maps with threshold equal to 1.0.
Disparity map for Teddy image is obtained by four different methods is shown in Figure 4. The algorithm proposed in the literature
2
is a classic graph cut matching algorithm. In Figure 4(a) and (d), we can see that our method can overcome the fuzzy mismatch at disparity edges, and significantly improve the match accuracy and obtain dense disparity in the blue oval regions, cabinet edges and other regional disparity discontinuities in the side of the cabinet and other large slope weak texture regions; the algorithm proposed in literature
14
is one of the best local algorithm, and the method proposed in the literature
22
is one of the latest outstanding local algorithm. As seen in Figure 4(b), (c) and (d), our algorithm has a certain advantage in the blue oval regions, cabinets and other disparity discontinuities regions.

Quantitative evaluation of the proposed method compared with other methods on Middlebury.
In Table 1, algorithms of Coop Region, Segment Support, Cottager + occ, AdaptDispCalib, AdapWeight, SegAggr, AdaptLocalSeg and so on are the latest outstanding local matching algorithms, and algorithms of AdaptingBP and OverSegmBP are some of the most outstanding global matching algorithms. Seeing mismatch quantitative data feedback from the website in Table 1, the proposed method is slightly worse than the most outstanding algorithms such as CoopRegion and AdaptingBP, but much better than GC + occ, TreeDP, SemiGlob and other global algorithms, and in the AdapWeight, SSD + MF and other local algorithms, the matching accuracy is superior to most of algorithms, and neck by neck with the current mainstream adaptive window algorithms such as SegmentSupport and AdaptLocalSeg, and meanwhile, the simple and effective adaptive window selection method proposed in the paper can improve the computational efficiency and applicability.
Conclusions
With adaptive window selection based on color segmentation and multi-feature fusion, we proposed a stereo matching algorithm. First of all, the stereo images are segmented by Mean-shift method in CIELAB color space. Pixel disparity is same or smooth in the same segmented region. The initial matching window is acquired by each segmented area, in which the final matching region is obtained by choosing pixels in the same segmented region with the pixel to be matched. And then the matching cost is calculated by the multi-feature fusion method, and the method overcomes the fuzzy disparity boundaries and enhances the accuracy of matching in weak texture regions. Finally, we get high precision dense disparity map by reliable classification and disparity optimization. The experiment results show that our algorithm is simple and effective. The feedback data from Middlebury website shows that the proposed algorithm exceeds most of the current outstanding matching algorithms, and illustrates the effectiveness of the proposed algorithm in weak texture and occluded regions, as well as the disparity discontinuous regions. However, mismatch ratio of disc is higher compared with the other two indexes in the three quantitative evaluation indexes. The future work will focus on further improving the matching accuracy of the disparity discontinuous regions and research the dynamic weights setting method, and getting better dense disparity.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research is supported by the grant from the Principal Research Foundation of Xi’an Technological University (grant no. XAGDXJJ14024).
