Abstract
In this paper, we propose a new method to compensate for lens distortions in image stitching. Lens distortions that arise from the nonlinearity of a lens are the main cause for mismatches in stitching images. We estimate the distortion factors for each image using the Division Model and linearize the projected relationships between matching distorted feature points. Because our method works at the RANSAC stage, the estimated distortion factors are further refined during the bundle adjustment phase and thus accurate distortion factors are obtained. Applications based on estimated lens distortion factors show that our method is more efficient and that the stitched results are more accurate than other previous methods.
1. Introduction
Recently panorama generation from partially overlapped images has become popular in many applications such as in cameras, surveillance systems, etc. The general process for panorama generation [1–4] uses feature matching, which has the advantage of a fast computational speed. This method automatically generates a stitched result from partially overlapped images through a series of steps, as in Fig. 1. However, if warping effects due to lens distortion are overlooked, the stitched result might contain misalignments, ghost effects, twisted scenes and other such defects. When the lens distortions of images are small, the general process [3, 4] for image stitching controls for them using the focal length of images with bundle adjustment [5]. However, with larger distortions it is hard to generate accurate stitched results simply by controlling for focal length. With great increases in multi camera applications nowadays, robust image stitching from the images captured from different cameras is needed in various applications such as robot surveillance.

Image stitching process based on the feature matching method
Correction of distortion effects had been studied in the context of camera calibration [6]. Tsai [7] recovered distortion factors after assuming that the 3D points of the corresponding points are known. Weng et al. [8] found the 3D points from calibrated objects and Zhang [9] used oriented planar patterns for calibration.
More recently, polynomial rather than calibration models have been proposed. To find the lens distortion factors, Fitzgibbon [10] proposed a method that formulates and solves geometric computer vision problems with radial distortion. Distortion factors of homography are estimated using the Eigen problem. However, there is an ambiguity in his method regarding the Eigenvalue to be applied. Josepthons and Byröd [11] solved the polynomial equations including the distortion factor with the Gröbner basis method. Byröd et al. [12] have applied this method to image stitching applications. Using the Division Model [10], they calculated homography, including the distortion factor and focal length, using only three matching points [13]. When applied in image stitching applications, their method assumes that all images have the same distortion factors and focal lengths. Therefore, if each input image has a different distortion factor and focal length due to their capturing devices or environmental factors, it is difficult to achieve accurate results using their method.
To deal with different distortion factors among images, Bukhari and Dailey [14] estimated radial distortion of a single image based on the plumb-line approach. Ju and Kang [15] proposed a simple method to estimate distortion factors for each image using the ratio of lengths between matched lines. They concentrated on the properties of line variations in respect to lens distortion. However, although they are able to estimate different distortion factors for each image, their method obtains convincing results only if sufficient matched parallel lines exist in the matched images.
In this paper, we propose a new image stitching method that addresses lens distortions in input images. We estimate each lens distortion factor and homography at successive iterations of the RANSAC [16] phase. In contrast to previous methods, we perform two steps. In the first step, the lens distortion factors for a matched pair are estimated by linearizing the projection between matching distorted point pairs. In the second step, the homography, which represents the relationship between the image pair, is estimated based on the lens distortion factors obtained in the first step. The estimated distortion factors are refined in a bundle adjustment [5] phase because they are sufficiently precise in relation to the objects in question.
The rest of this paper is constructed as follows. In Section 2 we describe our proposed lens distortion estimation method. In Section 3 we show that our method can efficiently estimate accurate distortion factors and performs stitched image construction better than previous models.
2. Estimation of Lens Distortion
2.1 Lens Distortion Model
To explain image variations caused by lens distortion, the Polynomial Model [17] and the Division Model [10] are generally used. The Polynomial Model is defined as
By contrast, the Division Model is defined by a somewhat different equation compared to (1).
However, this model can control a large distortion with a lower order than the Polynomial model. In this paper, we used this model with

Controlling lens distortion with the Division Model
2.2 Point Projection with Lens Distortion
When the ith image and jth image are matched, the relationship between matching distorted feature points in two images can be represented as
where
To begin estimating lens distortions, we assume that only one image is distorted and the other is not. When the ith image is distorted and the jth image is not in Equation (4), the relationship is represented as
In general, it is difficult to calculate the homography
where
When there are N matching feature points between the images, Equation (7) can be solved using the least squares method, which is expressed as
However, because of matching errors, the fourth column of
The homography
2.3 Estimation of Distortion Factors for Image pairs
When one image is distorted and the other is not, the distortion factor can be calculated using Equation (9). For general image stitching procedures, it can be assumed that input images are added to the stitching system one by one [4]. Therefore, except at the first step, only one distortion factor is estimated at each step; the second factor was estimated in the previous step. To estimate the two distortion factors of the two images at the first step, we consider the relationship of the estimated distortion factors between the two images.
Our method is based on general stitching procedures, in which it is rare to change the orientation of the camera relative to the horizon [4]. Thus, input images are overlapped according to translations of X vectors. Let us consider the overlapped undistorted feature point P between matched images A and B, as shown in Fig. 3. If the distortion factor of image A, λA, is less than 0,

Point distortion in an overlapped image pair
We estimate two distortion factors, λA and λB, in three steps. In the first step, we estimate λA by setting λB = 0. In the second step, we estimate λB by setting λA = 0. We adjust each of the distortion factors using the estimated result for the other factor obtained in the previous step. To adjust each estimated distortion factor, we use the scale variation of the absolute value of the estimated distortion factor according to the second distortion factor.
When the true λB is larger than 0,
where
The scale variation for the absolute value of the estimated distortion factor λB changes in the same way as the estimation procedure for λA.
We adjust the scale of the absolute value for each distortion value as
where sign(λA) is the sign of λA and γ is the scale factor to control the effect of λB. In our experiments, γ = 1.0.
3. Experiment Results
To implement the proposed stitching method with distorted images, we referenced the OpenCV 2.3.1 library. Our implementation followed a common method of stitching images [4], as shown in Fig. 1. The only differentiations in our method are in the RANSAC, the bundle adjustment and the blending phases, caused by the distortion correction. In the RANSAC phase, we randomly chose four matched points extracted from SURF [19]. Using the matched points, we estimated the distortion factors and their homographies and focal lengths for each image using Equations (9) and (11). In the bundle adjustment phase, we adjusted five parameters (x, y and z angles of Rodrigues' rotation formula [18, 20], and the focal length and distortion factors) for each image. In the blending phase, we implemented inverse mapping in place of distortion mapping.
To evaluate the performance of our proposed method, we compared our method with previous methods in two ways. First, we evaluated the errors of the estimated distortion factors according to the variations of distortion factors in the input images. We generated two distorted images with pre-defined distortion factors for a base reference. We compared our method with the Gröbner basis method [12] for the accuracy of the estimation. Second, we evaluated the stitching results with distorted images by comparing the results of Autostitch [4] and Image Composite Editor (ICE) [21].
3.1 Accuracy of the Estimated Distortion Factors
For the evaluation of the estimated distortion factors, we generated two matched and distorted images with predefined distortion factors. The images used for this are taken in distant view mode and have sufficient texture variation for feature detection, as shown in Fig. 4. Because of their long distant view properties, the images have little distortion when they are stitched. We distorted the images with pre-defined distortion factors that ranged from −0.5 to 0.5 using Equation (3). The distorted images have a 1280×960 image size.

Image pairs for the accuracy evaluation of the estimated distortion factors
We consider two cases of the relationships of the two distortion factors. In the first case, the two images are distorted with the same distortion factor (Fig. 4(b) and (c)). In the second case, the two distortion factors are different (Fig. 4(d)). For the comparison, we evaluated our method using distorted images with the same distortion factor (the first case) since the Gröbner basis method [12] assumes that all images are distorted with the same distortion factor. Note that our method can estimate two distortion factors even if each distortion factor has different values. We will also show the evaluation for the accuracy of the estimated distortion factors in the second case.
Fig. 5 shows the comparison of the estimated distortion factor between our method and the Gröbner basis method according to the distorted image pair, which is distorted with smooth varied distortion factors. Fig. 4(b) and (c) are input examples for the 0.3 and −0.3 distortion factors we tested. For this evaluation, we estimated distortion factors using two methods at the RANSAC stage since both of the tested methods work at this stage. We estimated 2000 times for the RANSAC iterations. We chose a distortion factor that has highest number of In-liner matching. We repeated this test ten times per each input distorted image pair and then averaged them.

Comparison of the estimated distortion factor
In our experiments, the Gröbner basis method could not estimate real distortion factors well since it matched features using a long focal length rather than variations of a distortion factor. Therefore, although the estimated distortion factor is almost close to zero, it has a high proportion of inlier number, as shown in Fig. 6. On the other hand, our method could estimate almost exact distortion factors, as shown in Fig. 5. Although two of the estimated distortion factors have different values, they are very close to the real ones. As shown in Fig. 6, our method has similar proportions of inlier numbers unrelated to distortion factors, while the Gröbner method has low proportions with distortion factors that are smaller than zero.

Comparison of the proportion of inlier number
Fig. 7 shows the comparable stitched result of the Gröbner basis method and our proposed one, when the true distortion factor has a value of −0.4. Fig. 7(a) shows the result of the Gröbner method. It obtained good matching while nevertheless preserving some image distortions. In comparison, our method undistorted each image accurately and stitched images together with no distortion, as shown in Fig. 7(b).

Comparison of stitched results
Since our method can estimate two different distortion factors, we also evaluated the estimated results when two images are distorted with different distortion factors, as in Fig. 4(d). We evaluated our method with two distorted images, which possess a smoothly varied distortion factor from −0.5 to 0.5, independently. Fig. 8 shows the errors in two estimated distortion factors. Fig. 8(a) is the error of the estimated distortion factor of the first input image and Fig 8(b) is the error in the second input image. In this evaluation, we estimated ten times for each iteration and averaged them. Our method estimated distortion factors that are close to the true ones, so that the error is smaller than 0.5 in almost all cases. These results lead to accurate stitched results with arbitrary distorted input images, as we will show in the next subsection.

Errors of the estimated distortion factors of the proposed method
3.2 Evaluation of Stitched Results
To evaluate the stitched results of the proposed method, we compared our results with ones by AutoStitch [4] and ICE [21]. AutoStitch is one of the most widespread general methods for image stitching. It can control small distortions by adjusting the focal lengths of the images. By comparison, ICE allows the combined image to be shifted to achieve perspective control. Thus, it is possible to reduce wide-angle lens distortion by using tiled exposures obtained at focal length settings that do not show the distortion.
The images we used in this evaluation are taken from several different indoor and outdoor areas such as a laboratory, a hallway, an urban landscape, a forest, etc. We captured images using a DSLR camera, a phone camera and a webcam. The distorted images have a 1280×960 size.
We considered three cases for this evaluation, according to the input distorted images. In the first case, we distorted all input images needed for distortion factors smaller than zero. In the second case, we distorted all input images needed for distortion factors larger than zero. In these two cases, to address adjustments in focal length with Autostitch, we set all distortion factors at the same value. In the final example, the input distorted images have different distortion factors, ranging from −0.5 to 0.5 in respect to each other.
Fig. 9(a) shows the input distorted images for the first case. The input images are captured in a laboratory using a webcam. For undistortion of the input images, a −0.25 value distortion factor is necessary. Fig. 9(b) illustrates the result obtained with AutoStitch. Although AutoStitch projected the input images using long focal lengths, it could not generate accurate results. There are ghost effects and the last input image is dropped altogether because it is not matched with any other images due to the distortion of the image. Fig. 9(c) shows the result of ICE. ICE compensated for distortion effects but some misalignments are shown in the result due to the incorrect estimation of distortions. By comparison, Fig. 9(d) shows the result of the proposed method. Since our method undistorted each input image using its estimated distortion factor, it could generate the accurate stitched result with neither ghost effects nor miss-alignments.

Examples of stitched results (laboratory)
For the second case, we set all distortion factors for an undistortion value of 0.4. Fig. 10(a) shows input distorted images taken in a restaurant. In this case, the input images are captured using a DSLR camera. Fig. 10(b) shows the result with AutoStitch, with distortions such as bending rails remaining in the final result. The result of ICE in Fig. 10(c) also has effects similar to AutoStitch. In Fig. 10(c), there are no distortions in the result generated by our proposed method.

Examples of stitched results (restaurant)
Fig. 11 shows some comparable results in the final example, where the input images are distorted by arbitrary distortion factors. The input images are captured in a hallway using a camera phone. As with previous cases, our method generated accurate stitched results, while there are numerous ghost effects, misalignments and the dropping of some images in the results obtained with AutoStitch and ICE.

Examples of stitched results (hallway)
To validate our proposed method with real data, we also evaluated our method on the real distorted images. Fig. 13 shows competitive stitching results from the two real distorted images taken by a GoPro-Hero camera [22] with vertical offsets. In the case of AutoStitch, the estimated parameters for the stitched result were too far from the ground truth; thereby the wrong result was achieved as shown in Fig. 13(b). Although ICE stitched the input images well, the result still had distortion effects (Fig. 13(c)). Fig. 13(d) shows our result where all of distortions are corrected. Fig. 14 and Fig. 15 show some other examples of the ICE and our proposed method. The input images were taken by a Sony HDR-AS15 camera, which has a wide angle lens; thereby they have large barrel distortions. The images have a 1280×720 size. The input images in Fig. 14(a) were captured with 120° of the lens angle and the images in Fig. 15(a) were captured with 170°. While the results of ICE have some dislocated regions (centre window in Fig. 14(b)) and distorted effects (pillar in Fig. 14(b) and door in Fig 15(b)), the proposed method could estimate the accurate distortion factors of the input images even if the input images were captured in different environments. It gave an accurate stitched result with no distortions.

Comparison of computing time

Stitched results from real distorted images taken by a GoPro-Hero camera

Stitched result of the proposed method using the real images taken by a 120 degree angle lens (out-door)

Stitched result of the proposed method using the real images taken by a 170 degree angle lens (in-door)
3.3 Computational Time
The proposed method needs additional time for stitching with distorted images in the RANSAC stage for the estimation, in the bundle adjustment stage for the adjusting of the estimated distortion factors and in the blending stage for backward mapping of distorted images.
For a comparison of computational time, we used the original images in Fig. 11(a), which are not distorted, with a 1280×960 size. Note that in this particular case our method estimated distortion factors to be close to zero. We measured each computational time by increasing the count of the input images. We also repeated the calculation procedure ten times per test and averaged the results. Fig. 12 shows the values of the computational times in our method and the general method [4] offered by the OpenCV library. We tested using a 3.3GHz QuardCore processor with 4GB of memory. As shown in Fig. 12, our method needed additional time compared to the general method due to the estimation of distortion factors. Since the count of image matching is defined by (N – 1) + (N - 2)+ · · · + 1 and our method partakes of this count, it increased proportionally according to the increase in input counts. However, when considered in the case of single image matching, our method needed only about 380ms of additional time.
4. Conclusion
In this paper, we present a new method to estimate accurate distortion factors for image stitching. With the Division Model, our method estimated accurate distortion factors by considering the relationships between matched feature points in distorted images. The advantage of the proposed method is that it can estimate accurate distortion factors even if the input images are distorted with arbitrary and varying distortion factors with respect to one another. However, the disadvantage is that it needs additional computing time. Even though distortion values are released for high performance cameras, lens distortions of varying severity occur in captured images, interfering with the generation of accurate stitching results. Since our method can deal with various images that include lens distortions, it will be useful in obtaining an improved performance in image stitching. However, since our method estimates distortion factors and homography independently stage by stage, our future work will focus on how to combine these two estimations, thereby resulting in faster performance.
Footnotes
5. Acknowledgments
This work was supported by the Basic Science Research Program through the National Research Foundation (NRF) funded by the Ministry of Education, Science and Technology (2010-0024641) and was supported by the Research Fund, 2012 of The Catholic University of Korea.
