Sage Journals: Discover world-class research

Abstract

Alignment for natural images in unconstrained environment is a challenging task. Despite the success for complex deformation, existing feature-based methods may be confused in low-textured regions where features are insufficient, while pixel-based approaches may fail when color changes. In this paper, a parametric chamfer alignment method based on mesh warping model is proposed. Warped positions of mesh vertices are considered as parameters and estimated by optimizing an object function, which measures the chamfer distance of edges and the smoothness of warping. To distinguish the sharpness of pixels, edges are detected through K-means cluster and weights are attached to different levels of edge points. In addition, after the warping model is initialized by feature-based alignment, a growing technique for registering the vertices is presented. Experiment shows that the proposed method outperforms some state-of-the-arts on real data and can be applied in stitching ceramic sanitary ware images.

Keywords

Chamfer alignment mesh-based registration edge detection image stitching ceramic sanitary ware

Image alignment is a crucial computer vision problem of finding a transformation which aligns two images captured from different views or at different times. It has been widely applied in visual measurement, such as pose detection,¹ precise positioning,^2,3 and dimensional measurement.⁴ Alignment on real-world images still remains a challenging task due to some factors such as non-rigid deformation, illumination change, noise, and low texture.

Feature-based alignment approaches rely on feature correspondences to estimate transformation model. SIFT⁵ is a classic feature-point descriptor. Global parametric models, such as homography, are appropriate for the cases when images are captured at a fixed view point or the scene is planar. To handle more complex deformation, local varying models^6–8 and mesh-based content-preserving warps^9–13 were proposed in recent years. However, these methods are sensitive to the quality of matching points. In other words, matching errors or insufficient features in low-textured regions may cause misalignment. Although line segment matching¹¹ can remedy this limitation to a certain extent, some weak features are still omitted.

To address this problem, a mesh-based photometric alignment method¹⁴ was proposed. It optimizes the mesh deformation model by minimizing pixel intensity difference and achieves better alignment quality than feature-based methods, especially for low-textured images. However, it is not robust to color variation, which may be caused by light condition, camera exposure, viewing angle or different modalities in unconstrained environment.

In this paper, we present an edge-based image alignment method, since edges are salient structures and stable to illumination changes. The fundamental technique of edge registration is based on the chamfer matching.¹⁵ The geometric transformation parameters can be optimized through minimizing an object function formulated by the distance transform image. Global parametric model and mesh warping model are adopted for simple and complex deformation, respectively. In addition, edge points are clustered into levels depending on their sharpness and weights are allocated to them in the cost function. The main contributions of this paper are presented as follows:

(1) A mesh-based chamfer alignment method, which is robust to images with low texture and color variation, is proposed;

(2) A cluster-based edge detection approach, which weights the contribution of different edge points in the object function, is introduced;

(3) Combining with feature-based alignment, an optimization scheme of mesh warping is presented.

The rest architecture of the paper is organized as follows: section “Related works” presents the related works. Parametric chamfer alignment solved by iterative optimization is revisited in section “Parametric chamfer alignment.” In section “Mesh-based chamfer alignment,” the proposed mesh-based chamfer alignment method is described in detail. Section “The experimental verification” gives the experimental results and comparisons with some other approaches. The conclusion is summarized in the last section.

Related works

Image alignment approaches are mainly categorized into feature-based and pixel-based algorithms. A thorough review of early works can be found in Zitová and Flusser.¹⁶ Feature-based methods estimate transformation model by means of feature correspondences. Point features, such as SIFT,⁵ and global models are the most common. To align images with multi-plane scene, local varying models were introduced. Gao et al.⁶ proposed a dual-homography model for the scene containing a distant plane and a ground plane. Zaragoza et al.⁸ presented an as-projective-as-possible (APAP) warp, which computes homography matrices for all local cells. Besides, mesh-based warping models can also deal with complex transformation. Liu et al.⁹ proposed content-preserving warp (CPW), which computes the warps of grid vertices by optimizing a linear least-squares problem. However, the alignment accuracy of these methods is depending on the quality and quantity of corresponding feature points. Mismatched pairs or insufficient features in low-textured region may cause misalignment. To resolve this defect, Zhang et al.¹² introduced an outlier rejection approach via fitting local homographies and computing the residual errors. Li et al.¹⁷ proposed a robust elastic warping (REW) method, which adopts the thin plate spline model and refines features iteratively based on a probabilistic model. Moreover, Guo et al.¹⁸ introduced a grid-based feature tracker to make points cover the image densely and uniformly. To cover the shortage of keypoint features in low-texture scenes, Li et al.¹¹ proposed a dual-feature warping (DFW) model, which matches line segments and extends the mesh-based model by incorporating a line alignment term. Some recent methods^12–14,19 also adopted line preserving constraint to achieve better results. However, this term relies on the line matching result and cannot ensure alignment of curved lines.

Pixel-based methods are to find the motion parameters of images by measuring the photometric error of overlapping pixels. Full search, also called template matching, is a straightforward technique. The Sum of Squared Differences (SSD) and the Normalized Cross-Correlation (NCC) are the most widely used measures. Korman et al.²⁰ proposed a fast matching algorithm for affine transformation. Hel-Or et al.²¹ introduced a matching scheme by tone mapping, which is robust to noise and photometric variance. These methods could obtain the global optimum, but are not suitable for complex deformation, since computational cost increases exponentially with the number of parameters. Parameter optimization methods, such as Gaussian-Newton and Levenberg-Marquardt,²² can be used to localize the minima, as the photometric error is regarded as the object function. The classic approach is the Lucas-Kanade optical flow algorithm.²³ Tian and Narasimhan²⁴ proposed a data-driven iterative algorithm to converge to the global optimum for non-rigid distortions. Lin et al.¹⁴ introduced a mesh-based photometric alignment method, which shows higher accuracy than feature-based approaches, especially for low-textured images. However, it assumes the color consistency of images, and thus suffers from illumination changes and noise. To improve the robustness to color variation, Chen et al.¹⁹ proposed a local color mapping model for each mesh. However, it involves in more parameters and increases the optimization difficulty.

Moreover, in recent years, some transformation parameter estimation approaches based on convolutional neural networks were proposed to handle low texture and illumination change. DeTone et al.²⁵ and Nguyen et al.²⁶ presented supervised and unsupervised learning methods to estimate global homography models, respectively. Ye et al.²⁷ introduced a deep meshflow model, but suffered from high training cost.

Chamfer matching is a classic edge-based alignment method, first proposed by Barrow et al.¹⁵ It is insensitive to small deformation and background disturbance, and thus has been used for object detection and recognition.^28–30 In this paper, to tackle the challenges to align images in unconstrained environment with complex transformation, low texture and color variation, we propose a mesh-based chamfer alignment method.

Parametric chamfer alignment

Chamfer matching¹⁵ is a classical image alignment technique between edge maps. Let $P = {p_{i}}_{i = 1}^{N}$ and $Q = {q_{j}}_{j = 1}^{M}$ be the sets of edge points of reference and target image respectively, the chamfer distance between $P$ and $Q$ is defined by the average of distances between mapped location of each point $p_{i} \in P$ and its nearest edge in $Q$ , formulated as

E (P, Q; s) = \frac{1}{N} \sum_{i = 1}^{N} min_{q_{j} \in Q} ∥ Φ (p_{i}; s) - q_{j} ∥,

(1)

where $Φ : R^{2} \to R^{2}$ is the warping function parameterized with $s$ . $s$ is a $K$ -dimensional vector and $K$ is determined by the geometric transformation type. A parametric chamfer alignment problem is to find the best $s$ that minimizes the matching cost.

The chamfer distance function can be efficiently computed through a distance transform image, which assigns each pixel as the distance to its nearest edge point in $Q$ , formulated as $D T_{Q} (x) = min_{q_{j} \in Q} ∥ x - q_{j} ∥$ . The distance transform can be computed in linear-time,³¹ and thus (1) is equivalent to $E (P, Q; s) = \frac{1}{N} \sum_{i = 1}^{N} D T_{Q} (Φ (p_{i}; s))$ .

Chamfer matching can tolerate small deformations, but is still sensitive to outliers in $P$ . To improve robustness, the distance transform image is penalized through a normal function in Zhang et al.,²⁹ denoted as $ND T_{Q} (x) = 1 - \exp (- \frac{D T_{Q} {(x)}^{2}}{2 σ^{2}})$ , where $σ$ is a parameter. When $D T_{Q} (x)$ is large, $ND T_{Q} (x)$ is limited to 1. Thus, an alternative cost function is

E (P, Q; s) = \frac{1}{N} \sum_{i = 1}^{N} ND T_{Q} (Φ (p_{i}; s))^{2} .

(2)

The nonlinear least-squares problem can be solved via the Gauss-Newton method. Given an initial vector $s_{0}$ , $s$ is optimized in iterative technique. In each iteration, an increment vector $Δ s$ is obtained by solving the linear system of equations

J^{T} J Δ s = - J^{T} r,

(3)

where $r \in R^{N \times 1}$ is the residual vector, s.t., $r = (ND T_{Q} (Φ (p_{1}; s)), \dots, ND T_{Q} (Φ (p_{N}; s)))^{T}$ , and $J \in R^{N \times K}$ is the Jacobian matrix with respect to $s$ , whose i-th row $J_{i}$ is computed as

J_{i} = \frac{\partial ND T_{Q} (Φ (p_{i}; s))}{\partial s} = \nabla_{x} ND T_{Q} (Φ (p_{i}; s))^{T} \frac{\partial Φ (p_{i}; s)}{\partial s},

(4)

where $\nabla_{x} ND T_{Q} (Φ (p_{i}; s))$ is the gradient vector of length 2 of the distance transform image at $Φ (p_{i}; s)$ , and $\frac{\partial Φ (p_{i}; s)}{\partial s} \in R^{2 \times K}$ is the Jacobian of the 2D transformation with respect to $s$ . $Φ$ is assumed to be continuously differentiable to give the required partial derivatives.

Translation, euclidean, affine and projective transformations are typical global models. For complex deformation, parametric chamfer alignment is extended to mesh warping, which is introduced in the following section.

Mesh-based chamfer alignment

Clustered edge detection

Classical edge detection methods, such as Canny detector,³² usually first compute the gradient magnitude, and then decide edges using thresholds. Low thresholds can detect plenty of edges, but are responsive to noise and irrelevant features in the image. Conversely, high thresholds may ignore weak edges. In addition, image intensity even determines the choice of thresholds in some extent. Rather than selecting appropriate thresholds, all image pixels are partitioned into some levels depending on their gradient magnitudes, where higher level signifies sharper edges. The unsupervised $K -$ means³³ algorithm is adopted to cluster data. The cluster number is set to 4 in our method and weights are assigned to every levels to represent the sharpness of edge points, which are set to ${1.0, 0.8, 0.5, 0}$ from high to low. In this case, pixels in the lowest level are not edges. Figure 1(c) shows the edge map with weight values, detected from the reference image (a). Weights of the edge points are used for calculating the alignment error in the next subsection.

Figure 1.

Edge detection and distance transform on images of the temple⁶ database: (a) reference image, (b) target image, (c) weighted edge maps of (a), and (d) inverse-color distance transform image of (b).

Distance transform image is also constructed on clustered edge points. Initially, for each level of edge points, the distance transform $D T_{Q_{l}} (x)$ is computed, s.t. $D T_{Q_{l}} (x) = min_{q \in Q_{l}} ∥ x - q ∥$ , where $Q_{l}$ denotes edge points in level $l$ . Then, the distance transform of $Q$ is calculated by minimize the weighted normal functions for each location, formulated as

ND T_{Q} (x) = min_{l} [1 - w_{l} \exp (- \frac{D T_{Q_{l}} {(x)}^{2}}{2 σ^{2}})],

(5)

where $w_{l}$ is the weight of level $l$ and $σ$ is a parameter. Figure 1(d) shows the inverse-color image of distance transform for edge points in target image (b).

Mesh-based alignment

Mesh warping model is adopted for image alignment with complex deformation. Let the reference image $I_{0}$ be divided into a uniform grid mesh, the original locations and the warped positions of vertices are denoted as $v_{k} = (x_{v_{k}}, y_{v_{k}})$ and ${\hat{v}}_{k} = (x_{{\hat{v}}_{k}}, y_{{\hat{v}}_{k}})$ , respectively, where $k = 1, 2, \dots, K$ is the vertex index. Then, the coordinates of each ${\hat{v}}_{k}$ are arranged into a $2 K$ dimension vector

V = {(\begin{matrix} x_{{\hat{v}}_{1}} & y_{{\hat{v}}_{1}} & x_{{\hat{v}}_{2}} & y_{{\hat{v}}_{2}} & \dots & x_{{\hat{v}}_{K}} & y_{{\hat{v}}_{K}} \end{matrix})}^{T},

(6)

which is aimed to be solved and guide the warping of the target image.

Alignment term The alignment term is defined by the chamfer distance of edges. The warping of each edge point in $I_{0}$ under mesh warping model is represented as a bilinear interpolation of its four enclosing vertices, as shown in Figure 2, formulated as

Φ (p_{i}; V) = \sum_{l = 1}^{4} α_{i, l} {\hat{v}}_{i, l} .

(7)

Figure 2.

Point mapping interpolation.

The interpolation weights $α_{i, 1}$ , $α_{i, 2}$ , $α_{i, 3}$ , and $α_{i, 4}$ of $p_{i}$ are computed as

\begin{matrix} α_{i, 1} = (x_{v_{i, 4}} - x_{p_{i}}) (y_{v_{i, 4}} - y_{p_{i}}) / (w \times h), \\ α_{i, 2} = (x_{v_{i, 3}} - x_{p_{i}}) (y_{p_{i}} - y_{v_{i, 3}}) / (w \times h), \\ α_{i, 3} = (x_{p_{i}} - x_{v_{i, 2}}) (y_{v_{i, 2}} - y_{p_{i}}) / (w \times h), \\ α_{i, 4} = (x_{p_{i}} - x_{v_{i, 1}}) (y_{p_{i}} - y_{v_{i, 1}}) / (w \times h), \end{matrix}

(8)

where $w \times h$ is the size of a mesh patch. Similar to (2), the alignment term is defined as

E_{a} (V) = \sum_{i = 1}^{N} w_{i} ND T_{Q} (Φ (p_{i}; V)),

(9)

where $w_{i}$ is the weight of edge point $p_{i}$ .

Smoothness term To maintain the local warping similarity of neighboring vertices and propagate transformation to regions with insufficient edge points, regularization in Zhang et al.¹² is adopted. Warping of each vertex is constrained to the average of its neighbors. Thus, the smoothness term is defined as

E_{s} (V) = \sum_{k = 1}^{K} ∥ {\hat{v}}_{k} - \frac{1}{| N_{{\hat{v}}_{k}} |} \sum_{v \in N_{{\hat{v}}_{k}}} v ∥^{2},

(10)

where $N_{{\hat{v}}_{k}}$ is the 4 or 2-connected neighboring set of internal or boundary vertex ${\hat{v}}_{k}$ .

Optimization Alignment and smoothness terms are combined into the object function, formulated as

E (V) = c_{1} E_{a} (V) + c_{2} E_{s} (V),

(11)

where $c_{1}$ and $c_{2}$ are weights of terms. The optimization is performed by gradient descent method. The gradient of $E (V)$ with respect to $V$ is calculated as

\begin{matrix} \nabla_{V} E {(V)}^{T} = c_{1} \sum_{i = 1}^{N} w_{i} \nabla_{x} ND T_{Q} {(Φ (p_{i}; V))}^{T} \frac{\partial Φ (p_{i}; V)}{\partial V} \\ + 2 c_{2} \sum_{k = 1}^{K} e_{k}^{T} \frac{\partial e_{k}}{\partial V}, \end{matrix}

(12)

where $\nabla_{x} ND T_{Q} (Φ (p_{i}; V))$ is the image gradient vector, $\frac{\partial Φ (p_{i}; V)}{\partial V} \in R^{2 \times 2 K}$ is a sparse matrix calculated from (7), $e_{k} = {\hat{v}}_{k} - \frac{1}{| N_{{\hat{v}}_{k}} |} \sum_{v \in N_{{\hat{v}}_{k}}} v$ , and $\frac{\partial e_{k}}{\partial V} \in R^{2 \times 2 K}$ is a sparse matrix. Given an initial $V_{0}$ , the updated function is

V_{n + 1} = V_{n} + max (- 1, min (1, - \nabla_{V} E (V))),

(13)

where the increment is constrained in an interval $[- 1, 1]$ to avoid large deformation change of mesh vertices in one iteration. Moreover, $c_{1}$ and $c_{2}$ determine the step length of iteration, which promotes convergence to the minima. To ensure the convergence speed and the alignment accuracy, $c_{1}$ is reduced by half as the amplitude of fluctuation is larger than convergence. An outline of optimization is given in Algorithm 1.

Algorithm 1 Gradient descent optimization for chamfer alignment using mesh deformation
Input: Edge point set $P = {p_{i}}_{i = 1}^{N}$ and their corresponding weight $W = {w_{i}}_{i = 1}^{N}$ , the distance transform image $ND T_{Q}$ Output: Warped position vector of vertices $V^{}$ 1: Initialize parameters $w$ , $h$ , $c_{1}$ , $c_{2}$ , $n_{m}$ and $ϵ$ ; 2: Compute $α_{i, 1}$ , $α_{i, 2}$ , $α_{i, 3}$ and $α_{i, 4}$ for each $p_{i}$ ; 3: Initialize $V_{0}$ and set $n = 0$ ; 4: repeat 5: Compute $E_{a} (V_{n})$ by (9); 6: Compute $\nabla_{V} E (V_{n})$ by (12); 7: Compute $V_{n + 1}$ by (13); 8: $\begin{matrix} if \end{matrix} \| E_{a} (V_{n}) + E_{a} (V_{n - 1}) - E_{a} (V_{n - 2}) - E_{a} (V_{n - 3}) \| < \| E_{a} (V_{n}) - E_{a} (V_{n - 1}) \|$ then 9: $c 1 \leftarrow c 1 / 2$ 10: end if 11: $n \leftarrow n + 1$ 12: until $\| E_{a} (V_{n}) - E_{a} (V_{n - 1}) \| < ϵ$ or $n > n_{m}$ 13: The optimal warped position vector estimation is $V^{} = V_{n}$ .

Algorithm 1 Gradient descent optimization for chamfer alignment using mesh deformation

Input: Edge point set

P = {p_{i}}_{i = 1}^{N}

and their corresponding weight

W = {w_{i}}_{i = 1}^{N}

, the distance transform image

ND T_{Q}

Output: Warped position vector of vertices

V^{*}

1: Initialize parameters

w

h

c_{1}

c_{2}

n_{m}

and

ϵ

;
2: Compute

α_{i, 1}

α_{i, 2}

α_{i, 3}

and

α_{i, 4}

for each

p_{i}

;
3: Initialize

V_{0}

and set

n = 0

;
4: repeat
5: Compute

E_{a} (V_{n})

by (9);
6: Compute

\nabla_{V} E (V_{n})

by (12);
7: Compute

V_{n + 1}

by (13);
8:

\begin{matrix} if \end{matrix} | E_{a} (V_{n}) + E_{a} (V_{n - 1}) - E_{a} (V_{n - 2}) - E_{a} (V_{n - 3}) | < | E_{a} (V_{n}) - E_{a} (V_{n - 1}) |

then 9:

c 1 \leftarrow c 1 / 2

10: end if
11:

n \leftarrow n + 1

12: until

| E_{a} (V_{n}) - E_{a} (V_{n - 1}) | < ϵ

n > n_{m}

13: The optimal warped position vector estimation is

V^{*} = V_{n}

Combination with feature points

Since the parametric chamfer alignment converges to a local optimum, the initial warping model determines the accuracy. Due to the robustness of feature detection and matching in textured region, in our image alignment process, warped positions of vertices are initialized through feature-based method and refined by chamfer matching. SIFT⁵ is used to establish feature matches, and then RANSAC³⁴ is adopted to remove outliers. Next, the model is solved by the optimization problem with alignment term defined by the corresponding error and smoothness term in (10).

Since the initial estimation of vertices surrounded by more feature points is likely to achieve higher accuracy, a growing technique is proposed for chamfer registration of vertices. The major steps are as follows:

Step 1: Initialization. For each vertex, it is marked if there are not lower than 4 feature points in its own related (4 nearest) meshes. Vertices in the convex hull of these marked ones contribute to the initial optimization variables in the alignment term $E_{a} (V)$ , which means the partial derivatives of $E_{a} (V)$ with respect to the other vertices are set to 0.

Step 2: Growing. This step expands the optimized vertices by dilating with $3 \times 3$ rectangular element, and then aligns the current vertices.

Step 3: If no additional vertices can be expanded, stop the algorithm. Else, go to Step 2.

Algorithm 1 is used for optimization in each step. The illustration is shown in Figure 3. This technique ensures that the accurate alignment propagates from textured area to low-textured area.

Figure 3.

Growing procedure for alignment of mesh vertices. Feature points are marked in blue. Vertices whose surrounding feature points are not lower than 4 are marked in yellow, and their convex hull is marked in red lines. Green points denote the current optimization vertices.

The experimental verification

Parameter setting

All the parameters of the proposed mesh-based chamfer alignment (MChA) method are fixed in the experiment. The parameter $σ$ of the normal function in distance transform is set to 1. The size of each divided mesh is $40 \times 40$ . The weights for alignment term and smoothness term are initialized as $c_{1} = 0.3$ and $c_{2} = 0.5$ . In the optimization process, the maximum number of iterations $n_{m}$ and the desired accuracy $ε$ are set to 100 and 1, respectively.

Quantitative evaluation

MChA is evaluated on real data, including image pairs and video sequences, as shown in Figure 4. Image pairs are collected from public available databases^6,8,17,35,36 and challenging in multiple planes, repetitive pattern and low-textured ground. Videos are collected from public datasets^14,37 and YouTube (http://www.youtube.com/) and categorized into regular videos, low-textured videos and videos with color changes.

Figure 4.

Dataset for quantitative evaluation. (a) Image pairs; (b) Videos (01-06 are regular videos, 07-09 are low-textured videos, 10-12 are color-change videos).

MChA is compared with three state-of-the-art image alignment methods, APAP,⁸ CPW,⁹ and REW.¹⁷ The alignment accuracy of two images or frames is evaluated using the measurement in Li et al.¹¹ The RMSE of one minus normalized cross correlation (NCC) over a neighborhood of $5 \times 5$ patches for pixels in the overlapping region $Ω$ is calculated as

RMSE (I_{0}, I_{1}^{'}) = \sqrt{\frac{1}{| Ω |} \sum_{x \in Ω} {(1 - NCC (f_{0} (x), f_{1} (x)))}^{2}},

(14)

where $I_{0}$ and $I_{1}^{'}$ are the reference image and the aligned target image respectively, $| Ω |$ denotes the number of pixels in $Ω$ , and $f$ denotes the vector of pixel values of $5 \times 5$ patches.

For each video, the dataset is comprised of all such pairs with 5 frame differences and the average RMSE is computed. For APAP and CPW, the size of each mesh or cell is set to the same as MChA. The RMSE results on image pairs and videos of different methods are shown in Tables 1 and 2, respectively. As we can see, MChA shows better alignment results than the other three methods significantly on image pairs and outperforms REW as well as CPW on most videos. Since the view differences of image pairs are larger than video frames, MChA is superior to methods based on corresponding feature points for large deformation. Moreover, MChA performs better on the cases with low texture and color variation. However, REW is more robust than MChA on the scenes with complicated texture (04-06), where excessively dense edges could have negative effect on alignment.

Table 1.

RMSE scores on image pairs.

Image-pair no.	APAP	CPW	REW	MChA
01	0.638	0.472	0.488	0.453
02	0.631	0.461	0.688	0.420
03	0.758	0.608	0.606	0.574
04	0.794	0.751	0.752	0.722
05	0.780	0.742	0.688	0.647
06	0.865	0.809	0.840	0.692
07	0.799	0.678	0.699	0.645
08	0.844	0.854	0.789	0.775
Average	0.764	0.672	0.694	0.616

Scores of the best methods are marked in bold.

Table 2.

RMSE scores on videos.

Video no.	APAP	CPW	REW	MChA
01	0.568	0.489	0.483	0.481
02	0.596	0.525	0.519	0.510
03	0.814	0.774	0.769	0.749
04	0.464	0.363	0.347	0.348
05	0.445	0.355	0.332	0.357
06	0.618	0.565	0.564	0.569
07	0.730	0.707	0.688	0.700
08	0.928	0.909	0.936	0.887
09	0.589	0.526	0.505	0.498
10	0.426	0.326	0.317	0.314
11	0.629	0.535	0.506	0.491
12	0.711	0.607	0.595	0.544
Average	0.627	0.557	0.547	0.537

Scores of the best methods are marked in bold.

The implementations of APAP, CPW and MChA are programed in c++ and performed on an Inter i7 7700HQ CPU and an 8 GB RAM. The runtime of MChA for an image pair with $640 \times 360$ resolution is 2.5 s on average and that of APAP or CPW is 0.6 s.

Qualitative evaluation

To evaluate the alignment quality of MChA on low-textured images, we compare it with three methods, APAP,⁸ REW,¹⁷ and DFW.¹¹ APAP and REW rely on feature point correspondences, while DFW is based on dual features: points and line segments. Figure 5 shows the comparison results of image stitching on 4 databases.¹¹ As we can see, there exist some misalignments in the results of APAP and REW, especially in the region with lines. DFW improves the quality, but still has some fine errors. It demonstrates that MChA solves the problem in these cases.

Figure 5.

Results on low-textured image stitching: (a) APAP, (b) REW, (c) DFW, and (d) MChA.

The image pairs for evaluating the effect of MChA on scenes with color variations are collected from Lin et al.¹⁴ and the website, as shown in Figure 6(a) and (b). For comparison, the feature-based alignment method, named MFA and used for the initial estimation of mesh warping in MChA (see section “Combination with feature points”), is evaluated. In addition, a pixel-based approach, named MPhA and similar to the method in Lin et al.,¹⁴ is also used for comparison. MPhA replaces the alignment term of MChA with the photometric error of pixels in the overlapping region $Ω$ , formulated as

E_{a} (V) = \sum_{x \in Ω} (I_{0} (x) - I_{1} (Φ (x; V)))^{2},

(15)

where $Φ (x; V)$ denotes the mesh warping function of pixel $x$ . Besides, the remainder of MPhA is the same as MChA. Figure 6(c) to (e) show the comparison results. It can be seen that MFA causes misalignments of some edges and MPhA enlarges the errors when color or illumination changes occur. MChA displays its accuracy to color variation.

Figure 6.

Comparison on color-variation image stitching. (a) reference images, (b) target images, (c) results of MFA, (d) results of MPhA, and (e) results of MChA.

In addition, we evaluate the proposed method on images of ceramic sanitary wares, which are challenging in complex curved surfaces, low textures and light spots. Figure 7 presents three examples. It can be seen that the feature-point-based methods, APAP and MFA, suffer from some misalignments of edges. MChA shows better results and can be applied to synthesize multi-view images of sanitary wares for further defect inspection.

Figure 7.

Results of aligning ceramic sanitary ware images. Top to bottom: reference images, target images, results of APAP, MFA, and MChA.

Conclusion

A mesh-based parametric chamfer alignment method is proposed in this paper. Edges with different sharpness are detected and divided into levels through $K -$ means cluster algorithm to avoid selecting an appropriate threshold to separate edges from the image. Thus, weights are attached to edge points in the object function to emphasize the contribution of high-gradient pixels. Considering the chamfer alignment and smoothness terms, mesh warping model can be optimized by gradient descent method. In addition, the optimization can be processed by a growing technique, as the model is initialized by the estimation depending on feature-point correspondences. Experiment on real-world data demonstrates that the proposed approach is robust to low texture and color variation, and achieves better alignment accuracy than some other state-of-the-art methods. However, our method suffers from complicated texture scenes, as excessively dense edges have negative effect on alignment. This problem will be considered in the future research. Moreover, the chamfer alignment method can be applied for stitching multi-view images of ceramic sanitary wares.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by National Key R&D Program of China (No.2018YFB1308400).

ORCID iDs

Zhihao Zhang

Xianqiang Yang

References

Gao

G-Q

Zhang

Pose detection of parallel robot based on improved ransac algorithm. Meas Control 2019; 52: 855–868.

Mahapatra

Thareja

Kaur

, et al. A machine vision system for tool positioning and its verification. Meas Control 2015; 48: 249–260.

Tsai

Hsieh

YC.

Machine vision-based positioning and inspection using expectation–maximization technique. IEEE Trans Instrum Meas 2017; 66: 2858–2868.

Wang

Cui

An image-based system for measuring workpieces. Meas Control 2014; 47: 283–287.

Lowe

DG.

Distinctive image features from scale-invariant keypoints. Int J Comput Vis 2004; 60: 91–110.

Gao

Kim

Brown

. Constructing image panoramas using dual-homography warping. In: 2015 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp.49–56. New York, NY: IEEE.

Lin

W-Y

Liu

Matsushita

, et al. Smoothly varying affine stitching. In: IEEE conference on computer vision and pattern recognition, 2011, Colorado Springs, CO, 20–25 June 2011, pp.345–352. New York, NY: IEEE.

Zaragoza

Chin

T-J

Brown

, et al. As-projective-as-possible image stitching with moving DLT. In: IEEE conference on computer vision and pattern recognition (CVPR’13), June 2013, pp.2339–2346. New York, NY: IEEE.

Liu

Gleicher

Jin

, et al. Content-preserving warps for 3d video stabilization. ACM Trans Graph 2009; 28: 44.

10.

Zhang

Liu

Parallax-tolerant image stitching. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2014, pp.3262–3269.

11.

Yuan

Sun

, et al. Dual-feature warping-based motion model estimation. In: Proceedings of the IEEE international conference on computer vision (ICCV), 2015, pp.4283–4291. New York, NY: IEEE.

12.

Zhang

Chen

, et al. Multi-viewpoint panorama construction with wide-baseline images. IEEE Trans Image Process 2016; 25: 3099–3111.

13.

Xiang

T-Z

Xia

G-S

Bai

, et al. Image stitching by line-guided local warping with global similarity constraint. Pattern Recognit 2018; 83: 481–497.

14.

Lin

Jiang

Liu

, et al. Direct photometric alignment by mesh deformation. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp.2405–2413. New York, NY: IEEE.

15.

Barrow

Tenenbaum

Bolles

, et al. Parametric correspondence and chamfer matching: Two new techniques for image matching. In: IJCAI’77: Proceedings of the 5th international joint conference on Artificial intelligence, August 1977, pp.659–663.

16.

Zitová

Flusser

Image registration methods: a survey. Image Vis Comput 2003; 21: 977–1000.

17.

Wang

Lai

, et al. Parallax-tolerant image stitching based on robust elastic warping. IEEE Trans Multimedia 2018; 20: 1672–1687.

18.

Guo

Liu

, et al. Joint video stitching and stabilization from moving cameras. IEEE Trans Image Process 2016; 25: 5491–5503.

19.

Chen

Yao

, et al. Generalized content-preserving warp: direct photometric alignment beyond color consistency. IEEE Access 2018; 6: 69835–69849.

20.

Korman

Reichman

Tsur

, et al. Fast-match: Fast affine template matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2013, pp. 2331–2338. New York, NY: IEEE.

21.

Hel-Or

David

Matching by tone mapping: Photometric invariant template matching. IEEE Trans Pattern Anal Mach Intell 2014; 36: 317–330.

22.

Nocedal

Wright

Numerical optimization. 2nd ed. New York, NY: Springer, 2006.

23.

Lucas

Kanade

An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th international joint conference on artificial intelligence (IJCAI ’81), 1981, pp. 674–679.

24.

Tian

Narasimhan

SG.

Globally optimal estimation of nonrigid image distortion. Int J Comput Vis 2012; 98: 279–302.

25.

DeTone

Malisiewicz

Rabinovich

Deep image homography estimation. arXiv preprint arXiv:1606.03798, 2016.

26.

Nguyen

Chen

Shivakumar

, et al. Unsupervised deep homography: A fast and robust homography estimation model. IEEE Robot Autom Lett 2018; 3: 2346–2353. 2018.

27.

Wang

Liu

, et al. Deepmeshflow: Content adaptive mesh deformation for robust image registration. arXiv preprint arXiv:1912.05131, 2019.

28.

Liu

M-Y

Tuzel

Veeraraghavan

, et al. Fast directional chamfer matching. In: 2010 IEEE Computer Society conference on computer vision and pattern recognition, 2010, pp.1696–1703. New York, NY: IEEE.

29.

Zhang

Shang

Chan

AB.

A robust likelihood function for 3d human pose tracking. IEEE Trans Image Process 2014; 23: 5374–5389.

30.

Wei

Yang

Local part chamfer matching for shape-based object detection. Pattern Recognit 2017; 65: 82–96.

31.

Felzenszwalb

Huttenlocher

DP.

Distance transforms of sampled functions. Theory Comput 2012; 8: 415–428.

32.

Canny

A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 1986; 8: 679–698.

33.

Lloyd

Least squares quantization in PCM. IEEE Trans Inf Theory 1982; 28: 129–137.

34.

Fischler

Bolles

RC.

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 1981; 24: 381–395.

35.

Chang

C-H

Sato

Chuang

Y.-Y

, Shape-preserving half-projective warps for image stitching. In: 2014 IEEE conference on computer vision and pattern recognition, pp.3254–3261. New York, NY: IEEE.

36.

Lin

C-C

Pankanti

Natesan Ramamurthy

, et al. Adaptive as-natural-as-possible image stitching. In: 2015 IEEE conference on computer vision and pattern recognition, 2015, pp.1155–1163. New York, NY: IEEE.

37.

Liu

Yuan

Tan

, et al. Bundled camera paths for video stabilization. ACM Trans Graph 2013; 32: 78.

Parametric chamfer alignment based on mesh deformation

Abstract

Keywords

Related works

Parametric chamfer alignment

Mesh-based chamfer alignment

Clustered edge detection

Mesh-based alignment

Combination with feature points

The experimental verification

Parameter setting

Quantitative evaluation

Qualitative evaluation

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

References