A coarse-to-fine scheme for groupwise registration of multisensor images

Abstract

Ensemble registration is concerned with a group of images that need to be registered simultaneously. It is challenging but important for many image analysis tasks such as vehicle detection and medical image fusion. To solve this problem effectively, a novel coarse-to-fine scheme for groupwise image registration is proposed. First, in the coarse registration step, unregistered images are divided into reference image set and float image set. The images of the two sets are registered based on segmented region matching. The coarse registration results are used as an initial solution for the next step. Then, in the fine registration step, a Gaussian mixture model with a local template is used to model the joint intensity of coarse-registered images. Meanwhile, a minimum message length criterion-based method is employed to determine the unknown number of mixing components. Based on this mixture model, a maximum likelihood framework is used to register a group of images. To evaluate the performance of the proposed approach, some representative groupwise registration approaches are compared on different image data sets. The experimental results show that the proposed approach has improved performance compared to conventional approaches.

Keywords

Image registration multi-image coarse-to-fine image segmentation mixture model

Introduction

Image registration aims to geometrically match up two or more images of the same scene taken at different times, from different perspectives, and/or by different imaging machineries.¹ It is a crucial step in many image analysis tasks, including image fusion, change detection, and image super resolution.² In many situations, scene information of images is acquired from different sources. For example, vehicle images of road traffic system are acquired with different sensors or illumination directions. In medical imaging, the underlying patient anatomy could be imaged by different acquisition techniques such as magnetic resonance (MR), computed tomography, position emission tomography, and ultrasound (US). It is difficult to register such images because that the image intensities cannot be compared directly. This kind of registration problem is referred as multisensor registration.

Through literature review, a large number of methods have been proposed to deal with image registration problem. These methods broadly fall into two categories: feature-based and intensity-based methods. Feature-based methods extract salient features from two images and then find a geometrical transformation to match the two sets of features.^3
–5 The preferred features include corners, line intersections, edge lines, contours, and closed-boundary regions. These features are manually or automatically detected and represented by control points.¹ The advantage of feature-based methods is that they are fast and robust to noises and large geometric distortions. However, if the distorted images come from different modalities, the same image scenes of these images have large differences in intensity and morphology. In this case, using feature-based methods alone cannot get accurate registration results.

To address this deficiency, several researches developed a two-step registration scheme.^6,7 First, a coarse result is produced by implementing a feature-based method. This result is used as an initial solution for the next step. Then, an optimal result is produced by implementing an intensity-based method. Intensity-based methods define similarity measures directly based on the joint intensity distribution of two images and consider the registration problem as an optimization process to minimize or maximize the similarity measures. In this kind of methods, correlation coefficient⁸ and mutual information⁹ are commonly used similarity measures. Although these measures can be employed to fine tune parameters of coarse registration step, most of them are only designed for solving pairwise registration problems.

Groupwise registration methods aim to register more than two images. Compared with pairwise registration methods, these registration methods use joint information of the entire group of images to estimate the correspondences between each pair of images. The groupwise registration approach proposed by Woods et al.¹⁰ constructed a global cost function by adding sums of squared intensity differences between each image pair and then minimized this cost function to estimate the transformation of each image. Lorenzen et al.¹¹ proposed a domain-specific approach to simultaneously align a group of brain MR images. This approach used a human brain atlas to classify tissue and then registered the tissue classification images. However, these approaches only focus on registering monotone images.

Some approaches are designed to measure the dispersion in a joint intensity scatter plot (JISP) for multisensor image registration. Neemuchwala et al.¹² used the length of a minimum-length spanning tree to quantify the dispersion in the JISP. An iterative scheme proposed by Guimond et al.¹³ is applied to correct the intensity differences between images. Leventon and Grimson¹⁴ presented a maximize likelihood (ML) registration approach that specified the correct JISP from previously registered images. Recently, Orchard and Jonchery¹⁵ presented a clustering approach for multisensor groupwise registration. Their approach modeled the distribution of points in the space of joint intensity based on a Gaussian mixture model (GMM) and then estimated the model parameters using an expectation maximization (EM) algorithm. Špiclin et al.¹⁶ proposed a treecode registration approach for registering a group of multisensor images. Their approach estimated the joint density function through an efficient hierarchical subdivision of the joint intensity space. Zhu et al.¹⁷ used an infinite GMM (IGMM), which has the capability of determining a proper number of mixing components to model the joint intensity distribution of the unregistered images and designed a variational Bayesian method to estimate motion parameters. While effective for groupwise registration, the mixture model used in this approach assumes that each intensity vector is independent of its neighbors and does not take into account spatial dependencies. This drawback will make their registration accuracies decline.

To alleviate this limitation, many approaches have been proposed to incorporate spatial information into the image. Some approaches impose spatial constraints on the image pixel labels. In these approaches, the Markov random field (MRF)¹⁸ is a commonly used model. Recently, the hidden MRF (HMRF) model is proposed.^19,20 It is a stochastic process, generated by a MRF, whose state sequence cannot be observed directly but can be observed indirectly through a field of observations. But the HMRF model is computationally demanding. Instead of imposing the MRF-based constraint on the pixel labels, some other approaches directly impose spatial constraints on contextual mixing proportions and take into account the spatial correlation of pixels.^21
–23 However, in these approaches, the prior distribution is different for each pixel and depends on the neighbors of the pixel. This limitation makes them lost global cluster information. In addition, this kind of approach cannot determine the number of mixing components. Thus, it is not very suitable for groupwise registration.

Based on aforementioned considerations, a novel coarse-to-fine scheme for groupwise image registration is proposed in this article. It comprises a coarse registration process and a fine registration process. Major contributions of this article include the following:

A simple and valid method is introduced to match segmented regions of reference and float image sets.

An effective registration approach based on region feature matching is presented to eliminate the initial displacements of unregistered images. It is used as an initial solution for the fine registration processing.

A modified GMM with a local template is proposed to model the joint intensity of coarse unregistered images. The weights of the template are computed based on both the spatial distance and the intensity difference between the central intensity vector and its neighbors.

A ML approach with a minimum message length (MML) criterion is employed to infer the fine-tuning registration parameters.

The proposed approach employs the advantages of both feature-based and intensity-based methods. The performance of the proposed approach is evaluated on different multisensor image data sets and the results show that the proposed approach has improved performance compared to other groupwise registration approaches.

The rest of this article is organized as follows. A brief introduction of the proposed approach is presented in section “Methodology.” The experiment settings that include evaluation method and experimental data are introduced in section “Experiment.” The experimental results are reported and analyzed in section “Results.” Conclusions are drawn in the last section.

Methodology

Overview of the approach

The main steps of the proposed approach are shown in Figure 1. It starts with an initialization process and the initialized images are registered by coarse registration method. The coarse registration results are used as a good initial solution and are then refined by fine registration method. Through this scheme, we can get the aligned images.

Figure 1.

Main steps of the proposed method for coarse-to-fine groupwise registration.

Coarse registration

Region feature extraction

For feature-based image registration method, feature extraction is critical for the success of feature matching and image registration. Our approach segments images and takes segmented regions as the features used for image registration. In the past decades, fuzzy segmentation methods have been widely used for data cluster and image segmentation.^24,25 One of the most popular methods is the well-known fuzzy c-means (FCM) algorithm. As a soft segmentation method, FCM is able to retain more original image information than the hard segmentation methods. In this article, we use FCM algorithm with spatial constraints to partition a given image to different regions.²⁶ Each segmented region, which is indicated as reg, has three parameters: λ, ec, and fc. λ is the angle between the x-axis and the major axis of the ellipse that has the same second moments as reg. ec and fc are x- and y-coordinates of the center of reg.

Region feature matching

A simple approach is used to match segmented regions. This approach is based on the idea that if two regions are similar in morphology, the area of their nonoverlapping region is small when they are aligned. Given two segmented regions reg₁ and reg₂, our approach calculates the transform parameters between them in the first place. The calculating method of transform parameter is introduced in next section. The transformation function that is applied to reg₂ is represented by g and the transformed image is indicated as g(reg₂). Then, we calculate the difference between reg₁ and reg₂ as follows

AD ({reg}_{1} {,reg}_{2}) = \sum_{a = 1}^{s reg} | {reg}_{1} (a) - g ({reg}_{2}) (a) |

where reg₁(a) refers to the a th pixel location of reg₁. It is calculated as follows

{reg}_{1} (a) = {\begin{matrix} 1 if a < = {size(reg}_{1}) \\ 0 if a > {size(reg}_{1}) \end{matrix}

where size(reg₁) denotes the number of pixels in reg₁. s reg is computed as

s reg = {max(size(reg}_{1}), size (g ({reg}_{2})))

A small AD(reg₁, reg₂) value indicates that reg₁ and reg₂ are similar to each other.

Image registration based on region matching

Assume that D source images will be registered. Let $I 1 = {I 1 (1), I 1 (2) ..., I 1 (D)}$ represent the image set. An algorithm that is outlined in algorithm 1 is used to register these images based on their region features. This algorithm divides source images into two sets: a reference image set A1 and a float image set A2. Then, the images of A2 are registered to images of A1, respectively. The detail of region match process is outlined in algorithm 2. After region matching, the matched pair set Ω is outputted by algorithm 2. Ω contains M match pairs ${({reg}_{Ω (1, 1)}, {reg}_{Ω (1, 2)}), ..., ({reg}_{Ω (M, 1)}, {reg}_{Ω (M, 2)})}$ that are used to calculate coarse registration parameters for I(i)(i = 2, …, D). The geometrical deformation between reference and float images is assumed to be rigid transformation in our approach. In two-dimensional space, a rigid transformation from point (e,f) to (e′,f′) is defined as

[\begin{matrix} e' \\ f' \end{matrix}] = [\begin{matrix} cos r o & - sin r o \\ sin r o & cos r o \end{matrix}] [\begin{matrix} e \\ f \end{matrix}] + [\begin{matrix} t e \\ t f \end{matrix}]

where ro is rotation angle; te and tf are translations along x- and y-axes, respectively. These coarse registration parameters are computed as follows

r o = \frac{\sum_{o = 1}^{M} arctan \frac{λ_{Ω (o, 1)} - λ_{Ω (o, 2)}}{1 - λ_{Ω (o, 1)} λ_{Ω (o, 2)}}}{M}

t e = \frac{\sum_{o = 1}^{M} ({e c}_{Ω (o, 1)} - e c_{Ω (o, 2)})}{M}

t f = \frac{\sum_{o = 1}^{M} (e f_{Ω (o, 1)} - e f_{Ω (o, 2)})}{M}

After coarse registration, the rotation and translation displacements of images are roughly eliminated. This registration results are used as an initial solution for the subsequent fine registration processing.

Algorithm 1.

Image registration based on segmented region.

Input:

initial reference image set

A 1 = {I 1 (1)}

initial float image set

A 2 = {I 1 (2), \dots, I 1 (D)}

1: while A2 is not empty

2: select one image

I 1 (i)

from A2 (i = 2, …, D)

3: match regions of I1(i) with regions of A2 and get matched pair set Ω. ( See Algorithm 2)

4: if Ω is empty

5: continue

6: else

7: compute coarse registration parameter

θ 1 (i) = {r o, t e, t f}

based on Ω

8: move I1(i) from A2 into A1

9: end if

10: end while

Output:

all coarse registration parameter θ1 = (θ1(1), …, θ1(D))

Algorithm 2.

Segmented region match.

Input:

Segment images of A1 and get reference region set B1(len(B1) denotes the number of regions in B1)

Segment I(i) and get float region set B2

initial threshold value ε

1: while B2 is not empty

2: extract one region

{reg}_{B 2 (j)}

from B2.

(j = 1, \dots, len (B 2))

d v \leftarrow

choose

min_{1 \leq k \leq l e n (B 1)} AD ({reg}_{B 1 (k)}, {reg}_{B 2 (j)})

4: if

d v > ε

5: remove

{reg}_{B 2 (j)}

from B2

6: continue

7: else

8: move match pair

({reg}_{B 1 (k)}, {reg}_{B 2 (j)})

into Ω

9: remove

{reg}_{B 2 (j)}

from B2

10: continue

11: end if

12: end while

Output:

matched pairs set Ω

Fine registration

Problem formulation

An image group consists of D images. Thus, the location of each pixel x is associated with D intensity values and is represented by a joint intensity vector I_x.¹⁵ Ensemble registration is to give these images a set of motion parameters that refer to the transforms applied to them. In this article, we focus on addressing the problem of rigid registration. If each image has L motion parameters, the total number of motion parameters is LD. Let θ be the set of motion parameters and the joint intensity vector of transformed images is denoted by $I_{x}^{θ}$ . Given model parameters ρ, the log likelihood function of all the variables can be written as

\begin{array}{l} log L (ρ) = log \prod_{x = 1}^{X} p (I_{x}^{θ} | ρ) \\ = \sum_{x = 1}^{X} log p (I_{x}^{θ} | ρ) \end{array}

where X is the number of pixels. We use a GMM with a local template to model the joint intensity distribution.²³ For each pixel location x, the likelihood of joint intensity vector $I_{x}^{θ}$ is

p (I_{x}^{θ} | ρ) = \sum_{k = 1}^{K} π_{k} \sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} p (I_{m}^{θ} | μ_{k}, Σ_{k})

where K is the number of mixture components, πk represents the prior probability of $I_{x}^{θ}$ belonging to class k (the constraint of {πk} is $\sum_{k = 1}^{K} π_{k} = 1$ ), and N_x is the neighborhood of the x th vector (h × h window around x). w_m and R_x are weighted parameter and normalized factor. w_m is computed based on the form of Gaussian function²⁷

w_{m} = \frac{1}{{(2 π η^{2})}^{1 / 2}} exp(- \frac{d_{m x}^{2}}{2 η^{2}})

where d_mx is calculated as

d_{m x} = \frac{{s d}_{m x}^{} +i d_{m x}^{}}{2}

where sd_mx and id_mx are the spatial distance and intensity difference between intensity vector $I_{x}^{θ}$ and its neighbor $I_{m}^{θ}$ , respectively. These two values are calculated using Euclidean distance. η means the standard deviation. From the definition of the local template, we can see that the strength of w_m decreases as the distance between vector $I_{x}^{θ}$ and $I_{m}^{θ}$ increases. Similarly, large intensity difference between intensity vector $I_{x}^{θ}$ and $I_{m}^{θ}$ will also reduce the value of w_m. That is, the value of w_m could be controlled by both sd_mx and id_mx. R_x is normalized factor, which is defined as

R_{x} = \sum_{m \in N_{x}} w_{m}

The function $p (I_{m}^{θ} | μ_{k}, Σ_{k})$ denotes the Gaussian distribution

p (I_{m}^{θ} | μ_{k}, Σ_{k}) = \frac{exp (- \frac{1}{2} {(I_{m}^{θ} - μ_{k})}^{T} {(Σ_{k})}^{- 1} (I_{m}^{θ} - μ_{k}))}{\sqrt{{(2 π)}^{D} | Σ_{k} |}}

where μ_k and Σ_k are the mean and covariance matrix of the k th component distribution, respectively.

In the above GMM, the number of K, which will affect the fitness of mixture model, need to be known in advance. To overcome this difficulty, we adopt the approach presented in the study by Figueiredo and Jain,²⁸ which is based on the MML criterion, and obtain the following cost function

- log L (ρ, θ) + \frac{W}{2} \sum_{k = 1}^{K} log (\frac{X π_{k}}{12}) + \frac{K}{2} log \frac{X}{12} + \frac{K (W + 1)}{2}

where W is the number of parameters specifying each component. It is $D + D (D + 1) / 2$ in our approach. Now, the objective of our registration approach is to minimize formula with respect to ρ ( $ρ = {μ_{k}, Σ_{k}, π_{k}}$ ) and θ.

Parameter learning

The EM update procedure is used to estimate the parameters of the proposed mixture model. In the E-step, the expectation of the complete data log likelihood Q is calculated. Let z_xk be a hidden random variable that indicates the membership of $I_{x}^{θ}$ among its own cluster. Q can be written as

\begin{array}{l} Q = - \sum_{x = 1}^{X} \sum_{k = 1}^{K} p (z_{x k} | I_{x}^{θ}, ρ) (log p (z_{x k} | ρ) + log \sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} p (I_{m}^{θ} | μ_{,}^{k} Σ_{k})) \\ + \frac{W}{2} \sum_{k = 1}^{K} log (\frac{X π_{k}}{12}) + \frac{K}{2} log \frac{X}{12} + \frac{K (W + 1)}{2} \\ = - \sum_{x = 1}^{X} \sum_{k = 1}^{K} p (z_{x k} | I_{x}^{θ}, ρ) (log π_{k} + V) \\ + \frac{W}{2} \sum_{k = 1}^{K} log (\frac{X π_{k}}{12}) + \frac{K}{2} log \frac{X}{12} + \frac{K (W + 1)}{2} \end{array}

From equation (15), we can see that quantity V cannot be directly calculated. Because that w_m/R_x satisfies w_m/R_x ≥ 0 and $\sum_{m \in N_{x}} w_{m} / R_{x} = 1$ , we can apply the Jensen’s inequality²⁹ and the new log likelihood function is calculated as

\begin{array}{l} Q = - \sum_{x = 1}^{X} \sum_{k = 1}^{K} p (z_{x k} | I_{x}^{θ}, ρ) (log π_{k} + \sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} log p (I_{m}^{θ} | μ_{,}^{k} Σ_{k})) \\ + \frac{W}{2} \sum_{k = 1}^{K} log (\frac{X π_{k}}{12}) + \frac{K}{2} log \frac{X}{12} + \frac{K (W + 1)}{2} \end{array}

The posterior probability of hidden variable z_xk is computed as follows

\begin{array}{l} p (z_{x k} | I_{x}^{θ}, ρ)^{(i + 1)} = γ_{x k}^{(i+ 1)} \\ = \frac{π_{k}^{(i)} \sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} p (I_{m}^{θ} | μ_{k}^{(i)}, Σ_{k}^{(i)})}{\sum_{j = 1}^{K} π_{j}^{(i)} \sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} p (I_{m}^{θ} | μ_{j}^{(i)}, Σ_{j}^{(i)})} \end{array}

where i denotes the i th iteration. Given the result (equation (17)), the expectation of the complete data log likelihood Q is calculated as

\begin{array}{l} Q = - \sum_{x = 1}^{X} \sum_{k = 1}^{K} γ_{x k}^{} (log π_{k} + \sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} log p (I_{m}^{θ} | μ_{,}^{k} Σ_{k})) \\ + \frac{W}{2} \sum_{k = 1}^{K} log (\frac{X π_{k}}{12}) + \frac{K}{2} log \frac{X}{12} + \frac{K (W + 1)}{2} \end{array}

In the M-step, the revised parameters are determined by minimizing function (18). First of all, the parameter π_k is updated as

π_{k}^{(i+ 1)} = \frac{max (0, \sum_{x = 1}^{X} γ_{x k}^{(i)} - \frac{W}{2})}{\sum_{j = 1}^{K} max (0, \sum_{x = 1}^{X} γ_{x j}^{(i)} - \frac{W}{2})}

The objective function in equation (19) does not make sense if we allow any of the π_k s to be zero. To overcome this problem, only the component with nonzero π_k contributes to the log likelihood. Thus, the complete data log likelihood Q is

\begin{array}{l} Q = - \sum_{x = 1}^{X} \sum_{k : π_{k} > 0}^{} γ_{x k}^{} (log π_{k} + \sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} log p (I_{m}^{θ} | μ_{k}, Σ_{k})) \\ + \frac{W}{2} \sum_{k : π_{k} > 0}^{} log (\frac{X π_{k}}{12}) + \frac{K_{z}}{2} log \frac{X}{12} + \frac{K_{z} (W + 1)}{2} \end{array}

where K_z denotes the number of non-zero probability components. For the k th mixing component, of which π_k is greater than 0, we update its parameters μ_k and Σ_k as follows

μ_{k}^{(i+ 1)} = \frac{\sum_{x = 1}^{X} γ_{x k}^{(i)} \sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} I_{m}^{θ}}{\sum_{x = 1}^{X} γ_{x k}^{(i)}}

Σ_{k}^{(i+ 1)} = \frac{\sum_{x = 1}^{X} γ_{x k}^{(i)} \sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} (I_{m}^{θ} - μ_{k}^{(i)}) {(I_{m}^{θ} - μ_{k}^{(i)})}^{T}}{\sum_{x = 1}^{X} γ_{x k}^{(i)}}

As for updating motion parameters θ, we optimize Q with respect to the motion parameters θ, we set its gradient to zero

\frac{\partial Q}{\partial θ} = 0

and have

- \sum_{x = 1}^{X} \sum_{k : π_{k} > 0}^{} γ_{x k}^{} (\sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} \frac{\partial I_{m}^{θ}}{\partial θ} {(Σ_{k})}^{- 1} (I_{x}^{θ} - μ_{k})) = 0

In order to find motion parameters θ that satisfy the above equation, we introduce a small motion increment $\tilde{θ}$ and replace $I_{x}^{θ}$ with a linear approximation $I_{x}^{θ + \tilde{θ}} = I_{x}^{θ} + \partial I_{x}^{θ} / \partial θ \tilde{θ}$ . Thus, equation (24) can be written as

\begin{array}{l} \sum_{x = 1}^{X} (\sum_{k : π_{k} > 0}^{} γ_{x k}^{} (\sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} \frac{\partial I_{m}^{θ}}{\partial θ} {(Σ_{k})}^{- 1} {\frac{\partial I_{m}^{θ}}{\partial θ}}^{T})) \tilde{θ} \\ = - \sum_{x = 1}^{X} (\sum_{k : π_{k} > 0}^{} γ_{x k}^{} (\sum_{m \in N_{x}} \frac{w_{m}}{R_{x}} \frac{\partial I_{m}^{θ}}{\partial θ} {(Σ_{k})}^{- 1} (I_{m}^{θ} - μ_{k}))) \end{array}

Equation (25) can also be expressed by the simple form as given below

H \tilde{θ} = b

The optimal motion increment is obtained by solving these equations for $\tilde{θ}$ . Then, the increment is used to adjust the current estimate for θ.

Algorithm 3.

Groupwise registration using GMM with local template.

Input:

initial source images I₀

initial motion parameters

θ \leftarrow θ 1

initial mixture model parameters ρ

initial number of mixing components K

1: for each scale do

I_{scaled} \leftarrow

scale ensemble I₀

3: I ← apply motion θ to ensemble I_scaled

4: while converged do

5: for M iterations do

6: ρ and

\tilde{θ}

\leftarrow

EM step

θ \leftarrow θ + \tilde{θ}

I \leftarrow

apply motion θ to ensemble I_scaled

9: end for

10: end while

11: end for

Output:

registered ensemble at full scale

optimal motion parameter θ

The pseudocode of the proposed approach is outlined in algorithm 3. A multiresolution framework is employed to decompose each source image into several levels. Each group of images is registered at a low level at first and then successively at the higher level. Motion parameters obtained at one level are used as an initial guess for the next level. In general, the scales of multiresolution framework are 20%, 50%, and 100%. In each test, the motion parameter number of each image is 3 (ro, te, and tf). Thus, when four images are registered simultaneously (D = 4), the total number of motion parameters LD is 12. The initial number of mixing components K is 10. The size of local template is 5 × 5 and the standard deviation η is set to 1.

Experiment

In this section, we perform experiments to evaluate the performance of the proposed method. Five other registration methods are selected as reference to compare. Among them, two of five are feature-based registration methods, that is, Scale Invariant Feature Transform-based registration method (SIFTR)⁷ and segmented region-based registration method Segmented Region Registration (SRR). The other three methods are intensity-based groupwise registration methods. They are GMM-based registration method Groupwise Registration (GR),¹⁵ IGMM-based registration method Dirichlet Process Registration (DPR),¹⁷ and the registration method with spatial constraints Spatial Intensity Constraint Groupwise Registration (SICGR). Our method is named as SRR-SICGR. It contains the SRR method and the SICGR method.

The criterion that is similar to the one in the work of Orchard and Mann¹⁵ is used to compare the performances of registration methods. It compares the estimated transformation to the gold standard transformation by computing the average pixel displacement (APD). Therefore, a small APD refers to a good registration and vice versa. We consider that the average registration error of a success registration is less than three pixels.

Four public available image data sets are considered to evaluate the performances of different registration methods. The source images of each data set are initialized by applying known displacements. Then, different registration methods are run on the obtained trial image data sets. Details about image data sets are outlined as follows.

Vehicle image

A vehicle image, which is shown in Figure 2, is used to test different registration methods. The region of interest (ROI), which is shown in Figure 2, is used for performing registration. Four displacement ranges are used to generate initial random rigid displacement of image. They are [−10, 10], [−15, 15], [−20, 20], and [−25, 25] (pixels or degrees), respectively. In each displacement range, 50 trial groups are generated and each trial group consists of four images. Among the four images of a trial group, one image is the reference image (be indicated as image (a)) and the other three unregistered images are generated by applying random displacements to reference image (be indicated as images (b) to (d)).

Figure 2.

Vehicle image. The ROI is outlined in image. ROI: region of interest.

Variable illumination

Five face images obtained from the Extended Yale Face Database B³⁰ are used to demonstrate the performances of different methods. These images, which are shown in Figure 3, are illuminated from different angles ranging from far left to far right. The ROI, which is shown in Figure 3(a), is used for performing registration. Four displacement ranges transform parameter range (TPR) are used for sampling translation and rotation parameters. They are [−10, 10], [−15, 15], [−20, 20], and [−25, 25] (pixels or degrees), respectively. In each displacement range, 50 trial groups are generated.

Figure 3.

Face images. The ROI is outlined in image (a). ROI: region of interest.

Disjoint content

An image group of the multisensory phantom is shown in Figure 4(a). The information of these images is complementary. The ROI, which is used for registration, is shown in Figure 4. Four displacement ranges ([−10, 10], [−15, 15], [−20, 20], and [−25, 25]) are used for sampling rigid displacements. Fifty trail groups are generated using random translation and rotation in each displacement range.

Figure 4.

Images from phantom data set. The ROI is outlined in image (a). region of interest.

Medical images

We also test these methods on a group of medical images, which is shown in Figure 5. This group consists of three MR images (T1 weighted, T2 weighted, and Proton-Density weighted). The ROI, which is shown in (a) of Figure 5, is used for performing registration. Four displacement ranges are used for sampling rigid registration parameters. They are [−10, 10], [−15, 15], [−20, 20], and [−25, 25] (pixels or degrees), respectively. In each displacement range, 50 trial groups are generated.

Figure 5.

Multisensor medical images. The ROI is outlined in image (a). region of interest.

Results

Vehicle image

The results for the rigid-body registration of vehicle image are shown in Tables 1 and 2. Table 1 shows pair registration success rates for the data set and Table 2 shows mean registration errors for image pairs. TPR refers to displacement range. From the results, we can see that when TPR is in [−15, 15], only the SRR method does not successfully register all image pairs. The GR, the SICGR, and the SRR-SICGR methods have lower registration errors than other methods. When TPR is [−20, 20], the SIFTR and the DPR methods successfully register all image pairs. Among them, the DPR method has the lowest mean error. As the displacement range is expended, the success rates of intensity-based methods decrease fast. In contrast, the success rates of feature-based methods have no obvious change. The SIFTR method has the highest registration accuracy. The accuracy of the SRR-SICGR method is slightly lower than that of the SIFTR method.

Table 1.

Pair registration success rates of different methods for vehicle images within different displacement ranges.

Method	SIFTR	SRR	GR	DPR	SICGR	SRR-SICGR
TPR = [−10, 10]	1	0.63	1	1	1	1
TPR = [−15, 15]	1	0.67	1	1	1	1
TPR = [−20, 20]	1	0.7	0.5	1	0.93	0.97
TPR = [−25, 25]	1	0.7	0.27	0.8	0.77	0.97

Note: Italics indicate highest pair success rates in each row.

Table 2.

Mean registration errors of different methods for vehicle images within different displacement ranges.

Method		SIFTR	SRR	GR	DPR	SICGR	SRR-SICGR
TPR = [−10, 10]	(a) and (b)	0.29	2.01	0.01	0.02	0.01	0.01
	(a) and (c)	0.24	3.28	0.01	0.02	0.01	0.01
	(a) and (d)	0.29	3.03	0.01	0.02	0.01	0.01
	Mean	0.27	2.77	0.01	0.02	0.01	0.01
TPR = [−15, 15]	(a) and (b)	0.37	1.52	0.01	0.02	0.01	0.01
	(a) to (c)	0.54	2.19	0.01	0.02	0.01	0.01
	(a) to (d)	0.42	3.26	0.02	0.02	0.01	0.01
	Mean	0.44	2.32	0.01	0.02	0.01	0.01
TPR = [−20, 20]	(a) and (b)	0.58	5.37	3.18	0.02	2.12	0.01
	(a) to (c)	0.37	2.45	4.66	0.04	0.01	0.91
	(a) to (d)	0.33	1.45	0.71	0.02	0.51	0.01
	Mean	0.43	3.09	2.85	0.03	0.88	0.31
TPR = [−25, 25]	(a) and (b)	0.46	3.89	8.38	3.02	1.36	0.03
	(a) to (c)	0.42	3.09	9.61	5.24	5.93	1.82
	(a) to (d)	0.35	4.61	2.69	0.02	7.54	0.61
	Mean	0.41	3.86	6.89	2.76	4.94	0.82

Note: Italics indicate lowest mean registration errors in each row.

All methods are implemented in MATLAB R2012a, running on a Windows machine with a Core (TM) 2 Duo 1.83 GHz CPU. Table 3 shows the average times of different methods to register a group of images. As seen, the SIFTR method spends less time than other methods, while the SRR-SICGR method requires the most time.

Table 3.

Average computation times of different methods for vehicle images.

Method	SIFTR	SRR	GR	DPR	SICGR	SRR-SICGR
Computation time	27.96s	37.36s	158.71s	235.42s	228.21s	261.21s

Note: The italics indicates the minimum total time.

Variable illumination

Tables 4 and 5 show the registration results for the variable illumination experiment. As seen, feature-based methods have lower success rates than intensity-based methods when TPR is [−10, 10]. The SIFTR method has the lowest success rate and highest mean error. This shows that it has difficult to register intensity uncorrelated images. The SICGR method has lower mean error than other methods. When TPR is expended to [−15, 15], the GR, the DPR, and the SRR-SICGR methods have the highest success rates. Among them, the SRR-SICGR method has the lowest mean error. As the TPR is expended, the success rates and registration accuracies of intensity-based methods decrease somewhat. The SRR-SICGR method has the highest success rate and lowest mean error because that it employs a coarse-to-fine registration scheme.

Table 4.

Pair registration success rates of different methods for face images within different displacement ranges.

Method	SIFTR	SRR	GR	DPR	SICGR	SRR-SICGR
TPR = [−10, 10]	0.35	0.63	1	1	1	1
TPR = [−15, 15]	0.28	0.55	1	1	0.95	1
TPR = [−20, 20]	0.2	0.43	0.85	0.9	0.88	0.98
TPR = [−25, 25]	0.35	0.48	0.65	0.73	0.65	0.97

Note: Italics indicate highest pair success rates in each row.

Table 5.

Mean registration errors of different methods for face images within different displacement ranges.

Method		SIFTR	SRR	GR	DPR	SICGR	SRR-SICGR
TPR = [−10, 10]	(a) and (b)	21.8	2.56	0.32	0.58	0.46	0.67
	(a) to (c)	6.07	2.31	1.8	1.48	1.4	1.27
	(a) to (d)	58.69	4.88	1.54	1.45	1.22	1.31
	(a) to (e)	22.36	4.24	1.85	1.32	1.13	1.14
	Mean	27.23	3.5	1.38	1.21	1.05	1.1
TPR = [−15, 15]	(a) and (b)	28.01	2.69	0.3	0.48	1.74	0.56
	(a) to (c)	8.89	2.74	1.79	1.44	1.77	1.31
	(a) to (d)	58.69	5.04	1.83	1.67	1.6	1.31
	(a) to (e)	22.36	4.78	1.57	1.63	8.2	1.23
	Mean	29.49	3.81	1.37	1.3	3.33	1.1
TPR = [−20, 20]	(a) and (b)	28.88	6.32	0.64	0.98	6.17	0.57
	(a) to (c)	4.49	6.73	7.27	4.39	3.71	1.53
	(a) to (d)	73.46	5.57	2.71	16.91	4.91	2.81
	(a) to (e)	21.89	8.01	9.13	1.18	12.52	1.12
	Mean	32.18	6.66	4.94	5.87	6.83	1.51
TPR = [−25, 25]	(a) and (b)	13.22	7.81	24.39	4.9	21.54	0.53
	(a) to (c)	3.66	3.18	5.73	10.36	5.7	1.66
	(a) to (d)	69.77	5.95	20.08	23.23	22.41	1.43
	(a) to (e)	42.74	3.44	6.91	7.97	17.92	2.94
	Mean	32.35	5.1	14.27	11.62	16.89	1.64

Note: Italics indicate lowest mean registration errors in each row.

To evaluate the registration performances of the proposed method under noise environment, we compare with the GR method and the DPR method. We set TPR = [−10, 10]. Each image of phantom data set is corrupted with Gaussian (G) noise and salt-and-pepper (SP) noise. The variance of G noise and the density of SP noise vary from 0 to 0.05. Figure 6 plots the registration results under different noise levels. From the results, we can see that all these methods have higher registration errors when noise increases. Under the same noise level, our method has the best performance compared with other two methods.

Figure 6.

Registration results of three methods for face images within different noise levels.

Disjoint content

Tables 6 and 7 show the registration results for the disjoint content experiment. According to the results, the SIFTR method fails to register all image pairs. When TPR is [−10, 10], intensity-based methods successfully register all image pairs. Among them, the DPR method has the lowest mean registration error. As the displacement range is expended, the SRR and the SRR-SICGR methods have higher success rates than other methods. In addition, the SRR-SICGR method has the highest registration accuracy compared with other methods.

Table 6.

Pair registration success rates of different methods for biological images within different displacement ranges.

Method	SRR	GR	DPR	SICGR	SRR-SICGR
TPR = [−10, 10]	0.97	1	1	1	1
TPR = [−15, 15]	0.97	0.53	0.53	0.9	1
TPR = [−20, 20]	0.97	0.57	0.5	0.23	0.99
TPR = [−25, 25]	0.93	0.03	0.03	0.13	1

Note: Italics indicate highest pair success rates in each row.

Table 7.

Mean registration errors of different methods for biological images within different displacement ranges.

Method		SIFTR	SRR	GR	DPR	SICGR	SRR-SICGR
TPR = [−10, 10]	(a) and (b)	8.07	0.75	0.65	0.21	0.33	0.79
	(a) to (c)	9.11	3.84	1.81	0.95	1.21	1.16
	(a) to (d)	8.69	1.51	0.28	0.14	0.12	0.11
	Mean	8.62	2.03	0.91	0.43	0.55	0.69
TPR = [−15, 15]	(a) and (b)	13.24	0.96	9.76	8.73	3.63	0.71
	(a) to (c)	11.39	4.01	18.17	20.43	4.5	1.16
	(a) to (d)	13.08	1.46	7.01	4.97	3.67	0.1
	Mean	12.57	2.14	11.65	11.38	3.93	0.66
TPR = [−20, 20]	(a) and (b)	18.79	0.79	14.68	18.65	28.26	0.72
	(a) to (c)	18.26	4.22	19.18	13.52	20.54	2.37
	(a) to (d)	18.48	2.71	3.24	14.74	31.38	0.12
	Mean	18.51	2.57	12.37	15.64	26.73	1.07
TPR = [−25, 25]	(a) and (b)	24.91	0.77	27.65	29.66	31.72	0.74
	(a) to (c)	20.48	4.79	25.43	27.85	25.84	1.4
	(a) to (d)	22.63	1.87	31.33	33.77	20.1	0.15
	Mean	22.67	2.48	28.14	30.43	25.89	0.76

Note: Italics indicate lowest mean registration errors in each row.

Figures 7 and 8 show the clustering results of the GR, the DPR, and the SRR-SICGR methods. There are five different clusters in the JISP of the phantom data set: four small ellipses and a large embedding circle. When TPR = [−10, 10], all three methods can converge to the correct solution. In addition, the DPR and the SRR-SICGR methods successfully estimate the number of components. When TPR = [−25, 25], the GR and the DPR methods cannot get correct clustering results because of large geometric distortions. In their clustering results, some components are forced to stretch to model multiple clusters. Thus, the performances of these two methods are not better than that of the SRR-SICGR method.

Figure 7.

The clustering results of three methods when TPR = [−10, 10]. (a) The clustering result of the GR method. (b) The clustering result of the DPR method. (c) The clustering result of the SRR-SICGR method.

Figure 8.

The clustering results of three methods when TPR = [−25, 25]. (a) The clustering result of the GR method. (b) The clustering result of the DPR method. (c) The clustering result of the SRR-SICGR method.

The clustering results of three methods when TPR = [−10, 10]: (a) the clustering result of the GR method; (b) the clustering result of the DPR method; and (c) the clustering result of the SRR-SICGR method.

The clustering results of three methods when TPR = [−25, 25]: (a) the clustering result of the GR method; (b) the clustering result of the DPR method; and (c) the clustering result of the SRR-SICGR method.

Medical images

The registration results for medical images are shown in Tables 8 and 9. It is seen that when TPR is [−10, 10], the success rate of the SIFTR method is lower than that of other methods. The GR and the SRR-SICGR methods have the lowest mean errors. When TPR changes from [−15, 15] to [−20, 20], the SICGR method has the highest success rate and lowest mean error. As the TPR is further expended, the SRR-SICGR method has better performances than other methods on registration success rate and registration accuracy.

Table 8.

Pair registration success rates of different methods for medical images within different displacement ranges.

Method	SRR	GR	DPR	SICGR	SRR-SICGR
TPR = [−10, 10]	0.95	1	1	1	1
TPR = [−15, 15]	0.95	0.9	0.85	1	1
TPR = [−20, 20]	0.8	0.95	0.85	0.99	0.98
TPR = [−25, 25]	0.75	0.9	0.9	0.8	0.95

Note: Italics indicate highest pair success rates in each row.

Table 9.

Mean registration errors of different methods for medical images within different displacement ranges.

Method		SIFTR	SRR	GR	DPR	SICGR	SRR-SICGR
TPR = [−10, 10]	(a) and (b)	68.63	1.07	0.38	0.65	0.34	0.36
	(a) to (c)	57.26	1.44	0.39	0.89	0.46	0.41
	Mean	62.95	1.26	0.39	0.76	0.4	0.39
TPR = [−15, 15]	(a) and (b)	73.3	0.69	1.16	4.11	0.33	0.4
	(a) to (c)	58.19	1.55	0.97	1.17	0.47	0.41
	Mean	63.93	1.12	1.07	2.64	0.4	0.41
TPR = [−20, 20]	(a) and (b)	75.53	0.56	0.34	1.72	0.37	0.61
	(a) to (c)	52.32	9.24	2.95	2.92	0.56	0.91
	Mean	63.93	4.9	1.64	2.32	0.47	0.76
TPR = [−25, 25]	(a) and (b)	83.4	9.04	0.4	0.64	10.46	1.36
	(a) to (c)	86.2	6.27	7.67	4.69	5.95	1.31
	Mean	84.8	7.66	4.04	2.66	8.21	1.34

Note: Italics indicate lowest mean registration errors in each row.

Conclusion

In this article, a novel coarse-to-fine approach is proposed for registering groupwise images. In the coarse registration step, our approach divides unregistered images into reference image set and float image set. The images of the two sets are registered based on segmented region matching. The coarse registration results are used as an initial solution for next step. In the fine registration step, our approach uses a GMM with a local template to model the joint intensity of unregistered images. The weights of the template are calculated based on both the spatial distance and the intensity difference of the central intensity vector and its adjacent vectors. Based on this GMM, a ML is constructed and a MML criterion-based method is employed to determine the unknown number of mixing components. The registration problem is solved by estimating relevant parameters. The proposed approach employs the advantages of both feature-based methods and intensity-based methods compared with previous groupwise registration approaches. Different multisensor image data sets are considered to evaluate the performance of the proposed approach and the registration results support the competitiveness and robustness of our approach.

Footnotes

Author’s Notes

Yinghao Li is also affiliated to College of Software and Applied Science and Technology, Zhengzhou University, Zhengzhou, China.

Acknowledgements

The completion of this research was made possible thanks to the following organizations for providing the data: the Retrospective Image Registration Evaluation project, Landsat, and the Yale Face Database.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is jointly supported by Scientific and Technological Research Program of Chongqing Municipal Education Commission—China (grant no. KJ130516, YJG133005), Open Fund of State Key Laboratory of Remote Sensing Science—China (grant no. OFSLRSS201419), the Research Funds of Chongqing Science and Technology Commission—China (grant no. cstc2013jcyjA40042), and the National Natural Science Foundation of China—China (grant nos 61301033, 61403349, and 61309013).

References

Zitova

Flusser

. Image registration methods: a survey. Image Vision Comput 2003; 21(11): 977–1000. DOI: 10.1016/S0262-8856(03)00137-9.

Zhu

Liu

Yuan

. Superresolution reconstruction of video sequence using a coarse-to-fine registration and optimal interpolation strategy. Int J Adv Robot Syst 2013; 10: 245. DOI: 10.5772/56451.

Yang

Cohen

. Image registration and object recognition using affine invariants and convex hulls. IEEE Trans Image Process 1999; 8(7): 934–946. DOI: 10.1109/83.772236.

Kwak

Kang

. Oriented edge-based feature descriptor for multi-sensor image alignment and enhancement. Int J Adv Robot Syst 2013; 10: 343. DOI: 10.5772/56788.

Fouad

Dansereau

Whitehead

. Image registration under illumination variations using region-based confidence weighted m-estimators. IEEE Trans Image Process 2012; 21(3): 1046–1060. DOI: 10.1109/TIP.2011.2167344.

Chan

JCW

Canters

. Fully automatic subpixel image registration of multiangle CHRIS/Proba data. IEEE Trans Geosci Remote Sens 2010; 48(7): 2829–2839. DOI: 10.1109/TGRS.2010.2042813.

Gong

Zhao

Jiao

. A novel coarse-to-fine scheme for automatic image registration based on SIFT and mutual information. IEEE Trans Geosci Remote Sens 2014; 52(7): 4328–4338. DOI: 10.1109/TGRS.2013.2281391.

Kim

Fessler

. Intensity-based image registration using robust correlation coefficients. IEEE Trans Med Imag. 2004; 23(11): 1430–1444. DOI: 10.1109/TMI.2004.835313.

Viola

Wells

. Alignment by maximization of mutual information. Int J Comput Vis 1997; 24(2): 137–154. DOI: 10.1023/A:1007958904918.

10.

Woods

Grafton

Holmes

. Automated image registration: I. General methods and intrasubject, intramodality validation. J Comput Assist Tomo 1998; 22(1): 139–152. DOI: 10.1097/00004728-199801000-00027.

11.

Lorenzen

Prastawa

Davis

. Multi-modal image set registration and atlas formation. Med Image Anal 2006; 10(3): 440–451. DOI: 10.1016/j.media.2005.03.002.

12.

Neemuchwala

Hero

Carson

. Image matching using alpha-entropy measures and entropic graphs. Signal processing. 2005; 85(2): 277–296. DOI:10.1016/j.sigpro.2004.10.002.

13.

Guimond

Roche

Ayache

. Three-dimensional multimodal brain warping using the demons algorithm and adaptive intensity corrections. IEEE Trans Med Imag 2001; 20(1): 58–69. DOI: 10.1109/42.906425.

14.

Leventon

Grimson

WEL

. Multi-modal volume registration using joint intensity distributions. In: 1st International Conference on Medical Image Computing and Computer-Assisted Intervention (eds WM

Wells

Colchester

Delp

), Cambridge, MA, 11–13 October 1998, pp. 1057–1066. Berlin: Springer-Verlag.

15.

Orchard

Mann

. Registering a multisensor ensemble of images. IEEE Trans Image Process 2010; 19(5): 1236–1247. DOI: 10.1109/TIP.2009.2039371.

16.

Špiclin

Likar

Pernus

. Groupwise registration of multimodal images by an efficient joint entropy minimization scheme. IEEE Trans Image Process 2012; 21(5): 2546–2558. DOI: 10.1109/TIP.2012.2186145.

17.

Zhu

. Ensemble registration of multisensor images by a variational Bayesian approach. IEEE Sens J 2014; 14(8): 2698–2705. DOI: 10.1109/JSEN.2014.2315838.

18.

Wang

Komodakis

Paragios

. Markov random field modeling, inference & learning in computer vision & image understanding: a survey. Comput Vis Image Underst 2013; 117(11): 1610–1627. DOI: 10.1016/j.cviu.2013.07.004.

19.

Rabiner

. A tutorial on hidden Markov models and selected applications in speech recognition. IEEE Proc 1989; 77(2): 257–286. DOI: 10.1109/5.18626.

20.

Zhang

Brady

Smith

. Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans Med Imag 2001; 20(1): 45–57. DOI: 10.1109/42.906424.

21.

Nguyen

QMJ

Ahuja

. An extension of the standard mixture model for image segmentation. IEEE Trans Neural Netw 2010; 21(8): 1326–1338. DOI: 10.1109/TNN.2010.2054109.

22.

Nguyen

QMJ

. Gaussian-mixture-model-based spatial neighborhood relationships for pixel labeling problem. IEEE Trans Syst Man Cybern B Cybern 2012; 42(1): 193–202. DOI: 10.1109/TSMCB.2011.2161284.

23.

Zhang

QMJ

Nguyen

. Incorporating mean template into finite mixture model for image segmentation. IEEE Trans Neural Netw Learn Syst 2013; 24(2): 328–335. DOI: 10.1109/TNNLS.2012.2228227.

24.

Pal

Keller

. A possibilistic fuzzy c-means clustering algorithm. IEEE Trans Fuzz Syst 2005; 13(4): 517–530. DOI: 10.1109/TFUZZ.2004.840099.

25.

Pham

Prince

. Adaptive fuzzy segmentation of magnetic resonance images. IEEE Trans Med Imag 1999; 18(9): 737–752. DOI: 10.1109/42.802752.

26.

Ahmed

Yamany

Mohamed

. A modified fuzzy C-means algorithm for bias field estimation and segmentation of MRI data. IEEE Trans Med Imag 2002; 21(3): 193–199. DOI: 10.1109/42.996338.

27.

Hansson-Sandsten

Brynolfsson

. The scaled reassigned spectrogram with perfect localization for estimation of Gaussian functions. IEEE Signal Process Lett 2015; 22(1): 100–104. DOI: 10.1109/LSP.2014.2350030.

28.

Figueiredo

MAT

Jain

. Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 2002; 24(3): 381–396. DOI: 10.1109/34.990138.

29.

Rudin

. Real and complex analysis. 3rd ed. New York: McGraw-Hill, 1987, p. 416.

30.

Georghiades

Belhumeur

Kriegman

. From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell 2001; 23(6): 643–660. DOI: 10.1109/34.927464.