Sage Journals: Discover world-class research

Abstract

In order to assess the performance of multisensor image registration algorithms that are used in the multirobot information fusion, we propose a model based on structural similarity whose name is vision registration assessment model. First of all, this article introduces a new image concept named superimposed image for testing subjective and objective assessment methods. Therefore, we assess the superimposed image but not the registered image, which is different from previous image registration assessment methods that usually use reference and sensed images. Then, we calculate eight assessment indicators from different aspects for superimposed images. After that, vision registration assessment model fuses the eight indicators using canonical correlation analysis, which is used for evaluating the quality of an image registration results in different aspects. Finally, three kinds of images which include optical images, infrared images, and SAR images are used to test vision registration assessment model. After evaluating three state-of-the-art image registration methods, experiments indict that the proposed structural similarity-motivated model achieved almost same evaluation results with that of the human object with the consistency rate of 98.3%, which shows that vision registration assessment model is efficient and robust for evaluating multisensor image registration algorithms. Moreover, vision registration assessment model is independent of the emotional factors and outside environment, which is different from the human.

Keywords

Multisensor image registration image registration assessment human vision system structural similarity superimposed image

Introduction

By connecting different robots that are integrated kinds of sensors respectively in a team can perform a better job than each individual is capable of.^1,2 In order to improve the abilities of multirobot systems, information fusion is a necessary task to suppress the noise in a multiagent environment.³ Finding a way to most effectively utilize the information captured from these multiple sensors, possibly of different modalities, is of considerable interest. Image fusion provides one versatile solution, where multiple aligned images acquired by different sensors are merged into a composite image. The properly registered image is more informative than any of the individual input images and can thus better interpret the scene.^4,5 Consequently, an accurate multisensor registration is a key procedure that is to transform information provided by different sensors with multiple spatial and spectral resolutions into the same coordinate model for image fusion.^{6
–8} To date, the surveys on image registration methods can be found in literatures.^{9

–13} Additionally, Pomerleau and Colas made a review on point cloud registration algorithms for mobile robotics.¹⁴ Moreover, multisenor image registration is also important for Simultaneous Localization And Mapping (SLAM) which is the core of the robot self-localization.^{15

–18}

In order to achieve the multisensor image registration for different types of images that include visible light images, infrared images, and synthetic aperture radar (SAR) images, multisource sensor image registration quality evaluation methods must be able to solve the problem of image diversity.¹⁹ On the other hand, due to the diversity and complexity of the image scene, the evaluation method is necessary to have a wide range of applicability, which can handle a variety of complex situations. This is one of the problems in the field of image registration quality evaluation-universal applicability,²⁰ which poses a very high requirement for the performance of image registration quality evaluation methods.

In the study of image registration, scholars mainly focus on the registration algorithm itself. The evaluation of image registration results has not been attracted much attention. Therefore, it is necessary to study the assessment model of the image registration algorithm. Through the study of the image registration quality evaluation algorithms, it is found that the evaluation methods that are commonly used are just applied for one or several kinds of specific scenarios in image registration.^{21
–23}

At present, some researchers have done a lot of work in the image registration quality evaluation. They proposed some image registration evaluation criteria for different fields. Bouchiha and Besbes utilized the recall which was based on the number of interest points properly matched and the number of actually existing matches for comparing the remote sensing image registration performance of four different features (Scale-invariant feature transform [SIFT], Gradient location-orientation histogram [GLOH], Speeded Up Robust Features [SURF] and Open-Speeded Up Robust Features [O-SURF]).²⁴ Thor and his colleagues proposed a quantitative measure which included a contour similarity measure which was the Dice’s similarity coefficient (DSC) for the deformable medical image registration quality evaluation.²⁵ Bharatha and his coauthors used the feature matching rate and DSC for evaluating the three-dimensional finite element-based deformable magnetic resonance image registration.²⁶ Liu and his collegues used the kernel sparse coding for object recognition.²⁷ However, those evaluation methods focused on the certain areas and were less robust in complex conditions of multisensor image registration algorithms.

Inspired by the human vision system (HVS)and the existing image registration quality evaluation criteria,^28,29 we built a quality assessment model of image registration with wide applicability and human visual characteristics. Experimental results have shown that the proposed evaluation model is robust to the local distortion. Additionally, our model can be used for evaluating registration algorithms which are used in a variety of application scenarios.

This article is organized as follows. In section “Related work,” we present several related works of subjective and objective evaluation methods. In “Image registration assessment model based on SSIM” section, we describe the proposed algorithm and its improvement by fusing eight indicators. In “Experimental results and analysis” section, we show the numerical simulations of comparing the evaluation results between our proposed model and HVS. It is noted that three states-of-art image registration algorithms are introduced to register the multisensor images. Finally, conclusions and suggestions for further work are summarized in section “Conclusion and future work.”

Related work

According to our previous research, image registration quality evaluation is generally divided into two types: (1) subjective evaluation methods, which depend on the human eyes to observe the images and make choices of registered image quality according to the options provided; (2) objective evaluation method, which depends on the related mathematical model to compute the mean value, variance, or gradient of an image.

Subjective assessment

The subjective image registration assessment methods based on HVS are fast and simple.³⁰ However, the disadvantages of those methods are one sided and poor reproducible, because the quality of the image registration result is mainly determined by the observer.³¹ Moreover, when the observer is affected by the psychological changes and observation environment, it will lead to the difference of the evaluation results and reduce the accuracy of the evaluation results.³² Therefore, the objective evaluation method has been developed, which results in a large number of objective evaluation model being proposed.

Objective assessment

The objective assessment algorithms of image registration can be divided into two categories: (1) direct assessment method and (2) indirect assessment method. Christensen and Crum proposed a method based on inverse consistency error (ICE) for assessing the quality of image registration.^33,34 However, ICE will fail in evaluating the image registration methods when the images’ background is flat. In image registration, it is generally assumed that the cross-correlation mapping between the reference image and the image to be registered is one-to-one. In other words, any point in the sensed image s(x, y) can only establish one mapping relationship with the unique point in the reference image r(x, y). However, in practical applications, the mapping T_sr obtained from the registration of s(x, y) to r(x, y) and the mapping T_rs obtained from the registration of r(x, y) to s(x, y) is not reversible.³⁵ Hardcastle and his colleagues proposed a new method that was used to evaluate the pros and cons of transitive attributes and make a statistic analyzation of the image registration results.³⁶ When two transformations are merged into one transformation, the transitive attribute is very important for minimizing non-uniformity errors. Ideally, the transitive transformation means that the correlation among the three different images t₁, t₂, and t₃ should be able to correctly map any point from t₁ to t₂, then from t₂ to t₃, and finally map the point from t₃ to t₁, while the geometric position of the point does not change. t₁, t₂, and t₃ is the reference image, the sensed image, and the registered image, respectively. The error between the identity mapping transformations of the three images is defined as a set of transitivity error (TE) of transformations in the work done by Hardcastle et al.³⁷ Christensen and Johnson used TE to evaluate their non-rigid image registration.³⁸ However, new errors are introduced during evaluating the image registration method using TE. Moreover, some other researchers usually utilized the evaluation method based on root mean square error (RMSE) for measuring the degree of discrepancy between the registered image and the reference image.^39,40 Rui and his colleagues proposed a robust matching method using shape matching for the multisensor imagery. Moreover, they used RMSE and standard deviation to assessment their proposed registration method.⁴¹ Besides, RMSE is also usually used to evaluate the registration results. Kumar and his coauthors used RMSE to evaluate their proposed methods.⁴² The smaller the RMSE is, the performance of the image registration algorithm is better than others. Additionally, information entropy is a quantitative measure of the information transmitted by the image, which is useful to evaluate the image registration methods.⁴³ Tsai and his collogues used the difference of entropy information (DEn) to assessment their proposed algorithm.⁴⁴ However, this method is limited in evaluating medical images. Therefore, Melbourne and his coauthors used the image structural (ST) contrast to evaluate the image registration results.⁴⁵ However, ST is just useful in nonlinear image registration algorithms.

However, the objective image registration quality assessment algorithms that are currently used cannot meet the above two requirements, which are difficult to match the assessment results based on the human eyes. Inspired by the image quality evaluation criterion based on structural similarity (SSIM), we proposed a new robust assessment model that combines the advantages of subjective and objective evaluation methods for multisensor image registration algorithms.

Image registration assessment model based on SSIM

Multisensor image registration is installed for different types of images that include optical images, infrared images, and SAR images. Therefore, the objective assessment method of the multisensor image registration methods must be able to solve the problem of image diversity. On the other hand, due to the diversity and complexity of the image scenes, it is inevitable that the image registration assessment method must have a wide range of applicability and can handle a variety of complex situations.

It is noted that our assessment model proposed is inspired by structural similarity model, which is approximate to the processing of human eyes. With the development of the HVS research, Wang and his coauthors proposed an image quality evaluation criterion based on structural similarity.⁴³ They believe that the image structure information is independent of the brightness and contrast characteristics, so the evaluation of image quality can be approximated as the image structure similarity, brightness, and contrast evaluation.³⁹ In other words, this method directly evaluates the image quality by calculating the similarity of image structures, which can overcome the complex image scene and the difficulty of multichannel decorrelation. Moreover, according to our previous research and analysis, we introduced eight evaluation factors to be parameters of our proposed model, which assessments the performance of image registration methods in different aspects. Then, vision registration assessment model (V-RAM) can give the readers a reasonable and robust assessment result.

The SSIM model presented by Wang et al. describes the relationship among the correlation contrast c(r, s), brightness l(r, s), and structure a(r, s).⁴⁶ The relationship among the three elements is shown as follows

SSIM (r, s) = {[l (r, s)]}^{α} \times {[c (r, s)]}^{β} \times {[a (r, s)]}^{γ}

where

l (r, s) = \frac{2 μ_{r} μ_{s} + C_{1}}{μ_{r}^{2} + μ_{s}^{2} + C_{1}}, c (r, s) = \frac{2 σ_{r} σ_{s} + C_{2}}{σ_{r}^{2} + σ_{s}^{2} + C_{2}}, a (r, s) = \frac{σ_{r s} + C_{3}}{σ_{r} σ_{s} + C_{3}}

where r and s are two images with a certain correlation, whose gray scale means are μ_r, μ_s , and variances are σ_r and σ_s, respectively. At the same time, in order to ensure the stability of the brightness, contrast, and correlation coefficient, and guarantee the denominator of the function to be non-zero, the corresponding minimum values C₁, C₂, and C₃ are introduced. Besides, α > 0, β > 0, and γ > 0.

Superimposed image

Generally, the reference image r(x, y) and the registered image g(x, y) are used in the image registration evaluation.^{47
–49} However, the proposed assessment model in this article just evaluates the superimposed image SI(x, y) which is defined as follows

SI (x, y) = \sum_{x = 1}^{x = M'} \sum_{y = 1}^{y = N'} \frac{| r_{ext} (x, y) - g_{ext} (x, y) |}{2}

where r_ext(x, y) is an external image of the reference image r(x, y) and g_ext(x, y) is an external image of the reference image g(x, y). M′ is the number of rows of the external image and N′ is the number of columns of the external image. In our manuscript, M′ and N′ are determined by the following formula

{\begin{cases} M' = κ \times M \\ N' = κ \times N \end{cases}

where M and N are the width and height of the reference image and κ is the weight. In order to include all of the overlapping regions, we make κ > 1.5.

SSIM-motivated registration assessment model

In our model, some previous indicators are introduced. In the first place, the function of this model is shown as follows

V - RAM (g, r, g t) = F ({RO}_{gr}, {EVA}_{add}, {ST}_{gr}, {NRMSE}_{rg}, {TE}_{g}, {ICE}_{g}, {DEn}_{rg}, T_{r g})

According to equation (3), we can see that there are eight indicators in V-RAM. According to the relationship between V-RAM and the registration algorithm, those indicators can be divided into two categories: (1) the indicator class PR that is proportional to the performance of the algorithm. In other words, the greater the value of the indicator is, the better might be the performance of the registration algorithm; (2) the indicator class NE, which is inversely proportional to the performance of the image registration algorithm. In other words, the smaller the value of the parameter will be, the worse will be the performance of the registration algorithm is. According to this classification criteria, those parameters are classified as follows

{{RO}_{gr}, {EVA}_{add}, {ST}_{gr}} \in P R

{{NRMSE}_{rg}, {TE}_{g}, {CTE}_{g}, {DEn}_{rg}, T_{r g}} \in N E

where RO_gr is the relative overlap region between r(x, y) and g(x, y) and EVA_add is the image definition of the superimposed image SI(x, y).⁵⁰ ST_gr is the structural contrast between r(x, y) and g(x, y), NRMSE_rg is the normalized RMSE between r(x, y) and g(x, y), TE_g is the TE of g(x, y), ICE_g is the consistency inverse error of g(x, y), DEn_rg is the entropy of the difference image of the overlap region that locates in r(x, y) and g(x, y), and T_rg is the running time of the image registration method. The functions of the eight indicators are shown as follows

R O_{gr} = \frac{\sum_{i = 1}^{i = W} \sum_{j = 1}^{j = H} (r (i, j) \cap g (i, j))}{\sum_{i = 1}^{i = W} \sum_{j = 1}^{j = H} (r (i, j) \cup g (i, j))}

where W × H is the size of the relative overlap region

{EVA}_{add} = \frac{\sum_{i = 1}^{i = M} \sum_{j = 1}^{j = N} (\frac{dSI}{d (i, j)})}{| σ_{SI} |}

where $\frac{dSI}{d (i, j)}$ represents the rate of the gradation changes

S T_{gr} = \frac{σ_{g r} + C_{g r}}{σ_{g} + σ_{r}}

where C_gr is the covariance of r(x, y) and g(x, y)

σ_{g r} = \frac{1}{N - 1} \sum_{i = 1}^{i = M \times N} (g (x_{i}, y_{i}) - μ_{g}) (r (x_{i}, y_{i}) - μ_{r})

where σ_g and σ_r is the variance of reference r(x, y) and g(x, y), respectively

{NRMSE}_{rg} = \frac{1}{2^{8} - 1} \times \sqrt{\frac{\sum_{x = 1}^{x = M} \sum_{y = 1}^{y = N} | g (x, y) - r (x, y) |^{2}}{M \times N}}

T E_{t_{1}} (y) = \frac{1}{(H - 1) (H - 2)} \sum_{\begin{array}{l} t_{1} = 1 \\ t_{2} \neq i \end{array}}^{H} \sum_{\begin{array}{l} t_{3} = 1 \\ t_{3} \neq t_{1} \\ t_{3} \neq t_{2} \end{array}}^{H} | | h_{t_{1} t_{2}} (h_{t_{2} t_{3}} (h_{31} (x))) - x | |

where t₁ denotes the reference image, t₂ denotes the sensed image, and t₃ denotes the registered image. H denotes the number of test images that are used for evaluating the performance of the algorithm and || • || is the standard Euclidean norm

{IE}_{Ci} (P) = \frac{1}{ImgN} \sum_{j = 1}^{M} | | h_{j i} (p) - h_{i j}^{- 1} (p) | |

where ImgN is the number of the test images, h_ji (p) is the mapping function between r(x, y) and g(x, y), and $h_{i j}^{- 1} (p)$ is the reverse mapping function between g(x, y) and r(x, y)

{DEn}_{rg} = \sum_{i = 1}^{i = M} \sum_{j = 1}^{j = N} P (i, j) \times (- ln P (i, j))

where P(i, j) is the probability of the superimposed image SI(x, y).

It is noticed that we assume that PR and NE are independent of each other in our article. Additionally, based on the previous research of SSIM, the abovementioned eight indicators are reasonably combined as the following function

V - RAM (g, r, s, SI) = \frac{w_{1} \times {RO}_{gr} + w_{2} \times {EVA}_{add} + w_{3} \times {ST}_{gr}}{(v_{1} \times {NRMSE}_{rg} + v_{2} \times {TE}_{g} + v_{3} \times {ICE}_{g} + v_{4} \times {DEn}_{rg} + E P S) \times T_{r g}}

where g is the registered image, r is the reference image, s is the sensed image, SI is the superimposed image, the vector W is constructed with w₁ > 0, w₂ > 0, and w₃ > 0, and those parameters are the weighted values of PR. The vector V is constructed with v₁ > 0, v₂ > 0, v₃ > 0, and v₄ > 0, which are the weights of the NE. EPS is the minimum value, which is used to maintain the stability of our evaluation model. V - RAM(g, r, s, SI) is just used for comparison, so it does not have a physical unit. The bigger the V - RAM is, the better will be the performance of image registration algorithm is.

Weight acquisition

In order to determine the values of W and V, we introduce canonical correlation analysis (CCA) to deal with the interdependence between two variables. CCA, principal component analysis (PCA), and discriminant analysis are effective methods for multivariate analysis.⁵¹ The relationship between {w₁, w₂, w₃} and {v₁, v₂, v₃, v₄} is studied in the same way of PCA. First of all, we find a linear combination of each variable in PR and NE to obtain the largest combination of correlation coefficients. Then, we look for the second pair of linear combinations that are unrelated to the first pair combination. Following this step, we can extract all combinations of dependencies in the two sets of variables. Based on this approach, the optimal vectors W and V can be converted to the maximum correlation for calculating the linear combination of PR and NE, which is shown by equation (15)

ρ_{P Q} = \frac{W^{T} C_{12} V}{\sqrt{W^{T} C_{11} V} \sqrt{W^{T} C_{22} V}}

where

P = w_{1} \times {RO}_{gr} + w_{2} \times {EVA}_{add} + w_{3} \times {ST}_{gr}

Q = v_{1} \times {NRMSE}_{rg} + v_{2} \times {TE}_{g} + v_{3} \times {ICE}_{g} + v_{4} \times {DEn}_{rg}

C₁₁ is the covariance of PR and C₁₂ is the covariance of NE. The corresponding W and V are the optimal weighting groups when ρ_PQ is the maximum. In order to calculate the maximum value of equation (15), a constraint is necessary used as follows

V_{a r} (P) = W^{T} C_{11} V = 1

V_{a r} (Q) = W^{T} C_{22} V = 1

The weights of each indicator is obtained by calculating equation (15)

W^{i} = C_{11}^{- \frac{1}{2}} μ^{i}

V^{i} = C_{22}^{- \frac{1}{2}}^{i} v^{i}

where i = 1, 2, 3 ..., r, r = rank(C₁₂) that is the number of typical projection vectors. μ and ν are unit orthogonal eigenvectors. In our experiment, the weights of W and V in V-RAM are shown as follows

w_{1} = 0.65, w_{2} = 0.24, w_{3} = 0.11

v_{1} = 0.52, v_{2} = 0.20, v_{3} = 0.18, v_{3} = 0.10

Flowchart of our proposed algorithm

The outlines of our algorithm are shown as follows:

Before evaluating the image registration result, the input image of V-RAM is necessary to be preprocessed. One of the key steps in this preprocessing stage is to obtain the image overlay, which evaluates the region region of interest (ROI) of the overlay image. It is known that the difference of the gray level between the heterologous images is very large. If we calculate the evaluation index of r(x, y) and g(x, y) directly, it will cause a large error. However, we can eliminate this difference based on equation (14). Moreover, the evaluation accuracy of V-RAM model can be improved using the superimposed image SI(x, y).

When evaluating the performance of the color image registration, it is necessary to decompose the image into different channel for assessment.

After the above two-step processing, the evaluation indicators of each channel are calculated.

According to the application scenario of image registration algorithms, we obtain the results of image registration evaluation results based on the V-RAM model.

Experimental results and analysis

Materials

In our experiment, a database for image registration was created by our lab. The database has 1200 images that include optical images, infrared images (includes near-infrared images with wavelengths of 780 to 3000 nm), and SAR images. Moreover, in order to test the V-RAM model for multisensor registration algorithms, we divided the database into 20 groups based on scenarios, which are described by {D₁, D₂, ..., D₁₈, D₁₉, D₂₀}. It is noted that the resolutions of each image in our database are 800 × 800 pixels and 1024 × 800 pixels, respectively. An example of our database is shown in Figure 1.

Figure 1.

Example images of D₁. (a) An optical image of a tree. (b) An infrared image of a tree. (c) An optical image of a city. (d) An SAR image of a city. (e) An optical image of the Pentagon. (f) An optical image of the Pentagon.

Three registration algorithms are selected for testing our assessment model. The three algorithm is Scale invariable-Features from accelerated segment test (SI-FAST),⁵² Fast Fourier-Mellin Transform (FAST-FMT),⁴⁰ and Difference of Gaussian-Local Binary Patterns (DOG-LBP).³¹

Objective evaluation results and analysis of image registration quality

The images in {D₁, D₂, ... m, D₁₈, D₁₉, D₂₀} are registered using SI-FAST, FAST-FMT, and DoG-LBP. Then, we evaluate the image registration quality by V-RAM. In this article, we proposed the distribution of eight evaluation indicators for each registration algorithm. Additionally, all the eight evaluation indicators are normalized for comparing SI-FAST, FAST-FMT, and DoG-LBP.

After obtaining the registration results, we tested the superimposed images based on combing the reference image and the registered image. In order to test the subjective assessment method, 30 students in our lab were chosen to evaluate the multisensor image registration results by their eyes. Those students spent 1 week to complete the performance assessment of image registration algorithms. It is noted that the subjective results are also used to be as the ground-truth for comparing the objective results which are calculated by V-RAM model. The subjective evaluation results are shown in Figure 2.

Figure 2.

The superimposed image and their zoom-in regions. (a) SI-FAST, (b) FAST-FMT, (c) DoG-LBP.

In order to get the subjective results, each student is required to observe the superimposed images, then he/she gives a score for each superimposed image. As we know, 1200 images are registered by the three registration algorithms, so the number of superimposed images is 3600. Moreover, 30 students were asked to watch and score the superimposed images, which caused the number of subjective evaluation results was 10,800. However, due to the limited space, all the scores for each superimposed image are not shown in Figure 3, while we just show the mean score of the observation of all the superimposed images from each student. The interval of the score is 0 to 10.

Figure 3.

The subjective results based on human eyes.

According to Figure 3, we can find that the subjective evolution results based on human eyes are different with each other when the students watched the superimposed images. These results imply that the subjective assessment method is easy to be affected by the emotional factor. Moreover, the subjective assessment method is necessary for evaluating the multisensor image registration algorithms.

Objective evaluation for multisensor image registration algorithms

In this section, we firstly analyze the effect of three registration algorithms that has been introduced in this article on the eight indicators of V-RAM model. The objective evaluation results of FAST-FMT, SI-FAST, and DoG-LBP are shown in Figures 4 to 6, respectively. It is noted that the results of V-RAM are described by box plot that can provide a visualization of summary statistics for sample data.

Figure 4.

The evaluation result for Fast-FMT based on the V-RAM model. V-RAM: vision registration assessment model.

Figure 5.

The evaluation result for SI-FAST based on the V-RAM model. V-RAM: vision registration assessment model.

Figure 6.

The evaluation result for Histogram of Oriented Gradient-Local Binary Patterns (HOG-LBP) based on the V-RAM model. V-RAM: vision registration assessment model.

From Figure 4, we can find that:

The means of RO_gr, EVA_add, ST_gr, NRMSE_rg, TE_g, ICE_g, DEn_rg, and T_rg are 0.46, 0.48, 0.39, 0.56, 0.48, 0.62, 0.62, and 0.42, respectively.

The length of the box of DEn_gr is longer than that of other indicators, which indicates that this indicator is affected by the scene of the image itself heavily.

The red marker “+” in the corresponding box diagram of NRMSE_rg indicates that there is an abnormality in the data of the registration result, which is due to a mismatch of the image features. However, there is no data abnormality in the boxes of other indicators. Therefore, we consider that the NRMSE_rg of V-RAM model is sensitive to the accuracy of registration results.

According to the Figure 5, we can find that:

The means of RO_gr, EVA_add, ST_gr, NRMSE_rg, TE_g, ICE_g, DEn_rg, and T_rg are 0.54, 0.42, 0.43, 0.45, 0.59, 0. 52, 0.53, and 0.58, respectively.

The box length of TE_g is longer than the box length of other indicators, which indicates that this indicator is affected by the scene where the image is taken.

As can be seen from Figure 6:

The means of RO_gr, EVA_add, ST_gr, NRMSE_rg, TE_g, ICE_g, DEn_rg, and T_rg are 0.62, 0.59, 0.50, 0.42, 0. 52, 0.50, 0.52, and 0.49, respectively.

The box length of NRMSE_rg is longer than the box length of other indicators, which indicates that NRMSE_rg is affected by the scene where the image is taken.

Comparison result of the objective observation and subjective calculation

In order to evaluate the performance of our proposed V-RAM model, we compare the subjective and objective assessment results of the same superimposed image. As we know, there are 10,800 evaluation results of all the superimposed images. However, due to the space limitation, we just showed the 30 of the evaluation results, which is shown in Figure 7.

Figure 7.

V-RAM comparison results. (It is noted that the vertical axis is not the score of the image registration assessment.) V-RAM: vision registration assessment model.

In this article, we focus on the performance of our proposed assessment model not the registration algorithms. In other words, we just compare the objective results of the three image registration algorithm based on the V-RAM model to the subjective results obtained by human eyes. Therefore, we utilized the charts to show the changes of the assessment results between the subjective and the objective evaluation method. According to Figure 7, we can come to two conclusions that are shown as follows:

For the same image registration algorithm, the objective result is approximate to the subjective result for the same superimposed image. Moreover, the trend of the objective results is same with that of the subjective results.

For the same scene, the objective result is the same with the subjective result for the superimposed images acquired using different registration methods. Taking the ninth results obtained by the three different registration methods as an example, we can find that the V-RAM model-based registration assessment result acquired by DoG-LBP is the best one, which is the same with the result based on human eyes.

Figure 8 displays that our proposed model can obtain an assessment result that is more approximate to the result human eyes than that of each parameter. It also indicates that single assessment parameter cannot obtain an exhaustive evaluation result for the multisensor image registration method, which means that our proposed model can achieve an objective and exhaustive assessment.

Figure 8.

V-RAM comparison results among different existing assessment methods. V-RAM: vision registration assessment model.

Based on the above results, we found that our proposed V-RAM model achieved the same registration assessment performance with the human vision system. Although the assessment scores are different with the scores obtained by a human, the V-RAM model can give the same conclusion with the human vision system when we compare different registration algorithms.

Conclusion and future work

In order to improve the performance of the multirobot information fusion system, we proposed an SSIM-based assessment model V-RAM for the multisensor image registration algorithms, which was motivated by the human vision system. Two contributions are given by the V-RAM: (a) in order to evaluate the quality of image registration, V-RAM evaluated the registration results from eight aspects, and (b) we introduced superimposed image for building our test database that was utilized to test the subjective image registration assessment method and the V-RAM model. Experimental results implied that our proposed assessment model can obtain approximate evaluation results to the human eyes but is not effected by emotional factors like human beings, which proved that our model was robust and efficient for assessing the multisensor image registration algorithms. In our future work, we will introduce the deep learning method to search the ideal weights of the V-RAM model.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The project sponsored by the National Natural Science Foundation of China (no. 61401040) and the National Key Research and Development Program (no. 2016YFB0502002).

References

Al Hage

El Najjar

Pomorski

. Multi-sensor fusion approach with fault detection and exclusion based on the kullback-leibler divergence: application on collaborative multi-robot system. Inf Fusion 2017; 37: 61–76.

Zheng

Chen

Wang

. On the design of a wearable multi-sensor system for recognizing motion modes and sit-to-stand transition. Int J Adv Robot Syst 2014; 11: 30.

Silva

Lau

Rodrigues

. Sensor and information fusion applied to a robotic soccer team. In: Baltes

Lagoudakis

Naruse

Ghidary

(eds) Robocup 2009: robot soccer world cup XIII. Ed Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 366–377.

Zhang

Liu

Blum

. Sparse representation based multi-sensor image fusion: a review. CoRRarXiv:1702.03515, 2017.

Dobrišek

Gajšek

Mihelič

. Towards efficient multi-modal emotion recognition. Int J Adv Robot Syst 2013; 10: 53.

Le Moigne

. Multi-sensor image registration, fusion and dimension reduction. Online J Space Commun 2002; 3: 1–9.

Moigne

Cole-Rhodes

Eastman

. Multiple sensor image registration, image fusion and dimension reduction of Earth science imagery. In: Proceedings of the fifth international conference on information fusion. FUSION 2002. (IEEE Cat.No.02EX5997), Annapolis, MD, USA, 8–11 July 2002, Vol. 2, 2002, pp. 999–1006. USA.

Liu

Sun

. Weakly paired multimodal fusion for object recognition. IEEE Trans Autom Sci Eng, 2017; vol. PP, pp. 1–12.

Sariyanidi

Gunes

Cavallaro

Automatic analysis of facial affect: a survey of registration, representation, and recognition[J]. IEEE Trans Pattern Anal Mach Intell 2015; 37(6): 1113–1133.

10.

Schnabel

Heinrich

Papież

. Advances and challenges in deformable image registration: from image fusion to complex motion modelling[J]. Medical Image Analysis, 2016; 33: 145–148.

11.

Viergever

Maintz

Klein

. A survey of medical image registration-under review. Med Image Anal 2016; 33: 140–144.

12.

Lahat

Adali

Jutten

. Multimodal data fusion: an overview of methods, challenges, and prospects. In: Proceedings of the IEEE, Vol. 103, 2015, pp. 1449–1477.

13.

Jun

Qi-min

Zhi-hua

. A survey of sub-pixel image registration methods [J]. J Image Graph 2008; 11: 004.

14.

Pomerleau

Colas

Siegwart

. A review of point cloud registration algorithms for mobile robotics[J]. Found Trends® Robot 2015; 4(1): 1–104.

15.

Atanasov

Le Ny

Daniilidis

. Decentralized active information acquisition: theory and application to multi-robot SLAM. In: 2015 IEEE international conference on robotics and automation (ICRA), Seattle, Washington, USA, 26–30 May 2015, pp. 4775–4782. Philadelphia, PA: GRASP Lab, University of Pennsylvania.

16.

Wang

Huang

Dissanayake

Multi-robot simultaneous localization and mapping using d-slam framework[C]. In: 3 rd international conference on intelligent sensors, sensor networks and information, 2007. ISSNIP 2007, Melbourne, Australia, 3–6 December 2007, pp. 317–322. USA: IEEE.

17.

Burguera

Bonin-Font

Oliver

Towards robust image registration for underwater visual slam[C]. In: 2014 international conference on computer vision theory and applications (VISAPP), Sana Lisbon Hotel Portugal, 5–8 January 2014, 3: 539–544. USA: IEEE.

18.

Liu

Sun

Fang

. Robotic room-level localization using multiple sets of sonar measurements. IEEE Trans Instrum Meas 2017; 66: 2–13.

19.

Zhang

. Multi-source remote sensing data fusion: status and trends. Int J Image Data Fusion 2010; 1: 5–24.

20.

Christensen

Geng

Kuhl

. Introduction to the non-rigid image registration evaluation project (NIREP). In: International workshop on biomedical image registration, 2006, pp. 128–135. Springer International Publishing AG. Part of Springer Nature.

21.

Rigaud

Simon

Castelli

. Evaluation of deformable image registration methods for dose monitoring in head and neck radiotherapy. BioMed Res Int 2015; 2015: 726268.

22.

Zhou

Zhao

. Robust feature matching for remote sensing image registration via locally linear transforming. IEEE Trans Geosci Remote Sens 2015; 53: 6469–6481.

23.

Liu

Sun

. Visual-tactile fusion for object recognition. IEEE Trans Autom Sci Eng, 2017; 14: 996–1008.

24.

Hsu

. A novel image registration algorithm for indoor and built environment applications. Comput Aided Civil Infrastruct Eng 2015; 30: 802–814.

25.

Bouchiha

Besbes

. Comparison of local descriptors for automatic remote sensing image registration. Signal Image Video Process 2015; 9: 463–469.

26.

Thor

Bentzen

Elstrøm

. Dose/volume-based evaluation of the accuracy of deformable image registration for the rectum and bladder. Acta Oncol 2013; 52: 1411–1416.

27.

Liu

Guo

Sun

. Object recognition using tactile measurements: Kernel sparse coding methods. IEEE Trans Instrum Meas 2016; 65: 656–665.

28.

Bharatha

Hirose

Hata

. Evaluation of three-dimensional finite element-based deformable registration of pre-and intraoperative prostate imaging. Med Phys 2001; 28: 2551–2560.

29.

Liu

Sun

. Robust exemplar extraction using structured sparse coding. IEEE Trans Neural Netw Learn Syst 2015; 26: 1816–1821.

30.

Mancas

Le Meur

. Applications of saliency models. In: Ed

Springer

(ed) From human attention to computational attention, Springer New York, 2016, pp. 331–377.

31.

Jiao

Deng

Zhao

. A hybrid method for multi-sensor remote sensing image registration based on salience region. Circuits Syst Signal Process 2014; 33: 2293–2317.

32.

Wang

Bovik

Sheikh

. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 2004; 13: 600–612.

33.

Yang

Sun

Wang

. A usability-based subjective remote sensing image quality assessment database [J]. Signal Image Video Process 2017; 11(4): 697–704.

34.

Christensen

Johnson

. Consistent image registration. IEEE Trans Med Imag 2001; 20: 568–582.

35.

Crum

Hartkens

Hill

. Non-rigid image registration: theory and practice. Br J Radiol 2004; 77(spec No 2): S140–S153.

36.

Cao

Christensen

Ding

. Intensity-and-landmarkdriven, inverse consistent, B-Spline registration and analysis for lung imagery. In: Proceedings of the 2nd international workshop on pulmonary image analysis, London, UK, 20 September 2009, pp. 137–148. USA.

37.

Hardcastle

Bender

Tomé

. The effect on dose accumulation accuracy of inverse-consistency and transitivity error reduced deformation maps. Aust Phys Eng Sci Med 2014; 37: 321–326.

38.

Christensen

Johnson

. Invertibility and transitivity analysis for nonrigid image registration[J]. J Electron Imag 2003; 12(1): 106–117.

39.

Xiao

. A survey on multisensor image fusion. Electr Opt Control 2011; 9: 1–7.

40.

Jiao

Zhao

Tang

. A new multi-stage sub-pixel precision SAR image registration algorithm. IET Int Radar Conference, Guilin, China, 20–22 April 2009, pp. 199–203. UK.

41.

Rui

Wang

Zhang

. Multi-sensor SAR image registration based on object shape. Remote Sens 2016; 8: 923.

42.

Kumar

Milesi

Nemani

. Multi-sensor multi-resolution image fusion for improved vegetation and urban area classification[J]. Int Arch Photogramm Remote Sens Spat Inf Sci 2015; 40(7): 51.

43.

Wang

Rehman

Wang

. Perceptual video coding based on SSIM-inspired divisive normalization. IEEE TransImage Process 2013; 22: 1418–1429.

44.

Tsai

Lee

Matsuyama

. Information entropy measure for evaluation of image quality[J]. J Digit Imag 2008; 21(3): 338–347.

45.

Melbourne

Ridgway

Hawkes

. Image similarity metrics in image registration[C]. In: SPIE medical imaging. International Society for Optics and Photonics, 12 March 2010, pp. 762335–l762335.10. Bellingham, US: The International Society for Optical Engineering.

46.

Wang

Rehman

Zeng

. SSIM-motivated two-pass VBR coding for HEVC. IEEE Trans Circuits Syst Video Technol 2016; vol. PP, pp. 1–1.

47.

Gholipour

Kehtarnavaz

Briggs

. Brain functional localization: a survey of image registration techniques. IEEE Trans Med Imag 2007; 26: 427–451.

48.

Zitova

Flusser

. Image registration methods: a survey. Image Vision Comput 2003; 21: 977–1000.

49.

Sotiras

Davatzikos

Paragios

. Deformable medical image registration: a survey. IEEE Trans Med Imag 2013; 32: 1153–1190.

50.

Wang

Zhong

Wang

. Research of measurement for digital image definition. J Image Graph 2004; 9: 828–831.

51.

Knutsson

Borga

Landelius

. Learning multidimensional signal processing. In: Proceedings. Fourteenth international conference on pattern recognition, 1998. 1998, pp. 1416–1420.

52.

Azad

Asfour

Dillmann

Combining harris interest points and the sift descriptor for fast scale-invariant object recognition. In: IEEE/RSJ international conference on intelligent robots and systems, 2009. IROS 2009, St. Louis, MO, USA, 10–15 October 2009, pp. 4275–4280. USA.

A structural similarity-inspired performance assessment model for multisensor image registration algorithms

Abstract

Keywords

Introduction

Related work

Subjective assessment

Objective assessment

Image registration assessment model based on SSIM

Superimposed image

SSIM-motivated registration assessment model

Weight acquisition

Flowchart of our proposed algorithm

Experimental results and analysis

Materials

Objective evaluation results and analysis of image registration quality

Objective evaluation for multisensor image registration algorithms

Comparison result of the objective observation and subjective calculation

Conclusion and future work

Footnotes

Declaration of conflicting interests

Funding

References