Microscopy image fusion algorithm based on saliency analysis and adaptive m-pulse-coupled neural network in non-subsampled contourlet transform domain

Abstract

Microscopy image fusion, as a new item in related research field, has been extensively used in integrated-circuit defect detection and intaglio-plate-microstructure observation. In this article, a novel microscopy image fusion algorithm based on saliency analysis and adaptive m-pulse-coupled neural network in non-subsampled contourlet transform domain is proposed, in which each original image can be decomposed into a low-frequency subband and a series of high-frequency subbands. A new measurement technique based on image variance permutation entropy is designed for fusion of the low-frequency subbands, and a novel sum-modified Laplacian is chosen as external stimulus which motivates the adaptive m-pulse-coupled neural network for the high-frequency subbands. Yet, the linking strength of the m-pulse-coupled neural network is determined by five features of the saliency map. Then, the selection rules of different subbands are worked based on the corresponding weight measures. Finally, the fusion image is reconstructed via inverse non-subsampled contourlet transform. Experimental results reveal that the proposed algorithm achieves better fused image quality than other traditional representative ones in the aspects of objective evaluation and subjective visual.

Keywords

Image fusion non-subsampled contourlet transform m-pulse-coupled neural network permutation entropy novel sum-modified Laplacian

Introduction

In order to better observe the details of small objects in gravure engraving field, the micro-observation technique is usually employed to grab the high-resolution images. With the increment of image resolution, the numerical aperture becomes higher and higher. Since this value varies inversely with the depth of focus, it is hard to inspect all details of a object in one image. Image fusion is a new technique which uses the different-depth images to synthesize one full focus image with the property of better visual effects. The image fusion technique has been widely applied in video applications,¹ medical applications,² and sensor applications.³

Within the last decade, many algorithms have been proposed related to the subject in image fusion. These algorithms can be classified into two categories: image fusion algorithms based on spatial and transform domains. The spatial domain fusion algorithms select some features of images such as pixel,⁴ block,⁵ and region⁶ as their final fused images. A simplest pixel-based fusion algorithm is intended to weight the source images to obtain fused results. However, this algorithm may lead to information loss and decreased clarity. The key point of the block-based algorithms is to select some image blocks from source images by some saliency measures such as spatial frequency, contrast visibility, and edge information. However, the image block containing both focus and defocus parts will result in blocking effect. In addition, it is difficult to choose a block size appropriately. In order to overcome these problems, many improved spatial domain fusion algorithms have been proposed.⁷ The super-pixel-based saliency analysis method is used to obtain the target region which leads to the focus areas’ accurate extraction.⁸

In recent years, many scholars have discovered that multi-scale transform (MST) may be a helpful tool for image fusion. The performance of the fusion result significantly depends on two aspects: MST method selection and effective fusion rules for different scales. The common MST approaches include discrete wavelet transform (DWT),⁹ stationary wavelet transform (SWT),¹⁰ discrete fractional random transform (DFRT),¹¹ internal generative mechanism transform (IGMT),¹² and shearlet transform.¹³ However, the PT, DWT, and SWT methods can only capture the details of limited direction information and cannot represent an image. There are many excellent features in the curvelet transform, contourlet transform, and shearlet transform, such as multi-direction, multi-resolution, and anisotropy. Although these characteristics can fully describe the image edges, curves, and other details, it can easily lead to the pseudo-Gibbs phenomenon and thus reduce the image fusion quality according to the lack of shift-invariance. The non-subsampled contourlet transform (NSCT) is a new method that can overcome the above-mentioned shortcomings, which is widely used in image fusion fields.¹⁴

In addition, an effective fusion rule is another important factor that impacts the fusion results. How to design a useful fusion strategy for fusion has attracted scholars’ attention. Recently, researchers have put forward many fusion rules in different transform domains.¹⁵ The common neighborhood evaluation measurement includes spatial frequency (SF),¹⁶ sum-modified Laplacian,¹⁷ and energy of edge.¹⁸ However, it is not enough to just consider the neighborhood information, since it may lead to the loss of some details. The pulse-coupled neural network (PCNN) as a global algorithm can retain more detailed information. PCNN was first introduced by Eckhorn in 1990.¹⁹ As there are some advantages like global coupling and pulse synchronization, PCNN has become a promising image fusion tool. However, due to its high computational complexity, some available image fusion algorithms with PCNN are not suitable for real-time applications.²⁰ Another limiting factors of traditional PCNN for image fusion is that many parameters need to be set up.²¹ In order to improve the performance of image fusion algorithm based on the PCNN model, scholars have proposed lots of modified PCNN models. The intersecting cortical model (ICM)²² is a simplified version of the PCNN. Compared with the conventional PCNN model, the ICM has the advantages of low computational complexity and simple in formulation. A new multi-channel model for medical image fusion (m-PCNN) is established by a set of parallel PCNN units with intra-channel and inter-channel linking.²³

There are two main characteristics of microscopy image fusion filed. First, the number of source images for fusion is far more than two images. Second, satisfactory fusion results are obtained by keeping details as much as possible from the source images. In this article, a new fusion algorithm for microscopy image fusion tasks combining NSCT and m-PCNN is proposed. After decomposing the source images using the NSCT, in the low-pass subbands, a novel permutation entropy (PE) based on variance, which can reflect the feature of the low-frequency subimages, is used to evaluate the weight of different source images. In the high-pass subbands, the region local energy, the region standard deviation (SD), the information entropy, the correlation of co-occurrence matrix, and the local image contrast are combined as five features of saliency map to construct a new measurement to the linking strength of m-PCNN. The novel sum-modified Laplacian (NSML)-based method is used to motivate the m-PCNN neurons. The coefficients of high-pass subbands in NSCT domain with lager firing times are selected as the final high-frequency coefficients of the fused image. The innovation of this algorithm is that two novel effective measures for different frequency subbands were proposed. Unlike traditional algorithms, the proposed algorithm not only takes account of the characters of single image but also considers the relationship between image sequences at the same position.

The structure of this article is organized as follows: Section “NSCT” introduces the theory of the NSCT. In section “PCNN,” a brief review of the traditional PCNN and the m-PCNN, and the adaptive m-PCNN are introduced in detail. Section “Proposed fusion scheme–based NSCT and m-PCNN” introduces the fusion algorithm of this article. Section “Experimental analysis” gives the several experimental results and the proposed algorithm is compared with other algorithms. Concluding remarks are given in section “Conclusion.”

NSCT

As a new MST analysis tool, NSCT is different from other MST methods, such as DWT and SWT, and has many advantages for image processing. First, using up sampling technique in filter contributes to overcome the shortcoming of contourlet transform without the property of shift-invariance. Second, the directional filter banks are employed to implement multi-direction decomposition. Hence, the NSCT can retain more image details. Finally, since NSCT has the characteristic of anisotropy, it can effectively represent the high-dimensional singularities than wavelets. In summary, NSCT shows a better performance in areas with low constraint degree of data redundancy, such as image fusion.

The structure of the NSCT is shown in Figure 1. NSCT can be divided into two parts: non-subsampled pyramid (NSP) and non-subsampled directional filter bank (NSDFB). The former part maintains the multi-scale property using a pair of non-subsampled filter banks. A high-frequency image and a low-frequency image are obtained at each decomposition level. The low-frequency image is utilized as input of the next decomposition level. As a result, NSP can produce $n + 1$ subbands, which consist of one low-frequency image and $n$ high-frequency images whose size are the same as the source image. The NSDFB allows the all-direction decomposition through a pair of non-subsampled directional fan filter banks. The $l$ -level high-frequency image can produce $2^{l}$ directional subimages with the same size as the source image. Therefore, the NSDFB allows the NSCT to have the multi-direction property and keeps more directional detailed information from the source images.

Figure 1.

Structure of the NSCT.

PCNN

The PCNN, which is different from the process of traditional neural network, was proposed by Eckhorn in order to explain the synchronous pulse bursts in the visual cortex of the cat. It is proved that PCNN is a useful tool which gets valid information from the complex background effectively.¹⁹ As PCNN does not need to be trained throughout the process and its process is more accordant with the physical foundation of the human visual neural system, it is widely used in image segmentation, image de-noising, and image fusion.²⁴

The basic model of PCNN is depicted in Figure 2. It contains three parts: receptive field, modulation field, and pulse generator. The receptive field is divided into two parts by received signal. One of them is called linking input, and the other is called feeding input. Only the feeding input can accept the stimulus $S_{i, j}$ . The state of current neuron is inferred from the neighboring neurons via the weight $m_{i, j}$ , $w_{i, j}$ , and their previous state. Their mathematical expressions are defined as

\begin{array}{l} F_{i, j} [n] = e x p (- a_{F}) F_{i, j} [n - 1] \\ + V_{F} \sum m_{i j} Y_{i, j} [n - 1] + S_{i, j} \end{array}

(1)

L_{i, j} [n] = \exp (- a_{L}) F_{i, j} [n - 1] + V_{L} \sum w_{ij} Y_{i, j} [n - 1]

(2)

where $V_{F}$ and $V_{L}$ are the amplitude coefficients, and $a_{F}$ and $a_{L}$ are the decayed constants.

Figure 2.

PCNN model.

The main function of modulation field is to regulate the ratio between the two channels. The internal state $U$ of the neuron is calculated by adding an offset to the linking input and then multiplying with the feeding input. Its essence is a rapidly changing signal superposed on a constant signal. The intensity of output of the modulation field is controlled by linking strength $β$ . The expressions of $U$ is given as follows

U_{i, j} [n] = F_{i, j} [n] (1 + β L_{i, j} [n])

(3)

The pulse generator is composed of the internal state $U$ and a dynamic threshold comparator. When $U$ is greater than the threshold $θ$ , the neuron is activated. With the increase in the threshold, the pulse generator will be shut down. This process can be described as follows

Y_{i, j} [n] = {\begin{matrix} 1 & U_{i, j} [n] > T_{i, j} [n] \\ 0 & others \end{matrix}

(4)

T_{i, j} [n] = \exp (- a_{T}) F_{i, j} [n - 1] + V_{T} \sum w_{ij} Y_{i, j} [n - 1]

(5)

where $a_{T}$ is the decay time and $V_{T}$ is the positive constant.

m-PCNN

Although the PCNN methods have a wide range of advantages in image processing, it still have some shortcomings that cannot be ignored. First, parameter setting mainly depends on experiential knowledge which may lead to incorrect results. Second, the iteration number is usually set to a constant that may be time-consuming. Third, the fusion task which has more than two source images is an iterative process with sophisticated mechanisms. In order to overcome the above drawbacks, a new model called m-PCNN has been proposed by Wang.²³ As shown in Figure 3, unlike the traditional PCNN, the m-PCNN model can be applied to microscopy image fusion filed which has special requirements for good expandability. The mathematical equations of m-PCNN can be described as follows

H_{i, j}^{k} [n] = f^{k} (Y [n - 1]) + S_{i, j}^{k}

(6)

U_{i, j} [n] = Π_{k = 2}^{K} (1 + β^{k} H_{i, j}^{k} [n]) + σ

(7)

Y_{i, j} [n] = {\begin{matrix} 1 & U_{i, j} [n] > T_{i, j} [n] \\ 0 & others \end{matrix}

(8)

T_{i, j} [n] = \exp (- a_{T}) F_{i, j} [n - 1] + V_{T} \sum w_{ij} Y_{i, j} [n - 1]

(9)

where $H^{k}$ represents the channel of the $k$ th external input, $f^{k}$ is called the feed function as quantifying the influence of neighboring neurons on the current neurons, and $β^{k}$ is the linking coefficient which shows the importance of the $k$ th channel. Usually $β^{k} = 0.5$ , and $σ$ is a level factor.²³

Figure 3.

m-PCNN model.

The main advantage of m-PCNN model compared to the traditional PCNN is that it has more external inputs. For traditional PCNN, if there is a task to deal with multiple image fusion, the whole process must be iterated as shown in Figure 4. However, for m-PCNN, it can receive multiple inputs at the same time, thus it accomplishes the whole process of fusion by only requiring single PCNN instead of multiple PCNNs.

Figure 4.

Multiple image fusion in traditional PCNN model.

Adaptive m-PCNN

In m-PCNN, the linking coefficient $β$ is a very important parameter, and it reflects the significance of linking channel. The higher value of $β$ indicates a more important role in a system than others. If all the inputs have the same weight, $β$ will be set to the same constant. This setting method is based on experience and experiments. The disadvantage of the method is that it cannot accurately reflect the different details of the image. In general, a region with significant features is more likely to draw attention in human vision system than a region with not significant features. Therefore, each neuron should have its own linking coefficient to reflect the features of the corresponding pixels of the image.

In recent years, researchers have developed many methods for determining the linking strength. A new gradient measurement took a sigmoid function as linking strength.¹¹ The region feature was made to strengthen the self-adaptive in Wang et al.’s¹⁸ work. The local SD as the linking strength can give more exact expression to the intensity of local information in transform domains.²⁵ However, for microscopy image fusion, the requirement of details preservation is usually incompatible. It is hard to satisfy the requirement only by single evaluation method.

In this article, a multi-judgment measure of the clarity of each source images as the linking strength of corresponding neuron is designed. It is well known that gradient is the most striking feature in an image and it reflects the clarity of the image. First, we need to determine the saliency analytical method on multi-focus images, and then get the saliency maps. Second, according to trend analysis of saliency maps, the formula of the linking strength can be derived.

Saliency analysis based on image

A region or pixel is in focus with higher saliency. In order to enhance the fusion result, an effective saliency feature extraction method should be well designed. In addition, it is true that human visual system has strong ability on identifying the texture in the image. The region which has large contrast is always easily captured by human eyes in a complex scene. Inspired by this, a new method for saliency analysis through the comparison between the pixel value of block center and the value of neighborhood pixels is proposed. A saliency map with the same size of source image $I$ can be calculated by equation(10)

S_{i, j} = \sum_{m = i - p}^{i + p} \sum_{n = j - q}^{j + q} | I_{m, n} - I_{i, j} |

(10)

where $S_{i, j}$ denotes the neighborhood saliency value of the image; $p$ and $q$ are the row number and column number of the block image, respectively; and $I_{i, j}$ and $I_{m, n}$ represent the gray-level intensity of pixels $(i, j)$ and $(m, n)$ , respectively. Finally, we can get a saliency image. The saliency map is obtained based on the results after saliency image normalization. In our proposed method, the kernel size is a very important parameter, and either too large or too small will not express the clarity of the image. Therefore, based on the experimental analysis, the optimal kernel size of the method is $5 \times 5$ . Figure 5(a) shows the source image of a clock and the focal plane at the right area. Figure 5(b) shows the saliency map of image with kernel size $5 \times 5$ , and Figure 5(c) shows the Sobel result of image in Figure 5(a) which uses the same kernel size $5 \times 5$ . By observing the details of the saliency map, our method can preserve more focus plane information for saliency analysis than Sobel.

Figure 5.

Experiment on clock to illustrate the saliency map of different methods: (a) source image, (b) our saliency map, and (c) Sobel result.

Trend analysis of saliency map

In order to analyze the change trend of the saliency map sequences, not only the measurement method is considered in the internal structure information of an image but also the variation of the same pixel position in continuous saliency maps. Therefore, five features are used to quantify this trend.

Feature $f_{1}$ : It is well known that the local area has larger energy to represent an important characteristic of an image, so the region energy of the image is selected to be one of the evaluation criteria. The region energy of an image is calculated as follows

\begin{matrix} E_{S, i} (x, y) & = \frac{1}{M \times N} \sum_{m = 1}^{(M + 1) / 2} \sum_{n = 1}^{(N + 1) / 2} \\ S_{i} (x + m - \frac{M + 1}{2}, y + n - \frac{N + 1}{2}) \\ i = 1, 2, \dots, k \end{matrix}

(11)

where $k$ refers to the total number of the saliency map images. $E_{S, i}$ denotes the region energy of image $S_{i}$ , and $M \times N$ denotes the size of the region.

Feature $f_{2}$ : The region SD, as an estimate of the saliency map contrast, is used to describe the diffusion degree of pixel values and its expression is defined as follows

\begin{matrix} S D_{S, i} (x, y) & = \frac{1}{\sqrt{M \times N}} (\sum_{m = 1}^{(M + 1) / 2} \sum_{n = 1}^{(N + 1) / 2} \\ {(S_{i} (x + m - \frac{M + 1}{2}, y + n - \frac{N + 1}{2}) - μ)}^{2})^{\frac{1}{2}} \end{matrix}

(12)

where $μ$ is the average value of the saliency map. $S D_{S, i} (x, y)$ denotes the region SD of saliency map $S_{i}$ . With the SD increased in the saliency image, it is clear that the gray value has higher degree of dispersion and the image contrast shows significant enhancement.

Feature $f_{3}$ : Information entropy is the average of the information contained in an image. The histogram-based estimation is a useful approach to compute the entropy of an image. A higher value of entropy indicates that images have more information. Then, the entropy $E_{S, i}$ is defined as follows

E_{S, i} = - \sum_{m = 0}^{M - 1} p (m) lo g_{2} (p (m))

(13)

where $p (m)$ denotes the probability of gray value $m$ in total image pixels. $M$ is the gray level of an image, it is usually defined 256 for 8-bit image.

Feature $f_{4}$ : A co-occurrence matrix is a matrix which is used to describe the distribution of an image at a given offset $(x, y)$ . Experimental results indicate that the co-occurrence matrix can better express the textural characteristics of an image. It defined by the mathematical expression as follows

C_{x, y} (i, j) = \sum_{p = 1}^{n} \sum_{q = 1}^{m} {\begin{matrix} 1 & S_{i} (p, q) = i, S_{i} (p + x, q + y) = j \\ 0 & otherwise \end{matrix}

(14)

where $i$ and $j$ are the gray values of the image, $p$ and $q$ are the positions in the saliency map $S_{i}$ , and $x$ and $y$ are the parameters that depend on the direction. We chose the correlation feature which is measured by the degree of similarity between rows and columns as a measuring criterion. Therefore, correlation value reflects the local image’s gray correlation. As the correlation value gets bigger, the consistency of the image gets better, and vice versa.

Feature $f_{5}$ : All of the above features are about a single saliency map, but the correlated feature between images is also need to be considered. Defining a new measure that expects to reflect the changes of the continuous image sequences is necessary. A sigmoid function is used to compute this character of discrepancy $Dif$ , and the formula is shown in equation (15)

Dif = \frac{| μ_{S, i} - μ_{S, i - 1} |}{1 + \exp (- | σ_{S, i} - σ_{S, i - 1} |)}

(15)

where $μ_{S, i}$ is the mean value of image $S_{i}$ , $μ_{S, i - 1}$ is the mean value of the previous image, and $σ_{S, i}$ and $σ_{S, i - 1}$ are the variance of imaged $S_{i}$ and $S_{i - 1}$ , respectively.

An index function can be defined to get the maximum index among the five features

\begin{matrix} I_{i} (x, y) & = index {\max f_{ij} (x, y)}, \\ i = 1, 2, \dots, k, j = 1, 2, \dots, C \end{matrix}

(16)

where $i$ is the index of features, and the maximum feature number of features $k$ is equal to 5 in this article. $C$ is the total amount of source images. Finally, the weight of position $(x, y)$ in each saliency map can be given by five features to yield an outcome

\bar{I_{(} x, y)} = \frac{\sum_{i = 1}^{k} I_{i} (x, y)}{k}

(17)

β_{j}^{H} (x, y) = e^{- \frac{{(j - \bar{I})}^{2}}{2}}, j = 1, 2, \dots, C

(18)

where $β_{j}^{H} (x, y)$ denotes the linking strength of the $j$ th high-frequency subband in m-PCNN. With full consideration of various features of saliency maps, the weight maps are more reasonable.

Proposed fusion scheme–based NSCT and m-PCNN

Rule of low-frequency subbands

The low-frequency subbands represent the approximation component of the multi-focus images. An effective fusion rule assigned to get a good performance of the fused image is essential. Some researchers have proposed many fusion methods of the low-frequency subbands, such as local energy-based selection scheme,²⁶ novel sum-modified Laplacian,²⁷ and weighted average.²⁸ However, these methods may reduce the contrast of the fusion image. It is well known that an image with some low-level visual features like edges and corners is very friendly to human vision system. So, it is not enough to determine the low-frequency subbands used in the above methods. In order to observe the small variations of the low-frequency subbands, a novel measure based on image variance permutation entropy (VPE) is proposed to meet the requirement.

PE is an effective analytic tool for measuring the time series.²⁹ There are many advantages of PE algorithm such as low computational complexity, and it amplifies slight changes of a time series effectively. For low-frequency subbands, a measurement of the regional variance is treated as time series in PE algorithm because most of the image energy is retained in low-frequency image. The low-frequency image is divided into several non-overlapping regions whose size is $M \times N$ . The regional variance series of $n$ source low-frequency subbands will produce $n$ values, $x (i), i = 1, 2, \dots, n$ . In order to obtain a series of PE values who have the same size as the source images, the variance sequences need to be extended with zero

\begin{matrix} x' (i), i = 1, 2, \dots, N, N = n + L - 1 \\ x' (i + (L / 2 + 1)) = x (i) \end{matrix}

(19)

where $L (L < n)$ is the subsequence. Then, take the time delay $τ$ and the embedding dimension $m$ to reconstruct the phase space

X (i) = [x' (i), x' (i + τ), \dots, x' (i + (m - 1) τ)]

(20)

Then, the reconstruct vector $X (i)$ is sorted from small to large

\begin{matrix} x (i + (k_{1} - 1)) \leq x (i + (k_{2} - 1) τ) \\ \leq \dots \leq x (i + (k_{p} - 1) τ), (1 \leq k \leq L - m + 1) \end{matrix}

(21)

The symbol sequence $B (t)$ which corresponds to the reconstruct vector $X (i)$ is obtained

B (t) = [k_{1}, k_{2}, \dots, k_{m}], (t = 1, 2, \dots, l, l \leq m!)

(22)

There are $m!$ permutation orders of each symbol sequence, one of them is $B (t)$ . $P_{j}$ is the probability of the number of $j$ th symbol sequence in total permutation orders. The regional variance-based PE is defined as follows

H_{m} = - \sum_{j = 1}^{l} P_{j} \ln P_{j}

(23)

After normalization processing, the new PE can be obtained by the following equation

H' = H / \ln (m!)

(24)

Then, the PE of the next subsequence $L = L + 1$ can be calculated. The smaller $H'$ reveals the time series more regularly; however, the time series shows a randomicity. Finally, a PE sequence which reflects the variation of low-frequency subbands is obtained. The $W_{i}$ denotes the weight value of the block image

W_{i} = \frac{{H'}_{i}}{\sum_{i = 1}^{n} {H'}_{i}}

(25)

As a result, the final low-frequency coefficients of the block image $C_{F}$ are calculated by equation(26)

C_{F}^{L} (x, y) = \sum_{i = 1}^{n} W_{i} (x, y) \times C_{L, i} (x, y)

(26)

In this article, the embedding dimension $m$ equals to 4, the time delay $τ$ is set to 1, and the subsequence $L$ is set to $n / 2$ .

Rule of high-frequency subbands

The high-frequency subbands represent the details of the source image. However, the traditional method motivates the neuron directly using the high-frequency coefficients, because the coefficients are not only representative of details but also possible noise. As features can represent more abundant information than pixels, it is reasonable to replace pixels with features. The NSML-based method can be used to motivate the adaptive m-PCNN neurons. The expressions of NSML is defines as follows

NSML (i, j) = \sum_{a} \sum_{b} w (i, j) \cdot F (i + a, j + b)

(27)

\begin{matrix} F (i, j) = | 2 f (i, j) - f (i - step, j) - f (i + step, j) | \\ + | 2 f (i, j) - f (i, j - step) - f (i, j + step) | \\ w (i, j) = [\begin{matrix} 1 / 15 & 2 / 15 & 1 / 15 \\ 2 / 15 & 3 / 15 & 2 / 15 \\ 1 / 15 & 2 / 15 & 1 / 15 \end{matrix}] \end{matrix}

(28)

where $w (i, j)$ is the normalized window and the step equals to 1. Suppose that $Fir e_{i} (x, y)$ is the firing map from the $i$ th high-frequency subband, the chosen maximum fusion rule is utilized to get the new coefficients of the high-frequency subbands

C_{F}^{H} (x, y) = \sum_{i = 1}^{n} C_{H, i} (x, y) \times D_{i} (x, y)

(29)

D_{i} (x, y) = {\begin{matrix} 1 & argmax (Fir e_{i} (x, y)) = Fir e_{i} (x, y) \\ 0 & others \end{matrix}

(30)

Image fusion algorithm

In this article, a novel image fusion algorithm is presented using NSCT and adaptive m-PCNN technique for microscopy image fusion. The source multi-focus images have got strict registration. Figure 6 displays the whole process in detailed steps.

Decompose the source multi-focus images into low-frequency subbands and a series of high-frequency subbands via NSCT.

For the low-frequency component fusion, the VPE is calculated via equation (24) in the slipping local window of each low-frequency subbands, and then, the fusion coefficients for the low-frequency subbands from source images is obtained via equation (26).

According to equation (10), a series of saliency maps can be obtained from the source images to prepare for the following feature analysis. From equations (11) to (18), the final formula of the linking strength $β$ of m-PCNN for fusing the high-frequency subbands is obtained.

For the high-frequency component fusion, the NSML is computed via equations (27) and (28). Then, NSML in high-frequency subbands are input to m-PCNN to motivate the neurons. And then, the high-frequency images with large firing times are selected as the final high-frequency coefficients of the fused image.

The new fused coefficients of subimages are used to reconstruct the fused image by taking an inverse NSCT.

Figure 6.

Proposed image fusion algorithm.

Experimental analysis

Linking strength $β$ of m-PCNN

This section discusses the influences of the linking strength $β$ of m-PCNN by the point at which the local window slides the image. In this experiment, the size of local window is set to $13 \times 13$ . Three marked sample points are collected from Figure 7.

Figure 7.

Source image.

Figure 8(a)–(e) shows the five features of sample point 1. The linking strength $β$ in Figure 8(f) is easy to be found that the characteristic of different features are considered. From Figure 8(d) and (e), the feature of image sequence is seriously disturbed by the noise, but the proposed method can effectively and accurately identify the distribution of the focus position.

Figure 8.

Sample point 1: (a) feature 1, (b) feature 2, (c) feature 3, (d) feature 4, (e) feature 5, and (f) $β$

Figure 9(a)–(e) shows the five features of sample point 2. Feature 4 and 5 produce ambiguous results, while the proposed method eliminates the difference of features.

Figure 9.

Sample point 2: (a) feature 1, (b) feature 2, (c) feature 3, (d) feature 4, (e) feature 5, and (f) $β$

Figure 10(a)–(e) shows the five features of sample point 3. It can be seen from the figures that these features produce consistent results. Furthermore, compared with the result in Figure 14(h), the fusion results show the validity of the proposed saliency analysis method.

Figure 10.

Sample point 3: (a) feature 1, (b) feature 2, (c) feature 3, (d) feature 4, (e) feature 5, and (f) $β$

Parameter setting

Seven excellent fusion algorithms are compared with the proposed algorithm. All the methods based on NSCT use the “9-7” wavelets in scale decomposition and “pkva” wavelets in NSDFB.

Method 1: The direct average method.³⁰

Method 2: PCA-based method.³¹

Method 3: Wavelet-based algorithm WT-AVG-MAX in which the averaging is used for the low-frequency subbands and maximum selection rule for the high-frequency subbands. The decomposition level is set to five and the wavelet filter is set to “db4.”³²

Method 4: Shift-invariant discrete wavelet transformation is based on algorithm with averaging and maximum for low- and high-frequency subbands. “db4” wavelets and five-level decomposition are adopted.³³

Method 5: The NSCT algorithm NSCT-AVG-MAX, in which the low-frequency subbands used the method of averaging and the high-frequency subbands used the method of absolute maximum choosing scheme. The decomposition level is $[4, 8, 8, 16]$ .³²

Method 6: In NSCT domain NSCT-AVG-PCNN, the averaging scheme and the PCNN scheme are used for low-frequency coefficients and high-frequency coefficients, respectively, and the NSCT decomposition level is $[1, 2, 3, 4]$ and the PCNN parameters are $a_{L} = 0.06931$ , $a_{T} = 0.2$ , $β = 0.2$ , $V_{L} = 1$ , $V_{T} = 20$ , $n = 200$ , and $W = [\begin{matrix} 0.707 & 1 & 0.707 \\ 1 & 0 & 1 \\ 0.707 & 1 & 0.707 \end{matrix}]$ .³⁴

Method 7: The method based on NSCT used the spatial frequency–motivated PCNN model NSCT-AVG-SFPCNN with the decomposition level $= [1, 2, 3, 4]$ and PCNN parameters are $a_{L} = 0.06931$ , $a_{T} = 0.2$ , $β = 0.2$ , $V_{L} = 1$ , $V_{T} = 20$ , $n = 200$ , and $W = [\begin{matrix} 0.707 & 1 & 0.707 \\ 1 & 0 & 1 \\ 0.707 & 1 & 0.707 \end{matrix}]$ . The average rule and SF-PCNN rule are used for the low- and high-frequency subbands, respectively.³⁴

Method 8: The proposed fusion method, where the NSML and m-PCNN are used for high-frequency subbands and the VPE is used for low-frequency subbands with the decomposition level $= [3, 3, 3, 4]$ and m-PCNN parameters $σ = 1.0$ , $a_{T} = 0.012$ , $n = 200$ , $V_{T} = 4000$ , and $W = [\begin{matrix} 0.1091 & 0.1409 & 0.1091 \\ 0.1409 & 0 & 0.1409 \\ 0.1091 & 0.1409 & 0.1091 \end{matrix}]$ .

To evaluate the above methods objectively, the entropy (EN), the SD, and Petrovic metric (Q) are regarded as the evaluation indexes.

The entropy is used to reflect the information of image. The entropy of image is defined as follows

E = - \sum_{i = 0}^{L - 1} p_{F} (i) lo g_{2} (p_{F} (i))

(31)

where $L$ is the gray level of image, and $p_{F} (i)$ represents the probability of gray value $i$ in image.

The SD is used to measure the contrast of image. The larger the SD, the higher the contrast of an image

SD = \sqrt{\frac{\sum_{i = 1}^{M - 1} \sum_{j = 1}^{N - 1} {(F (i, j) - μ)}^{2}}{M \times N}}

(32)

where $M$ and $N$ are the width and height of the image, respectively, and $μ$ is the mean value of the image.

$Q^{AB / F}$ presents how much edge information is retained from the source image into fusion information. It is given as follows

Q^{AB / F} = \frac{\sum_{i = 1}^{M} \sum_{j = 1}^{N} Q^{AF} w^{A} (i, j) + Q^{BF} w^{B} (i, j)}{\sum_{i = 1}^{M} \sum_{j = 1}^{N} w^{A} (i, j) + w^{B} (i, j)}

(33)

where $Q^{AF} (x, y) = Q_{g}^{AF} (x, y) Q_{a}^{AF} (x, y)$ , where $Q_{g}^{AF} (x, y)$ and $Q_{a}^{AF} (x, y)$ denote the edge strength and orientation values. If the $Q^{AF} (x, y)$ is equal to 1, it means that the fused image has no loss of information.

Performance analysis

In order to confirm the performance of the proposed algorithm, three different groups of microscopy images have been used to test in this section which are shown in Figures 11, 12 and 14. All source images have the same size of $512 \times 384$ . Each group contains 32 multi-focus images and all the images are supposed to have been registered. Figures 13, 15 and 16 served as their corresponding fused images. From subjective analysis of all these fused results, it is easy to find that the proposed algorithm is able to retain more details of weak, light areas than other algorithms.

Figure 11.

Source images for experiment 1.

Figure 12.

Source images for experiment 3.

Figure 13.

The fusion images of experiment 2: (a) method 1, (b) method 2, (c) method 3, (d) method 4, (e) method 5, (f) method 6, (g) method 7, and (h) method 8.

Figure 14.

Source images for experiment 2.

Figure 15.

Fusion images of experiment 1: (a) method 1, (b) method 2, (c) method 3, (d) method 4, (e) method 5, (f) method 6, (g) method 7, and (h) method 8.

Figure 16.

The fusion images of experiment 3: (a) method 1, (b) method 2, (c) method 3, (d) method 4, (e) method 5, (f) method 6, (g) method 7, and (h) method 8.

The fusion images of the first-group microscopy images are shown in Figure 15. Figure 15(h) shows the result of the proposed algorithm. Figure 15(a)–(g) shows the fusion results based on the other seven algorithms described above for comparison. It is obvious that the proposed algorithm can achieve better visual effect than others. In Figure 15(a) and (b), the contour of the object is difficult to be identified since the fused images have low contrast. As shown in Figure 15(c), the fusion result of method 3 shows many artifacts around slope areas because the wavelet takes shift-invariance. Figure 15(e) and (f) indicates that the NSCT methods show better performance in fusion images compared to wavelet method. However, not all useful information of the source images can be retained in fusion result. In Figure 15(g), although the details of the fused image has been improved, the contrast of the source images is not well protected. As a result, the proposed fusion algorithm not only keeps most of the useful information of source images but also reduces the artifact during the whole fusion process.

The microscopy images of the second experiment are shown in Figure 14. The metallic surface has the property of high light. Figure 13(a)–(h) shows the fused results of different fusion algorithms. It is clear that Figure 13(a) and (b) generated by method 1 and method 2 do not contain the information in source images at the same position. In Figure 13(c), the information is significantly distorted. The NSCT results (Figure 13(e) and (f)) contained artifacts in the edge regions. In Figure 13(g), although the fusion result generated by method 7 looks reasonable to the human eyes, the contrast in slope regions is lower than that in the image obtained by the proposed algorithm. These results indicate that the proposed algorithm can retain the contrast information more effectively, meanwhile, the detailed information of the metal profile is better preserved than others.

The rest of the microscopy images are shown in Figure 12, and the fusion images based on the above algorithms are shown in Figure 16. It shows that the region labeled by a red rectangle has lower contrast from the source images. Methods 1 and 2 cannot refine the detailed information of the original images. In Figure 16(c), although the contrast of the fused image has improved, there exists jaggy artifact in the edge of image due to the pseudo-Gibbs effect. In Figure 16(d) and (e), the fused ones suffer from the problem of artifacts. Fused results based on method 6 are blurred, such as the slope in the red box. Compared to Figure 16(g), the final result based on the proposed algorithm has much more reasonable contrast level and better visual effects. In Figure 16(h), the proposed algorithm well retains the details of the white areas in the original images. In summary, comparison results indicate that the proposed algorithm is more clear and more accurate than others in some lightless areas.

It is not enough to evaluate the fusion results only using the subjective evaluation method, hence three different performance measures are used for quantitative analysis of results. The results of different fusion algorithms are shown in Table 1. The SD values of the different fusion algorithms show that the proposed fusion image has better contrast than others. From the results of the second index, it is easy to find that the proposed fusion algorithm gets higher value of entropy than others. Another metric value is $Q^{AB / F}$ , and the higher $Q^{AB / F}$ values of the proposed algorithm indicate that the proposed algorithm not only preserves more information from the source images but also enhances the clarity of fusion image than others. Therefore, the proposed algorithm can achieve higher performance and obtain better objective evaluation results.

Table 1.

Objective evaluation of fusion image quality.

Dataset	Metrics	Method 1	Method 2	Method 3	Method 4	Method 5	Method 6	Method 7	Method 8
Experiment1	EN	5.6109	6.8995	6.9899	6.9401	6.9422	6.9337	6.9928	7.0384
	STD	33.4182	45.7017	46.5272	49.3547	46.2197	46.0014	47.0019	56.6205
	$Q^{AB / F}$	0.2044	0.2346	0.2461	0.2804	0.2813	0.2538	0.3049	0.3407
Experiment2	EN	4.6201	6.9154	7.1185	6.7919	7.0930	6.9919	7.0050	7.1360
	STD	38.8004	45.5752	46.8527	39.5843	42.3945	44.9325	45.7171	47.1498
	$Q^{AB / F}$	0.2905	0.3457	0.3279	0.3577	0.3826	0.3750	0.3985	0.3833
Experiment3	EN	4.5812	7.1069	7.4195	7.5089	7.3501	7.3158	7.4326	7.6772
	STD	35.8225	59.0622	60.6237	55.6363	59.8994	59.4623	60.7507	61.5225
	$Q^{AB / F}$	0.2419	0.2915	0.2918	0.3291	0.3232	0.3143	0.3193	0.3292

Conclusion

In this article, a new microscopy image fusion algorithm based on saliency analysis and adaptive m-PCNN in NSCT domain is proposed. The NSCT is used to decompose each of source images into a low-frequency subband and a series of high-frequency subbands. Then, a measurement based on the image VPE, which reflects the slight changes in subimages, is used to evaluate the low-frequency subbands. For the high-frequency subimages, an NSML method is selected to motivate the adaptive m-PCNN neurons. The linking strength of the adaptive m-PCNN is determined by a multi-judgement measure of the clarity in each source image. Finally, the selection rules for different subbands are given based on the weight measures. Experiments are carried out for validation of the proposed algorithm’s effectiveness. The objective evaluation and visual quality results prove that the proposed algorithm is superior to other fusion algorithms in terms of both detail-preserving and contrast-enhancing.

Footnotes

Academic Editor: Shahnawaz Khan

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National High Technology Research and Development Program of China under grant no. 2015AA015408 and the West Light Foundation of The Chinese Academy of Sciences under grant no. 2011180.

References

Chintalapalli

Patil

Nam

. 6DOF wireless tracking wand using marg and vision sensor fusion. Int J Distrib Sens N 2014; 2014(3): 199–204.

Ganasala

Kumar

. Multimodality medical image fusion based on new features in nsst domain. Biomed Eng Lett 2014; 4(4): 414–424.

Madan

Rao

NSV

. A perspective on information fusion problems. Int J Distrib Sens N 2009; 5(1): 4.

Lewis

O’Callaghan

Nikolov

. Pixel- and region-based image fusion with complex wavelets. Inform Fusion 2007; 8(2): 119–130.

Zhu

Hou

Yang

. Block-matching based multifocus image fusion. Math Probl Eng 2015; 2015: 1–7.

Zhao

Shang

Tang

. Multi-focus image fusion based on the neighbor distance. Pattern Recogn 2013; 46(3): 1002–1011.

Chen

. Infrared and visible image fusion via gradient transfer and total variation minimization. Inform Fusion 2016; 31: 100–109.

Zhang

Pei

. A fusion algorithm for infrared and visible images based on saliency analysis and non-subsampled shearlet transform. Infrared Phys Techn 2015; 73: 286–297.

Lee

Zhou

. Comparison of image fusion based on DCT-STD and DWT-STD. Lect Notes Eng Comp 2012; 2195(1): 720–725.

10.

Houkui

. An image fusion method based on the second generation curvelet and stationary wavelet transform. Inform Contr 2012; 41(3): 278–282.

11.

Lang

Hao

. Image fusion method based on adaptive pulse coupled neural network in the discrete fractional random transform domain. Optik 2015; 126(1–2): 96–104.

12.

Zhang

Feng

. Image fusion with internal generative mechanism. Expert Syst Appl 2015; 42(5): 2382–2391.

13.

Cheng

Miao

. A novel algorithm of remote sensing image fusion based on shearlets and PCNN. Neurocomputing 2013; 117(14): 47–53.

14.

Zhao

Guo

Wang

. A fast fusion scheme for infrared and visible light images in NCST domain. Infrared Phys Techn 2015; 72: 266–275.

15.

Luo

Zhang

. A novel algorithm of remote sensing image fusion based on shift-invariant shearlet transform and regional selection. AEU 2016; 70(2): 186–197.

16.

Kwok

Wang

. Combination of images with diverse focuses using the spatial frequency. Inform Fusion 2001; 2(3): 169–176.

17.

Xiao-Bo

Yan

Yang

. Sum-modified-laplacian-based multifocus image fusion method in cycle spinning sharp frequency localized contourlet transform domain. Guangxue Jingmi Gongcheng/Opt Precis Eng 2009; 17(5): 1203–1212.

18.

Wang

Jia

. A novel multi-focus image fusion method using PCNN in nonsubsampled contourlet transform domain. Optik 2015; 126(20): 2508–2511.

19.

Eckhorn

Reitboeck

Arndt

. Feature linking via synchronization among distributed assemblies: simulations of results from cat visual cortex. Neural Comput 1990; 2(3): 293–307.

20.

Agrawal

Singhai

. Multifocus image fusion using modified pulse coupled neural network for improved image quality. IET Image Process 2010; 4(6): 443–451.

21.

Wang

Zhan

. Spiking cortical model for multifocus image fusion. Neurocomputing 2014; 130(3): 44–51.

22.

Ekblad

Kinser

Atmer

. The intersecting cortical model in image processing. Nucl Instrum Methods 2004; 525(1–2): 392–396.

23.

Wang

. Medical image fusion using m-PCNN. Inform Fusion 2008; 9(2): 176–185.

24.

Subashini

Sahoo

. Pulse coupled neural networks and its applications. Expert Syst Appl 2014; 41(8): 3965–3974.

25.

Chai

Guo

. Multifocus image fusion scheme based on features of multiscale products and PCNN in lifting stationary wavelet domain. Opt Commun 2011; 284(5): 1146–1158.

26.

Zhang

Serikawa

. Maximum local energy: an effective approach for multisensor image fusion in beyond wavelet transform domain. Comput Math Appl 2012; 64(5): 996–1003.

27.

Singh

Gupta

Anand

. Nonsubsampled shearlet based CT and MR medical image fusion using biologically inspired spiking neural network. Biomed Signal Process 2015; 18: 91–101.

28.

Zhao

Shao

. Image fusion algorithm based on redundant-lifting nswmda and adaptive pcnn. Optik 2014; 125(20): 6247–6255.

29.

Bandt

Pompe

. Permutation entropy: a natural complexity measure for time series. Phys Rev Lett 2002; 88(17): 174102.

30.

Garcia

Mirbach

Ottersten

. Pixel weighted average strategy for depth sensor data fusion. In: Proceedings of the 2010 17th IEEE international conference on image processing (ICIP), Hong Kong, 26–29 September 2010, pp.2805–2808. New York: IEEE.

31.

Sun

Jiang

Zeng

. A study of PCA image fusion techniques on remote sensing. Proc SPIE 2005; 5985: 739–744.

32.

Yang

. Performance comparison of different multi-resolution transforms for image fusion. Inform Fusion 2011; 12(2): 74–84.

33.

Xin

Wei

. A new multi-source image sequence fusion algorithm based on SIDWT. In: Proceedings of the seventh international conference on image and graphics, Qingdao, China, 28 July 2013, pp.568–571. New York: IEEE.

34.

Xiao-Bo

Yan

Xiao

. Image fusion algorithm based on spatial frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform domain. Acta Automat Sin 2008; 34(12): 1508–1514.

Microscopy image fusion algorithm based on saliency analysis and adaptive m-pulse-coupled neural network in non-subsampled contourlet transform domain

Abstract

Keywords

Introduction

NSCT

PCNN

m-PCNN

Adaptive m-PCNN

Saliency analysis based on image

Trend analysis of saliency map

Proposed fusion scheme–based NSCT and m-PCNN

Rule of low-frequency subbands

Rule of high-frequency subbands

Image fusion algorithm

Experimental analysis

Linking strength β of m-PCNN

Parameter setting

Performance analysis

Conclusion

Footnotes

Declaration of conflicting interests

Funding

References

Linking strength $β$ of m-PCNN