Super-resolution reconstruction for a single image based on self-similarity and compressed sensing

Abstract

Super-resolution image reconstruction can achieve favorable feature extraction and image analysis. This study first investigated the image’s self-similarity and constructed high-resolution and low-resolution learning dictionaries; then, based on sparse representation and reconstruction algorithm in compressed sensing theory, super-resolution reconstruction (SRSR) of a single image was realized. The proposed algorithm adopted improved K-SVD algorithm for sample training and learning dictionary construction; additionally, the matching pursuit algorithm was improved for achieving single-image SRSR based on image’s self-similarity and compressed sensing. The experimental results reveal that the proposed reconstruction algorithm shows better visual effect and image quality than the degraded low-resolution image; moreover, compared with the reconstructed images using bilinear interpolation and sparse-representation-based algorithms, the reconstructed image using the proposed algorithm has a higher PSNR value and thus exhibits more favorable super-resolution image reconstruction performance.

Keywords

Introduction

Super-resolution image reconstruction technology aims to acquire enhanced high-resolution images with better visual effects based on the estimation of the complementary information among images, and thus provides more favorable feature extraction and image analysis results. Super-resolution image reconstruction technology mainly includes two types—reconfiguration-based super-resolution reconstruction (SRSR) and learning-based SRSR.

Using reconfiguration-based SRSR, high-resolution images were mainly acquired based on the established degradation model. A high-resolution image can be degraded to a low-resolution image through a series of operations mainly including the addition of noise, geometrical movement, degraded sampling and optical blurring. Then, using an inverse operation, the low-resolution image can be reconstructed to a high-resolution. Figure 1(a) shows the degradation process from a high-resolution image to a low-resolution image, and Figure 1(b) shows the comparison between the degraded image and the original image.

Figure 1.

Illustration of an image’s degradation analysis. (a) Degradation process of an image from high resolution to low resolution. (b) Comparison between the original image and the image after degradation.

The degradation model can be described as:¹

G_{k} = V_{k} D M f + N_{k}

(1)

where

V_{k}

denotes the geometrical rotation and translation matrix, k denotes the image’s serial number, M denotes the blurring operation matrix, D denotes the down-sampling matrix, N denotes the added noise, f denotes a high-resolution image and

G_{k}

denotes the corresponding low-resolution image.

As stated above, the additive noise, geometrical movements, down sampling and optical blurring were considered in this degradation model (as described in equation (1)). Next, by taking into account the atmospheric disturbance (denoted as $H_{a t m}$ ) and the deformation (denoted as X), the degradation model was expanded as¹:

G_{k} = V_{k} H_{a t m} D X M f + N_{k}

(2)

Although super-resolution image reconstruction gained a lot of useful results, it needs to be further investigated and improved in many aspects such as the algorithm’s temporal and spatial efficiency, the establishment of degradation model, blind SRSR and the construction of learning model.

First, the degradation model plays a decisive role in super-resolution image reconstruction and directly affects noise estimation and fuzzy estimation. The current established degradation models are generally nonlinear; however, simple linear degeneration models were always used in practical reconstruction processes.

Second, since part of blurring process is unknown in SRSR, fuzzy identification should be introduced for the realization of blind SRSR. Meanwhile, since only partial information was given, it is also difficult to estimate point spread function. The statistical model in degradation model generally assumed that linear space was invariable. On the other hand, in practical application, the point spread function was variable in linear space and the blind SRSR algorithm based on nonlinear degradation model needs further improvement.

Third, the established learning model significantly affects the completeness and effectiveness of knowledge acquisition. Markov random field (MRF), image pyramid model and neutral network model are commonly used in the establishment of learning model; however, the established learning models still lacked of all priori knowledge in super-resolution image reconstruction and thus showed relatively poor reconstruction performances. In addition, the realization of learning with advanced features should also be further investigated.

Fourth, SRSR always needs many frames of images for achieving favorable reconstruction performance.² On the one hand, the images always have large sizes, and massive data should be processed in real-time when using multiple images for reconstruction; on the other hand, in some cases, only one low-resolution image was acquired, which should be reconstructed as the corresponding high-resolution image. Therefore, it will be of great significance to investigate the SRSR for a single image.

In early stage, super-resolution single-image reconstruction mainly adopted interpolation reconstruction algorithm. To be specific, for a pixel point, the pixel value was estimated by the pixel information at around this point in image’s spatial domain or frequency domain. The commonly used interpolation reconstruction algorithms include linear interpolation, bilinear interpolation, neighborhood interpolation and bicubic interpolation. Using interpolation, the pixel values of high-resolution image were estimated using the image’s pixel information, while the image’s information was not taken into account, thus easily bringing about a series of questions such as blocking effect, edge blurring and the lost of detailed information.

In order to effectively enrich the reconstructed high-resolution image’s information and acquire better reconstruction effect, the learning-dictionary-based super-reconstruction single-image reconstruction method was developed recently. The dictionaries of low-resolution and high-resolution images were firstly acquired through a great number of sample database construction and sample training; then, based on image degradation model, the corresponding relation between low-resolution dictionary and high-resolution dictionary was established for achieving single-image reconstruction. Yang et al. proposed a multiple-geometric-dictionaries-based clustered sparse coding scheme for SISR.³ Pan et al. propose a single image SR method by learning local self-similarities from the original image itself.⁴

This study first examined the image’s self-similarity and constructed high-resolution and low-resolution learning dictionaries; then, using sparse representation and reconstruction algorithm in compressed sensing, the SRSR for a single image was realized.

Image’s self-similarity

There may exist similar structures on a same scale or different scales in an image. This is thus referred to as the image’s multi-scale structural self-similarity. Figure 2 shows two images with multi-scale structural self-similarity; specifically, Figure 2(a) includes many image blocks of aircrafts and Figure 2(b) includes many image blocks of buildings. These image blocks show self-similarity on different scales. The images with self-similarity are common, and the self-similarity information can provide much additional information for image’s SRSR.

Figure 2.

Images with multi-scale structural similarity. (a) includes many image blocks of aircrafts. (b) includes many image blocks of buildings.

For the images with self-similarity, the image with self-similarity on a same scale can serve as the low-resolution image for realizing the SRSR of local image blocks. Figure 3(a) illustrates the image reconstruction process using the image blocks from multiple images with self-similarity on a same scale and Figure 3(b) illustrates the image reconstruction process using the image blocks from a single image with self-similarity on a same scale.

Figure 3.

Super-resolution image reconstruction using the image blocks (from multiple images (a) and a single image (b)) with self-similarity on a same scale, in which LR denotes low-resolution image and HR denotes high-resolution image.

Image’s SRSR can be achieved using the auxiliary information provided by the self-similarity image blocks with sub-pixel displacement in the low-resolution image.

For the image blocks with self-similarity on different scales, the low-resolution image can be scaled to different extents using pyramid method; then, with the use of K-SVD dictionary learning algorithm, the information of the image blocks with similarity on different scales were added to the reconstructed image,⁵ and the image’s SRSR was achieved based on the auxiliary information provided by the similar image blocks on different scales. Figure 4 illustrates the principle of super-resolution image reconstruction using the image blocks with self-similarity on different scales,⁶ in which LR denote the low-resolution image, HR denotes the high-resolution image, $S_{1}^{L R}$ and $S_{2}^{L R}$ denote the image blocks in the low-resolution image with self-similarity on different scales, $S_{1}^{H R}$ and $S_{2}^{H R}$ denotes the image blocks in the high-resolution image with self-similarity on different scales.

Figure 4.

Super-resolution image reconstruction using the image blocks (from multiple images (a) and a single image (b)) with self-similarity on different scales, in which LR denotes low-resolution image and HR denotes high-resolution image.

Establishment of the learning dictionary

The establishment of the learning dictionary aims to acquire the image with a higher resolution based on the priori knowledge of training samples and to supplement the low-resolution image so as to effectively enhancing image’s super-resolution effect. Images often have a variety of geometric structures, including edges, corners, contours, and textures. Images with different structures can use different transforms to accomplish representation. Several characteristic dictionaries in this paper are applied to the corresponding feature regions in the image respectively. The aim is to improve the effect of low resolution image reconstruction. Figure 5 shows the training process of the learning dictionary.

Figure 5.

Training of the learning dictionary.

In this study, the samples were trained using the improved K-SVD algorithm (the implementation of the algorithm refer to reference⁷), and the high- and low-resolution dictionary pair was constructed by means of sparse representation. Algorithm 1 describes the establishment procedure of the learning dictionary in detail.

Algorithm 1. Detailed establishment procedure of the learning dictionary

Step 1: Select the characteristic samples from the training samples and conduct pre-processing on the selected characteristic image blocks (the characteristic image block was set as 16 pixel * 16 pixel);

Step 2: Classify the selected characteristic image blocks and perform classification and training on the image blocks with self-similarity according to various characteristics including uniformity, detail, texture and complex edge;

Step 3: Rearrange the characteristic image blocks after classification and construct the high-resolution and low-resolution dictionaries using the improved K-SVD algorithm.

Single-image super-resolution reconstruction based on compressed sensing and learning dictionary

After the construction of high-resolution and low-resolution learning dictionaries, this study then conducted SRSR on a single image used on the theoretical framework in compressed sensing and the related reconstruction algorithms.

The observation model of super-resolution image reconstruction model can be written as:

F = D H X + v

(3)

where F denotes the original low-resolution image, X denotes the ideal high-resolution image, D denotes the down-sampling matrix, H denotes the fuzzy matrix and v denotes the additive Gaussian noise.

In order to reduce the correlation between the measurement matrix and wavelet basis, a low-pass filter ( $Φ$ ) was introduced in super-resolution image reconstruction. $Φ$ can be described as the product of the filter matrix G and the transformation matrix,⁸ i.e. $Φ = φ^{H} G φ$ . Assuming that an ideal high-resolution image X can be represented sparsely, i.e. $a_{X} = Ψ^{- 1} X$ (where $Ψ^{- 1}$ denotes the sparse forward transformation and $Ψ$ denotes the sparse inverse transformation), sampling was conducted on the low-resolution image F based on compressed sensing theory, and a new signal can be acquired as⁹:

f = Φ F = Φ D H X + v = \tilde{Φ} X + v = \tilde{Φ} Ψ a_{X} + v = Θ a_{X} + v

(4)

where

\tilde{Φ} = Φ D H

and

Θ = \tilde{Φ} Ψ

. If

Θ

satisfies the restricted isometry property (RIP), i.e.

\tilde{Φ}

and

Ψ

are irrelevant, high-resolution image

\hat{X}

can then be reconstructed based on compressed sensing theory. The reconstruction formula can be written as:

\hat{X} = \arg_{\min}^{X} | | Ψ^{- 1} X | |_{1} \begin{matrix} s . t . \begin{matrix} | | f - \tilde{Φ} X | |_{2} \leq ε \end{matrix} \end{matrix}

(5)

In order to guarantee $Θ$ that satisfies RIP, locally improved Hadamard measurement matrix was used in this study as $φ$ , which was acquired by randomly selected M rows from a N-dimensional improved Hadamard matrix.¹⁰ Do et al. have designed a locally improved Hadamard matrix with high construction velocity and low requirements on storage space:

φ = Q_{M} W_{B} P_{N}

(6)

where

Q_{M}

denotes an operator,

W_{B}

denotes a diagonal matrix (each block is an improved Hadamard matrix) and

P_{N}

denotes a random scrambling matrix for ensuring the irrelevance between

φ

and the original signal.

Due to the scrambling of $P_{N}$ , the image after being sampled by D was irrelevant to $φ$ ; we then performed convolution operation on the fuzzy matrix H and the original image. As reported in reference,⁹ H participated in operation and thus increased the irrelevance between $φ$ and D. Thus, $Θ = Φ D H Ψ$ satisfies RIP.

Based on SRSR observation model and dictionary learning, this study also made appropriate improvements on matching pursuit algorithm so as to achieve the single-image SRSR on the basis of compressed sensing and learning dictionary, as the specific procedure described in Algorithm 2.¹¹ The improved algorithm is projected to the low resolution image on the basis of the degraded model, and obtained the high resolution image by iterative solution.

Algorithm 2. Detailed single-image super-resolution reconstruction procedures based on compressed sensing and learning dictionary

Step 1: Input high-resolution and low-resolution dictionary pair $D_{h}$ and $D_{l}$ , as well as low-resolution image F;

Step 2: Initiate the iteration number (n) and the reconstructed image as 1 and 0, respectively (i.e. make the initiation: n = 1 and $\hat{X}$ = 0);

Step 3: Based on the degradation model ( $F = D H X + v$ ), perform projection observation on the low-resolution image ( $f = Φ F = \tilde{Φ} X$ );

Step 4: Do the iteration and solve $X_{n + 1}$ according to the following equation (7):

X_{n + 1} = Ψ (Γ_{T} (Ψ^{- 1} (X_{n} + {\tilde{Φ}}^{- 1} (f - \tilde{Φ} X_{n}))))

(7)

where

Γ_{T} (x) = {\begin{matrix} x, | x | \geq T \\ 0, | x | < T \end{matrix}

, and

{\tilde{Φ}}^{- 1} = P^{- 1} D^{- 1} Φ^{- 1}

P^{- 1}

denotes the deblurring operator and

Φ^{- 1}

denotes the inversion transformation on locally improved Hadamard matrix.⁷

Step 5: Calculate the sparse representation coefficient $a'$ of the low-resolution feature g with respect to dictionary $D_{l}$ , and construct high-resolution image based on the sparse coefficient $a'$ and $D_{h}$ according to the following equation (8):

a' = \underset{a}{\arg \min} | | g - D_{l} a | |_{2}^{2} + | | Z D_{h} a - w | |_{2}^{2} + λ | | a | |_{1}

(8)

where Z denotes the sparse representation of low resolution images, and

w

denotes the balanced reconstruction error.

Step 6: Update the reconstructed image subset $(\hat{X} = \hat{X} + X_{n + 1})$ with an iteration number of (n + 1);

Step 7: Calculate the residual of the observed signal $(γ, γ = f - \hat{Φ} \hat{X})$ ;

Step 8: Repeat Step 3–Step 7 and finish the iteration-based reconstruction until $γ = σ^{2}$ .

Experimental analysis

Currently, super-resolution image reconstruction results are mainly evaluated subjectively or objectively, which can also be inversely evaluated in the practical applications by analyzing the reconstructed images through feature extraction, target detection and identification.

Subjective evaluation refers to experts’ subjective judgment on super-resolution image reconstruction through direct observation or grading. Subjective evaluation lacks of objective and accurate evaluation parameters; moreover, different experts would give different evaluation results, which were greatly affected by the experts’ visual perception.

Objective evaluation refers to the quantitative analysis of the reconstructed super-resolution image. Since the reconstructed image has no fixed reference image for comparison, some traditional parameters including mean-square error and cross entropy are no longer applicable. Objective evaluation mainly focuses on the quantitative analysis of the reconstructed image, with the common evaluation parameters including peak signal-to-noise ratio (SRNR), information entropy and mean gradient. In this study, PSNR was used as the evaluation parameter.

In the dictionary training experiment, aiming at the different characteristics of the image, the dictionary training experiment of grassland image, road texture image, self similar image and complex scene image is carried out. The training results of the grassland feature image dictionary are shown in Figure 6. Figure 6(a) is a meadow image block and Figure 6(b) is a dictionary of grassland image training. The road texture image dictionary training results are shown in Figure 7. Figure 7(a) is a road image block and Figure 7(b) is a dictionary for the training of road texture images. The self similarity image dictionary training results are shown in Figure 8. Figure 8(a) is a self similar image block and Figure 8(b) is a self similarity image training dictionary. The result of the image dictionary training for complex scenes is shown in Figure 9. Figure 9(a) is a complex scene image block and Figure 9(b) is a complex scene image training dictionary.

Figure 6.

Dictionary training for feature image of grassland. (a) Grassland image. (b) Grassland image training dictionary.

Figure 7.

Road texture image dictionary training. (a) Road image. (b) Road texture image training dictionary.

Figure 8.

Self similarity image dictionary training. (a) Self similar image. (b) Self similarity image training dictionary.

Figure 9.

Image dictionary training for complex scenes. (a) Complex scene image. (b) Complex scene image training dictionary.

In the present study, an image with a size of 256 pixels * 256 pixels was selected and degraded to the image with a size of 128 pixels * 128 pixels. Then, the image was reconstructed using bilinear interpolation, sparse-representation-based SRSR and the proposed method, as the results shown in Figure 10.

Figure 10.

Comparison of the super-resolution reconstructed images using different methods. (a) Degraded image (128 pixels * 128 pixels). (b) High-resolution reference image (256 pixels * 256 pixels). (c) Reconstructed image using bilinear interpolation (256 pixels * 256 pixels). (d) Reconstructed image using SRSR method (256 pixels * 256 pixels). (e) Reconstructed image using the proposed method (256 pixels * 256 pixels).

Figure 10(a) shows the degraded image with a size of 128 pixels * 128 pixels, Figure 10(b) shows the original high-resolution image for reference, and Figure 10(c)–(e) displays the reconstructed images using bilinear interpolation, SRSR and the proposed algorithm, respectively.

It can be observed that the reconstructed high-resolution images using SRSR and the proposed algorithm show better visual effect and image quality than the degraded low-resolution image. Figure 11 shows the local enlarged results of the reconstructed images using different algorithms.

Figure 11.

Local enlarged results of the reconstructed images. (a) Original high-resolution image for reference. (b) Reconstructed image using bilinear interpolation. (c) Reconstructed image using SRSR method. (d) Reconstructed image using the proposed method.

Meanwhile, the PSNR values of the reconstructed images using different algorithm were calculated for evaluation. The PSNR values of the reconstructed images using bilinear interpolation algorithm, SRSR algorithm and the proposed method are 26.32, 29.57 and 29.81, respectively. In order to further verify the reconstruction effect using the proposed algorithm, the degraded image was magnified by two times, four times and eight times, respectively, and the magnified images were then reconstructed using different super-resolution image reconstruction algorithms, as the results listed in Table 1.

Table 1.

Comparison of the PSNR values of the reconstructed images using different super-resolution reconstruction algorithms, during which the degraded image was magnified by two times, four times and eight times, respectively.

Algorithm	PSNR values of the reconstructed images from the degraded image that was magnified by different times
Algorithm	Two times	Four times	Eight times
Bilinear interpolation	26.32	22.21	18.57
SRSR algorithm	29.57	23.53	20.23
Proposed algorithm	29.81	23.67	20.31

SRSR: sparse-representation-based super-resolution reconstruction.

As listed in Table 1, the greater the magnification times of the degraded image, the smaller the value of PSNR of the reconstructed image, suggesting a poorer reconstruction performance. Through comparison, the reconstructed image using the proposed single-image SRSR algorithm based on compressed sensing and learning dictionary shows a largest PSNR value and most favorable SRSR performance.

Conclusions

By introducing compressed sensing theory that was widely applied in image processing into super-resolution image reconstruction, this study achieved favorable SRSR for a single image based on compressed sensing and learning dictionary. In order to reduce the correlation between measurement matrix and wavelet basis, low-pass filter was introduced in SRSR; meanwhile, local Hadamard matrix with fast reconstruction and low requirements on storage space, was selected as the measurement matrix for ensuring the image’ reconstruction satisfies RIP in compressed sensing theory; finally, a single image’s SRSR was achieved through iteration-based reconstruction algorithm. According to the experimental results, the proposed algorithm is superior to bilinear interpolation algorithm and SRSR algorithm in reconstruction performance.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was financially supported by the Scientific Research Program of Sichuan Province, China (No. 17ZA0453), and the Scientific Research Program of Yibin University, Sichuan, China (No. 2016QD12).

References

Zhong

JS.

Research on super-resolution reconstruction algorithm of optical remote image based on sparse representation. PhD Dissertation, Nanjing Normal University, 2013, pp.65–75.

Deng

CZ.

Sparse representation for images and the

applications. PhD Dissertation, Huazhong University of Science and Technology (Wuhan), 2008.

Yang

Wang

Chen

et al . Single-image super-resolution reconstruction via learned geometric dictionaries and clustered sparse coding. IEEE Trans Image Process 2012; 21: 4016–4028.

Pan

Yan

Zheng

Super-resolution from a single image based on local self-similarity. Multimed Tools Appl 2016; 75: 11037–11057.

Pan

SX.

Single image super resolution based on multi-scale structural self-similarity. Acta Automat Sin 2014; 24: 594–603.

Pan

Huang

Super-resolution method based on CS and structural self-similarity for remote sensing images. Signal Process 2012; 8: 859–872.

Yang

Research on image fusion and super-resolution reconstruction algorithm based on compressed sensing theory. IJSIP 2015; 12: 85–94.

Yin

Osher

Goldfarm

et al . Bregman iterative algorithms for ell-1 minimization with applications to compressed sensing. SIAM J Imag Sci 2008; 1: 143–168.

Blumensath

Davies

ME.

Gradient pursuits. IEEE Trans Signal Process 2008; 56: 2370–2382.

10.

Karahanoglu

Erdogan

An orthogonal matching pursuit: Best-first search for compressed sensing signal recovery. Digital Signal Process 2012; 22: 555–568.

11.

Yang

HuaJun

Luo

Study on the super-resolution reconstruction technique for remote sensing image based on compressed sensing. IJSIP 2015; 8: 18.