Abstract
Regarding the problems of insufficient image segmentation intelligence, low compression rate, slow speed for global searching to find the optimal fractal image compression encoding, and bad decoding effect, this article proposes the fractal image compression hybrid algorithm based on convolutional neural network and gene expression programming. Firstly, according to the accurate and fast image classification of deep convolutional neural network and the fast search and matching encoding advantages of gene expression programming, it realizes theoretically the action mechanism of fractal image compression hybrid encoding by combining the convolutional neural network and the gene expression programming; then, it uses the deep convolutional neural network to train and classify the image, and uses the adaptive quadtree segmentation method to segment the classified image, thus generating the domain block and range block classification set. According to the action mechanism of gene expression programming in fractal image compression encoding, it then quickly obtains the optimal solution of fractal image compression encoding by searching and encoding the sub-blocks of range block classification set and the classification set corresponding to the domain. Finally, in the CPU/GPU environment, it conducts the comparative experiment with basic fractal image compression algorithm and fractal image compression algorithm based on convolutional neural network. The experimental results show that this proposed algorithm outperforms similar algorithms in terms of image segmentation speed and accuracy as well as fractal compression encoding speed and compression ratio. Therefore, this algorithm is a fractal image compression algorithm with intelligent segmentation, fast encoding and high compression ratio.
Keywords
Introduction
Fractal image compression is an encoding method completely different from traditional compression technology. Barnsley 1 performs the fractal compression encoding on several specific images and obtains a high compression ratio of 10,000:1. Although its encoding process requires the manual participation, it also shows the great potential of fractal technology in image compression encoding. The existing fractal image compression encoding has the problems of being lack of optimization in segmentation method, long calculation time of fractal compression encoding, and large error in encoding and decoding control. To solve these problems, it requires the strong computing power, intelligent segmentation technology and fast search optimization methods. With the advent of CPU/GPU high-performance computing platforms, its powerful graphics processing capabilities and parallel computing capabilities have made it become a new tool in engineering fields such as image processing. The parallel computing model based on the new platform of CPU/GPU (graphic processing unit), multi-thread, multi-process, and thread hybrid programming method and its application in various fields have become the current research hotspots. The artificial intelligence based on deep learning has made great achievements in recent years, and the Alpha-dog upgraded version Master has consistently defeated 50 top professional GO players in the world, and it has also shown great power in the fields of face and voice recognition, thus all governments greatly increase the input in artificial intelligence, and the deep learning and its application have become the research hotspots. Gene expression programming has its unique advantages in optimization accuracy and convergence speed, and is widely used in image processing and pattern recognition, automatic control, artificial life, machine learning and other fields. Therefore, this paper combines the deep-learning neural network and gene expression programming optimization method on CPU/GPU parallel platform to study the intelligent segmentation and compression hybrid encoding algorithm of fractal image, which has important scientific significance and application value.
Related work
At present, the image compression standard mainly uses the discrete cosine transform (DCT), wavelet transform (DWT) and other technologies, these technologies are mature, but their compression ratio is not high. Fractal image compression technology is a completely different encoding method from traditional compression technology, and it is mainly realized by fractal self-similarity and iterated function system (IFS). For the first time, Barnsley applied the IFS theory to image compression encoding, which achieved a very high compression ratio, but the encoding process required the manual participation. Jacquin 2 proposed a partitioned iterated function system (PIFS) scheme, in which the encoding process can be carried out automatically, but its algorithm operation is huge, resulting in too long encoding time, thus limiting its practicability. 3 Since then, people have proposed various improvement plans for image segmentation, image matching search and encoding, and calculation speed increasing in the fractal compression process.
Fractal image segmentation
Traditional image segmentation methods mainly include threshold method, boundary detection method and regional method. While, the fractal image segmentation mostly uses the block-shaped region segmentation methods such as squares, rectangles, and triangles. For example, Kang et al. 4 divided the domain block and the range block into the squares, and divided all blocks of the image into 72 classes according to the average brightness of the pixels and the corresponding variance of the brightness; Chen et al. 5 divided the image into 360 classes by rectangular blocks; Sun et al. 6 divided it into 72 classes according to the triangle block; Yi 7 proposed an adaptive quadtree partitioning method, which is divided by layers, but its essence is also the square partitioning. Such methods of segmentation by different blocks and performing the matching search among the same class can improve the compression speed. However, it is not conducive to global optimization, and the compression ratio and accuracy of the image will be decreased exponentially. In recent years, the graph-based and cluster-based interactive super pixel segmentation method 4 has emerged, and its representative algorithms include that: Boykov proposed the Graph Cuts algorithm of the image, and Mustafa et al. proposed the multi-scale segmentation wide-band scene reconstruction feature detection algorithm.8,9 This kind of algorithm has the group pixels with similar features, so that the image block contains the image content information that is not possessed by a single pixel, which improves the accuracy of segmentation; however, the semi-automatic interaction method is time-consuming and laborious, which affects the efficiency of the algorithm. At present, the image segmentation mainly adopts the semi-automatic interaction mode, while the automatic block segmentation effect is not ideal. Image segmentation is a key step in image processing (including fractal compression encoding); however, there is no fast automatic segmentation method or general theory for image segmentation. With the introduction of new deep learning artificial intelligence technology and gene expression programming optimization method, it provides a good theoretical and technical basis for the establishment of image intelligent segmentation theory and method. To this end, we use the convolutional neural network (CNN) deep learning to study the image segmentation and construct an image intelligent segmentation method.
Fractal image encoding
In order to improve the encoding quality of fractal images and reduce the search time and range, people conduct the hybrid encoding of discrete transform, genetic algorithm and other methods with fractal images. (1) Commonly used method includes the hybrid encoding of DCT and wavelet transform (WT) with fractal images. They concentrate the image information mainly into the low frequency part of frequency domain through the conversion from time domain to frequency domain, which reduces the encoding range and improves the encoding efficiency and quality. For example, Wang et al. 10 proposed a DCT domain fractal image compression encoding method, which maps the images to the DCT domain for encoding; Karthikeyan et al. 11 proposed a fractal image compression method based on wavelet analysis. Soyjaudah and Jahmeerbacus 12 proposed a wavelet domain fractal image compression algorithm based on quadtree. Their purpose is to map all the images to the wavelet domain, and then perform the block segmentation on the wavelet domain, thus reducing the search amount and range, which can obtain a better compression ratio. However, the cosine and wavelet transformations do not make full use of the similarity between sub-bands in the wavelet domain, and the compression ratio of hybrid image encoding is discounted. (2) Another hybrid encoding method is to combine the optimization methods such as genetic algorithm, ant colony algorithm and gene expression programming with fractal images, and use the optimizing precision of optimization algorithm to automatically classify, thus realizing the matching search within the class, improving the encoding speed, and reducing the blocking effect. For example, Soyjaudah et al. 12 proposed a fractal image compression method based on genetic algorithm by using the fractal image compression of self-organizing map 13 ; Zhao Deping 14 proposed a fast fractal image compression method based on ant colony algorithm; Menassel et al. 15 proposed an improved fractal image compression using wolf pack algorithm, which divides the image space into blocks, and the sliding wolf explores in this space to find other similar smaller blocks. They construct an optimization algorithm for classification search matching, which effectively overcomes the shortcomings of block-based search method to a certain extent and improves the compression ratio and fidelity.16,17 Li et al. 18 applied the gene expression programming to fractal image compression encoding and achieved a better compression ratio. In addition, there are many applications such as niche, BFGS and other optimization algorithms and strategies for hybrid encoding of fractal images.19,20 (3) Fractal hybrid encoding improves the encoding speed and the solving accuracy to a certain extent, and the blocking effect produced by decoding is improved. However, the fractal dimension, affine transformation and its parameters of fractal compression are still the key factors of fractal image compression. Chamorro-Posada 21 proposed the compression fractal dimension to achieve high compression ratio by improving the search accuracy; Omari et al. 22 proposed a new image compression mechanism, which constructs a compression map by using the relationship between rational numbers and corresponding quotients and achieves a high compression ratio while maintaining the image quality. Swalpa proposed a new method for calculating the fractal encoding affine parameters to reduce the computational complexity of fractal encoding. (4) The huge computation amount of fractal image encoding still limits the encoding speed and time.23,24 To this end, people have proposed the parallel algorithm for fractal image compression on a variety of parallel platforms, which greatly improves the compression encoding speed and reduces the encoding time. For example, Wang and Zheng 25 proposed a distributed parallel fractal image compression algorithm in the cluster environment; Li et al. 26 proposed the adaptive image fractal compression parallel algorithm based on relative gradient; in multi-core PC platform, Li et al. 27 proposed the GEP-based parallel algorithm for fractal image compression. In addition, other scholars have proposed many different parallel algorithms for fractal image encoding.28,29 Although the parallel encoding of fractal images in parallel platforms such as multi-core PC and clusters can reduce the encoding time, it still does not meet the actual needs of people. The emergence of CPU/GPU systems has provided us with an opportunity to solve this problem.
Gene expression programming
GEP is a new evolutionary computation algorithm first proposed by Portuguese scholar Candida 30 in 2001, and it is a new member of genetic algorithm family. It has strong function discovery ability and high efficiency, and does not need any prior knowledge during function discovery, with no need to pre-store the type of function model, thus avoiding the blindness of pre-selecting the function types in traditional algorithm modeling. After Candida Ferreira proposed the GEP algorithm, it has attracted the research of many scholars from the whole world. Among them, Xu et al. 31 used the GEP algorithm to study the classification rules, and Zhu et al. 32 applied GEP to one-dimensional chaotic mapping. In China, the research team led by Professor Tang Changjie at Sichuan University and the research team of Professor Yuan Changan at Guangxi Normal University have proposed many more efficient and more adaptable algorithms for different applications. 33 Jedrzejowicz and Wierzbowska 34 proposed the gene expression programming in large data set classification parallel environment. Li et al. 27 applied GEP to the application research of fractal image compression, and used GEP's efficient search ability to quickly search the self-similarity of fractal images, which achieved good results. GEP combines the advantages of both genetic algorithms and genetic programming, and is two to four orders of magnitude more efficient than traditional genetic programming methods when solving the complex problems. It will be widely used in function optimization, combinatorial optimization, image processing and pattern recognition, artificial life, genetic programming, automatic control, machine learning, and production scheduling problems. To this end, we integrate GEP and deep CNNs on the CPU/GPU parallel platform to deeply study the fractal image compression encoding, so as to build a new automatic fast parallel algorithm with high compression ratio.
Deep learning
In 2006, Hinton and Osinde Rosthe at University of Toronto proposed the concept of deep learning, which opens the prelude to the development of deep learning. As a typical algorithm for traditional training of multi-layer networks, BP algorithm actually contains only a few layers of networks, and its training method is not ideal. Hinton and Osinde Rosthe 35 proposed the unsupervised greedy layer-by-layer training algorithm based on deep belief networks (DBNs), which brings hope to solving the optimization problem related to deep structure. The CNN proposed by Lecun et al. 36 is the first real multi-layer structure learning algorithm, which uses the spatial relative relations to reduce the number of parameters, thus improving the BP training performance. Since then, there have been many deformation structures in deep learning, such as denoising auto encoder, DCN, sum product, etc.37,38 BalléLaparra and Simoncelli 39 at New York University proposed the end-to-end optimized image compression (EEOIC), and this optimization method has better compression ratio than standard jpeg and jpeg 2000 compression methods; in addition, many scholars have combined the CNNs with fractal image compression and have also achieved some research results.40–43
In summary, various improved algorithms and hybrid algorithms for fractal image compression encoding being proposed mostly use the fixed, simple automatic or semi-automatic image segmentation methods, resulting in too long encoding time and poor decoding image quality, and they are all facing the PC or PC cluster platform, with low encoding and decoding speed and compression efficiency. Therefore, on the CPU/GPU high-performance parallel platform, we use the fast and accurate image classification characteristics of deep CNNs to study the intelligent segmentation method of fractal images, so as to realize the intelligentization of image segmentation. In fractal image encoding, by using the advantage of high optimizing precision and fast convergence of gene expression programming, we apply it to the solution of all transform parameters of fractal image compression encoding IFS, thus improving the solving speed and accuracy.
Basic ideas of CNN and GEP fusion encoding
On the CPU/GPU platform, the fractal image compression encoding based on deep CNN and gene expression programming uses a two-stage processing method: the first is the fractal image segmentation of deep CNNs; the second is that the gene expression programming performs the quick searching and encoding on the range block classification set and the domain block classification set, and finds the optimal solution of fractal image compression IFS.
Fractal image segmentation of deep CNN based on CPU/GPU
Commonly used deep learning models include Restricted Boltzmann Machines (RBMs), AutoEncoders (AE), CNNs, and DBNs. DBN is an algorithm based on RBM and AE, which is a stack of multi-layered unsupervised RBM and a supervised BP network. The greedy learning weight matrix of each layer needs a longer training time; CNN is a supervised learning algorithm with shared convolution kernel, and it can extract the required features by convolution and better handle 2D and high dimensional data, such as 2D/3D image classification. However, when the CNN network level is too deep, using the BP propagation to modify the parameters will make the parameters near the input layer change slowly. Therefore, the algorithm of this paper selects the CNN algorithm with five convolution layers and five maximum pooling layers for image classification.
Image classification principle of CNNs
The deep CNN consists of multiple convolution layers, pooling layers, fully connected layers and upsampling layers, and its image classification process includes the forward network calculation and the reverse error propagation. This algorithm uses the deep CNN consisting of five convolution layers, five maximum pooling layers, one fully connected layer and one upsampling layer.
Forward network calculation
Assuming that the size of the image B is hb × wb and the size of convolution kernel K is hk × wk (where hb ≥ hk, wb ≥ wk), then the size of convolution map matrix G = B ⊕ K is (hb − hk + 1) × (wb − wk + 1).
The calculation of its each layer is as follows
Network calculation of the jth(j = 1, 3, 5, 7, 9) convolution layer. G1 layer: Input n × n image data matrix B, then conduct the convolution with h narrow convolution kernels Network calculation of the kth (k = 2, 4, 6, 8) pooling layer. P2 layer: Perform the pooling operation after the convolution, the size of pooling window is r × r, each n2 × n2 feature map generates a pooling map of n2/2 × n2/2, and a total of h1 pooling maps are generated. P4 layer: Perform the pooling operation again, the size of pooling window is r × r, each n4 × n4 feature map generates a pooling map of n4/2 × n4/2, and a total of h3 pooling maps are generated. The forward calculation formula of pooling layer is
Calculation of fully connected layer. Expand
Reverse error propagation
4. Loss function. Find the best weight (w) and the bias (b) to minimize the value of loss function, thus the loss function J is a function about w and b, where w and b are the set of all weights and deviations in the network. Therefore, assuming that the loss function of the network is
5. Weight correction of fully-connected layer. The reverse derivation of the fully-connected layer of CNN is consistent with the reverse derivation of BP neural network.
6. Weight correction of convolution layer. The convolution layer is obtained by calculating the network weight through the intermediate convolution kernel. The correction of the weight is based on the error of convolution layer, uses the ReLU excitation function, and is modified according to the weight correction formula of reverse neural network. Therefore, the weight and bias (threshold) correction formula is
Segmentation of fractal image range set and domain set
In the CPU/GPU parallel system, after the convolution, pooling, full connection, and upsampling operations of CNN, the image is classified and restored to the same size as the original image. On this basis, the classified images are segmented into non-overlapping range block sets and overlapping domain block sets, and the parallel adaptive quadtree segmentation method is used for segmentation.
The main thread divides the classified image into four equal non-overlapping sub-blocks, and one sub-block is reserved by itself, and the other three sub-blocks are distributed to other slave threads for processing; the main thread divides the reserved sub-block into four grand-subblocks of equal size, one is reserved, and the other three grand-subblocks are distributed again to other slave threads. Similarly, the other three sub-blocks are also divided into four equal parts by their respective slave threads, and one grand-subblock is reserved by itself, and the other three parts are distributed to other slave threads; until the size of sub-block being divided by each thread satisfies the conditions. Image segmentation generates a categorical range block pool; Similarly, the main thread divides the classified image into four equal non-overlapping sub-blocks, and one sub-block is reserved by itself, and the other three sub-blocks are distributed to other slave threads for processing; the main thread again divides the reserved sub-block into four grand-subblocks of equal size, one is reserved, and the other three grand-subblocks are distributed again to other slave threads. Similarly, the other three sub-blocks are also divided into four equal parts by their respective slave threads, and one grand-subblock is reserved by itself, and the other three parts are distributed to other slave threads; until the size of sub-block being divided by each thread satisfies the conditions. Image segmentation generates a categorical domain block pool.
Fractal image classification and segmentation algorithm for CNNs
Image segmentation is a key step in fractal image compression; however, all existing image segmentation methods have the problems of insufficient automation and intelligence and low image recovery accuracy. To this end, we classify the fractal image through a five-layer CNN in CPU/GPU parallel system environment, and use the adaptive quadtree to automatically segment the classification results. The image classification is processed in parallel by CNN, and it creates multiple threads through the multi-core CPU, and each thread is responsible for transmitting the image convolution processing data to the GPU, and performing the scheduling and data receiving of GPU array processor, respectively. The original image is classified in parallel by CNN on the GPU and its processing includes five convolution layers, five maximum pooling layers, one fully connected layer and one upsampling layer. With the automatic segmentation of image range block and domain block, the classified images are divided into the non-overlapping range block and the overlapping domain block by CPU multi-threading through the adaptive quadtree, thus achieving the parallel segmentation of fractal images. The flow chart of its classification and segmentation algorithm is shown in Figure 1.

Automatic segmentation flow chart of fractal image convolutional neural network.
As shown in Figure 1, the GPU module is the convolution process, including G1, G2, G3, G4, G5 convolution layers and P1, P2, P3, P4, P5 pooling layers, 1 fully connected layer, and 1 upsampling layer. The convolution layers G1, G2, G3, G4, and G5 are calculated according to the parameters given in formula (1) and the module; the P1, P2, P3, P4, and P5 pooling layers are calculated according to the parameters given in formula (2) and the module, and the fully connected layer is calculated according to the parameters given in Formula (3) and the module; the BP reverse error calculation as well as the weight and the convolution weight correction of fully connected layer are calculated according to formulas (4) to (6) respectively. UpSampling restores the feature map of full convolution layer (FullConv) back to the original image.
Fractal image compression based on gene expression programming
Basic principle of fractal compression
If the norm
A grayscale image has the two-dimensional array of grayscale, i.e. z = f(x,y), where (x, y) is the spatial position and z is the grayscale value at the corresponding location. In order to adapt to the processing of grayscale images, the two-dimensional affine transformation is extended to three-dimensional affine transformation. Three-dimensional affine transformation for grayscale images ω:
This affine transformation can then be viewed as the combination of the 2D affine transformation on the (x, y) plane and the grayscale transformation in the Z direction. Where si controls the contrast of grayscale, and oij controls the offset of grayscale. Because the sub-block set in range block classification pool is in the same class with the sub-graph of corresponding sub-block set in domain block classification pool after the classification by the above CNN, and it has self-similarity. The grayscale compression factor sij and the grayscale offset factor oij will not be considered in the matching search. The definition and collage theorem of IFS are described in Jedrzejowicz and Wierzbowska. 34
Action mechanism of GEP in fractal image compression
Gene expression programming, like genetic programming, is developed based on genetic algorithms. However, it uses a new individual description method different from genetic algorithm, and its formal description requires two kinds of symbols: terminators and functions. The formal definition of GEP is as follows:
The chromosomes of gene expression programming are composed of K-expressions. 34 The definition, gene head, tail and GEP basic algorithm of K-expression are described inJedrzejowicz and Wierzbowska. 34
Approximation solution mechanism of GEP
The mechanism of GEP approximation to find the optimal solution is: assuming that A is a color or grayscale image, according to the self-similarity and collage theorem of fractal image and the powerful evolutionary search ability of GEP algorithm, the true value solution of original image is quickly approached in the space R3. However, in the approaching process, the following three problems must be solved.
First, the encoding representation of GEP gene and chromosome; second, the fractal compression fitness function shall be designed to select the good individuals and accelerate the evolution of the system and the solving of the problem; third, a finite number of individuals (chromosomes) are randomly generated to quickly approach the true value solution of original binary image after nine genetic evolution operations, such as individual selection, mutation, reversal, interpolation, root interpolation, gene transformation, single-point recombination, two-point recombination and gene recombination.
Encoding of genes and individuals (chromosomes)
Gene encoding
As known from formula (2), there is a set of finite compression transforms ωij that make up an IFS Fi = [ωi1, ωi2, ωi3, … , ωin]. The ωij compression transform is mapped to the jth gene of the ith chromosome of GEP, Fi represents the ith IFS (chromosome), and several IFSs Fij form a population (i, j = 1, 2, … , n). The function set of the gene is F = {+, −, *, A}; the set of variables is T = {aij, bij, cij, dij, eij, fij, xij, yij | aij, bij,cij, dij, eij, and fij satisfies the condition of formula (2), x ∈ [0, 640], y ∈ [0, 480]}. Assuming that the gene has a head length of h = 9, n = 2, and the tail length t = 10, then the total length of the gene is 19 bits. Each gene can be expressed as the following K-expression
Several IFSs form a population
Then the encoding of an IFS (chromosome) is
Fractal compression fitness function design
The fitness of the IFS of GEP image fractal compression is mainly shown in: the similarity between the decoded image and the original image, the small compression factor, and the small number of compressed affine transformations.
The fitness function is defined as follows
The cross similarity metric formula is as follows
The compression function c(λ) is defined as follows
The affine transformation number L(ξ) function is defined as follows
Genetic operation of GEP
GEP adopts the linear isometric encoding, and the genetic operation process satisfies the principle of “the length of the gene keeps unchanged and only the terminator can appear in the tail,” and the genes of the progeny chromosomes produced by the inheritance are still legal. Its genetic basic operators include nine genetic evolution operations such as selection, mutation, reversal, interpolation, root interpolation, gene transformation, single-point recombination, two-point recombination and gene recombination. Due to limited space, they are not described here, please refer to Hinton and Osinde Rosthe. 35
Hybrid algorithm of deep CNN and gene expression programming
According to the design idea of ‘Image classification principle of CNNs’ section, through the fractal image parallel segmentation of deep CNN and the fractal image compression encoding of gene expression programming, the fractal image compression hybrid algorithm based on deep CNN and gene expression programming is obtained, and its steps are as follows:
Input: Read the original image data; initial parameters of convolution training; initial parameters of gene expression programming such as population size, gene head and tail length, number of genes, maximum iteration number, termination iteration fitness value, mutation rate, interpolation rate, and recombination rate.
Output: Output the optimal coded IFS.
Begin:
Step 1: Create multiple (set k = 2, 4, 8, 16…) threads from the multi-core CPU, one of which is the main thread and the other is the slave thread.
Step 2: Read the original image data, and send the image data to the corresponding array computing core of the GPU by multiple threads.
Step 3: Call the CNN fractal image classification and segmentation algorithm. The image classification is processed in parallel by CNN, which creates multiple threads through the multi-core CPU, and the multiple threads are responsible for transmitting the image convolution processing data to the GPU, and respectively performing the scheduling and data receiving on GPU array processor. The CNN is used to classify the original image in parallel in GPU, and its processing includes five convolution layers, five maximum pooling layers, one fully connected layer and one upsampling layer, and finally the Soft-max layer obtains the classification result, wherein there is one ReLU activation function layer behind each convolution layer, and the pooling layer adopts the maximum pooling.
Step 4: The main thread divides the classified image B of the size
Step 5: The main thread divides the classified image B of the size
Step 6: Initialize the population. In the main thread, input the initial parameters such as population size, gene head and tail length, number of genes, maximum iteration number (maxg), termination iteration fitness value (minf), mutation rate, interpolation rate, recombination rate, and distribute them to each slave thread.
Step 7: Adopt the dynamic task allocation–work pool parallel search and encoding method. The main thread is responsible for the task management and allocation of range block classification pool. First, the complete ith (i = 1, 2… , m) class range sub-block set is sent to the kth (k = 1, 2, … , p) slave thread according to the classification order. If the distribution cannot be fully completed in one time, the remaining range sub-block sets (p + 1, … , m) are allocated as required by slave thread.
Step 8: After the kth (k = 1, 2, … , p) slave thread receives the first range sub-block set, the sub-blocks are grouped together and the sub-block diagrams Rij(i = 1, 2… , m;j = 1, 2… , g) are taken out one by one for encoding with the corresponding domain sub-block set Dij(i = 1, 2… , n; j = 1, 2… , q) in domain block classification pool.
Step 9: In the encoding process, each thread calculates the fitness value according to formulas (11) to (14).
Step 10: The kth (k = 1, 2… , p) slave thread sends nine basic operators of gene expression programming used in the programming process (including selection, mutation, reversal, interpolation, root interpolation, gene transformation, single-point recombination, two-point recombination, and gene recombination) and the calculation of individual fitness values to the k*10 kernels of GPU for parallel computations: do { U
k
1 nuclear calculation: selection operation; U
k
2 nuclear calculation: mutation operation; U
k
3 nuclear calculation: reversal operation; U
k
4 nuclear calculation: interpolation operation; U
k
5 nuclear calculation: root interpolation operation; U
k
6 nuclear calculation: gene transformation; U
k
7 nuclear calculation: single-point recombination; U
k
8 nuclear calculation: two-point recombination; U
k
9 nuclear calculation: gene recombination; U
k
10 nuclear calculation: calculation of individual fitness values; } while (fitness<=minf or gen<maxg)
Step 11: k(k = 1, 2… , p) slave threads will complete the encoding of a range block Rij, and obtain the parameters of classification sub-block sets such as the transform parameters, the domain block Di and the compression transform ωij corresponding to the range block Rij, which are then sent back to the main thread. The main threads form them into the complete and optimal coded IFS according to the original image classification order.
Step 12: The main thread outputs the optimal coded IFS.
Experimental results and analysis
Experimental environment and parameter settings
The hardware environment is: Intel DBS2600CW2 server board, dual XEON W3520 processor, GeForce RTX 2080 8 G memory card, 128GB memory, 80GB SSD and Gigabit Ethernet card. The software is: Windows 7, matlabR2014a, VS2013, CUDA 7.0, and Open CV 3.1.
The experiment uses the Camvid image set for training. Wherein, there are 2169 training sets, 406 evaluation sets and 861 test sets. According to the characteristics of image set, the original 32 classification is roughly divided into 13 categories, namely the cars, roads, trees, pedestrians, sky, bicycles, children, baby carriages, traffic lights, houses, street lamps, sidewalks, and billboards. Since the number of classifications is reduced, the loss weights of different categories integrate the number of each classification according to the training data set, and the weight wi corresponding to the ith classification is equal to the pixel number ni/total pixels in this classification, and the loss weight of each classification is as shown in Table 1. Initial parameters of convolution training. Including setting the mini-batches size as 256, the initial learning rate is 0.01, and the learning rate is divided by 10 for every 10,000 iterations. The method used for weight initialization is the Gaussian distribution with a mean of 0 and a variance of 0.01, and the maximum number of iterations is set at 50,000. Wherein, the parameters of convolution layer 1 are size = 3, pad = 1, stride = 1, and num = 64; the parameters of convolution layer 2 are size = 3, pad = 1, stride = 1, and num = 128; the parameters of convolution layer 3 are size = 3, pad = 1, stride = 1, and num = 256; the parameters of convolution layer 4 and convolution layer 5 are both size = 3, pad = 1, stride = 1, and num = 512; the parameters of pooling layer 1 to pooling layer 5 are all taken as MAX = 2 × 2, size = 2 × 2, and stride = 2; the upsampling layer restores the image to the original image size; the fully-connected layer takes num = 4096; the Soft-max layer obtains the classification results. The main parameter settings of gene expression programming (GEP) are shown in Table 2. Using the binary, grayscale and three color 512 × 512 images to conduct the comparative experiment on the fractal image compression encoding algorithm based on gene expression programming and its parallel algorithm (four threads) and this algorithm.
Classification loss weight.
Basic parameter settings of GEP.
GEP: gene expression programming.
Test results and analysis
The experimental results show that: when the learning rate is set at 0.01, the network converges rapidly. After 30,000 iterations, the loss function value hardly changes, and the correct rate and loss of training set are basically stable. The classification effect of this algorithm and the original SegNet algorithm on the Camvid dataset is shown in Figure 2. Table 3 shows that, among 13 classifications, 8 classifications have the higher classification results than the original SegNet algorithm. This algorithm can also classify the smaller or lesser classification, and the segmentation effect is more refined. Table 4 is the peak signal-to-noise ratio (PSNR) test results based on four compression algorithms: GEP algorithm,
18
GEP parallel algorithm,
27
EEOIC
39
(compression ratio better than jpeg and jpeg 2000) and the algorithm in this paper. Table 5 is the experimental result of their compression ratio, and Table 6 is the running time of these algorithms. As seen from the peak signal-to-noise ratio (PSNR) test results in Table 4, the EEOIC algorithm has the best reconstruction effect, followed by the algorithm in this paper. Because the EEOIC algorithm optimizes the DCT and multi-scale orthogonal WT of JPEG and JPEG 2000 for pixel blocks, its inverse transform has less decoding distortion. While the fractal compression algorithm reconstructs the image by transforming the parameters, the degree of distortion is large, and the reconstruction effect is relatively poor. This algorithm conducts the classification by deep convolutional network, and the segmented range block set and the domain block set have better search and matching precision in the encoding process of gene expression programming, which improves the imaging quality and obtains the better decoding effect than GEP or GEP parallel algorithm. The decoding reconstruction effect is shown in Figure 3. As seen from the compression ratio test results in Table 5, “C-curve and Lena diagram” are binary images, and a relatively high compression ratio is obtained, mainly because they do not need classification and have strong self-similarity characteristics, and the IFS being obtained has fewer transformation parameters, and the compression ratio is relatively high. The color image is relatively complicated, as it contains more information, thus, after the convolution classification and the segmentation, it obtains more IFS parameters, and the compression ratio is relatively small. Although the EEOIC compression ratio is better than jpeg and jpeg 2000, it only optimizes the DCT of JPEG on pixel blocks and the multi-scale orthogonal wavelet decomposition used by JPEG 2000, its compression rate is lower than fractal compression algorithm, and its decompression effect is better than fractal algorithm. Compared with the other three algorithms, this algorithm obtains a higher compression ratio, but the decoding effect is worse than the EEOIC algorithm. As seen from the compression time of the algorithm in Table 6, the compression time of this algorithm in “C curve and Lena map” is about 30–50% faster than “Street view”. Its reason is that, when processing “C curve and Lena map”, it is not necessary to perform classification calculation, which reduces the time expenditure. It is 6–10 times faster than the GEP-based serial algorithm, 3–4 times faster than the GEP-based parallel algorithm (4 threads), and 4–5 times faster than the EEOIC algorithm. It is mainly because this algorithm uses the deep CNN to classify the image first and then performs the segmentation, and its encoding search and matching number are reduced. Meanwhile, the compression encoding time is accelerated by the multi-thread scheduling of CPU/GPU platform and the accelerated operation of CUDA array processor. While the other two GEP algorithms directly segment the original image, which has large block search encoding, and the EEOIC algorithm uses serial coding and its calculation speed is slow.
In summary, this algorithm has a larger advantage in compression ratio and compression time compared to the advanced EEOIC algorithm, but it is slightly worse in compression distortion.

Comparison of segmentation effects on CamVid annotation dataset.
CamVid dataset segmentation results.
Signal-to-noise ratio (PSNR) test results.
GEP: gene expression programming; EEOIC: end-to-end optimized image compression.
Compression ratio test results.
GEP: gene expression programming; EEOIC: end-to-end optimized image compression.
Compression time test results (time unit: second).
GEP: gene expression programming; EEOIC: end-to-end optimized image compression.

Fractal compression encoding and decoding test comparison chart. (a) C curve original image (b) GEP serial reconstruction (c) GEP parallel reconstruction (d) This algorithm reconstruction (e) Lena original image (f) GEP serial reconstruction (g) GEP parallel reconstruction (h)This algorithm reconstruction (i) Street view original image (j) GEP serial reconstruction (k) GEP parallel reconstruction (l) This algorithm reconstruction.
Conclusion
This paper adopts a two-stage CUDA parallel programming hybrid processing method on CPU/GPU platform to segment the fractal image of deep CNN, and uses the gene expression programming to quickly search and encode the classification range block set and the classification domain block set, thus obtaining a higher compression ratio and better decoding reconstruction effect. This algorithm makes full use of the high-performance computing speed of CPU/GPU parallel system, the accurate and fast image classification of CNN, and the fast search and evolution convergence advantages of gene expression programming, thus its convergence speed is fast and its precision is high. However, when applying the CNN on CPU/GPU platform for image segmentation and classification numerical training, because the training data set is not universal and the training classification sample is too small, it has better segmentation and classification effect for the same type of image as the data set, but its segmentation and classification effect are not ideal for non-dataset images. Therefore, it is our next major research work to train different data sets on the CPU/GPU heterogeneous cluster platform in order to obtain the general multi-sample classifier, so as to adapt to the efficient classification and segmentation of various images.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study is supported by the National Natural Science Foundation of China under Grant Nos. 61866006,61741203; Guangxi Natural Science Foundation (2016GXNSFAA380243); Guangxi innovation-driven development of special funds project (Gui Ke AA17204091); Guangxi Nanning Science and Technology Development Planning Project (20181015–5).
