Abstract
In recent years, reversible data hiding technology has been widely used in JPEG images for special purposes such as file management and image authentication. Histogram shifting is one of the most popular techniques for achieving reversible data hiding technology. However, invalid shifting in histogram shifting limits the performance of existing reversible data hiding schemes. Therefore, we propose a two-dimensional histogram shifting-based reversible data hiding scheme in this article to improve the performance of marked JPEG images in terms of visual quality and file size. In the proposed histogram shifting method, only the coefficient pairs containing two non-zero quantized discrete cosine transform coefficients are changed for embedding data. Specifically, the coefficient pairs with at least one quantized discrete cosine transform coefficient valued −1 or +1 are shifted and the rests leave room for embedding data. With our proposed reversible data hiding scheme, the number of invalid shifting pixels is reduced so that it improves the performance of marked JPEG images. The experimental results show that the proposed method achieves a higher peak signal-to-noise ratio and has a lower increase in file size than state-of-art methods.
Introduction
For the last few decades, reversible data hiding (RDH) methods are springing up like mushrooms. The RDH is also called lossless data hiding in many cases. The difference between RDH methods and traditional information hiding methods is that the former can recover the original carrier image without losing any information.1–4 And by RDH, not only the hidden data but also the original cover medium can be completely recovered by the authorized users. 5 There are four main categories for existing RDH methods: (1) lossless compression,6–11 (2) difference expansion (DE),12–14 (3) histogram shifting (HS),15–19 and (4) integer-to-integer transform-based methods.20–23 Recently, by applying RDH, the embedding distortions can be eliminated when extracting embedded data. RDH is usually used in some sensitive scenarios with distortions, such as medical images, military images, legal forensics, copyright protection, and other related fields, in which any distortion is not prohibited.
So far, various RDH algorithms have emerged. Among them, the prediction error expansion (PEE) algorithm17,24–36 is widely used. For example, in Qu and Kim, 33 an improved pixel-based pixel value ordering (PPVO) method was proposed to predict each pixel. Compared with previous pixel value ordering (PVO) methods, it was not constrained by blocks and secret data are embedded in some smoother areas. To further reduce distortions and improve embedding capacity, a new scheme was proposed in Long et al., 36 in which an image is divided into three L-shaped blocks with non-overlapping pixels. After these blocks are scrambled with dromion and stream encryption, the difference is calculated, and the paired prediction errors corresponding to each block are accumulated to form a two-dimensional (2D) prediction error histogram to carry data. Then, a halftone image RDH algorithm based on minimizing the visual distortions of pixel flipping and an RDH scheme based on the image texture to reduce invalid pixels shifting were proposed in Yin et al. 37 and Jia et al., 38 respectively. A method proposed by Li et al., 39 to reduce invalid pixel shifting is based on PVO and PEE, which can not only reduce the number of shiftable pixels, but also improve the image visual quality for using PVO in the PEE scheme.
For RDH schemes, PVO is very important and practical. The literature40–43 used block-based ordering (neighboring pixel ordering). Sachnev et al. 44 proposed a popular efficient data hiding using a sorting prediction technique. In Weng et al., 45 a method based on improved dynamic pixel value ordering (IPVO) was proposed, which can flexibly modify the number of pixels in a block by classifying the local complexity into multiple levels.
HS is a very successful method of RDH, which was first proposed by Ni et al. 1 In this method, the histogram is generated and some values in the histogram are extended to carry data. Based on the sensitivity of Human Vision System, Chang et al. 46 introduced discrete cosine transform (DCT) technique and discarded the low-frequency components, and directly modified the continuous zero value coefficients in the mid-frequency coefficients. Xuan et al. 47 proposed that HS technique can be used in JPEG images to extend multiple pairs of bins of DCT coefficient histograms and modify them to carry secret data. Peng et al. 48 proposed a method in which secret data are first encoded into grouped pixels through some specific pixel transformation operations. Then, multi-segment left and right HS based on threshold operation is used to record the pixel change operation. This method is one of the checkerboard prediction techniques and multiple embedding strategies. Gao et al. 49 proposed an RDH method based on prediction error histogram (PEH), which makes the full use of correlations of coded channels, and designed a quadratic time prediction RDH scheme based on 2D PEH. He et al. 50 proposed an improved RDH scheme for JPEG images based on block sorting and frequency selection. According to the distribution of DCT coefficients, the scheme accurately estimates the distortion caused by the embedded data in each block.
The rest of this article is organized as follows. In section “Related works,” the related works are introduced. The proposed scheme is described formally in section “The proposed method.” The experimental results are given in section “Experimental results” and followed by a conclusion in section “Conclusion.”
Related works
As presented in section “Introduction,” lots of RDH schemes are based on HS. We review three main RDH schemes proposed in Huang et al., 51 Wedaj et al., 52 and Xiao et al., 53 respectively, in this section.
Huang et al.’s scheme
Huang et al.
51
selected quantized alternating current (AC) coefficients values of +1 and −1 in some specific smooth blocks for extended embedding. The smoothness of the block is defined according to the number of AC coefficients with 0 value in each block. That is, the smoothness of the
where # means the cardinality of a set. Note that, for a JPEG image of size
where
Wedaj’s scheme
The embedding method in Wedaj et al.
52
is the same as that in Huang et al.
51
The difference is that the former optimized the selection of cover blocks. The embedding efficiency of each AC frequency band should be calculated, compared, and descending ranked, and p blocks with maximum embedding efficiency should be selected to carry data. Then, for each AC band
where
After the embedding efficiency is calculated, descending order is performed from high to low. Therefore, the embedding can be done for the corresponding AC coefficients that conform to the embedding rules until all embedding payloads are satisfied.
Li’s scheme
Similarly, Xiao et al.
53
also calculated the embedding efficiency of all AC coefficient bands. However, different from Huang et al.
51
and Wedaj et al.,
52
they did not embed data in the AC coefficients with value ±1. Instead, they selected a pair of bins
where
The proposed method
In this section, we will describe the proposed scheme in details.
Figure 1 shows the flowchart of the proposed scheme, in which data hiding is applied in quantized discrete cosine transform (QDCT) coefficients of cover images, and it includes two stages, block selection and data embedding. In block selection, every two adjacent QDCT coefficients are paired, and then embedding efficiency is evaluated and sorted in descending order. According to the ordered coefficient pairs, the minimum number of blocks that can meet the requirement of embedding capacity is p, and the secret bits are then embedded.

Flowchart for the proposed method.
Coordinate classification based on QDCTcoefficients pairs
As shown in Figure 2, JPEG encoder includes three main components, namely, DCT, quantitative, and entropy code. Through JPEG compression, the original image is orthogonally transformed. The specific method is used to perform 2D DCT on the non-overlapping 8 × 8 blocks and transport the DCT coefficients into the quantizer. The quantized DCT coefficients are arranged in a zig-zag scanning mode, and direct current (DC) coefficients are modulated by differential pulse code on one hand. Then, the run-length encoding (RLE) on the AC coefficients is pre-compressed and encoded by Huffman so that we can get the JPEG file.

Flowchart of JPEG encoding.
What is more, in Figure 2, all coefficients are obtained by zig-zag scanning in the encoder, and then every two adjacent coefficients are paired. The subsequent embedding is based on the embedding of coefficient pairs. For example,
Figure 3 shows a quantized

Zig-zag scan in

Diagram of the proposed 2D histogram shifting method.
In Figure 4, all the coefficient pairs are divided into three categories: embeddable, immutable, and shiftable pairs. Note that the purple coordinates in Figure 4 (i.e. coefficient pairs) contain at least one 0 QDCT coefficient, and they are immutable coefficient pairs. Since modifying the coefficient 0 will produce more distortions and lead to the degradation of image quality, coefficient pairs containing the coefficient 0 are kept unchanged.
As shown in Figure 4, the red arrows of the solid line indicate that the coefficient pairs are embeddable, and the binary secret message bit is “0” or “1” around these arrows. In addition, the arrows also indicate the corresponding expansion directions of the coefficient pairs. The black dotted arrows indicate the shiftable pairs and the pairs can only be shifted to reserve room. In addition, there are four special circular red arrows, corresponding to four points, (1, 1), (−1, 1), (−1, −1), and (1, −1), when the embedded secret message is “10,” coefficients of these four points will be kept unchanged. What is more, if the embedded secret message is “11,” then the vertical operation is performed, that is, the coordinate value is modified up or down accordingly, as shown in Figure 4.
Block selection
In the method proposed in Huang et al., 51 blocks are sorted according to the number of coefficients with 0 values in the blocks, and the experimental results also accurately show that in blocks with more zero AC coefficients, there are more AC coefficients of +1 or −1 generally. Therefore, taking full advantage of this statistical property, selecting the first p blocks sorted blocks, pairing the coefficients in these blocks, and embedding the data into coefficient pairs with a value of +1 or −1 can reduce the distortion of the file and the incremental value of storage size.
However, we do not take the above scheme into two accounts here. On one hand, all AC coefficients of each block can be classified into three categories: embeddable, immutable, and shiftable AC coefficients. In some cases, the number of embeddable coefficients at a given spatial frequency is greater than the number of shiftable coefficients. Figure 5 shows the total number of embeddable and shiftable coefficients on each AC coefficient bit of all blocks of the aircraft F-16 image and the boat image, respectively. Figure 5 shows that the embeddable AC coefficient at position 25 is obviously more suitable for embedding. On the other hand, the modification cost of each location is different, that is, the non-uniform quantization table structure leads to less distortion caused by modifying the AC coefficients near the DC coefficient than those at high frequency. Therefore, we will choose a method to select the position of the embedding coefficient below.

The number of embeddable and shiftable coefficients at each AC coefficient location: (a) F-16 with QF = 70 and (b) boat with QF = 70.
Fortunately, for a JPEG image with size
So based on the idea of Wedaj et al.
52
and according to equation (3), the embedding efficiency
Data embedding
Compared with the three schemes mentioned in section “Related works,” our method is different in the following two aspects:
In all coefficient pairs, those containing at least one zero element are kept unchanged.
Only coefficient pairs including at least one of the AC coefficients of +1 and −1 are expanded for data embedding.
For convenience, we use
The embedding is implemented as follows:
Step 1: get the sequence
Step 2: if the coordinate
Step 3: next, if
Step 4: when
Step 5: in this step, all operations are based on
When
where |x| represents the absolute value of x.
Otherwise, shifting
Data extraction
Data extraction is reverse processing of data embedding. For the shiftable coefficient pairs, they are shifted back to their original place, while other parts require reversible extraction. Denote
Step 1: this step is to recover all shifted coefficient pairs
where
Step 2: in this step, we do not only need to restore the paired pixel values to their original values
but also extract the bit of the embedded secret message whose binary bit is “0,” that is,
Step 3: the following formula is used to extract the binary bit “1,” that is,
where |x| represents the absolute value of x.
Step 4: for (1, 2), (−1, 2), (−1, −2), and (1, −2), two bits of secret information need to be extracted continuously, that is,
Step 5: finally, for (1, 1), (1, −1), (−1, −1), and (−1, 1), the coefficient pair will be recovered according to formula (16). At the same time, two consecutive binary bits 10, that is,
If the marked image is transmitted in a lossless channel,
Experimental results
Visual quality and file size increment are two performance evaluation metrics commonly used for evaluating RDH methods. In this section, the proposed method is evaluated against three state-of-the-art methods, that is, Huang et al.’s method, 51 Wedaj et al.’s method, 52 and Li et al.’s method. 53
To better show the performance, we give simulations for four methods, that is, the above three methods and ours, and compared each other in both the visual quality and the file size increment, where the setting is coherent.
Six color images—that is, Lena, Sailboat, F-16, Baboon, Peppers, and Splash—with the size of
Visual quality
In this section, we take advantage of PSNR (Peak-Signal-to-Noise-Rate) to evaluate the visual quality of marked JPEG images with fixed embedding payloads. The relationship between PSNRs and embedding payloads on the six marked JPEG images is shown in Figure 6, where the horizontal axis represents the embedding capacity and the vertical axis represents the PSNRs. As shown in Figure 6, with the increasing of embedding payload, PSNRs for the four RDH schemes decrease. However, although PSNRs of the four methods decrease, it can still be found that under the same embedding payload condition, PSNRs of the proposed method are higher than those of the other three methods.

PSNRs and embedding capacity for six test images.
It can be clearly concluded as follows:
PSNRs in the proposed method are higher than those in other methods with the same embedding capacity.
With the increasing of the embedding capacity, PSNRs of the proposed method are closed to those in Xiao et al. 53 The reason is that the proposed method in this article combines every two coefficients in each block into a coefficient pair and maps them to a 2D histogram, which has less embedding capacity than the method in Xiao et al. 53 However, the method in this article uses more blocks, but makes fewer changes to the coefficients within each block.
In order to clearly explain the performance, we list PSNRs of six test images with 2500 bits and 4096 bits embedded by the four schemes in Tables 1 and 2. It can be found that the scheme proposed in this article is with high PSNRs. Comparing Tables 1 and 2, it is found that PSNRs of our scheme are higher than those of the other three schemes with less embedded bits. However, as the number of embedded secret bits increases, PSNRs of the method presented in this article become closer to those of the other three schemes, because when more and more secret data are embedded, more blocks are needed; thus, block selection is equivalent to no selection at all. However, the method of coefficient pairing adopted in this article has less embedding capacity compared with the other three methods.
Comparison of PSNR (dB) when the embedded bits are 2500 bits.
PSNR: peak signal-to-noise ratio.
Comparison of PSNR (dB) when the embedded bits are 4096 bits.
PSNR: peak signal-to-noise ratio.
These experiment results show that the proposed scheme goes well, which is profited from the 2D RDH: on one hand, we select relatively smooth regions for embedding through the pairwise method and block selection on the 2D JPEG domain; on the other hand, we discard all the AC coefficients containing 0. The reason is that whenever a zero AC coefficient is modified to non-zero, an extra symbol is needed to be coded, which will increase the size of the file.
Also, in some images, such as Lena and AircraftF-16 in Figure 6, PSNRs in Huang et al. 51 are larger than those of the other three schemes, which may be due to the texture features. It is worth mentioning that our proposed method is based on 2D HS method, and only the AC coefficient ±1 is selected for embedding, which is different from Xiao et al., 53 so the improvement of PSNRs is not significant in some images.
File size increment
For RDH, the file size increment is also an important comparison index. In general, a better RDH scheme of JPEG images should minimize the increase storage size as much as possible while ensuring the embedding capacity. Table 3 lists the size increasing of each test image compared with the original image at different embedding capacities when QF = 70 and QF = 80. In Table 3, it demonstrates that in the four schemes, all the file sizes increase. It means that embedding secret message necessarily results in increasing the file size of the marked image compared to the original image. However, although in this article, two values in a coefficient pair may be increased or decreased by 1, respectively, or kept unchanged in each embedding operation, the size of the final file does not increase significantly.
The file size increment (bits) of each test image compared with the original image at different embedding capacities when QF = 70 and QF = 80.
Figure 7 shows the average file size increment for the three reference methods and the proposed method for different embedding capacities. The horizontal axis represents the size of embedded capacity (in bits), and the vertical axis represents the value of increment of the file.

As shown in Figure 7, the file size increment is compared with that in Huang et al. 51 and Wedaj et al. 52 under the same embedded capacity condition. It shows that the file size increment of the method in this article is smaller and better. However, this advantage is not obvious comparing with that in Xiao et al. 53 Similarly, as shown in Figure 7, the comparison of file size increment between the methods in this article and Xiao et al. 53 under the same embedding capacity is very close or even slightly insufficient. The reason is that data are embedded in coefficient pairs. Each embedding operation is to modify at least one AC coefficient, and in most cases, both two AC coefficients will be modified. Xiao et al. 53 did not embed data in all the AC coefficients with value +1 or −1, but adaptively select AC coefficients.
Figure 8 shows the comparison of the file size increment with different payloads when QF = 80. The scheme in this article has smaller file size increment. What is more, combining the results in Figure 7 with Figure 8, it can be found that with different QF conditions, the growth trend of file size increment is same.

Computational complexity
Usually, time complexity is used to illustrate the computational complexity. And the time complexity of schemes in this article and the research studies51–53 is
Average runtime (seconds) of the six test images.
Conclusion
JPEG image format is becoming a research hotspot because of its universality. In this article, a novel RDH method based on 2D HS is proposed. It is based on non-zero AC coefficient pairs. In general, modifications to JPEG images introduce distortions and file size increment. The experimental results have shown that our proposed RDH scheme can reduce the distortions while ensuring the image quality, which is better than some advanced RDH schemes. Also, the bit rate increase of the marked image is lower than state-of-the-art schemes. This proposed method can be applied to the sensitive scenarios with distortions, especially for the embedding distortions. In this article, secret data are only embedded in QDCT coefficients with +1 and −1 values. However, more embedding locations would be discussed for more payloads and the application of the proposed scheme would be an interesting topic in the future.
Footnotes
Handling Editor: Peio Lopez Iturri
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by the National Key Research and Development Program of China (grant no. 2020YFC0833406), the National Natural Science Foundation of China (NSFC) under the grant nos 62102112, 61901096, and 61902085, and the Basic Research Plan of Guizhou Province (grant no. Qiankehejichu-ZK (2021) Yiban310).
