Sage Journals: Discover world-class research

Abstract

Developing the technology of reversible data hiding based on video compression standard, such as H.264/advanced video coding, has attracted increasing attention from researchers. Because it can be applied in some applications, such as error concealment and privacy protection. This has motivated us to propose a novel two-dimensional reversible data hiding method with high embedding capacity in this article. In this method, all selected quantized discrete cosine transform coefficients are first paired two by two. And then, each zero coefficient-pair can embed 3 information bits and the coefficient-pairs only containing one zero coefficient can embed 1 information bit. In addition, only one coefficient of each one of the rest coefficient-pairs needs to be changed for reversibility. Therefore, the proposed two-dimensional reversible data hiding method can obtain high embedding capacity when compared with the related work. Moreover, the proposed method leads to less degradation in terms of peak-signal-to-noise ratio, structural similarity index, and less impact on bit-rate increase.

Keywords

Two-dimensional reversible data hiding embedding capacity quantized discrete cosine transform coefficients H.264/advanced video coding privacy protection

Introduction

With the introduction of sensor networks, many smart devices are used to collect a large amount of data, including images, videos, speeches, and texts, for smart homes, health monitoring, traffic control, and so on. This indeed makes people’s life more convenient. However, this may lead to the leakage of personal information at the same time. For example, the face information and fingerprint information in public videos are abused. Therefore, it is important to guarantee the key content of video but not make the personal information leakage. Nowadays, distributing digital videos to the global has become easier because of the rapid development of high-speed broadband Internet and video encoding standard. For instance, in practice, many social applications, such as Skype, Facebook, WhatsAPP, WeChat, Blog, and QQ, can be used to spread digital videos. This has brought many concerns and maybe lead to many problems, even criminal activities, such as the illegal distribution of a digital movies and the leakage of personal privacy information in public videos. Therefore, researching and finding out effective ways to solve these problems or prevent them happening has become necessary.

Currently, encryption and digital watermarking are two commonly used techniques in digital multimedia, such as images and videos, to these problems. When applying encryption technique in images or videos (also referred to as motion images), the computational complexity may be high. In addition, it is possible for a video codec to produce format incompatibility. At the same time, encryption usually makes the content of images or videos unavailable. However, in some cases, like copyright protection, the content should be available. Hence, there have been many researchers to start to research digital watermarking in videos not only for solving these problem but also for other purposes, such as broadcast monitoring, copy or playback control, online location, and content filtering.¹ Digital watermarking is a part of data hiding (DH),^1,2 and it is classified into irreversible and reversible watermarking corresponding to the technologies of irreversible DH and reversible data hiding (RDH), respectively. Compared with irreversible data embedding, RDH has attracted much more attention from many researchers because it can embed additional information into digital media, such as images and videos, and recover the original media content after extracting the embedded information from the marked digital media.^3,4

In the past two decades, RDH in images has been rapidly developed that leads to make many achievements.⁵ For instance, Wu et al.⁶ designed an RDH scheme in encrypted palette images since palette images are widely utilized in real life. In their RDH scheme, a color partitioning method is proposed to make use of the palette colors to construct a certain number of embeddable color triples for embedding the secret data. By doing this, their scheme can provide a relatively high data-embedding payload and have a low computational complexity. Recently, Yang et al.⁷ propose an adaptive real-time RDH for JPEG images. This RDH scheme is realized by using successive zero coefficients in zig-zag order of discrete cosine transform (DCT) blocks. Their experimental results have verified that their proposed scheme can enhance embedding capacity meanwhile maintaining the image quality. Moreover, Chen and Wang⁸ proposed a RDH method with high embedding payload for JPEG images. For this method, each quantized discrete cosine transform (QDCT) coefficients is changed for carrying 1 information bit, thus leading to high embedding payloads.

In videos, there exist many kinds of coding parameters that can be changed for RDH, even DH. Therefore, compared with images, videos have much more research room to develop the techniques of RDH and DH. Recently, as the development of video compression standard, such as H.264/advanced video coding (AVC)⁹ and high-efficiency video coding (HEVC),¹⁰ some video DH methods^2,11–16 and video RDH methods^17–23 are reported. For these DH methods, they are proposed for improving embedding capacity,^12,14 stopping intra-frame drift,^11,13,16 and reducing bit-rate increase.²⁴ For these RDH methods, they are proposed for improving error concealments performance,¹⁸ making embedding capacity larger,¹⁷ protecting privacy,^20,23 and preventing inter-frame distortion drift.¹⁹ However, compared with the development of RDH technique in images, it is not enough. Thus, this has motivated us to continue to research RDH technique in video. Recently, Xu and Wang¹⁸ proposed a two-dimensional (2D) RDH method, as shown in Figure 1, for error concealment of intra-frame in videos. Compared with one-dimensional RDH method, 2D RDH can keep better performance in terms of peak-signal-to-noise ratio (PSNR) and structural similarity index (SSIM). Thus, Xu et al.’s scheme is better than Chung et al.’s scheme.²⁵ However, Xu et al. did not make full use of (0,0) and only a part of coefficient-pairs containing 1 zero coefficients are used to map for DH. Based on this, this has motivated us to propose a novel 2D RDH method in this article for improving the embedding capacity. In our experiments, we exploit the method of block selecting in Chen et al.’ method¹¹ and apply the 2D RDH of Xu and Wang’s method¹⁸ and our proposed 2D RDH method in H.264/AVC reference software JM12.0.²⁶ In other words, we compare them in identical cases. Experimental results have verified that our proposed method outperforms Xu and Wang’s method in terms of embedding capacity. Furthermore, our proposed 2D RDH method causes little degradation in visual quality and little impact on coding efficiency in terms of bit-rate increase.

Figure 1.

Illustration of Xu et al.’s two-dimensional histogram modification.

The remainder of this article is organized as follows. In section “proposed method,” we present the proposed 2D RDH method. Some experimental results and analysis are given in section “Experimental results and analysis.” Finally, we draw some conclusions in section “Conclusion”.

Proposed method

In common RDH methods, zero QDCT coefficients are not considered and exploited to embed information in compressed images and videos. Therefore, Chen et al. present a video RDH scheme by combining with zero QDCT coefficient-pairs from high-frequency areas.¹⁷ Based on Chen et al.’s¹⁷ work, we propose a novel 2D RDH method in H.264/AVC videos in the following. The histogram modification is shown in Figure 2.

Figure 2.

Illustration of the proposed two-dimensional histogram modification.

Data embedding

During the procedure of data embedding, our proposed 2D RDH method is based on paired QDCT coefficients and thus all coefficients should first be paired two by two. In the following, all coefficient-pairs, each of which is also called as a point, that is, $A (x, y)$ denoted as $(x, y)$ in short in this article, constitute a set denoted as $A$ and then the points in $A$ will be changed for data embedding. In the embedding procedure, $b$ is binary string. For each coefficient-pair, it is changed as follows.

Shifting

1. If $y \neq 0$ , the point $A (x, y)$ is changed by

(x', y') = {\begin{matrix} (x, y + 1), & if y > 0 \\ (x, y - 1), & if y < 0 \end{matrix}

(1)

2. If $A (x, y) = (- 1, 0)$ , that is, $x = - 1$ and $y = 0$ , the point $A (x, y)$ is changed by

(x', y') = (x, y - 1) = (- 1, - 1)

(2)

3. If $A (x, y) = (1, 0)$ , the point $A (x, y)$ is changed as follows

(x', y') = (x + 1, y) = (2, 0)

(3)

Data embedding

1. If $x < - 1$ and $y = 0$ , the point $A (x, y)$ is changed by equation (4) for data embedding

(x', y') = {\begin{matrix} (x, y - 1), & if b = 1 \\ (x - 1, y), & if b = 00 \\ (x, y + 1), & if b = 01 \end{matrix}

(4)

2. If $x > 1$ and $y = 0$ , the point $A (x, y)$ is changed by equation (5) for data embedding

(x', y') = {\begin{matrix} (x, y - 1), & if b = 1 \\ (x + 1, y), & if b = 00 \\ (x, y + 1), & if b = 01 \end{matrix}

(5)

3. If $x = y = 0$ , the point $A (x, y)$ is changed by equation (6) for data embedding

(x', y') = {\begin{matrix} (x, y), & if b = 000 \\ (x, y - 1), & if b = 001 \\ (x - 1, y), & if b = 010 \\ (x + 1, y), & if b = 100 \\ (x, y + 1), & if b = 011 \\ (x + 1, y + 1), & if b = 101 \\ (x + 1, y - 1), & if b = 110 \\ (x - 1, y + 1), & if b = 111 \end{matrix}

(6)

Exploiting equations (1)–(6), information can be embedded into the videos reversibly.

Data extraction and video recovery

Corresponding to the procedure of data embedding, the data extraction and the video recovery are addressed as follows.

Data extraction

1. For one point $A (x', y')$ , if $x' < - 1$ or $x' > 2$ and at the same time the absolute value of $y'$ is not greater than 1, the embedded information $b$ is extracted by

b = {\begin{matrix} 1, & if y' = - 1 \\ 00, & if y' = 0 \\ 01, & if y' = 1 \end{matrix}

(7)

2. If $A' (x', y') = (2, y')$ and $| y | = 1$ and the embedded information $b$ is extracted by

b = {\begin{matrix} 10, & if y' = 1 \\ 1, & if y' = - 1 \end{matrix}

(8)

3. For one point $A' (x', y')$ , if the sum of the absolute values of $x'$ and $y'$ is not greater than 1 and $x'$ and $y'$ are not −1 at the same time, the embedded information $b$ is extracted by

b = {\begin{matrix} 000, & if (x', y') = (0, 0) \\ 001, & if (x', y') = (0, - 1) \\ 010, & if (x', y') = (- 1, 0) \\ 100, & if (x', y') = (1, 0) \\ 011, & if (x', y') = (0, 1) \\ 101, & if (x', y') = (1, 1) \\ 110, & if (x', y') = (1, - 1) \\ 111, & if (x', y') = (- 1, 1) \end{matrix}

(9)

Video recovery

1. For one point $A (x', y')$ , if $x' < - 2$ and at the same time the absolute value of $y'$ is not greater than 1, this point is restored by

(x, y) = {\begin{matrix} (x' + 1, y') & if y' = 0 \\ (x', 0), & if y' \neq 0 \end{matrix}

(10)

2. For one point $A (x', y')$ , if $x' > 1$ and at the same time the absolute value of $y'$ is not greater than 1, this point is restored by

(x, y) = {\begin{matrix} (x' - 1, y') & if y' = 0 \\ (x', 0), & if y' \neq 0 \end{matrix}

(11)

3. For one point $A' (x', y')$ , if the sum of the absolute values of $x'$ and $y'$ is not greater than 2 and $x'$ and $y'$ are not −1 at the same time, this point is restored by

(x, y) = (0, 0)

(12)

4. For one point $A' (x', y')$ , if the absolute value of $y$ is greater than 1, this point is restored by

(x, y) = {\begin{matrix} (x', y' - 1), if y' > 1 \\ (x', y' + 1), if y' < - 1 \end{matrix}

(13)

5. If $A' (x', y') \in {(2, 0)}$ , this point is restored by

(x, y) = (1, 0)

(14)

6. If $A' (x', y') = (- 1, - 1)$ , this point is restored by

A (x, y) = (- 1, 0)

(15)

By using equations (10)–(15), the original compressed videos can be restored.

Analysis of embedding capacity and distortion

To analyze the embedding capacity and distortion of our proposed method, we first define three sets as follows

\begin{matrix} A_{1} = {(0, 0)} \\ A_{2} = {(x, y) | x < - 1, y = 0} \cup {(x, y) | x > 1, y = 0} \\ A_{3} = {(x, y) | x \in Z, y > 0} \cup {(x, y) | x \in Z, y < 0} \\ \cup {(1, 0)} \cup {(- 1, 0)} \end{matrix}

where $A_{1}$ and $A_{2}$ are used for data embedding and $A_{3}$ is shifted for reversibility. Therefore, the embedding capacity is calculated by

EC = k_{A_{1}} \times h (A_{1}) + k_{A_{2}} \times h (A_{2})

(16)

where $k_{A_{1}}$ and $k_{A_{2}}$ denote average embedding rate per coefficient-pair corresponding to the sets $A_{1}$ and $A_{2}$ , respectively, and $h (A)$ is defined by

h (A) = # {A_{set} | A_{set} = A}

(17)

where $#$ is the cardinal number of a set and $A$ denotes all points in the set $A$ . Thus, equation (16) can be rewritten as

EC = 3 \times h (A_{1}) + \frac{3}{2} \times h (A_{2})

(18)

In our experiments, we count $h (A_{1}) = 2375$ , $h (A_{2}) = 149$ , and $h (A_{3}) = 1636$ by setting quantization parameter (QP) = 26 and using H.264/AVC encoder to encode Foreman. Therefore

EC = 3 \times 2375 + \frac{3}{2} \times 149 = 7348.5

It is very close to the result shown in Table 1, that is, 7343 bits. Moreover, embedding distortion can be defined by

ED = d_{A_{1}} \times h (A_{1}) + d_{A_{2}} \times h (A_{2}) + d_{A_{3}} \times h (A_{3})

(19)

where $d_{A_{1}}$ , $d_{A_{2}}$ , and $d_{A_{3}}$ denote average modification rate per coefficient-pair corresponding to the sets $A_{1}$ , $A_{2}$ , and $A_{3}$ , respectively. Thereby, equation (19) can be rewritten as

ED = \frac{5}{4} \times h (A_{1}) + h (A_{2}) + h (A_{3})

(20)

Table 1.

Maximum embedding capacity (bits) on video sequences.

Sequences	No.	QP = 24		QP = 26		QP = 28
	Frame	Xu and Wang ¹⁸	Proposed	Xu and Wang ¹⁸	Proposed	Xu and Wang ¹⁸	Proposed
Akiyo	300	6022	9263	5226	8088	4108	6291
Carphone	382	6018	9309	5296	8153	5502	8422
Claire	494	3894	6043	2934	4587	2412	4685
Coastguard	300	5792	9158	5780	9060	5696	8831
Container	300	3046	4932	3430	5448	2616	4156
Foreman	300	4510	7015	4750	7343	4834	7461
Hall Monitor	300	5466	8457	5432	8289	5398	8254
Miss America	150	2086	3204	1720	2628	1664	2543
Mobile	300	2908	5097	3234	5561	3982	6586
Mother–Daughter	300	4688	7227	4778	7345	4352	6634
News	300	4862	7705	4370	6955	4964	7768
Suzie	150	2786	4235	2570	3946	2348	3598
Average	−	4339	6803	4126	6450	3989	6269

QP: quantization parameter.

Furthermore

ED = \frac{5}{4} \times 2375 + 149 + 1636 = 4753.75

In fact, the embedding distortion cannot be measured by equation (19) since H.264/AVC has intra-frame and inter-frame predictions. Equation (19) can stand for total number of modification on QDCT coefficients but not embedding distortion. Finding a good way to reasonably calculate the embedding distortion is a big challenge and it is also a research direction for us in the future. In this article, we will not address more details about how to find a good way to reasonably calculate embedding distortion.

Experimental results and analysis

This section contains four subsections, that is, setup, embedding capacity, visual quality, and bit-rate variation.

Setup

To evaluate the performance of the proposed 2D RDH method, we applied the proposed 2D RDH method in the H.264/AVC reference software JM12.0.²⁶ Twelve standard video sequences, that is, Akiyo, Claire, Coastguard, Container, Foreman, Miss America, Mobile, Mother–Daughter, News, and Suzie (as shown in Figure 3) downloaded from websites,²⁷ which are with the resolution of $176 \times 144$ , are used in our experiments. Moreover, we give some main configuration parameters of JM12.0 in Table 2, QP will be discussed in the following several subsections, and other parameters not mentioned remain in their default values. The group of picture (GOP) is IBPBPBPBPBPBPBPB. To fairly compare the performance of the proposed 2D RDH method with related method proposed by Xu and Wang,¹⁸ we take advantage of the block selecting method proposed by Chen et al.¹¹ to select blocks for data embedding. In other words, the two methods are compared in the identical conditions.

Figure 3.

Test video sequences: (a) Akiyo, (b) Carphone, (c) Clair, (d) Coastguard, (e) Container, (f) Foreman, (g) Hall Monitor, (h) Miss America, (i) Mobile, (j) Mother–Daughter, (k) News, and (l) Suzie.

Table 2.

Configuration parameters of the JM12.0 software.

Parameter	Configuration
Profile	Main
Frame rate	30
Rate distortion optimization	On
IntraPeriod	8
Symbol mode	0: CAVLC
FrameSkip	1
NumberBFrame	1

CAVLC: context-adaptive variable length coding.

In addition, we exploit embedding capacity, visual quality, and bit-rate variation to measure the performance of our proposed 2D RDH method. PSNR and SSIM²⁸ are used for objectively evaluating visual quality of marked videos. Bit-rate comparisons show the impact of our proposed 2D RDH method on H.264/AVC encoder in terms of coding efficiency. In the following several subsections, the “Original” of PSNR, SSIM, and bit-rate is computed by the original H.264/AVC encoder. Otherwise, they are computed by H.264/AVC encoder with the corresponding DH methods. More analyses are given as follows.

Embedding capacity

Table 1 shows the maximum embedding capacities on these 12 video sequences mentioned in section “Setup” by using Xu and Wang’s method¹⁸ and our proposed method. In Table 1, QP has three values, that is, 24, 26, and 28 and it determines the quantization step of H.264/AVC encoder.⁹ According to Table 1, obviously, our proposed method has larger maximum embedding capacities on these 12 video sequences when compared with Xu and Wang’s method.¹⁸ For example, on Miss America in Table 1, our proposed method obtains 3204, 2628, and 2543 bits corresponding to QP = 24, 26, and 28, respectively. However, Xu and Wang’s method¹⁸ obtains 2086, 1720, and 1664 bits correspondingly. Moreover, our proposed method obtains average maximum embedding capacities of 6803, 6450, and 3598 bits, which are greater than that Xu and Wang’s method¹⁸ obtains, that is, 4339, 4126, and 2348 bits. These have verified that our proposed 2D RDH method has indeed an advantage in embedding capacity when compared with Xu and Wang’s method.¹⁸

Visual quality

In this subsection, we will measure the visual quality of marked videos by our proposed method in two sides. On one hand, we give Figures 4 and 5 to subjectively evaluate the visual quality of marked videos. Herein, Figures 4 and 5 represent videos with more smooth areas and more rich areas, respectively. At the same time, Figures 4 and 5 are the 31st frame of Hall Monitor and Mobile, respectively. Figures 4(a)–(c) and 5(a)–(c) are obtained by using the original H.264/AVC encoder. Figures 4(d)–(f) and 5(d)–(f) are generated by H.264/AVC encoder with Xu and Wang’s method¹⁸ and similarly Figures 4(g)–(i) and 5(g)–(i) are generated by H.264/AVC encoder with our proposed method. According to Figures 4 and 5, we do not observe any distortion of video frame caused by Xu and Wang’s method¹⁸ and our proposed method when compared with the original video frames. Under this case, our proposed method outperforms Xu and Wang’s method¹⁸ in terms of embedding capacity.

Figure 4.

The 31st frame of Hall Monitor (with more smooth areas): (a)–(c) original frames, (d)–(f) marked frames by Xu and Wang’s method,¹⁸ and (g)–(i) marked frames by the proposed two-dimensional reversible data hiding method. For (a), (d), and (g), $QP = 24$ . For (b), (e), and (h), $QP = 26$ . For (c), (f), and (i), $QP = 28$ .

Figure 5.

The 31st frame of Mobile (with more rich areas): (a)–(c) original frames, (d)–(f) marked frames by Xu and Wang’s method,¹⁸ and (g)–(i) marked frames by the proposed two-dimensional reversible data hiding method. For (a), (d), and (g), $QP = 24$ . For (b), (e), and (h), $QP = 26$ . For (c), (f), and (i), $QP = 28$ .

However, we make use of PSNR and SSIM to objectively evaluate visual quality of marked videos and they are shown in Tables 3 and 4. In Table 3, when $QP$ is fixed, “Original” PSNR value is greater than PSNR values caused by Xu and Wang’s method¹⁸ and our proposed method. In addition, Xu and Wang’s method¹⁸ provides larger PSNR values than our proposed method. For instance, exploiting H.264/AVC encoder with QP = 24 on Mobile, they provide 37.83, 37.55, and 37.47 dB. Average PSNR values also meet this. Noted that, however, Tables 3 –5 correspond to Table 1. Therefore, it is not fair to compare the two 2D RDH methods like this. To better compare the two methods, we define PSNR variation (PSNRV) by

PSNRV = \frac{PSN R_{Ori} - PSN R_{2 DRDH}}{EC} \times 100 %

(21)

where $PSN R_{Ori}$ and $PSN R_{2 DRDH}$ are determined by the H.264/AVC encoder without and with 2D RDH method, respectively. $EC$ denotes embedding capacity. $PSNRV$ is used to evaluate embedding distortion. Based on this, we give Figure 6 corresponding to $QP = 28$ . As shown in Figure 6, for these 12 video sequences, the proposed method leads to very close embedding distortion when compared with Xu and Wang’s method.¹⁸ In particular, although our proposed method provides greater embedding distortion on Suzie than Xu and Wang’s method,¹⁸ it provides less embedding distortion on other 11 video sequences. In other words, our proposed method has less impact caused by DH.

Figure 6.

Peak-signal-to-noise-ratio variation comparisons $(QP = 28)$ .

Table 3.

Comparisons of PSNR (dB) between Original, Xu and Wang’s method,¹⁸ and the proposed method.

Sequences	No.	QP = 24			QP = 26			QP = 28
	Frames	Original	Xu andWang¹⁸	Proposed	Original	Xu andWang¹⁸	Proposed	Original	Xu andWang¹⁸	Proposed
Akiyo	300	42.29	41.59	41.25	40.82	40.27	39.97	39.52	38.91	38.61
Carphone	382	40.73	40.44	40.34	39.18	38.98	38.81	37.71	37.51	37.39
Claire	494	43.85	43.52	43.31	42.47	42.25	42.07	41.16	40.88	40.81
Coastguard	300	38.47	38.14	38.02	36.81	36.50	36.29	35.30	34.91	34.75
Container	300	39.83	39.45	39.27	38.44	38.03	37.84	37.18	36.86	36.70
Foreman	300	39.81	39.59	39.34	38.32	38.14	37.77	36.93	36.68	36.58
Hall Monitor	300	40.74	40.21	40.01	39.53	38.95	38.71	38.33	37.75	37.42
Miss America	150	43.30	43.00	42.83	42.15	41.89	41.70	41.02	40.69	40.52
Mobile	300	37.83	37.55	37.47	36.00	35.74	35.64	34.32	34.06	33.98
Mother–Daughter	300	41.35	40.93	40.76	39.86	39.57	39.28	38.48	38.21	38.08
News	300	41.11	40.56	40.37	39.52	38.95	38.72	38.06	37.49	37.22
Suzie	150	40.63	40.42	40.27	39.23	39.01	38.79	37.96	37.79	37.68
Average	–	40.83	40.45	40.27	39.36	39.02	38.80	38.00	37.65	37.43

PSNR: peak-signal-to-noise ratio; QP: quantization parameter.

Table 4.

Comparisons of SSIM between Original, Xu and Wang’s method,¹⁸ and the proposed method.

Sequences	No.	QP = 24			QP = 26			QP = 28
	Frames	Original	Xu andWang¹⁸	Proposed	Original	Xu andWang¹⁸	Proposed	Original	Xu andWang¹⁸	Proposed
Akiyo	300	0.9857	0.9834	0.9812	0.9792	0.9791	0.9780	0.9762	0.9737	0.9723
Carphone	382	0.9840	0.9832	0.9840	0.9790	0.9780	0.9773	0.9726	0.9714	0.9706
Claire	494	0.9884	0.9877	0.9872	0.9857	0.9851	0.9847	0.9823	0.9817	0.9814
Coastguard	300	0.9722	0.9706	0.9700	0.9597	0.9580	0.9570	0.9444	0.9422	0.9409
Container	300	0.9634	0.9622	0.9614	0.9545	0.9529	0.9512	0.9465	0.9450	0.9440
Foreman	300	0.9798	0.9789	0.9797	0.9729	0.9718	0.9706	0.9647	0.9633	0.9627
Hall Monitor	300	0.9822	0.9806	0.9797	0.9795	0.9772	0.9759	0.9763	0.9738	0.9722
Miss America	150	0.9808	0.9799	0.9795	0.9770	0.9761	0.9755	0.9722	0.9711	0.9703
Mobile	300	0.9908	0.9902	0.9900	0.9866	0.9858	0.9855	0.9807	0.9795	0.9792
Mother–Daughter	300	0.9813	0.9799	0.9799	0.9745	0.9730	0.9719	0.9654	0.9638	0.9631
News	300	0.9848	0.9836	0.9829	0.9800	0.9786	0.9779	0.9743	0.9723	0.9711
Suzie	150	0.9739	0.9729	0.9724	0.9656	0.9644	0.9633	0.9558	0.9545	0.9538
Average	–	0.9806	0.9794	0.9789	0.9747	0.9733	0.9724	0.9676	0.9660	0.9651

SSIM: Structural similarity index; QP: quantization parameter.

Table 5.

Comparisons of bit-rate (kbps) between Original, Xu and Wang’s method,¹⁸ and the proposed method.

Sequences	No.	QP = 24			QP = 26			QP = 28
	Frames	Original	Xu and Wang¹⁸	Proposed	Original	Xu and Wang¹⁸	Proposed	Original	Xu and Wang¹⁸	Proposed
Akiyo	300	84.67	86.01	86.52	68.00	69.17	69.60	55.77	56.69	57.02
Carphone	382	316.56	317.59	317.98	243.27	244.18	244.55	188.02	188.95	189.33
Claire	494	76.90	77.43	77.63	60.56	60.95	61.10	48.40	48.64	48.95
Coastguard	300	423.93	425.22	425.71	316.93	318.21	318.70	236.23	237.47	237.97
Container	300	123.48	124.23	124.49	93.87	94.69	94.99	73.85	74.48	47.70
Foreman	300	260.78	261.75	262.14	200.47	201.50	201.91	158.10	159.13	159.55
Hall Monitor	300	144.26	145.48	145.93	109.34	110.55	111.02	87.08	88.27	88.73
Miss America	150	78.79	79.69	80.05	58.86	59.62	59.92	45.62	46.36	46.65
Mobile	300	725.76	726.63	726.85	542.45	543.34	543.60	408.49	409.47	409.78
Mother–Daughter	300	109.97	110.96	111.38	84.55	85.58	86.02	66.24	67.18	67.57
News	300	177.73	178.84	179.24	143.87	144.89	145.25	117.35	118.50	118.90
Suzie	150	171.67	172.86	173.33	125.88	126.98	127.42	94.78	95.78	96.20
Average	–	224.54	225.56	225.94	170.67	171.64	172.01	131.66	132.58	132.95

QP: quantization parameter.

In Table 4, for “Original,” Xu and Wang’s method,¹⁸ and our proposed method, the SSIM values are decreasing with the increase in the QP value. Although our proposed method provides least SSIM value of 0.9404 on Coastguard when QP = 28, the difference between them are little. When considering the embedding capacity (shown in Table 1), the performance of our proposed method is accepted. Totally, our proposed method provides larger embedding capacity and leads to close embedding distortion when compared with the related work.

Bit-rate variation

Bit-rate after and before embedding data into video sequences is often used to evaluate the coding efficiency of H.264/AVC codec. In this subsection, we give Table 5 to show bit-rate variation without and with DH method. Likewise, Table 5 also corresponds to Table 1.

According to Table 5, our proposed method leads to close coding efficiency of H.264/AVC encoder with Xu and Wang’s method.¹⁸ Compared with “Original,” our proposed method has a little impact on coding efficiency. In addition, to better compare our proposed method with Xu and Wang’s method,¹⁸ we define bit-rate variation by

BVE = \frac{B_{2 DRDH} - B_{Ori}}{EC} \times 100 %

(22)

where $B_{2 DRDH}$ and $B_{Ori}$ are generated by H.264/AVC encoder with and without the RDH method, respectively. Similarly, based on this, we draw Figure 7. Herein, Figure 7 corresponds Figure 6 and Table 1. As seen from Figure 7, obviously, except Claire, our proposed method leads to less bit-rate increase per embedding information bit when compared with Xu and Wang’s method.¹⁸ That is to say that our proposed method outperforms Xu and Wang’s method¹⁸ in terms of embedding impact per embedding information bit.

Figure 7.

BVE comparisons $(QP = 28)$ .

Conclusion

This article presents a novel 2D RDH method, which is based on H.264/AVC compression standard, with high embedding capacity. In the proposed 2D RDH method, almost all points, referred to as coefficient-pair in this article, containing coefficient with a value of 0 are used for data embedding. In addition, only one coefficient in other points is changed to vacate room for reversibility. Therefore, the proposed method provides higher embedding capacity compared with the related method. When compared the variation of PSNR and SSIM caused by each embedding information bit, the proposed method keeps better visual quality. Moreover, the increase in bit-rate caused by the proposed method is less.

Footnotes

Handling Editor: Yulei Wu

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61972269, the Fundamental Research Funds for the Central Universities under Grant No. YJ201881, and Doctoral Innovation Fund Program of Southwest Jiaotong University under Grant No. DCX201824.

ORCID iD

Yi Chen

References

Asikuzzaman

Pickering

MR.

An overview of digital video watermarking. IEEE T Circ Syst Vid 2018; 28(9): 2131–2153.

Tew

Wong

An overview of information hiding in H.264/AVC compressed video. IEEE T Circ Syst Vid 2014; 24(2): 305–319.

Shi

Ansari

, et al. Reversible data hiding. IEEE T Circ Syst Vid 2006; 16(3): 354–362.

Tian

Reversible data embedding using a difference expansion. IEEE T Circ Syst Vid 2003; 13(8): 890–896.

Shi

Zhang

, et al. Reversible data hiding: advances in the past two decades. IEEE Access 2016; 4: 3210–3237.

Shi

Wang

, et al. Separable reversible data hiding for encrypted palette images with color partitioning and flipping verification. IEEE T Circ Syst Vid 2017; 27(8): 1620–1631.

Yang

Kim

YH.

Adaptive real-time reversible data hiding for JPEG images. J Real-Time Image Pr 2018; 14(1): 147–157.

Chen

Wang

HX.

An improved reversible data hiding scheme by changing modification direction of partial coefficients in JPEG images (arxiv preprint arxiv:1804. 06645v3), 2018, https://arxiv.org/abs/1804.06645

Wiegand

Sullivan

Bjontegaard

, et al. Overview of the H.264/AVC video coding standard. IEEE T Circ Syst Vid 2003; 13(7): 560–576.

10.

Sullivan

Ohm

, et al. Overview of the high efficiency video coding (HEVC) standard. IEEE T Circ Syst Vid 2012; 22(12): 1649–1668.

11.

Chen

Wang

, et al. A data hiding scheme with high quality for H.264/AVC video streams. In: Proceedings of the 4th international conference on cloud computing and security, Haikou, China, 8–10 June 2018, pp.99–110. Cham: Springer.

12.

Fallahpour

Shirmohammadi

Ghanbari

A high capacity data hiding algorithm for H.264/AVC. Secur Commun Netw 2015; 8(16): 2947–2955.

13.

Liu

Zhao

, et al. A new data hiding method for H.265/HEVC video streams without intra-frame distortion drift. Multimed Tools Appl 2019; 78(6): 6459–6486.

14.

Wang

Zhu

Tunable data hiding in partially encrypted H.264/AVC videos. J Vis Commun Image R 2017; 45: 34–45.

15.

Kumar

Singh

An improved data-hiding approach using skin-tone detection for video steganography. Multimed Tools Appl 2018; 77(18): 24234–24268.

16.

, et al. A data hiding algorithm for H.264/AVC video streams without intra-frame distortion drift. IEEE T Circ Syst Vid 2010; 20(10): 1320–1330.

17.

Chen

Wang

, et al. Reversible video data hiding using zero QDCT coefficient-pairs. Multimed Tools Appl 2019; 78(16): 23097–23115.

18.

Wang

Two-dimensional reversible data hiding-based approach for intra-frame error concealment in H.264/AVC. Signal Process: Image 2016; 47: 369–379.

19.

Yao

Zhang

Inter-frame distortion drift analysis for reversible data hiding in encrypted H.264/AVC video bitstreams. Signal Process 2016; 128: 531–545.

20.

Long

Peng

Separable reversible data hiding and encryption for HEVC video. J Real-Time Image Pr 2018; 14(1): 171–182.

21.

Niu

Yang

Zhang

A novel video reversible data hiding algorithm using motion vector for H.264/AVC. Tsinghua Sci Technol 2017; 22(5): 489–498.

22.

Zhao

Feng

A novel two-dimensional histogram modification for reversible data embedding into stereo H.264 video. Multimed Tools Appl 2016; 75(10): 5959–5980.

23.

Yang

Xiang

, et al. Fully reversible privacy region protection for cloud video surveillance. IEEE T Cloud Comp 2017; 5(2): 510–522.

24.

Chen

Wang

, et al. An adaptive data hiding algorithm with low bitrate growth for H.264/AVC video stream. Multimed Tools Appl 2018; 77(15): 20157–20175.

25.

Chung

Huang

Chang

, et al. Reversible data hiding-based approach for intra-frame error concealment in H.264/AVC. IEEE T Circ Syst Vid 2010; 20(11): 1643–1647.

26.

H.264/AVC reference software JM12.0 (online referencing), http://iphome.hhi.de/suehring/tml/download/old_jm/ (2003, accessed 12 October 2018).

27.

Test video sequences (online referencing), http://trace.eas.asu.edu/yuv/ (accessed 12 October 2018).

28.

Lie

Klaue

Evalvid-RA: trace driven simulation of rate adaptive MPEG-4 VBR video. Multimedia Syst 2008; 14(1): 33–50.

A novel two-dimensional reversible data hiding method with high embedding capacity in H.264/advanced video coding

Abstract

Keywords

Introduction

Proposed method

Data embedding

Shifting

Data embedding

Data extraction and video recovery

Data extraction

Video recovery

Analysis of embedding capacity and distortion

Experimental results and analysis

Setup

Embedding capacity

Visual quality

Bit-rate variation

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

References