Sage Journals: Discover world-class research

Abstract

Video data for the Internet traffic is increasing, and video data transmission is important for consideration of real-time process in the Internet of Things (IoT). Thus, in the IoT environment, video applications will be valuable approach in networks of smart sensor devices. High Efficiency Video Coding (HEVC) has been developed by the Joint Collaborative Team on Video Coding (JCT-VC) as a new generation video coding standard. Recently, HEVC includes range extensions (RExt), scalable coding extensions, and multiview extensions. HEVC RExt provides high resolution video with a high bit-depth and an abundance of color formats. In this paper, a fast intraprediction unit decision method is proposed to reduce the computational complexity of the HEVC RExt encoder. To design intramode decision algorithm, Local Binary Pattern (LBP) of the current prediction unit is used as texture feature. Experimental results show that the encoding complexity can be reduced by up to 12.35% on average in the AI-Main profile configuration with only a small bit-rate increment and a PSNR decrement, compared with HEVC test model (HM) 12.0-RExt4.0 reference software.

1. Introduction

The Internet of Things (IoT) is a sensing network that connects any object with the Internet using many kinds of sensor equipment. Along with the rapid development of IoT applications, new generations of mobile broadband networks, cloud computing, and video coding technology for video streaming in real-time all represent an interactive and realistic development direction for next generation multimedia application networks. It will play a valuable role in industrial, medical, and television fields [1–3].

MPEG has already started to investigate standardization activities to define network protocols for the Internet of Things (e.g., how to connect things). The variety and heterogeneity of “Things” make it difficult to standardize descriptions, data formats, and APIs in a global manner; however, when the environment is well established, this can be done. Therefore, MPEG is exploring representations of multimedia things as part of complex distributed systems implying interaction between things and between humans and things. The multimedia data type elements are corresponding to descriptions of devices and messages for “talking to” and “adapting to” either devices or services in the Internet of Things.

Recently, there has been a change in the video content service in video communication technologies from lower resolution video to an ultrahigh definition (UHD) video format. Mobile device, storage, and network technologies are striving to keep pace with rapid changes in the market. Modern data compression techniques can store or transmit based on allocation of significant amounts of data while UHD video content has a large data transfer rate. Many applications of existing video compression technology are used to broadcast high definition (HD) TV signals over satellite, cable, and terrestrial transmission systems, video content acquisition and editing systems, camcorders, security applications, Internet and mobile network video, Blu-ray discs, and real-time conversational applications, such as video chat, video conferencing, and telepresence systems for lower dimensional video sequences.

However, the growing popularity of HD video and an increasing diversity of services and emergence of beyond-HD formats (4k × 2k or 8k × 4k resolution, called UHD) requires stronger video coding with efficiency that is superior to previous video compression standards. Moreover, the traffic caused by video applications targeting mobile devices and tablet-PCs and the transmission requirement for video on demand services are imposing severe pressures on existing networks. An increased desire for higher quality and better resolution is also driving mobile applications.

The H.264/MPEG-4 AVC [4] is still widely used for most of many applications, both in real-time and non-real-time. However, this standardization suffers a bit-rate increment and significant computational complexity for beyond-HD resolution applications.

A next-generation video coding scheme, called High Efficiency Video Coding (HEVC), was developed by the Joint Collaborative Team on Video Coding (JCT-VC) group of ISO/IEC MPEG and ITU-T VCEG [5]. The HEVC version 1 has the primary goal of achieving a 50% high compression rate than the H.264/MPEG-4 AVC, especially with a primary focus on 8-bit/10-bit YUV 4:2:0 video. Although this standardization supports high compression rate using improved and modified coding tools, the HEVC standard still requires a large amount of time for compression.

HEVC is developing extensions to support several additional application scenarios, including professional uses with enhanced precision and color format support, scalable video coding, and 3D/stereo/multiview coding. Among these extensions, the HEVC range extension (RExt) provides a high bit-depth (larger than 10 bits) and different color formats in high resolution sequences.

HEVC RExt has the same structure as HEVC, but additional coding tool options have been added to support 10 bits per sample and different color formats. The 4:2:2 and 4:4 :4 enhanced chroma sampling structures and sample bit-depths beyond 10 bits per sample are supported [6].

UHD resolution is expected to emerge in the near future and will be supported by next generation displays. This kind of data rate increase will put additional pressure on all types of networks, and data rates for video content are increasing faster than network infrastructure capacities for economical delivery. HEVC and its extensions provide good performance based on a large computational complexity because of heavy and complicated coding tools in order to improve the coding efficiency, support for in-deep color formats, and a high bit-depth.

To reduce the computational complexity in the HEVC RExt encoder, a fast intramode decision algorithm is proposed based on block texture information. This paper is organized as follows: in Section 2, the HEVC structure and related works are introduced. Local Binary Patterns (LBPs) and fast intramode decision method based on LBP are described for the proposed algorithm in Section 3. Section 4 presents the coding performance of the algorithm, and Section 5 presents concluding remarks.

2. HEVC Encoding Structure and Related Works

The HEVC standard has adopted a highly flexible and efficient block partitioning based on introduction of the coding tree unit (CTU). There are three structures of three block units as a coding unit (CU), prediction unit (PU), and transform unit (TU). The CU represents a basic block type like macroblock of the H.264/AVC. The PU is used for the coding mode decision, including motion estimation and rate-distortion (RD) optimization (RDO). Transform and entropy coding are performed based on the TU. Initially, a frame is divided into largest CU size which is called a coding tree unit (CTU). The CTU consist of a coding tree block (CTB), on luma block, and two chroma blocks. Each CTB is an assemblage of square shaped coding blocks (CB) that are divided based on a quad tree structure.

The structure of each CB is square and the size can be 8, 16, 32, or 64. This kind of change is more effective and beneficial, unlike the conventional previous H.264 method that used a 16 × 16 macroblock (MB). A larger and more flexible block structure is effective for encoding high resolution video.

The CTU size is $2 N \times 2 N$ , where N is 32, 16, or 8. A CTU can contain a single CU with a $2 N \times 2 N$ dimension, or it can be split into four smaller CUs of equal size ( $N \times N$ ). Each CTU is selected and recursively split into four CUs based on a quadtree structure.

Each CB is predicted based on an intra- or interprediction process that is performed in the PU. The intraprediction process uses two modes (PART_ $2 N \times 2 N$ and PART_ $N \times N$ ) based on the encoded size of the PU. Figures 1(a) and 1(b) show different intraprediction direction and mode between HEVC and H.264/AVC. In order to improve the coding efficiency of video coding, HEVC uses 35 intraprediction modes in the PU from $4 \times 4$ to $64 \times 64$ . The H.264/AVC standard only used 4 and 9 intraprediction modes for block sizes of $4 \times 4$ and $16 \times 16$ macroblocks. The increased numbers of intraprediction modes increase the computational complexity of HEVC, compared with H.264.

Figure 1

Different intraprediction direction and mode for (a) HEVC and (b) H.264/AVC.

In order to reduce the time required for the intraprediction coding, Yoo and Suh [8] proposed an early termination algorithm for inter- and intra-PUs that checked a coded block flag (CBF) value and the RD cost of the inter-PU. If conditions were satisfied based on these two values, each PU was skipped for an inter-PU and intra-PU. A two-stage prediction unit size decision method has also been presented [9] in which texture complexities are analyzed according to the video content using variance in order to filter out unnecessary PUs. Next, for intraprediction coding, skipped small PU sizes are selected based on PU sizes of encoded upper-left, upper, and left blocks. Some fast algorithms have been reported that use dominant edge information [10] with a subset of tree level PUs [11].

Cho and Kim [12] proposed fast CU splitting and pruning methods based on Bayes decision rules in order to reduce the computational complexity in the HEVC intraprediction process. Fast intraprediction approaches based on gradients have been used [7, 13, 14]. Wang and Siu [15] reported an adaptive intramode skipping algorithm and signaling processes using statistical properties of reference samples. An intramode decision strategy arranging candidate modes into different groups has been presented using a notation of a circle [16]. In HEVC RExt, an advanced color Table and Index Map (cTIM) [17], intrablock copy (IntraBC) [18], and angular prediction with a weight function and a modification filter based on a blending filter for DC mode [19] have been proposed.

3. Proposed Work

3.1. Local Binary Patterns (LBPs)

Intraprediction process has usually been analyzed based on use of image texture information. Local Binary Pattern (LBP) features were originally designed for texture description [20]. This LBP operator transforms an image into an array or image of integer labels describing the small-scale appearance of the image. These labels or their statistics, most commonly in the form of a histogram, are then used for further image analysis. This approach has advantages, such as gray-scale invariance and normalization. The LBP represents texture information without any time consumption because the LBP operator is simply calculated.

The LBP operator is based on the assumption that texture has two locally complementary aspects for a pattern and a pattern strength. In H.264/AVC, the LBP is used to extract moving objects with motion vectors and to use edge information in motion estimation process [21–23]. The pixels in a particular block area are thresholded based on a center pixel value, multiplied by powers of two and then summed to obtain a label for the center pixel. If the neighborhoods consist of 8 pixels, a total of $2^{8} = 256$ different labels can be obtained depending on the relative gray values of the center and the pixels in the neighborhood.

Circular symmetrical neighbor sets for different $(P, R)$ are illustrated in Figure 2. P is the node for the number of neighboring pixels and R is the radius of circle. By combining different values of P and R, the LBP is composed of variety of sets. Let $g_{p}$ denote the gray value of the sampled pixel in an evenly spaced circular neighborhood of P sampling points of radius R around point $(x, y)$ . $I (x, y)$ and $g_{c}$ denote the image of a frame and the gray level of center position. $x_{p}, y_{p}$ is given by ( $- R \sin (2 π / P), R \cos (2 π / P)$ ):

\begin{matrix} g_{p} = I (x_{p}, y_{p}), p = 0, \dots, P - 1 . \end{matrix}

(1)

Figure 2

Different set for Local Binary Patterns. P is the number of neighboring pixels and R is circle of radius.

For analysis of local texture patterns, the joint distribution of differences with spatial characteristics can be modeled as

\begin{matrix} T \approx t (g_{0} - g_{c}, g_{1} - g_{c}, \dots, g_{p - 1} - g_{c}) . \end{matrix}

(2)

Analyzed LBP is a discriminative pattern for different patterns between neighborhood pixels and center pixel. The LBP code can represent a bright/dark spot, flat areas, edges, edge ends, and curves if differences are zero in a constant region.

Equation (3) represents the binary bit value which is calculated at the ith neighbor. Let $B (i)$ denote the binary bit value of the neighboring pixel intensity I. $I (c)$ presents the pixel location of the center position at $(0,0)$ . The $I_{(x, y)}$ coordinates of circular neighbor sets $I (i)$ are given by ( $- R \sin (2 π / P)$ , $R \cos (2 π / P)$ ).

The LBP operator when P is 8 and R is 1 is shown in Figure 2(b). Binary bits can be transformed into integer values as pattern number using (4) when the binary bit stream consists of a combination of each bit calculated using (3) as the thresholding function. For example, Figure 3 illustrates the ${L B P}_{(8,1)}$ operator:

\begin{matrix} B (i) = \{\begin{cases} 1, & if I (i) \geq I (c), \\ 0, & otherwise, \end{cases} \end{matrix}

(3)

\begin{matrix} {L B P}_{(P, R)} = \sum_{i = 0}^{P - 1} B (i) * 2^{i} . \end{matrix}

(4)

Figure 3

${L B P}_{(8,1)}$ operator.

Many texture analysis applications are required for invariant or robust rotations of the input image. ${L B P}_{P, R}$ patterns are obtained by circularly sampling around the center pixel. Most of the Local Binary Patterns in natural images are uniform. Use of uniform patterns is the statistical robustness. Local primitives detected by the LBP include spots, flat areas, edges, edge ends, and curves. Figure 4 illustrates examples with the $L B P_{8, R}$ operator which are represented as gray circles and zeros are white. The LBP distribution can be successfully used in recognizing a wide variety of different textures, to which statistical and structural methods have normally been applied separately.

Figure 4

Different textures detected using LBP_(8,R).

The texture of the current PU can be identified as discriminative textures using the LBP. In the HEVC encoder, interpolation is used for application of the LBP model to coordinate location of neighboring node sets appropriately using (5). The designated center position and the neighboring location from the center in the LBP based on the PU size are shown in Figure 5. The number of neighboring pixels as P is 8 and R is $(s / 2) - 1$ . The value of s is the PU size of $8 \times 8$ , $16 \times 16$ , $32 \times 32$ , and $64 \times 64$ , except $4 \times 4$ . Consider

\begin{matrix} {\hat{I}}_{x, y} = \frac{1}{N M} \sum_{i = 1}^{N = 2} \sum_{j = 1}^{M = 2} I_{(x - i - 1, y - j - 1)} . \end{matrix}

(5)

Figure 5

$L B P_{(8, (s / 2) - 1)}$ in the HEVC with interpolated center and its neighboring positions. s is the PU size.

Therefore, ${L B P}_{(P, R)}$ is calculated using $I (i) = {\hat{I}}_{x, y}$ in (5) as neighborhood pixels for application and analysis of local texture information in the HEVC block structure. For analysis of binary textural information and different intraprediction modes, a probability distribution is first analyzed for binary patterns that occurred in text sequences of natural video content. Next, relationship with bit patterns that exhibit higher frequencies of occurrence than other patterns and encoded intramodes is analyzed. Probability distributions of patterns and modes in four sequences with 20 frames using texture information for LBP are shown in Figures 6 and 7.

Figure 6

Probability distribution about pattern number according to various sequences.

Figure 7

Probability distribution of intramode which is best for most probable LBP.

Sequences have a 4:2:2 color format, 10 bits, and 1920 × 1080 resolutions in HM version 12.0-RExt4.0. Similar distribution graphs for all test sequences are shown in Figure 6, indicating that the texture feature appears in different sequences. Furthermore, before the encoding stage, sets of most probable patterns with high distribution rates in the LBP are already prepared using a look-up table.

The mode distribution for different most probable patterns that are similar to other most probable patterns for Intra_Planar, Intra_DC, and vertical mode (0, 1, and 26) is shown in Figure 7. Therefore, in order to use only DC, Planar, and 26 modes, the most probable patterns are used to quickly make a mode decision using texture information.

3.2. The Overall Procedure of the Proposed Fast Scheme

The complexity of HEVC is significantly increased than H.264/AVC by improvement of encoding efficiency. Consequently, HEVC requires improved rapid coding process as well as guarantee of efficient compression. In HEVC, there are fast encoding tools in each prediction, transform, and filtering processes.

To support high speed encoding, the intramode prediction process in the original HM-12.0-RExt4.0 is performed using a rough mode decision (RMD) and most probable mode (MPM). RMD and MPM are contributed to speed-up intraprediction process. Intraprediction selects the N best candidate modes based on RMD where all modes are tested based on the minimum absolute sum of Hadamard transformed coefficients of the residual signal (HSAD) and the number of mode bits in the RMD. The number of N best RMD candidates is 8 for $4 \times 4$ and $8 \times 8$ and 3 for $16 \times 16$ , $32 \times 32$ , and $64 \times 64$ . The RD optimization is only used for N + MPM candidates. However, the computation load on the encoder is still high. The overall procedure of the proposed fast mode decision scheme is illustrated in Figure 8 and follows the following process.

Figure 8

The overall procedure of the proposed algorithm.

Step 1.

Initially, the LBP is calculated for the current encoded PU.

Step 2.

If the LBP is included in the most probable patterns, which are already defined in the look-up table, go to Step 3. Otherwise, go to Step 4.

Step 3.

Prediction is only performed three times for a set of 0, 1, and 26 candidate modes. Next, go to Step 5.

Step 4.

Prediction is performed for the number of modes based on the PU size. Go to Step 5.

Step 5.

The best mode is selected with the minimum RD cost.

In the proposed scheme, the MPM condition is used based on the most probable LBP in the look-up table. If the local texture pattern of the LBP encoded block satisfies the condition, the intraprediction process is only performed three times for the three modes Intra_Planar 0, Intra_DC 1, and vertical mode 26.

4. Experimental Results

The proposed fast scheme was implemented on HM-12.0-RExt4.0 (HEVC RExt reference software). Test environments were all intra using AI-Main. For wireless video communication, in the past, IPPP structure which one I frame followed by all P frames is usually employed. Recently, wireless video communication is required to support high resolution video service due to rapid advance in network technology. To provide better quality than IPPP structure, all intrastructure should be used necessarily. Standard sequences with 50 frames were used for three to four sequences with different quantization parameter (QP) range (12, 17, 22, and 27) defined by superhigh tier (SHT) [24]. Test sequences were classified for color formats of RGB4:4:4, YCbCr4:4:4, and YCbCr4:2:2. Each class had a 1920 × 1080 resolution. Details of the encoding environment can be seen in JCTVC-N1006 [24].

To evaluate performance, measurements of $Δ B i t$ , $Δ {P S N R}_{Y}$ , and $Δ T i m e$ were used as

\begin{matrix} Δ B i t = B i t_{p r o p o s e d} - B i t_{o r i g i n a l}, \end{matrix}

(6)

\begin{matrix} {Δ P S N R}_{Y} = {P S N R}_{Y_{p r o p o s e d}} - {P S N R}_{Y_{o r i g i n a l}}, \end{matrix}

(7)

\begin{matrix} Δ T i m e = \frac{T i m e_{P r o p o s e d} - T i m e_{A n c h o r}}{T i m e_{A n c h o r}} \times 100 . \end{matrix}

(8)

$Δ T i m e$ is a complexity comparison factor used to indicate the amount of total encoding time saving (8). From (8), ${T i m e}_{(x)}$ indicates the total consumed time of the method x for encoding.

The performance results between the proposed algorithm and Jiang et al. [7] algorithm on the HM-12.0-RExt4.0 software are shown in Tables 1, 2, and 3. Each performance in the tables is based on different color formats of RGB4:4:4, YCbCr4:4:4, and YCbCr4:2:2. Bjøntegaard delta bitrates (BDBR) [25] are shown in Tables 1, 2, and 3 as a performance measurement. The time reduction performance of the proposed method in RGB4:4:4 was almost 11.29%, on average with 0.28%, 0.14 (dB), and 1.13% losses in bit-rate, Y-PSNR, and BDBR, respectively. In RGB4:4:4, Jiang's algorithm achieved 7% of complexity reduction, on average with 0.48% in bitrate increment, 0.12 (dB) loss of Y-PSNR, and 1.25% BDBR.

Table 1

The performance of the proposed algorithm and Jiang's algorithm [7] on the HM-12.0-RExt4.0 reference software in RGB4:4:4 with superhigh tier (SHT) of QP range.

	Sequence	QP	Proposed				Jiang [7]
	Sequence	QP	ΔBit (%)	ΔPSNR $_{Y}$ (dB)	ΔTime (%)	BD rate (%)	ΔBit (%)	ΔPSNR $_{Y}$ (dB)	ΔTime (%)	BD rate (%)
RGB4:4:4	Kimono	12	−0.49	−0.12	−11.90	0.6	−0.43%	−0.10	−8.95	0.5
		17	−0.56	−0.09	−11.16		−0.43%	−0.07	−8.31
		22	0.25	−0.01	−10.44		0.16%	−0.01	−7.40
		27	0.90	−0.01	−10.61		0.46%	−0.01	−7.07
		avg	0.02	−0.06	−11.03		−0.06%	−0.05	−7.93
	DucksAndLegs	12	0.49	−0.49	−11.92	1.3	0.42%	−0.45	−9.40	1.2
		17	0.01	−0.38	−11.47		0.00%	−0.34	−8.83
		22	0.08	−0.13	−10.73		0.04%	−0.13	−7.98
		27	0.39	−0.04	−10.92		0.24%	−0.06	−7.86
		avg	0.24	−0.26	−11.26		0.18%	−0.25	−8.52
	OldTownCross	12	0.19	−0.15	−11.71	1.1	0.73%	−0.11	−8.25	2.1
		17	0.21	−0.13	−11.73		1.41%	−0.10	−5.51
		22	0.47	−0.10	−11.27		1.86%	−0.07	−4.95
		27	0.62	−0.06	−11.23		2.50%	−0.05	−4.52
		avg	0.37	−0.11	−11.48		1.62%	−0.08	−5.81
	Park Scene	12	0.02	−0.18	−11.88	1.6	−0.07%	−0.15	−7.72	1.2
		17	0.07	−0.15	−11.69		−0.09%	−0.14	−5.66
		22	0.63	−0.08	−11.12		0.24%	−0.07	−5.16
		27	1.25	−0.07	−10.88		0.61%	−0.06	−4.43
		avg	0.49	−0.12	−11.39		0.17%	−0.10	−5.74
	Average	0.28	−0.14	−6.29	−11.29	1.13	0.48%	−0.12	−7.00	1.25

Table 2

The performance of the proposed algorithm and Jiang's algorithm [7] on the HM-12.0-RExt4.0 reference software in YCbCr4:4:4 with superhigh tier (SHT) of QP range.

	Sequence	QP	Proposed				Jiang [7]
	Sequence	QP	ΔBit (%)	ΔPSNR $_{Y}$ (dB)	ΔTime (%)	BD rate (%)	ΔBit (%)	ΔPSNR $_{Y}$ (dB)	ΔTime (%)	BD rate (%)
YCbCr4:4:4	Kimono	12	−0.70	−0.11	−11.89	0.8	−0.57	−0.09	−9.18	0.7
		17	−0.36	−0.05	−11.17		−0.16	−0.03	−8.56
		22	0.61	0.00	−10.84		0.35	−0.01	−8.15
		27	2.20	−0.01	−11.46		1.52	−0.02	−8.38
		avg	0.43	−0.04	−11.34		0.28	−0.04	−8.57
	BirdsInCage	12	0.12	−0.08	−12.08	1.8	0.01	−0.08	−8.73	1.5
		17	−0.02	−0.06	−11.48		−0.09	−0.05	−8.47
		22	0.46	−0.06	−11.49		0.32	−0.04	−8.17
		27	2.18	−0.02	−12.22		1.60	−0.02	−8.50
		avg	0.69	−0.05	−11.82		0.46	−0.05	−8.47
	Crowd Run	12	0.49	−0.18	−12.07	3.0	0.22	−0.16	−8.38	2.1
		17	0.82	−0.15	−12.23		0.38	−0.14	−8.41
		22	1.87	−0.10	−12.25		0.97	−0.09	−8.43
		27	3.12	−0.10	−12.24		1.75	−0.09	−7.87
		avg	1.58	−0.14	−12.20		0.83	−0.12	−8.27
	Average	0.28	0.90	−0.08	−11.78	1.87	0.52	−0.07	−8.43	1.44

Table 3

The performance of the proposed algorithm and Jiang's algorithm [7] on the HM-12.0-RExt4.0 reference software in YCbCr4:2:2 with superhigh tier (SHT) of QP range.

	Sequence	QP	Proposed				Jiang [7]
	Sequence	QP	ΔBit (%)	ΔPSNR $_{Y}$ (dB)	ΔTime (%)	BD rate (%)	ΔBit (%)	ΔPSNR $_{Y}$ (dB)	ΔTime (%)	BD rate (%)
YCbCr4:2:2	Kimono	12	−1.23	−0.17	−14.00	1.1	−1.15	−0.16	−12.07	0.9
		17	−0.77	−0.06	−13.16		−0.64	−0.05	−11.09
		22	1.30	0.00	−12.50		0.76	−0.01	−10.53
		27	2.51	−0.01	−13.12		1.67	−0.02	−10.67
		avg	0.45	−0.06	−13.19		0.16	−0.06	−11.09
	DucksAndLegs	12	0.16	−0.25	−13.58	2.7	−0.02	−0.21	−12.04	2.0
		17	0.26	−0.24	−13.71		−0.04	−0.20	−12.17
		22	0.89	−0.16	−13.56		0.37	−0.13	−12.18
		27	2.11	−0.11	−13.15		1.23	−0.10	−11.73
		avg	0.85	−0.19	−13.50		0.38	−0.16	−12.03
	OldTownCross	12	2.17	−0.25	−13.50	5.8	1.42	−0.19	−12.19	4.2
		17	2.98	−0.26	−13.30		1.73	−0.21	−11.82
		22	3.22	−0.24	−13.40		2.37	−0.18	−11.69
		27	5.69	−0.16	−13.10		4.69	−0.11	−11.26
		avg	3.51	−0.22	−13.32		2.55	−0.17	−11.74
	Park Scene	12	−0.25	−0.23	−16.14	2.0	−0.26	−0.21	−11.26	1.7
		17	−0.23	−0.21	−16.07		−0.32	−0.20	−11.13
		22	0.02	−0.13	−15.82		−0.16	−0.12	−11.04
		27	0.95	−0.08	−15.37		0.52	−0.07	−10.41
		avg	0.12	−0.16	−15.85		−0.06	−0.15	−10.96
	Average	0.28	1.24	−0.16	−13.97	2.90	0.76	−0.13	−11.45	2.17

For sequences with YCbCr4:4:4 (Table 2), the proposed algorithm achieved a BDBR loss rate of 1.87% with a bit increment of 0.9% and a 0.08 (dB) PSNR decrement, on average. An 11.78% speed-up gain was achieved in sequences with YCbCr4:4:4. Performance result of [7] in YCbCr4:4:4 achieved 0.52%, 0.07 (dB), and 1.44% losses in bit-rate, Y-PSNR, and BDBR, respectively. The time reduction of Jiang's method gained 8.43% on average value. Losses and gains of the proposed HEVC encoding system with YCbCr4:2:2 are shown in Table 3. Simulated sequences with YCbCr4:2:2 achieved a 13.97% time-saving factor and a 2.9% BDBR with a 1.24% bit-rate loss and a 0.16 (dB) PSNR loss, on average value. Jiang's algorithm achieved 11.45% on average of complexity improvement and a 2.17% BDBR with a 0.76% in bit-rate and 0.13 (dB) in Y-PSNR.

The proposed algorithm achieved a speed-up gain up to 16.14% with a smaller bit increment in the Seeking sequence with QP = 12 and YCbCr4:2:2. In DucksAndLegs sequence with QP = 22 and YCbCr4:2:2, Jiang's algorithm achieved a speed-up factor up to 12.18% with a smaller bit-rate loss. Bit-rate performance was increased in nonnatural sequences and videos with many moving objects in the EBUGraphics sequence. Using the proposed approach, an average speed-up gain of over 12.35% was obtained, compared to the original intraprediction process with a negligible bit-rate increment. Gradients based fast intramode decision algorithm of Jiang et al. [7] achieved encoding speed-up gain of 8.98% on average. Although Jiang's algorithm can give better performance of BDBR than the proposed method, the proposed one can reduce more encoding time without the performance degradation from gradients based fast intramode decision scheme [7].

RD performance for test sequences classified by color and line-style with a QP range of SHT in AI-Main is shown in Figure 9. The original standard and proposed method performance values exhibited similar peaks in a graph (Figures 9(a) and 9(b)). A Seeking sequence with YCbCr4:2:2 (Figure 9(c)) showed a negligible loss of bit-rate and quality.

Figure 9

Rate-distortion (RD) curves for (a) Kimono (RGB4:4:4), (b) BirdsInCage (YCbCr4:4:4), and (c) Seeking (YCbCr4:2:2) sequences in AI-SHT.

A large loss of image quality was observed using the original HM encoder (up to 0.22 dB, on average, for the EBUGraphics sequence). However, the proposed method provided high quality maintenance of 40 (dB) when the QP value was less than or equal to 22, and the video quality was greater than or equal to 50 (dB) when the QP value was set to 12, except for the OldTownCross sequence. Furthermore, the proposed fast intramode decision scheme supported rapid compression, even with a small Y-PSNR loss. The proposed algorithm was efficient for use in a real-time encoding system without significant degradation of encoding performance for large video resolution, using a deep color format, and high bit-rate sequences.

5. Conclusions

A fast intramode decision scheme is proposed for high resolution video with a high bit-depth and a rich color format. The proposed algorithm achieved a 12.35% time saving and a BDBR value of 1.96%, on average, over the original HM-12.0-RExt4.0 software with a Y-PSNR loss of 0.12 (dB) and a 0.81% bit increment. The proposed algorithm should be considered for the Internet of Things (IoT) environment in real-time and can be useful for real-time HEVC video encoding systems for maintenance of video quality.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work was supported by ICT R&D program of MSIP/IITP. [B0101-15-1280, Development of Cloud Computing Based Realistic Media Production Technology].

References

Andrepoulos

Xiao

van der Schaar

Non-stationary resource allocation policies for delay-constrained video streaming: application to video over internet-of-things-enabled networks

IEEE Journal on Selected Areas in Communications 2014 32 4 782 794

10.1109/jsac.2014.140410

2-s2.0-84897129959

Pereira

E. G.

Video streaming considerations for internet of things

Proceedings of the 2nd International Conference on Future Internet of Things and Cloud (FiCloud '14)

August 2014

48 52

10.1109/ficloud.2014.18

2-s2.0-84922554270

Liu

Yan

Study on multi-view video based on IOT and its application in intelligent security system

Proceedings of the International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC '13)

December 2013

1437 1440

2-s2.0-84918576538

Wiegand

Sullivan

G. J.

Bjøntegaard

Luthra

Overview of the H.264/AVC video coding standard

IEEE Transactions on Circuits and Systems for Video Technology 2003 13 7 560 576

10.1109/tcsvt.2003.815165

2-s2.0-0042631515

Sullivan

G. J.

Ohm

J.-R.

Han

W.-J.

Wiegand

Overview of the high efficiency video coding (HEVC) standard

IEEE Transactions on Circuits and Systems for Video Technology 2012 22 12 1649 1668

10.1109/tcsvt.2012.2221191

2-s2.0-84872253926

Sullivan

G. J.

Boyce

J. M.

Chen

Ohm

J. R.

Segall

C. A.

Vetro

Standardized extensions of high efficiency video coding (HEVC)

IEEE Journal of Selected Topics in Signal Processing 2013 6 6

Jiang

Chen

Gradient based fast mode decision algorithm for intra prediction in HEVC

Proceedings of the International Conference on Consumer Electronics, Communications and Networks (CECNet '12)

April 2012

Yoo

H.-M.

Suh

J.-W.

Fast coding unit decision algorithm based on inter and intra prediction unit termination for HEVC

Proceedings of the IEEE International Conference on Consumer Electronics (ICCE '13)

January 2013

Las Vegas, Nev, USA

IEEE

300 301

10.1109/icce.2013.6486903

2-s2.0-84876386508

Tian

Goto

Content adaptive prediction unit size decision algorithm for HEVC intra coding

Proceedings of the Picture Coding Symposium (PCS '12)

May 2012

10.

da Silva

T. L.

Agostini

L. V.

Cruz

L. A. D. S.

Fast HEVC intra prediction mode decision based on EDGE direction information

Proceedings of the 20th European Signal Processing Conference (EUSIPCO '12)

August 2012

Bucharest, Romania

IEEE

1214 1218

2-s2.0-84869808585

11.

da Silva

T. L.

da Silva Cruz

L. A.

Agostini

L. V.

Fast HEVC intra mode decision based on dominant edge evaluation and tree structure dependencies

Proceedings of the 19th IEEE International Conference on Electronics, Circuits, and Systems (ICECS '12)

December 2012

Seville, Spain

IEEE

568 571

10.1109/icecs.2012.6463683

2-s2.0-84874623069

12.

Cho

Kim

Fast CU splitting and pruning for suboptimal CU partitioning in HEVC intra coding

IEEE Transactions on Circuits and Systems for Video Technology 2013 23 9 1555 1564

10.1109/TCSVT.2013.2249017

2-s2.0-84883762073

13.

Chen

Pei

Sun

Liu

Ikenaga

Fast intra prediction for HEVC based on pixel gradient statistics and mode refinement

Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP '13)

July 2013

Beijing, China

514 517

10.1109/chinasip.2013.6625393

2-s2.0-84889597799

14.

Chen

Sun

Liu

Ikenaga

Fast mode and depth decision HEVC intra prediction based on edge detection and partitioning reconfiguration

Proceedings of the International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS '13)

November 2013

Naha, Japan

IEEE

38 41

10.1109/ISPACS.2013.6704518

15.

Wang

L.-L.

Siu

W.-C.

Novel adaptive algorithm for intra prediction with compromised modes skipping and signaling processes in hevc

IEEE Transactions on Circuits and Systems for Video Technology 2013 23 10 1686 1694

10.1109/TCSVT.2013.2255398

2-s2.0-84885607506

16.

Quanhe

Yaocheng

Yun

Fast intra mode decision strategy for HEVC

Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP '13)

July 2013

Beijing, China

500 504

10.1109/chinasip.2013.6625390

2-s2.0-84889591562

17.

Wang

Advanced screen content coding using color table and index map

IEEE Transactions on Image Processing 2014 23 10 4399 4412

10.1109/tip.2014.2346995

18.

Kwon

D.-K.

Budagavi

Fast intra block copy (IntraBC) search for HEVC screen content coding

Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '14)

June 2014

9 12

10.1109/iscas.2014.6865052

2-s2.0-84907417810

19.

Matsuo

Takamura

Shimizu

Intra angular prediction with weight function and modification filter

Proceedings of the Picture Coding Symposium (PCS '13)

December 2013

San Jose, Calif, USA

IEEE

77 80

10.1109/pcs.2013.6737687

2-s2.0-84897731158

20.

Ojala

Pietikäinen

Mäenpää

Multiresolution gray-scale and rotation invariant texture classification with local binary patterns

IEEE Transactions on Pattern Analysis and Machine Intelligence 2002 24 7 971 987

10.1109/tpami.2002.1017623

2-s2.0-0036647193

21.

Wang

Liang

Wang

Background modeling using local binary patterns of motion vector

Proceedings of the IEEE Visual Communications and Image Processing (VCIP '12)

November 2012

San Diego, Calif, USA

IEEE

1 5

10.1109/vcip.2012.6410784

2-s2.0-84874047520

22.

Verma

Dabbagh

M. Y.

Binary pattern based edge detection for motion estimation in H.264/AVC

Proceedings of the IEEE Canadian Conference of Electrical and Computer Engineering (CCECE '13)

May 2013

23.

Yang

Wang

Lei

Zhao

S. Z.

Spatio-temporal LBP based moving object segmentation in compressed domain

Proceedings of the 9th International Conference on Advanced Video and Signal-Based Surveillance (AVSS '12)

September 2012

Beijing, China

IEEE

252 257

10.1109/avss.2012.68

2-s2.0-84868241403

24.

Flynn

Rosewarne

Common test conditions and software reference configurations for HEVC range extensions

Proceedings of the 14th Meeting of Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11

August 2013

Vienna, Austria

25.

Bjøntegaard

Calculation of average PSNR differences between RD-curves (VCEG-M33)

Proceedings of the VCEG Meeting

April 2013

Austin, Tex, USA

ITU-T SG 16 Q.6 Document

Fast Video Encoding Algorithm for the Internet of Things Environment Based on High Efficiency Video Coding

Abstract

1. Introduction

2. HEVC Encoding Structure and Related Works

3. Proposed Work

3.1. Local Binary Patterns (LBPs)

3.2. The Overall Procedure of the Proposed Fast Scheme

Step 1.

Step 2.

Step 3.

Step 4.

Step 5.

4. Experimental Results

5. Conclusions

Footnotes

Conflict of Interests

Acknowledgment

References