Abstract
Video data for the Internet traffic is increasing, and video data transmission is important for consideration of real-time process in the Internet of Things (IoT). Thus, in the IoT environment, video applications will be valuable approach in networks of smart sensor devices. High Efficiency Video Coding (HEVC) has been developed by the Joint Collaborative Team on Video Coding (JCT-VC) as a new generation video coding standard. Recently, HEVC includes range extensions (RExt), scalable coding extensions, and multiview extensions. HEVC RExt provides high resolution video with a high bit-depth and an abundance of color formats. In this paper, a fast intraprediction unit decision method is proposed to reduce the computational complexity of the HEVC RExt encoder. To design intramode decision algorithm, Local Binary Pattern (LBP) of the current prediction unit is used as texture feature. Experimental results show that the encoding complexity can be reduced by up to 12.35% on average in the AI-Main profile configuration with only a small bit-rate increment and a PSNR decrement, compared with HEVC test model (HM) 12.0-RExt4.0 reference software.
1. Introduction
The Internet of Things (IoT) is a sensing network that connects any object with the Internet using many kinds of sensor equipment. Along with the rapid development of IoT applications, new generations of mobile broadband networks, cloud computing, and video coding technology for video streaming in real-time all represent an interactive and realistic development direction for next generation multimedia application networks. It will play a valuable role in industrial, medical, and television fields [1–3].
MPEG has already started to investigate standardization activities to define network protocols for the Internet of Things (e.g., how to connect things). The variety and heterogeneity of “Things” make it difficult to standardize descriptions, data formats, and APIs in a global manner; however, when the environment is well established, this can be done. Therefore, MPEG is exploring representations of multimedia things as part of complex distributed systems implying interaction between things and between humans and things. The multimedia data type elements are corresponding to descriptions of devices and messages for “talking to” and “adapting to” either devices or services in the Internet of Things.
Recently, there has been a change in the video content service in video communication technologies from lower resolution video to an ultrahigh definition (UHD) video format. Mobile device, storage, and network technologies are striving to keep pace with rapid changes in the market. Modern data compression techniques can store or transmit based on allocation of significant amounts of data while UHD video content has a large data transfer rate. Many applications of existing video compression technology are used to broadcast high definition (HD) TV signals over satellite, cable, and terrestrial transmission systems, video content acquisition and editing systems, camcorders, security applications, Internet and mobile network video, Blu-ray discs, and real-time conversational applications, such as video chat, video conferencing, and telepresence systems for lower dimensional video sequences.
However, the growing popularity of HD video and an increasing diversity of services and emergence of beyond-HD formats (4k × 2k or 8k × 4k resolution, called UHD) requires stronger video coding with efficiency that is superior to previous video compression standards. Moreover, the traffic caused by video applications targeting mobile devices and tablet-PCs and the transmission requirement for video on demand services are imposing severe pressures on existing networks. An increased desire for higher quality and better resolution is also driving mobile applications.
The H.264/MPEG-4 AVC [4] is still widely used for most of many applications, both in real-time and non-real-time. However, this standardization suffers a bit-rate increment and significant computational complexity for beyond-HD resolution applications.
A next-generation video coding scheme, called High Efficiency Video Coding (HEVC), was developed by the Joint Collaborative Team on Video Coding (JCT-VC) group of ISO/IEC MPEG and ITU-T VCEG [5]. The HEVC version 1 has the primary goal of achieving a 50% high compression rate than the H.264/MPEG-4 AVC, especially with a primary focus on 8-bit/10-bit YUV 4:2:0 video. Although this standardization supports high compression rate using improved and modified coding tools, the HEVC standard still requires a large amount of time for compression.
HEVC is developing extensions to support several additional application scenarios, including professional uses with enhanced precision and color format support, scalable video coding, and 3D/stereo/multiview coding. Among these extensions, the HEVC range extension (RExt) provides a high bit-depth (larger than 10 bits) and different color formats in high resolution sequences.
HEVC RExt has the same structure as HEVC, but additional coding tool options have been added to support 10 bits per sample and different color formats. The 4:2:2 and 4:4 :4 enhanced chroma sampling structures and sample bit-depths beyond 10 bits per sample are supported [6].
UHD resolution is expected to emerge in the near future and will be supported by next generation displays. This kind of data rate increase will put additional pressure on all types of networks, and data rates for video content are increasing faster than network infrastructure capacities for economical delivery. HEVC and its extensions provide good performance based on a large computational complexity because of heavy and complicated coding tools in order to improve the coding efficiency, support for in-deep color formats, and a high bit-depth.
To reduce the computational complexity in the HEVC RExt encoder, a fast intramode decision algorithm is proposed based on block texture information. This paper is organized as follows: in Section 2, the HEVC structure and related works are introduced. Local Binary Patterns (LBPs) and fast intramode decision method based on LBP are described for the proposed algorithm in Section 3. Section 4 presents the coding performance of the algorithm, and Section 5 presents concluding remarks.
2. HEVC Encoding Structure and Related Works
The HEVC standard has adopted a highly flexible and efficient block partitioning based on introduction of the coding tree unit (CTU). There are three structures of three block units as a coding unit (CU), prediction unit (PU), and transform unit (TU). The CU represents a basic block type like macroblock of the H.264/AVC. The PU is used for the coding mode decision, including motion estimation and rate-distortion (RD) optimization (RDO). Transform and entropy coding are performed based on the TU. Initially, a frame is divided into largest CU size which is called a coding tree unit (CTU). The CTU consist of a coding tree block (CTB), on luma block, and two chroma blocks. Each CTB is an assemblage of square shaped coding blocks (CB) that are divided based on a quad tree structure.
The structure of each CB is square and the size can be 8, 16, 32, or 64. This kind of change is more effective and beneficial, unlike the conventional previous H.264 method that used a 16 × 16 macroblock (MB). A larger and more flexible block structure is effective for encoding high resolution video.
The CTU size is
Each CB is predicted based on an intra- or interprediction process that is performed in the PU. The intraprediction process uses two modes (PART_

Different intraprediction direction and mode for (a) HEVC and (b) H.264/AVC.
In order to reduce the time required for the intraprediction coding, Yoo and Suh [8] proposed an early termination algorithm for inter- and intra-PUs that checked a coded block flag (CBF) value and the RD cost of the inter-PU. If conditions were satisfied based on these two values, each PU was skipped for an inter-PU and intra-PU. A two-stage prediction unit size decision method has also been presented [9] in which texture complexities are analyzed according to the video content using variance in order to filter out unnecessary PUs. Next, for intraprediction coding, skipped small PU sizes are selected based on PU sizes of encoded upper-left, upper, and left blocks. Some fast algorithms have been reported that use dominant edge information [10] with a subset of tree level PUs [11].
Cho and Kim [12] proposed fast CU splitting and pruning methods based on Bayes decision rules in order to reduce the computational complexity in the HEVC intraprediction process. Fast intraprediction approaches based on gradients have been used [7, 13, 14]. Wang and Siu [15] reported an adaptive intramode skipping algorithm and signaling processes using statistical properties of reference samples. An intramode decision strategy arranging candidate modes into different groups has been presented using a notation of a circle [16]. In HEVC RExt, an advanced color Table and Index Map (cTIM) [17], intrablock copy (IntraBC) [18], and angular prediction with a weight function and a modification filter based on a blending filter for DC mode [19] have been proposed.
3. Proposed Work
3.1. Local Binary Patterns (LBPs)
Intraprediction process has usually been analyzed based on use of image texture information. Local Binary Pattern (LBP) features were originally designed for texture description [20]. This LBP operator transforms an image into an array or image of integer labels describing the small-scale appearance of the image. These labels or their statistics, most commonly in the form of a histogram, are then used for further image analysis. This approach has advantages, such as gray-scale invariance and normalization. The LBP represents texture information without any time consumption because the LBP operator is simply calculated.
The LBP operator is based on the assumption that texture has two locally complementary aspects for a pattern and a pattern strength. In H.264/AVC, the LBP is used to extract moving objects with motion vectors and to use edge information in motion estimation process [21–23]. The pixels in a particular block area are thresholded based on a center pixel value, multiplied by powers of two and then summed to obtain a label for the center pixel. If the neighborhoods consist of 8 pixels, a total of
Circular symmetrical neighbor sets for different

Different set for Local Binary Patterns. P is the number of neighboring pixels and R is circle of radius.
For analysis of local texture patterns, the joint distribution of differences with spatial characteristics can be modeled as
Analyzed LBP is a discriminative pattern for different patterns between neighborhood pixels and center pixel. The LBP code can represent a bright/dark spot, flat areas, edges, edge ends, and curves if differences are zero in a constant region.
Equation (3) represents the binary bit value which is calculated at the ith neighbor. Let
The LBP operator when P is 8 and R is 1 is shown in Figure 2(b). Binary bits can be transformed into integer values as pattern number using (4) when the binary bit stream consists of a combination of each bit calculated using (3) as the thresholding function. For example, Figure 3 illustrates the

Many texture analysis applications are required for invariant or robust rotations of the input image.

Different textures detected using LBP(8,R).
The texture of the current PU can be identified as discriminative textures using the LBP. In the HEVC encoder, interpolation is used for application of the LBP model to coordinate location of neighboring node sets appropriately using (5). The designated center position and the neighboring location from the center in the LBP based on the PU size are shown in Figure 5. The number of neighboring pixels as P is 8 and R is

Therefore,

Probability distribution about pattern number according to various sequences.

Probability distribution of intramode which is best for most probable LBP.
Sequences have a 4:2:2 color format, 10 bits, and 1920 × 1080 resolutions in HM version 12.0-RExt4.0. Similar distribution graphs for all test sequences are shown in Figure 6, indicating that the texture feature appears in different sequences. Furthermore, before the encoding stage, sets of most probable patterns with high distribution rates in the LBP are already prepared using a look-up table.
The mode distribution for different most probable patterns that are similar to other most probable patterns for Intra_Planar, Intra_DC, and vertical mode (0, 1, and 26) is shown in Figure 7. Therefore, in order to use only DC, Planar, and 26 modes, the most probable patterns are used to quickly make a mode decision using texture information.
3.2. The Overall Procedure of the Proposed Fast Scheme
The complexity of HEVC is significantly increased than H.264/AVC by improvement of encoding efficiency. Consequently, HEVC requires improved rapid coding process as well as guarantee of efficient compression. In HEVC, there are fast encoding tools in each prediction, transform, and filtering processes.
To support high speed encoding, the intramode prediction process in the original HM-12.0-RExt4.0 is performed using a rough mode decision (RMD) and most probable mode (MPM). RMD and MPM are contributed to speed-up intraprediction process. Intraprediction selects the N best candidate modes based on RMD where all modes are tested based on the minimum absolute sum of Hadamard transformed coefficients of the residual signal (HSAD) and the number of mode bits in the RMD. The number of N best RMD candidates is 8 for

The overall procedure of the proposed algorithm.
Step 1.
Initially, the LBP is calculated for the current encoded PU.
Step 2.
If the LBP is included in the most probable patterns, which are already defined in the look-up table, go to Step 3. Otherwise, go to Step 4.
Step 3.
Prediction is only performed three times for a set of 0, 1, and 26 candidate modes. Next, go to Step 5.
Step 4.
Prediction is performed for the number of modes based on the PU size. Go to Step 5.
Step 5.
The best mode is selected with the minimum RD cost.
In the proposed scheme, the MPM condition is used based on the most probable LBP in the look-up table. If the local texture pattern of the LBP encoded block satisfies the condition, the intraprediction process is only performed three times for the three modes Intra_Planar 0, Intra_DC 1, and vertical mode 26.
4. Experimental Results
The proposed fast scheme was implemented on HM-12.0-RExt4.0 (HEVC RExt reference software). Test environments were all intra using AI-Main. For wireless video communication, in the past, IPPP structure which one I frame followed by all P frames is usually employed. Recently, wireless video communication is required to support high resolution video service due to rapid advance in network technology. To provide better quality than IPPP structure, all intrastructure should be used necessarily. Standard sequences with 50 frames were used for three to four sequences with different quantization parameter (QP) range (12, 17, 22, and 27) defined by superhigh tier (SHT) [24]. Test sequences were classified for color formats of RGB4:4:4, YCbCr4:4:4, and YCbCr4:2:2. Each class had a 1920 × 1080 resolution. Details of the encoding environment can be seen in JCTVC-N1006 [24].
To evaluate performance, measurements of
The performance results between the proposed algorithm and Jiang et al. [7] algorithm on the HM-12.0-RExt4.0 software are shown in Tables 1, 2, and 3. Each performance in the tables is based on different color formats of RGB4:4:4, YCbCr4:4:4, and YCbCr4:2:2. Bjøntegaard delta bitrates (BDBR) [25] are shown in Tables 1, 2, and 3 as a performance measurement. The time reduction performance of the proposed method in RGB4:4:4 was almost 11.29%, on average with 0.28%, 0.14 (dB), and 1.13% losses in bit-rate, Y-PSNR, and BDBR, respectively. In RGB4:4:4, Jiang's algorithm achieved 7% of complexity reduction, on average with 0.48% in bitrate increment, 0.12 (dB) loss of Y-PSNR, and 1.25% BDBR.
The performance of the proposed algorithm and Jiang's algorithm [7] on the HM-12.0-RExt4.0 reference software in RGB4:4:4 with superhigh tier (SHT) of QP range.
The performance of the proposed algorithm and Jiang's algorithm [7] on the HM-12.0-RExt4.0 reference software in YCbCr4:4:4 with superhigh tier (SHT) of QP range.
The performance of the proposed algorithm and Jiang's algorithm [7] on the HM-12.0-RExt4.0 reference software in YCbCr4:2:2 with superhigh tier (SHT) of QP range.
For sequences with YCbCr4:4:4 (Table 2), the proposed algorithm achieved a BDBR loss rate of 1.87% with a bit increment of 0.9% and a 0.08 (dB) PSNR decrement, on average. An 11.78% speed-up gain was achieved in sequences with YCbCr4:4:4. Performance result of [7] in YCbCr4:4:4 achieved 0.52%, 0.07 (dB), and 1.44% losses in bit-rate, Y-PSNR, and BDBR, respectively. The time reduction of Jiang's method gained 8.43% on average value. Losses and gains of the proposed HEVC encoding system with YCbCr4:2:2 are shown in Table 3. Simulated sequences with YCbCr4:2:2 achieved a 13.97% time-saving factor and a 2.9% BDBR with a 1.24% bit-rate loss and a 0.16 (dB) PSNR loss, on average value. Jiang's algorithm achieved 11.45% on average of complexity improvement and a 2.17% BDBR with a 0.76% in bit-rate and 0.13 (dB) in Y-PSNR.
The proposed algorithm achieved a speed-up gain up to 16.14% with a smaller bit increment in the Seeking sequence with QP = 12 and YCbCr4:2:2. In DucksAndLegs sequence with QP = 22 and YCbCr4:2:2, Jiang's algorithm achieved a speed-up factor up to 12.18% with a smaller bit-rate loss. Bit-rate performance was increased in nonnatural sequences and videos with many moving objects in the EBUGraphics sequence. Using the proposed approach, an average speed-up gain of over 12.35% was obtained, compared to the original intraprediction process with a negligible bit-rate increment. Gradients based fast intramode decision algorithm of Jiang et al. [7] achieved encoding speed-up gain of 8.98% on average. Although Jiang's algorithm can give better performance of BDBR than the proposed method, the proposed one can reduce more encoding time without the performance degradation from gradients based fast intramode decision scheme [7].
RD performance for test sequences classified by color and line-style with a QP range of SHT in AI-Main is shown in Figure 9. The original standard and proposed method performance values exhibited similar peaks in a graph (Figures 9(a) and 9(b)). A Seeking sequence with YCbCr4:2:2 (Figure 9(c)) showed a negligible loss of bit-rate and quality.

Rate-distortion (RD) curves for (a) Kimono (RGB4:4:4), (b) BirdsInCage (YCbCr4:4:4), and (c) Seeking (YCbCr4:2:2) sequences in AI-SHT.
A large loss of image quality was observed using the original HM encoder (up to 0.22 dB, on average, for the EBUGraphics sequence). However, the proposed method provided high quality maintenance of 40 (dB) when the QP value was less than or equal to 22, and the video quality was greater than or equal to 50 (dB) when the QP value was set to 12, except for the OldTownCross sequence. Furthermore, the proposed fast intramode decision scheme supported rapid compression, even with a small Y-PSNR loss. The proposed algorithm was efficient for use in a real-time encoding system without significant degradation of encoding performance for large video resolution, using a deep color format, and high bit-rate sequences.
5. Conclusions
A fast intramode decision scheme is proposed for high resolution video with a high bit-depth and a rich color format. The proposed algorithm achieved a 12.35% time saving and a BDBR value of 1.96%, on average, over the original HM-12.0-RExt4.0 software with a Y-PSNR loss of 0.12 (dB) and a 0.81% bit increment. The proposed algorithm should be considered for the Internet of Things (IoT) environment in real-time and can be useful for real-time HEVC video encoding systems for maintenance of video quality.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
This work was supported by ICT R&D program of MSIP/IITP. [B0101-15-1280, Development of Cloud Computing Based Realistic Media Production Technology].
