Hierarchical Objective Quality Assessment for CS Video in WMSN

Abstract

Compressive sensing (CS) is a sub-Nyquist sampling way while still enabling exact reconstruction, which is applicable to WMSN. In this paper, based on the characteristic of CS video in WMSN, we proposed a hierarchical objective CS video quality assessment (HOCSVQA) approach to get CS video quality index (CSVQI) from three levels, measurement level, stream level, and packet level, respectively. This approach cannot only keep the convenience and real-time characteristic of objective video assessment, but also reflect the QoE to a certain extent due to the coefficients regressed from subjective video assessment experiments. A set of experiments on subjective CS video quality assessment and another set of verification experiments are designed and settled. The CS video quality index, CSVQI, assessed by the model we proposed maintained a high correlation with data from verification experiments under statistical correlation measure.

1. Introduction

Compressive sensing (CS), as an emerging way of data acquisition, has already aroused the concern of the signal processing circles. In the framework of CS theory [1–4], sparse signal, also known as compressible signal with respect to sparse transform, can be sampled at sub-Nyquist rates while still enabling exact reconstruction of the original signal. Compared with local sampling way of classic Nyquist sampling theorem, the entire extent of signal is observed via linear random projection called sensing matrix as a small set of measurements, which can be viewed as a global sampling way to acquire and compress data. Moreover, CS fits low-consumption and low-complexity image/video acquisition systems, especially for WMSN (wireless multimedia sensor network), because of the breaking of the Nyquist rate limitation and combination of sampling and compression. CS-based video processing, which is called CS video in this paper, has been studied [5–10] in recent years with dramatic prospects.

For sensor network, wireless channel transmission performance index is very important to transmission quality assurance. For video content network, video quality index can be used as network performance to optimize the network. There still exists a long distance between the CS theory and video application service in sensor network. How to evaluate the CS video quality in sensor network is a special issue to be solved. In conventional video signal processing system, many practical video quality assessment (VQA) algorithms [11–13] can be used to evaluate, control, and improve the perceptual quality of video signal, which is also an important part of the CS video processing. The CS video codec scheme is different from the conventional scheme essentially, and current reconstruction algorithms [14, 15] for CS video employed Peak Signal-to-Noise Ratio (PSNR) as video recovery quality metric generally which is only signal fidelity measure and not correlated with human perception of quality [16]. Therefore, to solve the CS video quality evaluation problem, an effective and practical metric is needed, which focuses no longer on measurable physical distorted quantities but rather on how the video quality experienced subjectively, that is, the Quality of Experience (QoE), in other words, in contrast to the Quality of Service (QoS). More importantly, this metric must meet the characteristic of the CS video signal. A CS video quality evaluation approach based on redundant measurements was proposed in our previous work [17]. And a parity-based error control method for CS video quality improvement was also proposed [18].

In this paper, we proposed a hierarchical objective CS video quality assessment (HOCSVQA) approach to get CS video quality index (CSVQI) for WMSN. The first step in defining our video quality assessment model is to determine the principal components from three levels, measurement level, stream level, and packet level, respectively. And then, to obtain the coefficients in the video quality assessment model by means of regression analysis, a set of experiments on subjective CS video quality and another set of verification experiments are designed and settled. Finally, the CS video quality index, CSVQI, assessed by the model we proposed maintains a high correlation with data from verification experiments under statistical correlation measure. The video quality index of different levels has different application; stream level index can be used for rate control, packet level for error-resistance, and measurement level for CS-based video coding, respectively. At the same time, they also can be used together to acquire video quality information.

The remainder of this paper is organized as follows. Section 2 gives brief overviews of the CS theory, CS video framework, and video quality assessment. Section 3 presents the proposed HOCSVQA approach and hierarchical video quality assessment model. And Section 4 presents preparation and procedure of the subjective CS quality assessment experiments. Experimental results are illustrated in Section 5. Section 6 concludes and provides some further discussion.

2. Related Work

2.1. Compressive Sensing Theory

The CS theory [1–4] as a new research focus gives a novel set of theoretical framework about signal representation, sampling, and reconstruction. It points out that if the signal $x \in R^{N}$ is sparse in time domain or with respect to some transform basis Ψ ( $N \times N$ ), then global measurement process, using measurement matrix Φ ( $M \times N$ ) as a linear projection with sampling rate far below the Nyquist limitation, can be employed instead of local sampling; dimension of measurements $y \in R^{M}$ acquired via linear projection is much lower than that of the original signal x , expressed as $M ≪ N$ . Consequently, the CS measurement process can thus be considered to effectuate signal acquisition and dimensionality reduction simultaneously. After that, original high-dimensional signal x can be recovered accurately by optimization with appropriate reconstruction algorithm [14, 15] from low-dimensional measurements y . This optimization aims to search for the set of coefficients with the minimum $l_{0}$ norm that agrees with the measurements y . Consider

\begin{matrix} \hat{x} = \arg m i n {‖x‖}_{0} \\ s.t. y = Φ x . \end{matrix}

(1)

The sampling rate is not dependent on bandwidth of original signal, but on two basic criteria: sparsity of original signal x and the restricted isometry property (RIP) which can be simplified as incoherence between the measurement matrix Φ and sparse transform basis Ψ [19]. Theoretical framework of compressive sampling is shown in Figure 1.

Figure 1

Compressive sampling framework.

Sparsification, measurement matrix, and reconstruction algorithm in the above steps are three key parts of CS theory.

2.2. The CS Video

In addition to the benefits of sampling rate and consumption, unstructured presentation of signal as another advantage for CS applied to video signal reflects in measurements so that video can withstand a certain packet loss. The CS video was first proposed in [5], called as compressive video sampling, where each video block is divided into either sparse or nonsparse. A series of studies on block-based CS video were conducted by Fowler et al. [6, 7]. On the other side, distributed CS video [9, 10], combining CS theory and distributed source coding (DSC), shifts the complicated motion estimation from encoder to decoder, which leads to consumption reduction further at encoder. The research contents in these literatures are called as CS video in this paper, differentiated from the conventional video.

As applied to video, however, the CS still faces several challenges including not good enough reconstruction quality with computationally expensive process and huge memory required to store the random measurement matrix.

2.3. Video Quality Assessment

Video quality assessment methods are basically classified into two categories: subjective quality assessment and objective quality assessment. The essential factor of subjective quality assessment is rating the perceived video quality by a group of subjects with certain environmental requirements, and most of these methods are based on Absolute Category Rating (ACR) recommended by ITU-T P.910 [20]. High accuracy of quality in terms of human perception, also regarded as QoE, is provided in subjective quality assessment but with high cost and non-real-time limitation. Objective quality assessment evaluates the video streaming quality by the function related to some parameter which is measurable. Objective quality assessment methods are classified into three categories based on existence of reference video information: Full Reference (FR), Reduced Reference (RR), and No Reference (NR). Mean Square Error (MSE) and Peak Signal-to-Noise Ratio (PSNR) taken as examples compare pixel-to-pixel with reference. Commonly, objective methods are convenient and real time but not correlated with QoE. Video Quality Metrics (VQM) and Video Structural Similarity (VSSIM) [11, 12] are active objective quality assessment methods.

From the perspective of practical value, what we need most for VQA is NR or RR objective quality assessment method in reality. The HOCSVQA we proposed agrees well with this characteristic. Meanwhile, coefficients in the model are obtained by regression analysis from subjective experiments, so the quality index, CSVQI, assessed by our model meets QoE requirement.

3. The Hierarchical Objective CS Video Quality Assessment Model

There are many factors affecting the perceived quality of video, including bit rate, frame rate, network service quality, resolution, video format, delay, and jitter. From the perspective of practical experience, data volume per second video bearing is the most important factor of influence on the quality of video. Meanwhile, it can also be described as the amount of information received at video viewing end. Thus what we focused on in the paper is this type of degradation factors used as CS video quality assessment model parameters. The quality assessment model we proposed is instructive and appropriate for other degradation factors like delay and jitter under condition of degradation quantities and simulation of distortion.

For the purpose of low-consumption and low-complexity of CS video at encoder, which is our study emphasis, video quality assessment model should be objective and straightforward (without reference). And parameters selected in the model should be also supported by low-complexity.

In view of the difference from conventional video encoding, measurement parameters of CS video would be selected as primary elements. The CS video is also a form of media stream so that parameters in stream level would be taken into account. Then, with the concerning of impact on quality of network service, packet loss, as packet level parameter, should be got involved in our model. In summary, we established appropriate empirical model of CS video quality degradation as the hierarchical objective CS video quality assessment (HOCSVQA) model from above three levels, measurement level, streaming level, and packet level, respectively.

The hierarchical model we proposed is based on subjective CS video quality assessment experiments. Subjects were organized to assess the Mean Opinion Score (MOS) of each CS video sequence sample quality. The specific coefficients of this model were acquired from multivariate nonlinear regression analysis afterwards. The CSVQI, reached from the model, was objectively verified by corresponding verification experiments. The model and experiments are shown in Figure 2.

Figure 2

The HOCSVQA model and design of experiments.

3.1. Measurement Level

At measurement level, we selected measurement rate per each frame and measurement quantization step, which is bit number per each measurement, as parameters. These two parameters are special in CS video and exhibit the different features from conventional video as well. Linearity between measurement rate and quality can be seen clearly from the Figure 3, which describes relationship between reconstruction quality of 30th frame of “coastguard” video sequence and measurement rate. Figure 4 describes the probability distribution of measurements value of 30th frame of “coastguard” with 10000 measurements. From this we can see that measurements value exists between −204 and 246 and basically fits the Gaussian distribution. Accordingly, initial empirical model at measurement level was assumed as

\begin{matrix} Q_{c} = \{\begin{cases} c_{1} M_{r} + c_{a 1} & Q_{step} = 6 \\ c_{2} M_{r} + c_{a 2} & Q_{step} = 7 \\ c_{3} M_{r} + c_{a 3} & Q_{step} = 8 \\ c_{4} M_{r} + c_{a 4} & Q_{step} = 9,10,11, \end{cases} \end{matrix}

(2)

where

Q_{c}

means quality at measurement level,

M_{r}

is measurement rate, and

Q_{step}

is quantization step.

c_{1}

c_{2}

c_{3}

, and

c_{4}

are coefficients which represent slope of linearity;

c_{a 1}

c_{a 2}

c_{a 3}

, and

c_{a 4}

are constant coefficients. Piecewise linear function, which is employed in model with different quantization step, expresses quality linear change with measurement rate at different

Q_{step}

. Note that the range of quantization step makes sense; short

Q_{step}

cannot convey each measurement and long one is extravagant. Literature discussing quantization problem in CS theory is still not much currently. Here, we simplified the measurement quantization process with scalar quantization method to acquire our video experimental materials.

Figure 3

Reconstruction quality of 30th frame of “coastguard” with different measurement rate.

Figure 4

Probability distribution of measurements value of 30th frame of “coastguard.”

3.2. Stream Level

As mentioned previously, bit rate and frame rate impact the quality in stream level and are easy to get in the video system, which is consistent with the traditional video stream. They are added to communication protocols at the beginning of video stream and are the knowledge of sending end and receiving end. Empirical model at stream level was assumed as

\begin{matrix} Q_{s} = s_{1} B_{r} + s_{2} \frac{1}{B_{r}} + s_{3} F_{r} + s_{4} \frac{1}{F_{r}} + s_{a}, \end{matrix}

(3)

where

Q_{s}

means quality at stream level,

B_{r}

is bit rate, and

F_{r}

is frame rate.

s_{1}

s_{2}

s_{3}

, and

s_{4}

are coefficients which are needed to be determined;

S_{a}

is constant coefficient. This model is a binary hyperbolic function. Linear terms dominate the trends, and reciprocal terms are responsible for detail modification. Coefficients in model can be zero or negative value.

3.3. Packet Level

In packet level, the main parameter affecting quality is certainly packet loss rate. Empirical model at packet level was assumed as

\begin{matrix} Q_{l} = l_{1} \log_{a} (l_{2} (b - P_{r})) + l_{a}, \end{matrix}

(4)

where

Q_{l}

means quality at packet level and

P_{r}

is packet loss rate per each frame.

l_{1}

and

l_{2}

are coefficients which are needed to be determined;

l_{a}

is constant coefficient. The reason for using a logarithmic function is to reflect the packet loss impact on quality which declines slowly down in forepart and acutely in backend. Constant b makes logarithmic function change like that. The logarithmic base number a is determined by regression method.

How to packet the video data for production of gradual degradation video materials depends on specific communication protocol, which is not our main issue to consider. We employed full block line packet form for CS video, shown in Figure 5.

Figure 5

Full block line packet form for CS video.

From the data analysis in next section, it is observed that the model possesses high correlation between video material in experiments and practical utility due to the close similarity with the verification experiment results.

The hierarchical model can be applied as a single level model in special applications, such as rate control, error-resistance, or CS-based video coding, and also can be applied as a whole, as shown in following expression:

\begin{matrix} Q = A Q_{c} + B Q_{s} + C Q_{l}, \end{matrix}

(5)

where Q is the overall quality of video and

Q_{c}

Q_{s}

, and

Q_{l}

are measurement level quality, stream level quality, and packet level quality, respectively. A, B, and C are weighted coefficients of three levels of quality. Determination of the weighted coefficients of each level will be carried out in the future research. Building the video materials with different parameters, and regression analysis experiments can be adopted to help determine the coefficients in the same way.

4. Design of the Experiments

Two sets of experiments were designed in total. One is for acquiring the coefficients of model proposed by regression analysis with experimental data; the other is verification experiments. They are referred to as experiment A and experiment B for short. In the following, experiment environment, material, methodology, and subjects are given in detail, respectively.

4.1. Experiment Environment

Experiments were performed in a quiet room without any noises and view conditions were according to ITU standards [21]. Surface RT is employed for degraded CS video view test, with 10.6′′ IPS screen size, 1366 × 768 resolution, and 148 PPI. Data statistics and analysis proceeded in MATLAB R2012a.

4.2. Experiment Material

Experiment materials were collected from http://xiph.org/ video test media [22]. The video sequences “Akiyo,” “Claire,” “Deadline,” “Grandma,” “Silent,” and “Suzie” were adopted, shown in Figure 6. All of these “head and shoulder” sequences are on the basis that it is representative of low motion and low complexity. Main content in video is one speaker situated at some kind background, and therefore the focus of attention is primarily on the speaker. The main reason for that is different perceptual sensitivity for different video content viewed by subjects. For instance, fast-motion content video, like soccer, has more than one focus, such as moving ball, feature of athletes, and location of athletes. If these focuses have degraded slightly, the overall perceptual quality of video will degrade greatly. In addition, fast-motion content video quality is lower than head and shoulder video at the same rate. Hence, these video sequences which have unified content were adopted in our experiments.

Figure 6

(a)–(f) are original 1st frame in video sequences “Akiyo,” “Claire,” “Deadline,” “Grandma,” “Silent,” and “Suzie,” respectively.

Each test sample was clipped from these video sequences described as above. Duration of each was set as 10 seconds, and QCIF (Quarter Common Intermediate Format) is determined as resolution format, in consideration of computing speed and massive computing tasks for degraded video production. 10 seconds is sufficient enough for subjects to arrive at reliable quality rating [23]. In order to eliminate the video colors interference for subjective quality, each test sample was turned into gray scale.

In CS procedure of degraded video production, scrambled block Hadamard ensemble (SBHE) [24] and smooth projected Landweber (SPL) [25] were selected as measurement operator and reconstruction algorithm. Block size in each frame of video is set as 8 × 8.

Based on the empirical quality model proposed above, different degraded videos possessing different parameters were constructed in measurement, stream, and packet level. 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, and 0.6 were taken as measurement rate per each frame and 6, 7, 8, 9, 10, and 11 as quantization steps; two main parameters combined and produced a total of 54 samples. Meanwhile, frame rate is set as 25 fps in measurement level for experiment A. 0.25, 0.35, 0.45, and 0.55 were taken as variation range of measurement rate per each frame and produced a total of 24 samples in measurement level for experiment B. 0.9, 1.15, 1.4, 1.65, 1.9, 2.15, 2.4, 2.65, 2.9, and 3.15 Mbps were taken as bit rate per each test sample and 10, 15, 20, 25, and 30 fps as frame rate; two main parameters combined and produced a total of 50 samples. Meanwhile, quantization step was set as 9, and measurement rate per each frame was set as 0.4 in stream level for experiment A. 1.15, 1.65, 2.15, and 2.65 were taken as variation range of bit rate per each sample and produced a total of 20 samples in stream level for experiment B. 0.056, 0.111, 0.167, 0.222, 0.278, 0.333, 0.389, and 0.444 were taken as packet loss rate per each frame and produced a total of 8 samples in packet level for both experiment A and experiment B; meanwhile, quantization step was set as 9, measurement rate per each frame was set as 0.4, and frame rate was set as 25 fps. Specific experiment data was shown in Tables 1–5. Some degraded samples were shown in Figure 7.

Table 1

CS video subjective rating scores with different measurement rate and Q-step in experiment A.

Q-step	Measurement rate per frame (%)
Q-step	0.2	0.25	0.3	0.35	0.4	0.45	0.5	0.55	0.6
6	0.89	1.07	1.27	1.40	1.53	1.68	1.83	2.07	2.35
7	1.68	2.12	2.37	2.59	2.77	3.05	3.13	3.20	3.46
8	2.15	2.67	2.98	3.26	3.49	3.76	4.02	4.24	4.41
9	2.23	2.76	3.13	3.42	3.66	3.90	4.13	4.35	4.53
10	2.14	2.81	3.17	3.35	3.65	3.72	4.21	4.29	4.48
11	2.17	2.71	3.09	3.41	3.58	3.71	4.08	4.36	4.56

Table 2

CS video subjective rating scores with different measurement rate and Q-step in experiment B.

Q-step	Measurement rate per frame (%)
Q-step	0.25	0.35	0.45	0.55
6	0.96	1.29	1.53	1.98
7	1.80	2.47	3.10	3.36
8	2.43	3.37	3.89	4.19
9	2.79	3.26	3.86	4.27
10	2.98	3.07	3.98	4.32
11	3.03	3.29	3.70	4.22

Table 3

CS video subjective rating scores with different bit rate and frame rate in experiment A.

Frame per second (fps)	Bit rate per second (Mbps)
Frame per second (fps)	0.9	1.15	1.40	1.65	1.90	2.15	2.40	2.65	2.90	3.15
10	2.23	2.46	2.79	3.10	3.34	3.69	3.72	3.75	3.68	3.71
15	2.32	2.52	2.84	3.24	3.57	3.77	3.90	4.10	4.29	4.39
20	2.07	2.49	2.96	3.28	3.67	3.81	4.01	4.25	4.38	4.56
25	1.98	2.25	2.71	3.08	3.43	3.57	3.83	4.02	4.19	4.35
30	1.78	1.93	2.19	2.67	3.06	3.24	3.40	3.65	3.86	3.93

Table 4

CS video subjective rating scores with different bit rate and frame rate in experiment B.

Frame per second (fps)	Bit rate per second (Mbps)
Frame per second (fps)	1.15	1.65	2.15	2.65
10	2.35	3.01	3.56	3.70
15	2.45	3.37	3.68	3.94
20	2.55	3.06	3.75	4.23
25	2.31	2.93	3.65	3.89
30	2.08	2.56	3.10	3.54

Table 5

CS video subjective rating scores with different packet loss rate in experiments A and B.

	Packet loss rate per frame (%)
	0.056	0.111	0.167	0.222	0.278	0.333	0.389	0.444
Experiment A	4.12	4.06	3.85	3.67	3.35	3.03	2.7	2.36
Experiment B	4.31	3.98	3.73	3.61	3.42	2.89	2.54	2.25

Figure 7

(a) The degraded 30th frame in “Akiyo” with $Q_{step} = 7$ , $M_{r}$ = 0.35; (b) the degraded 140th frame in “Akiyo” with $Q_{step} = 9$ , $M_{r}$ = 0.55; (c) the degraded 30th frame in “Suzie” with $B_{r}$ = 1.4 Mbps, $F_{r}$ = 10 fps; (d) the degraded 90th frame in “Suzie” with $B_{r}$ = 2.4 Mbps, $F_{r}$ = 25 fps; (e) the degraded 150th frame in “Grandma” with $P_{r}$ = 0.056; (f) the degraded 250th frame in “Grandma” with $P_{r}$ = 0.389.

4.3. Experiment Methodology

The double stimulus continuous quality scale (DSCQS) methodology was used throughout experiment A. This methodology is a standardized subjective quality assessment technique. The reference signal is needed for quality assessment. Each test consists of two video presentations, one is the reference (undegraded source material), and the other is the typical degraded material for quality rating. Both presentations are identical in content. Quality ratings of both of them were provided by subjects. Subjects did not know which is reference or the degraded one, and that is to say, the order of presentation in one test was randomized.

Continuous rating scale was employed in quality rating, shown in Figure 8. This scale was electronically presented in monitor after subjects finished test viewing, and subjects moved the pointer to provide the rating. The range of this scale is from 0 to 100. The pointer is relocated in the middle of scale at each time test prepared.

Figure 8

Continuous rating scale for subjective experiments.

The single stimulus 5-grade quality scale methodology (SSQS) was used throughout experiment B. Each test has one typical degraded sample for quality rating by subjects. 5-grade category rating scale (excellent, good, fair, poor, and bad) was employed in rating. This scale is the same as described above, presented in monitor, and pointer was moved to rate. The difference is that subjects can only see the rough scale, 5 termed excellent and 1 termed bad. But they were informed that the pointer can be moved at the position between two grades, so we can read more accurate rating with two decimals which was hidden to subjects.

Summarizing above, in experiment A, DSCQS is kind of delicate quality rating method with reference; the rating scores were used to model regression analysis. Reference existence is to distinguish the degree of degradation more carefully. While, in experiment B, rating scores from SSQS were used to prove the validity of the model, no reference is to meet the normal viewing conditions. Here, the difference score of two presentations in experiment A was needed to be calculated, and the larger value means the poorer perceptual quality. In order to unify the results of two experiments, data in experiment A was converted into 5-grade scale in experiment B by the following formula:

\begin{matrix} Q_{5-grade} = 5 - 5 \times \frac{Q_{DS}}{100}, \end{matrix}

(6)

where

Q_{D S}

is the difference scores of each test in experiment A, and

Q_{5-grade}

is the rating scores of each test in experiment A.

For all experiments, subjects were introduced to the experiment procedure and declared to understand the goal of experiments. The order of degraded video sample for each test was randomly assigned. Each test in both experiment A and experiment B was repeated 5 times by different 5 subjects; the average score by 5 subjects was taken as this test final score. Each test began with two practice trials for subjects to become familiar with the experiments, and, at the end of each test, we documented the scores.

4.4. Subjects

All subjects in experiment were author's undergraduate schoolmates, with college-educated experience, while being without professional knowledge of image or multimedia. None of them were working in the field of image or multimedia. It is because subjective experience would have corresponding interference with the results if they were familiar with pattern of video error. All of them were male so that subjective gender difference interference would be eliminated. And the age range of them was between 24 and 30.

5. Experimental Data and Analysis

Data in Tables 1–5 is CS video subjective rating scores with different parameters acquired from subjective quality assessment experiments A and B. Based on the data in experiment A and quality model proposed previously, coefficients in model were regressed by data analysis at measurement level, stream level, and packet level, respectively. Then, models at three levels regressed are shown as follows:

\begin{matrix} Q_{c} = \{\begin{cases} 3.4122 M_{r} + 0.2002 & Q_{step} = 6 \\ 4.1133 M_{r} + 1.0624 & Q_{step} = 7 \\ 5.4433 M_{r} + 1.2649 & Q_{step} = 8 \\ 5.4989 M_{r} + 1.3412 & Q_{step} = 9, 10, 11, \end{cases} \end{matrix}

(7)

\begin{matrix} Q_{s} = 0.455 B_{r} - 1.6655 \frac{1}{B_{r}} + 0.0165 F_{r} + 6.408 \frac{1}{F_{r}} + 2.6547, \end{matrix}

(8)

\begin{matrix} Q_{l} = 1.8568 \ln (2.8805 (0.6686 - P_{r})) + 3.1381 . \end{matrix}

(9)

Equations (7), (8), and (9) are HOCSVQA models at measurement level, stream level, and packet level, respectively. To be noted, natural constant is taken as base number a in model at packet level, because errors with different base number remain the same in the same regression method. Also note that, at stream level, when

F_{r}

= 10 fps,

B_{r}

= 2.15, 2.4, 2.65, 2.9, 3.15 Mbps, quality rating scores keep being similar to each other, and that is because measurements at these test are sufficient enough for video reconstruction, so we rejected these test data for regression accuracy. At measurement level, when Q-step = 9, 10, 11, quality rating scores keep being similar at same measurement rate, which verified asymmetric piecewise empirical model proposed previously.

(a) Figures 9–11 show the square residuals of regression at three levels to indicate the accuracy of regression. At stream level, parts of square residuals are numerically large because of binary and nonlinearity of regression at this level. At packet level, square residuals keep being small, which presents a good regression effect. That is because number of samples used to regress at this level is less. (b) Figures 9–11 show the comparison between subjective verification rating scores and CSVQI achieved by the CS quality model we proposed. The distance of the circles to the diagonal line in picture indicates the approximation between two sets of data. It can be seen that the CSVQI assessed by the model we proposed maintains a high correlation with data from verification experiments in the picture. And validity of the model has also been verified.

Figure 9

(a) Square residuals of regression at measurement level; (b) subjective verification rating scores comparing with CSVQI at measurement level.

Figure 10

(a) Square residuals of regression at stream level; (b) subjective verification rating scores comparing with CSVQI at stream level.

Figure 11

(a) Square residuals of regression at packet level; (b) subjective verification rating scores comparing with CSVQI at packet level.

From the perspective of statistics, we also employed Pearson correlation coefficient (10) and spearman rank correlation coefficient (11) to prove correlation between CSVQI and data from verification experiments, shown in Table 6. The more close to 1 the coefficient is, the more correlated the two sets of data are. Consider

\begin{matrix} C_{pearson} = \frac{L_{X Y}}{\sqrt{L_{X X}} \sqrt{L_{Y Y}}} = \frac{X^{T} Y}{\sqrt{(X^{T} X) (Y^{T} Y)}}, \end{matrix}

(10)

\begin{matrix} C_{spearman} = 1 - \frac{6 {(X - Y)}^{T} (X - Y)}{N (N^{2} - 1)}, \end{matrix}

(11)

where N is the number of samples.

Table 6

Pearson correlation coefficient and Spearman rank correlation coefficient at three levels.

	Pearson coefficient	Spearman coefficient
Measurement level	0.9988	0.9997
Stream level	0.997	0.999
Packet level	0.9995	0.9987

6. Conclusion

In this paper, we proposed a hierarchical objective CS video quality assessment (HOCSVQA) approach to get CS video quality index (CSVQI). CS video quality model chose measurement rate, quantization step, bit rate, frame rate, and packet loss rate as principal parameters from three levels, measurement level, stream level, and packet level, respectively. The coefficients in the model were obtained by regression analysis of data from subjective CS video quality assessment experiments. The CS video quality index, CSVQI, assessed by the model we proposed maintained a high correlation with data from verification experiments under statistical correlation measure and was illustrated in Figures 9–11. This approach not only can keep the convenience and real-time characteristic of objective video assessment, but also reflects the QoE to a certain extent due to the coefficients regressed from subjective video assessment experiments.

Nowadays CS is on its growing stage, and a lot of achievements have been made on the theoretical part. However, on account of difference of information form between CS system and traditional system, there still exists a mass of technical issues to be solved. And device realization of CS data acquisition is the very problem attaching much attention. There is a long way to go before putting it into practice. How to evaluate and guarantee the quality of information transmission in CS system is the key point in this paper and also the focus in our future research work.

Footnotes

Conflict of Interests

The authors, Shouning Chen, Baoyu Zheng, and Yujuan Zhao, declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research was supported by the National Science Foundation of China (61271240); Innovation Project of Sci & Tech for College Graduates of Jiangsu Province (CXLX12 0475); A Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions; Specialized Research Fund for the Doctoral Program of Higher Education (20113223120002); and University Natural Science Research Project of Jiangsu Province (11KJB510016). The authors express their thanks to participants and organizer in the experiments.

References

Donoho

D. L.

Compressed sensing

IEEE Transactions on Information Theory 2006 52 4 1289 1306

10.1109/tit.2006.871582

MR2241189

2-s2.0-33645712892

Candès

E. J.

Wakin

M. B.

An introduction to compressive sampling

IEEE Signal Processing Magazine 2008 25 2 21 30

Baraniuk

R. G.

Compressive sensing

IEEE Signal Processing Magazine 2007 24 4 118 124

10.1109/MSP.2007.4286571

2-s2.0-34548253373

Duarte

M. F.

Davenport

M. A.

Takhar

Single-pixel imaging via compressive sampling

IEEE Signal Processing Magazine 2008 25 2 83 91

Stankovic

Cheng

Compressive video sampling

Proceedings of the European Signal Processing Conference (EUSIPCO ′08)

August 2008

Lausanne, Switzerland

Mun

Fowler

J. E.

Residual reconstruction for block-based compressed sensing of video

Proceedings of the Data Compression Conference (DCC ′11)

March 2011

Snowbird, Utah, USA

IEEE

183 192

10.1109/DCC.2011.25

Fowler

J. E.

Mun

Tramel

E. W.

Block-based compressed sensing of images and video

Foundations and Trends in Signal Processing 2010 4 4 297 416

10.1561/2000000033

2-s2.0-84858221586

Liu

Pados

D. A.

Motion-aware decoding of compressed-sensed video

IEEE Transactions on Circuits and Systems for Video Technology 2013 23 3 438 444

10.1109/TCSVT.2012.2207269

2-s2.0-84874873257

Kang

L.-W.

C.-S.

Distributed compressive video sensing

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ′09)

April 2009

Taipei, Taiwan

IEEE

1169 1172

10.1109/icassp.2009.4959797

2-s2.0-70349200890

10.

T. T.

Chen

Nguyen

D. T.

Nguyen

Gan

Tran

T. D.

Distributed compressed video sensing

Proceedings of the 16th IEEE International Conference on Image Processing (ICIP ′09)

November 2009

1393 1396

10.1109/icip.2009.5414631

2-s2.0-77951951428

11.

Wang

Bovik

A. C.

Video quality assessment based on structural distortion measurement

Signal Processing: Image Communication 2004 19 2 121 132

10.1016/s0923-5965(03)00076-6

2-s2.0-0348170824

12.

Wolf

Pinson

Video quality measurement techniques

NTIA Report 2002 02-392

13.

Winkler

Mohandas

The evolution of video quality measurement: from PSNR to hybrid metrics

IEEE Transactions on Broadcasting 2008 54 3 660 668

10.1109/tbc.2008.2000733

2-s2.0-50549092673

14.

Figueiredo

M. A. T.

Nowak

R. D.

Wright

S. J.

Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems

IEEE Journal on Selected Topics in Signal Processing 2007 1 4 586 597

10.1109/jstsp.2007.910281

2-s2.0-39449126969

15.

Bioucas-Dias

J. M.

Figueiredo

M. A. T.

A new TwIST: two-step iterative shrinkage/thresholding algorithms for image restoration

IEEE Transactions on Image Processing 2007 16 12 2992 3004

10.1109/tip.2007.909319

MR2472806

2-s2.0-36749005582

16.

Keimel

Rothbucher

Shen

Diepold

Video is a cube

IEEE Signal Processing Magazine 2011 28 6 41 49

10.1109/msp.2011.942468

2-s2.0-80455156034

17.

Chen

Zheng

A method of image quality assessment for compressive sampling video transmission

Journal of Electronics 2012 29 6 598 603

10.1007/s11767-012-0900-1

2-s2.0-84868315908

18.

Chen

S.-N.

Zheng

B.-Y.

Zhou

A parity-based error control method for distributed compressive video sensing

Proceedings of the 10th International Conference on Signal Processing and Multimedia Applications (SIGMAP ′13)

July 2013

105 110

2-s2.0-84887814745

19.

Candès

Romberg

Sparsity and incoherence in compressive sampling

Inverse Problems 2007 23 3 969 985

10.1088/0266-5611/23/3/008

MR2329927

2-s2.0-34249687049

20.

International Telecommunication Union

Subjective video quality assessment methods for multimedia applications

Recommendation ITU-T 1999 P.910

Geneva, Switzerland

International Telecommunication Union

21.

International Telecommunication Union

Methodology for the subjective assessment of the quality of television pictures

Recommendation ITU-R 2002 BT.500-11

Geneva, Switzerland

International Telecommunication Union

22.

http://media.xiph.org/video/derf/

23.

Hands

D. S.

A basic multimedia quality model

IEEE Transactions on Multimedia 2004 6 6 806 816

10.1109/TMM.2004.837233

2-s2.0-5444272471

24.

Gan

T. T.

Tran

T. D.

Fast compressive imaging using scrambled block Hadamard ensemble

Proceedings of the 16th European Signal Processing Conference

August 2008

Lausanne, Switzerland

IEEE

1 5

25.

Fowler

J. E.

Mun

Tramel

E. W.

Multiscale block compressed sensing with smoother projected Landweber reconstruction

Proceedings of the European Signal Processing Conference

2011

Barcelona, Spain

564 568