Systematic scheme for performance measurement in display

Abstract

Camera-based wireless video sensor networks present opportunities to use large numbers of low-cost, low-resolution wireless camera sensors for various applications such as outdoor remote surveillance and ubiquitous data transmission in line-of-sight systems. Owing to the continuously increasing popularity of displays and cameras, communication in sensor networks with display–camera pairs has become a promising area in both computer vision and wireless communication communities. However, measurement/sampling procedures for sensing data in wireless video sensor networks have not been established well yet. Consistency in channel measurements is an open issue, as precise calibration of the experimental settings has not been fully studied in the literature. This article focuses on establishing a scheme for the precise calibration of the display–camera channel performance. To guarantee high consistency in the experiment, we propose an accurate measurement scheme for the geometric parameters and identify some unstable channel factors such as the Moire effect, rolling-shutter effect, blocking artifact, inconsistencies in autofocus, trembling, and vibrations. In the experiment, we first define the consistency criteria based on the error-prone regions in the bit error rate plots of the channel measurements. It is demonstrated that the consistency of the experimental results can be improved by the proposed precise calibration scheme.

Keywords

Performance evaluation wireless video sensor networks measurement procedure display–camera repeatability

Introduction

In the era of big data, the huge volumes of data from various sources have led to the possibility of launching a large number of new applications and technologies. An example would be the data-driven techniques for computer vision and machine-learning applications.¹ Similar to the traditional wireless sensor network, the camera-based wireless video sensor network is with multiple spatially distributed camera sensor nodes which operate autonomously and transmit data cooperatively through the network. The traditional closed-circuit television (CCTV) surveillance systems used in the field with their centralized processing and recording architecture together with a simple multi-monitor visualization of the raw video streams bear several drawbacks and limitations. One of the most relevant problems in traditional CCTV is the necessary communication bandwidth to each camera, and the computational requirements on the centralized servers strongly limit such systems in terms of scalability, installation size, and spatial and temporal resolution of each camera. For the traditional wireless sensor network, it processes and transmits the physical or environmental data which is not considered as multimedia data. Thus, the volume of data is much larger in a camera-based wireless video sensor network. The camera-based wireless sensor network takes advantages of both the traditional CCTV and wireless sensor network on the integration of multimedia data and flexible network structure, respectively.

The camera-based wireless video sensor network contain massive amount of data to be transmitted and analyzed through the network. For example, the camera systems are employed in the locations with potential high-security threat, such as airports, train stations, trains, and aircrafts. Moreover, it is also widely deployed in many public facilitates, for example, shopping malls, museums, parking garages, and hotels. The concerns on the camera system may also on generating data to analyze user behaviors for business promotion, traffic control, and so on.

As a ubiquitous sensing device, the mobile camera has been considered a low-cost and effective device for collecting varied types of data. It contains heterogeneous elements that can provide numerous benefits over the traditional homogeneous sensor networks.² More specifically, the display–camera pair has been considered a promising technology for building a wireless sensor network, owing to its advantages in terms of bandwidth, licensing, coexistence, and security.³ Display–camera communication has recently gained significant attention owing to the extensive advances in mobile phone cameras.^4–8 In the channel transmitter, the message is encoded and modulated into an image frame, for example, a barcode image, and is presented on a conventional display. At the receiving end, a mobile camera serves as the receiver. The advantages of such display–camera channels are nontrivial. First, the communication requires no extra hardware module, except for a camera–display pair, which is available on almost every off-the-shelf mobile phone. Second, unlike the existing short-range wireless communication technologies, it requires no transmissions in the ever-congested spectrum and, therefore, creates no inferences to other devices. Last, but not the least, the security and privacy can be controlled well during communication by adjusting the visible distance and direction.^6,7 The potential applications of the display–camera communication include, but are not limited to, information retrieval in a shopping mall from a large display, teaching material distribution in a classroom from the projector, and multimedia file sharing from phone-to-phone between different users. The advantages are obvious and the potential applications are promising.

Researchers have been working on improving the channel throughput and reliability. The concept of display–camera communication is motivated by the fact that the capacity of a single barcode is too limited. Multiple barcode images are grouped into a video sequence to boost the capacity. Under such background, the stable decoding performance across the whole recording period is important. However, very few attention has been paid to the measurement/sampling procedures (i.e. the data-acquisition setup) for collecting sensing data from the display–camera channel. It is an important study for both standardizing the experimental procedure and analyzing the existing results. The calibration of experimental parameters is not a trivial problem. First, it is not an easy task to set the experimental parameters accurately. In particular, it is difficult to calibrate the capturing angles between the display and the camera, using conventional measurement tools such as ruler and protractor. In addition, it is found that the channel performance is sensitive to the experimental settings, especially when high channel throughput is required. As shown in section “Experimental results,” the channel performance often changes significantly, even if the experimental settings deviate by 2° in angle or 3 cm in distance. In the literature, some works focus on the case without perspective distortion,^4,7 which is the ideal case for practical applications. Some other works consider a wide range of capturing angles, but no details have been provided on the measurement.^5,6,8

Imprecise descriptions of the experimental settings and the sensitivity of the channel performance lead to issues of low consistency in the channel measurement results. However, consistency is the fundamental requirement of scientific experiments, as the demonstrations of experimental results should not be based on a single event.⁹ In this article, we aim at addressing the issue of consistency in the display–camera channel measurements. The remainder of this article is organized as follows. Section “Proposed channel calibration scheme” describes our proposed channel calibration scheme. Section “Experimental results” shows the consistencies of the channel measurement results, considering and without considering the proposed calibration method. Section “Conclusion” concludes this article.

Proposed channel calibration scheme

In this section, some detailed treatments of the experimental setup are proposed to guarantee the consistency of the experimental results. The contributions are mainly in two parts.

Ensure accurate geometric setups for the experiments

To guarantee results with high consistency, the experimental parameters must be measured and set up with ultimate care. We divide the parameters into two independent sets, that is, the geometric and nongeometric parameters. The geometric ones include the display-to-camera distance, capturing angle between the camera and display, and barcode resolutions/sizes of display and camera. However, the ambient light intensity, non-uniformity of the display brightness, and image blurriness are classified as the nongeometric parameters, which describe the transmitted signal energy. In this article, we focus on the geometric parameters, as the experimental results are more sensitive to the geometric parameters than to the nongeometric ones. The capturing angle and the captured barcode size, in particular, affect the channel performance heavily⁸ because of the change in the received signal energy. However, the nongeometric distortions can be compensated with some equalization schemes as shown in Wengrowski et al.¹⁰

Avoid unstable channel states

An experiment with certain experimental settings (experiment state points) is considered as consistent only when “there is an open set around that point within which the result of that experiment is the same.”^11,12 In each setup for the display–camera communication channel, one or multiple dominant error factors are present. The dominant factors can be a specific process of the decoding pipeline, for example, the barcode corner detection, image binarization, and symbol synchronization. It is important to identify these factors and ensure that they remain unchanged in other instances of the experimental setup. Therefore, it is crucial to identify the dominant error sources in the channel and avoid conducting experiments in the unstable states where the experimental results are sensitive to tiny deviations in the setup.

In the following subsections, details on the above-mentioned two aspects will be presented.

Precise setup for geometric parameters

To achieve a precise setup for the geometric parameters, we propose to use four reference points, to align with the barcode corners as illustrated in Figure 1(a). Given a fixed displayed barcode size, the four points are set such that the aligned barcode will possess a desired capturing angle, distance, and image resolution. Once the four barcode corners are aligned with the reference points and the displayed barcode size is fixed, all geometric parameters will be set accurately. The problem of setting the geometric parameters accurately has been simplified to the alignment of the four corners. The calculations of the four alignment points are explained with reference to the display–camera model illustrated in Figure 2.

Figure 1.

Experimental setup: (a) demo with 20° capturing angle using Nexus 5 and the new iPad. The alignment points are circled. (b) Barcode pattern used in our experiment, and the illustration of slicing operation.

Figure 2.

Pin-hole camera geometry model: (a) mapping from display plane $(x' y')$ to image plane $(xy)$ , (b) illustrating the calculation of $M O_{s}$ from $M' O_{d}$ using the principle of similar triangles, and (c) captured barcode on the image plane.

In the pin-hole camera model illustrated in Figure 2(a), the displayed barcode $A' B' C' D'$ on the display plane $x' y'$ is mapped to the captured barcode $ABCD$ on the camera sensor plane $xy$ . The camera principle axis $z$ passes through the optical center $O$ , the captured barcode center $O_{s}$ , and the displayed barcode center $O_{d}$ . The distance between the displayed barcode center and the optical center $O O_{d}$ is denoted as $d$ , while $f$ is the focal length parameter, which is fixed in most off-the-shelf phone cameras.^13,14 In this model, for the ease of experimental setup, the capturing angle $α$ is set with 1 degree of freedom, that is, $z$ is coplanar with $x' z'$ , and the angle between $z$ and $z'$ is $α$ . For experiments that study multiple degrees of freedom, the angles with the other planes can always be added later. For example, the proposed measurement scheme can be extended without the parallel constraint between $AC$ and $AC$ . For example, we could add an intermediate state of the varying shape between $ABCD$ and $A' B' C' D'$ to allow AC being transformed to any orientation. The intermediate shape could be an arbitrary quadrilateral generated by the perspective transform. However, the calculation of the preset locations of the four corners will be much more complicated and introduces more approximations. To conduct a repeatable experiment, the number of approximation steps should be minimized. Thus, in our work, we have only considered a simplified but representative case of the two geometric shapes.

Let the displayed barcode size be $m \times m$ , and $M, M'$ in Figure 2(a) is the center of $AC$ and $A' C'$ . Based on the principle of similar triangles, for $Δ_{OM O_{s}}$ and $Δ_{OM' P}$ , illustrated in Figure 2(b), the following can be calculated

| M O_{s} | = \frac{f \cdot m \cdot \cos α}{2 d - m \cdot \sin α}

(1)

where $| \cdot | evaluates$ the L1 norm of a vector. However, triangles $Δ_{OAM}$ and $Δ_{OA' M'}$ are also similar triangles, that is

\frac{| AM |}{| A' M' |} = \frac{| MO |}{| M' O |} = \frac{| M O_{s} |}{| M' P |}

(2)

Given that $| A' M' | = m / 2$ , $| AM |$ can be computed as

| AM | = \frac{f \cdot m}{2 d - m \cdot \sin α}

(3)

With $| M O_{s} |$ and $| AM |$ , the coordinate of the point $A$ can be calculated with respect to the origin of the image sensor $O_{s}$ ; similarly, the coordinates for the other three corners can also be calculated. The locations of the four corners can be computed as

\begin{matrix} A = (- \frac{f \cdot m \cdot \cos α}{2 d - m \cdot \sin α}, \frac{f \cdot m}{2 d - m \cdot \sin α}) \\ B = (\frac{f \cdot m \cdot \cos α}{2 d + m \cdot \sin α}, \frac{f \cdot m}{2 d + m \cdot \sin α}) \\ C = (- \frac{f \cdot m \cdot \cos α}{2 d - m \cdot \sin α}, - \frac{f \cdot m}{2 d - m \cdot \sin α}) \\ D = (\frac{f \cdot m \cdot \cos α}{2 d + m \cdot \sin α}, - \frac{f \cdot m}{2 d + m \cdot \sin α}) \end{matrix}

It should be noted that the above location, in length, should be converted to image coordinates, using the knowledge of the pixel size of the imaging sensor. During the experiment, the four corner locations will be calculated according to the predetermined parameters and the above geometric model. The computed coordinates will be marked in the preview window on the mobile application. Therefore, the problem of setting the geometric parameters accurately is simplified to the alignment of the four barcode corners to the reference points.

Avoid unstable experiment states

In this part, we discuss several undesirable factors in the experimental setup, which lead to unstable experiment states. The settings come from a wide range of factors including the trembling of the camera, modulation scheme of the light-emitting diode (LED) display, autofocus operation of the camera, sampling of the displayed pixels with the camera sensor, and effects of tiny vibrations. In the following subsections, detailed discussions will be presented for these factors.

Avoid rolling-shutter effect from complementary metal-oxide-semiconductor sensor

The rolling-shutter scheme is a popular data acquisition scheme for digital cameras with complementary metal-oxide-semiconductor (CMOS) sensors. It reads out the imaging data from the CMOS sensor pixels, row-by-row, sequentially, from the top to the bottom.¹⁵ This is a common scheme in the state-of-the-art high-resolution CMOS sensors, which enables the sensor to continuously gather photons during the exposure process, and thus, increase the sensitivity of the imaging sensor. The major disadvantage of the rolling-shutter scheme is that the time difference between retrieving the pixel data from the top and the bottom of the sensor introduces a delay, and therefore, the rolling-shutter effect.¹⁶ Thus, it is not appropriate for use in capturing scenes with fast-changing environments.

Unfortunately, the display–camera channel can be viewed as a periodic fast-changing channel over space due to the pulse width modulation (PWM) scheme popularly used in conventional displays.¹⁷ The PWM scheme is used to control the brightness or the backlight intensity of a screen. Owing to the binary nature of the display backlight, a pixel can be turned either on or off, which corresponds to the maximum or minimum pixel intensity, respectively. To display grayscale images, the PWM driver turns on each pixel for a certain duration in each duty cycle to achieve the intermediate brightness levels. For a typical display driven by a PWM scheme, the full duty cycle is approximately $5.5 ms$ .¹⁷ If the brightness level is to be set to 50%, the pixel will be turned on and off for the same duration, that is, $2.75 ms$ , in each duty cycle. However, the typical image exposure time for each video frame is from $1 / 200$ to $1 / 30 s$ , which implies that the time difference between retrieving the top and bottom rows will be much larger than the on–off duration.¹⁶ Therefore, several bright and dark bands corresponding to the PWM on–off durations in each duty cycle can be observed in a single captured image, owing to the rolling-shutter effect. It should also be noted that the rolling-shutter effect can be observed in the off-the-shelf liquid crystal display (LCD) and organic light-emitting diode (OLED) display as long as the PWM driving scheme is employed.¹⁸

The generic binarization algorithm, such as the one in ZXing,¹⁹ is a window-based binarization scheme which can be easily affected by any variations of image intensity, for example, caused by the PWM effect. The small change on binarization output leads to different corner-detection results. Meanwhile, any small change on the corner detection leads to a large difference in the following slicing and demodulation step. The sensitivity of the corner detection step is also discussed in section “Consistency experiment: with the proposed scheme.”

In order to reduce the rolling-shutter effect on the display with a PWM scheme, the brightness level should be turned to full so that the transmitted signals are kept static for each display pixel. As demonstrated in Figure 3, a rolling-shutter effect with alternate dark and bright patterns can be observed clearly in the left image with 50% display brightness, while it is eliminated in the case of full brightness level, and any automatic brightness adjustments are disabled. An alternate approach is to avoid using the display with PWM scheme; a list of the possible choices for PWM-free monitors can be found in Baker.²¹

Figure 3.

Barcode images with (left) and without (right) rolling-shutter effect from a Dell E2313H display²⁰ (driven by PWM scheme).

Avoid inconsistencies from autofocus operation

Autofocus refers to the camera operation that automatically adjusts the distance between the lens and the imaging sensor so that the captured image is visually sharp.²² There are two types of autofocus techniques, active and passive autofocus. Active autofocus measures the distance between the camera and the object of interest, using an infrared beam, and adjusts the sensor-to-lens distance accordingly. Passive autofocus captures a series of video frames with varying sensor-to-lens distances and picks the best one by inspecting the image quality metrics, for example, the contrast, in each image. Passive autofocus is the dominant technique employed in the state-of-the-art mobile phone cameras. Thus, it is of our main interest.

Each trial of autofocus may, however, produce different results even if the geometric settings are fixed. This is mainly due to two reasons. First, the limited resources at hand, for example, the low quality of image sequences, strict time constraints, and limited computational power available for metric calculations. Second, the metric differences between the adjacent frames are usually very small, owing to the simple quality metric and the large depth-of-field of the mobile phone camera.²³ Therefore, a slight difference in the sensor-to-lens distance, across different images, produces significant changes in the image quality in a single experiment. In order to maintain the consistency during the experiment, we suggest turning off the autofocus function after one trial at the beginning of each experiment; the barcode image collection should start only after the autofocus operation has finished.

In order to freeze the autofocus function, a modification was made to the autofocus loops. As shown in Algorithm 1, an indicator counter is added to the generic autofocus handling algorithm²⁰ to fix the focus settings after the camera focus is properly set.

Algorithm 1. One-time autofocus algorithm.
1: Set autofocus states: $counter = 0$ ; 2: Initialize preview mode parameters; 3: loop: 4: if $counter = 0$ then 5: goto AUTO-FOCUS. 6: close; 7: 8: procedureAuto-focus 9: if Find best focus lens = SUCCESS then 10: $counter = 1$ ; 11: else 12: $counter = 0$ ;

Avoid Moire effect from the display

It is well known that recapturing an image from the display without careful setup introduces the Moire effect. This is due to the aliasing from sampling the display pixel grid with the camera sensor pixels.²⁴ The barcode detection performance is heavily affected by the Moire pattern, as it introduces false barcode regions. Moreover, the pattern is very sensitive to the geometric settings. A tiny shift in the camera position can lead to huge changes in the Moire pattern, which affects the overall decoding performance. Researchers have been working on possible solutions to avoid the Moire effect in the recaptured image. Some suggestions, such as intentionally avoiding sharp focusing and preprocessing the image using a frequency-domain filter, have been made to eliminate the Moire patterns. However, these operations limit the sharpness of the barcode image and introduce inevitable losses into the images. Moreover, due to the uncontrolled capture environment, the Moire pattern is in irregular shape with spectrum across low- to high-frequency bands. It is difficult to design a band-pass filter to eliminate such artifacts.

Recently, Muammarand and Dragotti²⁴ modeled the structure of the display pixel grid in a two-dimensional (2D) square form and showed that the artifacts could be eliminated by simply setting the display–camera distance to a predetermined value. The distance could be calculated by the knowledge of the camera focal length and the display and camera sensor pixel sizes, that is

d_{k} = 2 f (\frac{k T_{s}}{T_{d}} + \frac{T_{d}}{4 k T_{s}} + 1)

(4)

where $d_{k}$ is the desirable distance; $T_{s} and T_{d}$ are the pixel sizes on the camera sensor and display, respectively; and $k$ is an arbitrary integer. For any integer value of $k$ , the calculated distance $d_{k}$ can effectively eliminate the Moire pattern on the captured image. For a generic setting with the new iPad display and the Nexus 4/5 camera, we have $T_{d} = 0.097 mm$ (iPad), $T_{s} = 1.1$ (Nexus 4) or $1.4$ (Nexus 5), and $f = 4.6 mm$ (Nexus 4) or $4.0 mm$ (Nexus 5).^14,25 The Moire pattern eliminating distances for both sets of equipment have been calculated in Table 1. The four reference points mentioned in section “Ensure accurate geometric setups for the experiments” will be determined based on these Moire pattern eliminating distances. Therefore, the Moire pattern in the captured image is a good indicator of whether the geometric setting is precise enough or not.

Table 1.

Capture distance with no Moire pattern (in meters).

Equipment	$k$
	1	2	3	4	5
New iPad with Nexus 4	0.21	0.11	0.077	0.060	0.050
New iPad with Nexus 5	0.15	0.078	0.055	0.043	0.036

It should be noted that this scheme works only when the display and camera planes are parallel. However, it brings insightful clues for the choice of equipment that can alleviate the Moire pattern. As $T_{d} / 4 k T_{s}$ are the dominant terms in the bracket of equation (4), a smaller display pixel size results in a smaller range for the Moire pattern eliminating distance, which benefits different parts of the image with different but similar display-to-camera distances. In other words, even when the display and camera planes are not parallel, a display with a higher pixel per inch (PPI) value produces a less obvious Moire effect.

Avoid trembling and vibration

Display–camera communication is pervasive and sometimes ad hoc. The camera devices could be handheld, with no stable stations. This leads to difficulties in producing consistent channel measurements, because of irregular hand trembles. This leads to unexpected motion blurs in the captured images, which degrades the image quality and causes loss of parts of an image frame.^7,26 In order to avoid abrupt changes in the communication channels, it is suggested that a stationary platform with adjustable freedom of viewing angle should be used to hold the camera. One example of such equipment is the mini tripod shown in Figure 1(a).

Besides the hand trembling, some tiny mechanical vibrations also cause unexpected camera motions. Early studies^27,28 showed that the mechanical vibrations introduced inter-pixel interferences and limited the performances of an imaging sensor. The situations worsen in our case, since a display is used as the transmitter. On one hand, some inevitable Moire patterns are produced when the display and camera planes are not parallel. As discussed in section “Avoid inconsistencies from autofocus operation,” the Moire effect is generated because of sampling the display pixel grid with the camera sensor pixels. Thus, it is extremely sensitive to the vibrations in the display–camera channel. A small vibration of the camera or display may lead to a big change in the Moire pattern, which heavily affects the image preprocessing results, for example, binarization output. On the other hand, vibrations cause tiny shifts in the barcode corners in the captured images. This slight difference, in the pixel or even subpixel level, leads to very different results in the corner-detection and symbol-synchronization steps.

In the literature, researchers have proposed several approaches to solve the issues caused by vibrations. Siebert et al.²⁹ showed that the vibrations could be well estimated with a three-dimensional (3D) homography setup using two calibrated cameras. In our setup, a pair of consecutive images can be considered as images from two independent cameras. The vibrations between the two image frames can then be estimated using the same homography model, and the required compensations can be carried out. Advanced mobile phones offer image stabilization functionality at optical and software levels. For example, the iPhone 6 Plus³⁰ employs optical image stabilization, which takes advantage of the gyroscope and motion sensors, to measure motion data, and provides accurate information on the lens movement. The motion blur produced by shaking hands can then be compensated even under low-light conditions.

Therefore, we suggest that the experimental setup be stationed at a desk without any vibration sources such as computer cooling systems and hard disk motors. A tablet, which has no moving components, is a highly suitable display for our experiment, and phone models with image stabilizers for the cameras are preferable.

Experimental results

In this section, the experimental procedure is first introduced, and the experimental results, considering and without considering the proposed channel calibration scheme, are shown.

Introduction to experimental procedure

The purpose of the display–camera channel calibration experiment is to identify the range of the error-prone regions, which is important for error correction codes with unequal error protection (UEP)³¹ and other advanced coding schemes. In this part, we first describe our experimental settings proposed to produce consistent results, followed by a methodology to analyze the experimental results.

Experimental setups

Given the considerations discussed in section “Proposed channel calibration scheme,” the experimental parameters are set as follows:

Display: new iPad with Retina display $(2048 \times 1536 pixels)$ ;

Camera: Nexus 4/5 run in video graphics array (VGA) $(640 \times 480 pixels)$ preview mode with a customized Android app for retrieving the frames;

Camera stand: mini tripod with adjustable orientation and height;

Barcode pattern: generic binary barcode with datamatrix-like structure as shown in Figure 1(b).

Barcode size: $10 \times 10 c m^{2}$ on display;

Barcode dimension: $87 \times 87 module s^{2}$ ;

Distance: 21 and 15 cm from the camera to barcode center for Nexus 4 and Nexus 5, respectively, according to equation (4);

Capture angles: −20°, 0°, and 20°.

Brightness: 250–350 lx as measured by a lux meter.

It should be noted that our barcode pattern is not exactly the same as the datamatrix code.³² The main difference is that our barcode pattern has an odd number of modules in each dimension (e.g. 87 in our setup), while the datamatrix code has an even number of modules. This change has been implemented to improve the detection accuracy of the top-right corner. With an odd number of modules along each dimension, a dark module is presented at each end of the finder pattern. It eliminates the need for estimating the top-right corner based on the detected ones at top-left, bottom-left, and bottom-right. Therefore, the detection accuracy can be improved.

It should be noted that our scheme for channel calibration is not limited to a specific barcode pattern. The barcode dimension is chosen such that it demonstrates a high-throughput (85 × 85 bits/frame) performance, which is of our main interest. The display–camera distance is fine-tuned to minimize the effect of the Moire pattern, according to equation (4). The settings of the capture angles cover a wide range of practical applications. For each angle setting, we precompute a set of four reference points as shown in section “Ensure accurate geometric setups for the experiments.” The users are required to align the reference points with the barcode corners. A simple demo of the 20° experimental setting is shown in Figure 1(a). It should be noted that the measurement procedure is also applicable to high-capacity quick response (QR) Code or other barcode patterns. This is due to the binarization and corner-detection processes of the generic 2D barcodes are the same, hence factors leads to inconsistency experimental results are similar for different barcode patterns. Besides the results on extreme cases (20°), the demodulation results under 0° condition is also reported. These settings cover both best and worst cases under practical setup. Thus, it is generalizable to other configurations, for example, 10°.

In each set of experiments, 500 images are collected to achieve high reliability, that is, a good enough confidence interval. The 95% confidence interval can be computed by³³

I_{C} = 2 \sqrt{p \cdot (1 - p)} / \sqrt{N_{I}}

(5)

where $p$ is the bit error rate (BER) for a given module and $N_{I}$ is the number of images, that is, 500 in our experiment. It means that the actual BER has 95% probability of being in the interval $[p - I_{C}, p + I_{C}]$ . For $p = 0.1$ , $I_{C} = 0.026$ and the 95% confidence interval is $[0.074, 0.126]$ . In other words, with 500 images and 0.1 BER, the experimental result has achieved 95% confidence interval in $\pm 0.026$ interval. The 95% confidence interval for $p = 0.2, 0.3, 0.4$ can also be computed as $\pm 0.036, \pm 0.042, \pm 0.044$ .

Result analysis: consistency measure

After collecting the barcode images, they are forwarded to the decoder and the demodulation BER plot is generated by assembling the error probabilities of each module according to the module position. As illustrated in Figure 4, the error plot is overlaid with the barcode image to show the position correspondences. The BER value at a given position in the xy-coordinates shows the performance of the corresponding barcode module. In this simple example, an error-prone module with an error rate of 0.1 is detected at the center of the barcode region.

Figure 4.

Illustration of the bit error rate plot over the barcode region.

The BER plots of different experiments are compared to evaluate the consistency of the experiments. In this study, the locations of the error-prone regions, rather than the amplitudes of individual peaks, are of our main interest, because error-prone locations are the primary concerns of some advanced error protection schemes and channel studies. Therefore, preprocessing of the BER plots is necessary to eliminate the noisy spikes before evaluating the consistency. One generic approach is to apply a thresholding operation on the BER plots so that only the dominant peaks are retained, and the regions within the $D$ modules L1-distance to the dominant peaks are labeled as the error-prone regions $P_{1}$ and $P_{2}$ . In the experiment, the threshold is empirically selected to be 50% of the mean amplitude of the top 10 dominant peaks and $D$ is set to be 3. The common error-prone regions in both BER plots are obtained by $P_{1} ⋂ P_{2}$ , while the overall error-prone regions are $P_{1} ⋃ P_{2}$ , and $S_{c} = P_{1} ⋂ P_{2}$ and $S_{a} = P_{1} ⋃ P_{2}$ denote the set of nonzero modules for the intersection and union sets, respectively. The number of elements in $S_{c}$ and $S_{a}$ are denoted as $E_{c}$ and $E_{a}$ , respectively. Thus, the consistency metric is defined as

R = E_{c} / E_{a}

(6)

The higher the metric, the more consistent the two experiments are.

Consistency experiment: with the proposed scheme

In this part, we demonstrate that the consistencies of the experiments can be maintained by accurately setting the geometries with the proposed scheme. As shown in Figure 5, the consistency metrics for the above distributions are computed as 0.56 for −20°, 0.39 for 0°, and 0.77 for 20°. It can be seen that the top-left and bottom-left corner regions are the error-prone positions in the error plots, regardless of the capturing angles. This is due to the datamatrix-like finder and timing pattern design. As can be seen in Figure 1(b), the top and right edges of the barcode consist of alternative black and white modules, which are used in the symbol synchronization. Errors in locating the barcode position heavily affect the symbol-synchronization accuracy. This is due to the red slicing lines for symbol alignment, which are drawn from the top to the bottom or from right to left. The top-left, bottom-left, and bottom-right corner regions suffer the most from the symbol-synchronization errors.

Figure 5.

BER plots for our experimental results at −20°, 0°, and 20° capturing angles with Nexus 5 camera and the new iPad display. The consistency metrics for the three columns are: 0.56, 0.40, and 0.77 for the −20°, 0°, and 20° conditions, respectively.

It is also worth mentioning that the differences in the amplitudes of the two corresponding error plots are due to small alignment differences in the two experimental settings. As can be seen in Figure 6, the first set of images has higher detection accuracy in the bottom-left corner than the second set of images. Thus, smaller symbol-synchronization errors and lower amplitudes in the BER plots are obtained for the first set of images. Although various experimental conditions have been considered and controlled in the measurement process, there are still a lot of uncertainties which could generate slightly different experimental results. These factors contains but not limited to the pixelated effect of display pixels, blurring in the capturing process, and various noises from camera sensor.

Figure 6.

Corner-detection and symbol-synchronization results for both experiments in the case of 20° capturing angle. Top: the detected bottom-left corners marked with red dots; Bottom: the corresponding symbol-synchronization performances at the bottom-left region: (a) first set and (b) second set.

The results for both Nexus 4 and Nexus 5 are summarized in Table 2. It can be inferred that, when the channel distortions are strong and deterministic, the experiments should have high consistency. For example, under −20° and 20° perspective distortions, the consistency scores are generally higher than those of the experiments at 0°.

Table 2.

Consistency scores.

Equipment	Angle (°)
	−20	0	20
New iPad with Nexus 4	0.68	0.52	0.82
New iPad with Nexus 5	0.56	0.40	0.77

Inconsistent experiments: without the proposed scheme

Slight deviations in the geometric parameters can lead to very different decoding performances, and therefore, very different error plots. Without precise calibration of the geometric parameters, the consistency of the experimental results cannot be guaranteed from independently collected images. In this section, we show that a small inaccuracy in setting the angle and distance can lead to a completely different decoding performance.

One set of experimental images are collected at an angle of −22° and a distance of 18 cm, using Nexus 4 and the new iPad. Such deviations can easily occur if the angle and distance measurements are not conducted properly. For example, by measuring the distance when the phone and display are not properly aligned ( $O, O_{s}$ , and $O_{d}$ do not lie on the same axis as shown in Figure 2), or by roughly setting the angle using a protractor and ruler. It turns out that the consistency score drops significantly, when compared to the error plots with precise geometric setups. As shown in Figure 7(a) and (b), the position of the error-prone region has changed significantly and the consistency score has dropped to 0.22, although the geometric setup has deviated by only 2° in terms of the viewing angle and 3 cm in the capturing distance. The error-prone region near the right edge, due to the lens distortion, becomes more and more serious when the viewing angle increases and the capturing distance decreases as shown in Figure 7(c).

Figure 7.

Error plots of the experimental results by Nexus 4 and the new iPad with (a) precise geometric settings, that is, −20° and 21 cm; (b) −22° and 18 cm; and (c) synchronization error due to lens distortion in the right edge of the barcode.

It should be noted that the inconsistency between measurement results is an important phenomena but, to the best of our knowledge, it has not been reported in the literature. Following the suggested experimental procedure, repeatable experimental results can be generated. Moreover, though the results are more sensitive under a large viewing angle which is an extreme experimental condition. After understanding the error-prone locations under different experimental conditions, error correction code with UEP can be applied and improve the reliability of the communication system.

Conclusion

In this article, we established a precise calibration scheme for the geometric parameters in the display–camera communication channel. Four reference points were precomputed according to the predetermined geometric parameters. The problem of accurate setup of the parameters was simplified to the alignment of four corner with four reference points. Careful setting of the equipment and experimental parameters was needed to avoid unstable channel states such as the Moire effect, rolling-shutter effect, blocking artifacts, autofocus inconsistencies, trembling, and vibrations. In the experiment, the BER plots of the captured barcode images were analyzed and the consistency criteria were defined. We demonstrated that setting the geometric parameters accurately, and avoiding the unstable factors with the proposed scheme, could improve the consistency of the experiment significantly.

It should be noted that the proposed scheme is also applicable to the display–camera communication scheme without a fix pattern. General speaking, all display–camera communication systems require proper alignment of the detected pattern, which could be a barcode or an area without specific landmark. The factors mentioned in our measurement scheme, that is, rolling-shutter effect, inconsistencies from autofocus, and Moire effect, induce significant distortions into the captured images. These distortions lead to significant difficulties in the detection process and result in inconsistent experimental results and less stable communication performance. Therefore, the conditions specified in the proposed measurement scheme are also applicable on general display–camera communication scheme without a barcode pattern, such as Jo et al.³⁴

Footnotes

Handling Editor: Zhigao Zheng

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work of the first author was supported by the NSFC Project under grant 61702340, the NSF of Guangdong Province under grant 2017A030310382 and the Faculty Startup Grant of Shenzhen University under grant 2016052. The work of the second author was supported by the General Research Fund of the Hong Kong Research Grants Council under Project 16207114.

References

LeCun

Bengio

Hinton

. Deep learning. Nature 2015; 521(7553): 436–444.

Kulkarni

Ganesan

Shenoy

et al . Senseye: a multi-tier camera sensor network. In: Proceedings of the 13th annual ACM international conference on multimedia, Singapore, 6–11 November 2005, pp.229–238. New York: ACM.

Hossain

Jang

. A survey of design and implementation for optical camera communication. Signal Process: Image 2017; 53:95–109.

Hranilovicand

Kschischang

. A pixelated MIMO wireless optical communication system. IEEE J Sel Top Quant 2006; 12(4): 859–874.

Perli

Ahmedand

Katabi

. Pixnet: interference-free wireless links using LCD-camera pairs. In: Proceedings of the sixteenth annual international conference on mobile computing and networking, Chicago, IL, 20–24 September 2010, pp.137–148. New York: ACM.

Hao

Zhouand

Xing

. COBRA: color barcode streaming for smartphone systems. In: Proceedings of the 10th international conference on mobile systems, applications, and services, Ambleside, 25–29 June 2012, pp.85–98. New York: ACM.

Wang

et al . Enhancing reliability to boost the throughput over screen-camera links. In: Proceedings of the 20th annual international conference on mobile computing and networking, Maui, HI, 7–11 September 2014, pp.41–52. New York: ACM.

Ashok

Jain

Gruteser

et al . Capacity of pervasive camera based communication under perspective distortions. In: Proceedings of the IEEE international conference on pervasive computing and communications, Budapest, 24–28 March 2014, pp.112–120. New York: IEEE.

Sprott

. Statistical inference in science (springer series in statistics). Berlin: Springer, 2000.

10.

Wengrowski

Yuan

Dana

et al . Optimal radiometric calibration for camera-display communication. In: Proceedings of the IEEE winter conference on applications of computer vision, Lake Placid, NY, 7–10 March 2016, pp.1–10. New York: IEEE.

11.

Witten

. A note on the structure of system state spaces and its implications on the existence of non-repeatable experiments. B Math Biol 1980; 42(2): 267–272.

12.

Campbell

. Repeatability of experiments. B Math Biol 1982; 44(4): 593–593.

13.

Some thoughts about the iPhone 5S camera improvements, http://www.anandtech.com/show/7329/some-thoughts-about-the-iphone-5s-camera-improvements (accessed 10 September 2017).

14.

Google Nexus 5 review, http://www.anandtech.com/show/7517/google-nexus-5-review/6 (accessed 10 September 2017).

15.

Bigas

Cabruja

Forestand

et al . Review of CMOS image sensors. Microelectr J 2006; 37(5): 433–451.

16.

Nakamura

. Image sensors and signal processing for digital still cameras: optical science and engineering. Boca Raton, FL: Taylor & Francis, 2005.

17.

Hainichand

Bimber

. Displays: fundamentals and applications. Boca Raton, FL: Taylor & Francis, 2011.

18.

Lee

J-H

Liu

S-T

. Introduction to flat panel displays, vol. 20. Hoboken, NJ: John Wiley & Sons, 2008.

19.

ZXing Project. Multi-format 1D/2D barcode image processing library, https://github.com/zxing/ (accessed 10 September 2017).

20.

Monitor DELL E2313H, 23’, http://www.monitor-details.info/monitor-dell-e2313h-23.html (accessed 10 September 2017).

21.

Baker

. Flicker free monitor database, TFT Central, http://www.tftcentral.co.uk/articles/flicker_free_database.htm (accessed 10 September 2017).

22.

Hirsch

. Light and lens: photography in the digital age. Abingdon: Taylor & Francis, 2012.

23.

Micheletti

. Iphone photography and video for dummies. Hoboken, NJ: John Wiley & Sons, 2010.

24.

Muammarand

Dragotti

. An investigation into aliasing in images recaptured from an LCD monitor using a digital camera. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, Vancouver, BC, Canada, 26–31 May 2013, pp.2242–2246. New York: IEEE.

25.

Apple iPad. https://www.apple.com/hk/en/ipad/compare/ (accessed 10 September 2017).

26.

Chen

Kotand

Yang

. A two-stage quality measure for mobile phone captured 2D barcode images. Pattern Recogn 2013; 46(9): 2588–2598.

27.

Wulichand

Kopeika

. Image resolution limits resulting from mechanical vibrations. Opt Eng 1987; 26(6): 266529.

28.

Rudoler

Hadar

Fisherand

et al . Image resolution limits resulting from mechanical vibrations. Opt Eng 1991; 30(5): 577–589.

29.

Siebert

Wood

Splitthof

. High speed image correlation for vibration analysis. J Phys: Conf Ser 2009; 181: 012064.

30.

Apple iPhone 6 Plus cameras, https://www.apple.com/iphone-6/cameras/ (accessed 10 September 2017).

31.

Goldsmith

. Wireless communications. Cambridge: Cambridge University Press, 2005.

32.

ISO/IEC 16022:2006. Information technology—automatic identification and data capture techniques—data matrix bar code symbology specification.

33.

Belia

Fidler

Williamsand

et al . Researchers misunderstand confidence intervals and standard error bars. Psychol Methods 2005; 10(4): 389–396.

34.

Gupta

Nayar

. DisCo: display-camera communication using rolling shutter sensors. ACM T Graphic 2016; 35(5): 150.

Systematic scheme for performance measurement in display–camera-based sensing networks