Sage Journals: Discover world-class research

Abstract

Closed-circuit televisions serve as prevention against crime, and many studies for closed-circuit television deployment have been conducted. The closed-circuit television deployment in downtown is similar to solving coverage problem in wireless camera-based sensor networks. The difference between the two problems is various environmental factors such as buildings, roads, camera capability, and movements of pedestrians. We use a genetic algorithm to increase the efficiency of closed-circuit television deployment in two-dimensional topography. In addition, a parallel experiment using general-purpose computing on graphics processing units is added to improve computing speed, which is a disadvantage in genetic algorithms. The target region is 500 m × 500 m and consists of 50 × 50 grids. The fitness of the evaluation, which refers to a detection rate, is calculated from the corresponding cell when a pedestrian moves to each cell depending on whether the pedestrian is detected. The proposed experiment was superior to the random deployment experiment by approximately 37.5%. There was no significant difference in the detection rate between the CPU experiment and a NVIDIA GeForce GTX 970 experiment in the 95% confidence interval. The efficiency of a CUDA kernel function using the NVIDIA GeForce GTX 970 graphic card was analyzed.

Keywords

Sensor deployments coverage genetic algorithm parallel computing CUDA

Introduction

Boundary coverage^1–3 involves the detection of an entity by forming a boundary when an entity passes a region of interest (RoI). This is applicable to various fields such as border surveillance, protection of resources, surveillance patrol, and invasion detection. Many sensor nodes must be deployed to cover the total RoI in order to reach full coverage. However, the number of deployed sensor nodes can be substantially reduced by forming boundary coverage and detecting the moving entities. The efficiency of boundary coverage increases significantly with wider RoI, but problems occur because coverage holes appear.

Boundary coverage can be classified into weak barrier coverage and strong barrier coverage methods. Weak barrier coverage detects the moving entities that pass the RoI in a fixed path. Strong barrier coverage detects moving entities that pass the RoI in a random path.⁴

Metaheuristic algorithms such as tabu search, gradient ascend, simulated annealing, and evolution operations are used to optimize the boundary coverage. This is a non-deterministic polynomial-time (NP) hardness problem.⁵ A genetic algorithm,^6,7 which is a type of evolution operation, is used in this study to optimize the boundary of closed-circuit television (CCTV) deployment. Adopting parallelism is more convenient in genetic algorithms compared with other metaheuristic-based algorithms because high-quality solutions can be easily gained in a short time during parallel processing.

CCTVs are used in crime prevention, criminal tracking, invasion detection, and boundary surveillance patrol. Uses of CCTV equipment are gradually increasing. CCTV’s efficacy is especially important in urban areas that have concentrated populations.

It is quite difficult to deploy CCTVs efficiently in wide RoIs and on complex roads using a suitable number of CCTVs. This is an NP-hard problem and it is proper to adopt approaches that apply metaheuristic-based algorithms.

Zang⁸ researched the boundary coverage of camera sensor networks and aimed to gain high-sensing coverage. Regardless of the direction from which an object approaches, the camera boundary is generated with full-view coverage. The camera sensor can effectively capture facial images. The authors aimed to generate a boundary using few camera sensors by analyzing a full-view coverage model and camera selection algorithm. Geometric models are used to achieve full-view coverage in both deterministic deployment and random deployment.

Wang and Cao⁹ claimed that boundary coverage in camera sensor networks with directivity was not effective only with the sum of RoI sensing ranges. The direction of an object and the camera angle was used to measure the sensing quality. When the camera direction and direction in which the object faces are close enough, one object is completely covered from all angles regardless of its facing direction. This was used to generate a camera boundary that could completely cover all points within the RoI. They proposed a new camera deployment method to achieve boundary coverage in deterministic environments, and they reduced a redundancy to optimize the number of required cameras.

Tao et al.⁴ researched a sensing model that had directivity. To provide strong boundary coverage, the researchers used a video camera to make sensors randomly control the direction. First, the researchers proposed to study strong boundary coverage in one-dimension (1D) using a polynomial-time algorithm, which can achieve the required coverage using a minimum number of sensors. Moreover, for strong boundary coverage, a geographical relation was found between deployment region boundaries and sensors with directivity. The concept of virtual nodes was introduced to reduce the solution space from a continuous domain into a discrete domain. A directivity boundary graph that can provide quick answers if the directions of directional sensors exist, to achieve strong boundary coverage in a given region, was generated. If such a direction exists, it was claimed that efficient solutions that minimize the maximum rotation angle of directional sensors could be found. The efficiency of the solutions was proven through simulation.

Navin¹⁰ proposed a new distributed genetic algorithm to overcome camera coverage control problem, which has high time complexity. He/She divided one population into k sub-population and repeated genetic process in parallel for each sub-population. We used only one population. Genetic algorithms require a lot of computing time in fitness evaluation. We designed a CUDA kernel function to evaluate all chromosomes at the same time and the attempt led to largely reduce computation time. Simulation results showed that distributed genetic algorithm results near-optimal solution faster than standard genetic algorithm.

Shih et al.¹¹ proposed a distributed algorithm, namely CoBRA (Cone-based Barrier coveRage Algorithm), to achieve barrier coverage in wireless camera sensor networks (WCSNs). The basic concept of CoBRA is that each camera sensor can determine the local possible barrier lines according to the geographical relations with their neighbors. Experiment results show that CoBRA can efficiently achieve barrier coverage in WCSNs. Comparing to the ideal results, CoBRA can use fewer nodes to accomplish barrier coverage in random deployment scenarios.

Navin et al.¹² also showed the ability of genetic algorithm to solve coverage problem in WCSNs. Simulation results of their presented method showed the better coverage ratio compared with the self-orientation algorithm presented in 2008.

Ma et al.¹³ focused on the Minimum Camera Barrier Coverage Problem (MCBCP) in WCSNs in which the camera sensors are deployed randomly in a target field. In addition, they proposed an optimal algorithm for the MCBCP. Simulation results demonstrated that their algorithm outperformed the existing algorithm.

Costa and Guedes¹⁴ surveyed the state of the art of the coverage problem issue in WCSNs, regarding strategies, algorithms, and general computational solutions. Open research areas are also discussed, envisaging promising investigation considering coverage in WCSNs.

This study aimed to optimize the deployment of CCTV in two dimensions. A downtown area was set as a background, and the locations and angles of CCTVs were fixed during deployment. Detection is determined as a random probability when both the pedestrian facing direction and CCTV facing direction overlap. This procedure was optimized using a genetic algorithm, and the movement detection of each pedestrian was processed in parallel. Our research deals with a barrier coverage problem. Barrier coverage problem is used to successfully detect events for some objects to cross sensing area. As a special case, CCTV deployments also aim to detect pedestrians crossing a region of downtown. Deterministic deployment and random deployment are types of sensor deployment. Deterministic sensor deployment is generally convenient for installation and is used in accessible environments. Random deployment is used in randomly scattered form when direct deployment is difficult. In this study, deterministic sensor deployment that identifies the locations and number of sensors in advance is used for installation, and a genetic algorithm that is a metaheuristic-based algorithm is used for optimization. Genetic algorithms include inherent parallelism differently from other heuristic algorithms such as tabu search^15,16 and gradient ascend.^17,18 Our research focused on the improvement on computing time. Tabu search enhances the performance of local search by relaxing a basic rule. At each step, worsening moves can be accepted if no improving move is available. Gradient ascend is a first-order iterative optimization algorithm for finding the maximum of a function. These algorithms do not include any parallelism and they use just one solution to find the optimum.

The remainder of this article is organized as follows. Section “CCTV deployments” explains CCTV deployments based on CPUs. In section “Parallel CCTV Deployments,” we present CCTV placements using a parallel technique such as CUDA. In section “Experiments,” we explain our experimental environments, plan, and results. This article ends with some concluding remarks in section “Concluding remarks.”

CCTV deployments

CCTV deployment is a difficult issue that must be determined by considering many variables. CCTV deployment was simplified as a problem to detect pedestrians by determining only CCTV locations and angles in a two-dimensional (2D) plane. CCTV types and detection rates were set in reference to actual products, and the detection direction was simplified into eight directions.

A downtown was set as the region in the experiment, and virtual topography was used to create Scenarios 1 and 2. Deployment was limited to a 2D plane, and the topography was composed only of buildings and roads. CCTVs can be deployed only on roadsides, and the detection direction and detection distance are different depending on the type of CCTV. The area of the RoI is 500 m × 500 m and is a grid composed of 50 × 50 cells.

The pedestrian moved randomly from a starting point of (x, y) = (0, 0) to a destination point of (x, y) = (49, 49). An A* algorithm¹⁹ is used as a path-searching algorithm. The A* algorithm moves toward the destination point by calculating the shortest path. The weighted value was added to the moving direction of the pedestrian. However, the number of movements toward the shortest path was generally high owing to the properties of the A* algorithm.

CCTV types and detection capabilities

Detectable distance and detection rate were set as listed in Table 1 by referring to CCTV products types sold in the market. CCTV1 is based on the DS-2CD2010-I model and has a maximum visible distance of 30 m. CCTV2 is based on the OPS-BMIP720P3 model and has a maximum visible distance of 40 m. CCTV3 is based on the VSTARCAM-100BP model and has a maximum visible distance of 50 m.

Table 1.

Distance and the detection rates according to CCTV type.

CCTV type	Visibility
CCTV type	Distance (m)	Detection rate	Distance (m)	Detection rate
CCTV1	0–20	0.95	20–30	0.50
CCTV2	0–30	0.75	30–40	0.40
CCTV3	0–40	0.65	40–50	0.30

CCTV: closed-circuit television.

The directions of pedestrians were set in eight types, as shown in Figure 1(a). The detection directions of CCTVs were also set in eight directions. When a CCTV can detect pedestrians, the case is limited to the situation in which the pedestrian facing direction and CCTV facing direction are facing each other, as shown in Figure 1(b).

Figure 1.

Relation between the direction of a moving person and that of a CCTV: (a) moving direction of pedestrian and (b) detectable direction of CCTV.

Calculation of detection ratio in each cell

Probability of detection (POD) of CCTV deployment should be calculated according to the movement direction of pedestrians, because CCTV detection has a limited detection direction, unlike the scalar sensor detection, which can detect in all angles. The POD should be calculated again for new movement because the pedestrian and CCTV facing directions are different when the pedestrian moves to a new cell. After adding the number of cases for both detected and undetected cases of a pedestrian in the relevant cell by comparing the calculated PODs and random values, equation (1) is used to calculate the detection rate. V_d is the number of detected cases and V_nd is the number of undetected cases

Detection ratio (%) = 100 \times \frac{V_{d}}{V_{d} + V_{nd}}

(1)

Genetic algorithm

A genetic algorithm was used to deploy CCTVs in the target region, which was composed of 50 × 50 grids. The size of the population was 100, the number of generations was 100, tournament selection is used for selective operation, and the k value that indicates a selective force was set as 3. A block-even crossover operation suitable for 2D problems was used, and a variation was created with a 5% probability in a gene-wise variation operation. In a substitution operation, the solution with the lowest quality was substituted with the child solution if the child solution in the solution population was better than the solution with the lowest quality.

Selection, crossover, and variation operations are repeated in every generation as many times as the number of populations. Then, the genetic algorithm with generation type is used for a substitution operation. Table 2 lists the parameters of the CPU-based genetic algorithm. Regarding encoding, integers that represent the eight directions are added to the integer-type x and y location coordinates on the grid. A CCTV deployment simulator is used for a fitness evaluation. The evaluation is performed once after deployment of the initial solution and once before the substitution operation for each generation. A fitness evaluation is also sequentially performed in the CPU experiment, and an evaluation was performed in parallel with all fitness solutions within the population in the graphics processing unit (GPU) experiment. A total of 10 pedestrians passed the RoI in the fitness evaluation, in which the number of fitness evaluations became 100 × 10. Figure 2 shows a flowchart of the proposed genetic algorithm.

Table 2.

GA parameters for CCTV deployment.

Parameter	Contents
Fitness function	Calculation of detection rate according to pedestrian movement
Encoding	Two-dimensional integer coordinates (50, 75, and 200 dimension) + 8 directions
Population size	100
Number of generations	100
Selection operation	Selection of solution with the highest quality after randomly selecting solution from population (t = 3)
Crossover operation	Block-even crossover
Variation operation	Gene-wise (5%)
Substitution operation	If quality of the child solution is better than the solution with the lowest quality in the population, the solution with the lowest quality is substituted with the child solution

CCTV: closed-circuit television; GA: genetic algorithm.

Figure 2.

Flowchart of the proposed GA.

Parallel CCTV deployments

POD for each cell in CCTV parallelism should be calculated whenever a pedestrian crosses each cell. The CCTV deployment simulator, which is a fitness function, calculates the POD in each CUDA thread, processes the detection of pedestrians, and returns the number of detected and undetected pedestrians. Our research deals with a barrier coverage problem, of which the goal is how well a deployment detects events for some pedestrians to cross sensing area of CCTVs. Hence, it is very natural that the detection rate is considered as our fitness evaluation measure. The detection rate is calculated by applying equation (1) to the returned result.

The most important part of the parallelism of the CCTV deployment simulator was a path search using the A* algorithm. The A* algorithm priority queue enabled a hip alignment in space with a fixed size. The weighted value was put in the direction of pedestrian movement when setting the level values to diversify moving paths. Moving paths were attempted to be diversified by generating random numbers when setting the level values, but too much computing time was used owing to the frequent uses of random-number generation function. Moreover, the random-number generator used in CPU experiments could not be used in CUDA kernel function in the same way. We made a random-number generator based on the method proposed by Park and Miller.²⁰ The generator effectively reduced the computing time in CUDA kernel function.

Genetic algorithms have a trade-off between solution quality and computation time. When population size or the number of generations is large, genetic algorithm may converge too slowly. On the contrary, if it is small, the algorithm may take less computing time but produce worse solutions. Genetic algorithms usually adopt 100–200 solutions over 100–500 generations.^21,22 Although not shown in this article, we conducted several preliminary experiments varying population size and the number of generations to find better parameter settings. When population size is 100 and the number of generations is 100, we could reach to an appropriate trade-off point.

Calculation of parallel A* algorithm

A path search calculation, which is part of the fitness function, required the longest computing time in the CPU-based CCTV deployment simulation results. Figure 3 shows the configuration of the NVIDIA CUDA platform²³ block used for the parallelism of the path search and thread per block. The number of blocks and thread per block are set as shown in Table 3.

Figure 3.

Blocks and threads for detection of moving persons based on NVIDIA CUDA platform.

Table 3.

Blocks and threads used in detection of moving persons.

Classification	Dimension
Number of blocks per grid	$(x, y, z) = (100, 1, 1)$
Number of threads per block	$(x, y, z) = (10, 1, 1)$

A total of 100 blocks are set, 10 threads work on a block, and each thread is used to calculate a path search of pedestrians. Usually, the number of blocks and that of threads per block used in CUDA kernel function are set considering the number of CUDA cores and the amount of shared memory of the graphic card used in the experiments. In addition, various optimization algorithms are used to reduce computation time in GPU. We set the number of blocks and that of threads per block considering graphic card specifications given in Table 5, shared memory usage used in CUDA kernel function of Figure 4, and implementation simplicity. We set the number of threads per streaming multiprocessor (SM) to be 10 because the amount of shared memory used in the CUDA kernel function was large.

Figure 4.

Pseudocode of kernel function for POD calculation.

Implementation of parallel A* algorithm

Figure 4 shows a pseudocode to detect pedestrians who pass the RoI. Location information on all CCTVs deployed for all solutions were encoded in a 1D array; x represents the starting address of the x coordinate array, and y represents the starting address of the y coordinate array. dir refers to the starting address of the direction array in the direction the CCTV faces. cmap is the constant memory of the GPU used to transfer the topography information of Scenario 1 or Scenario 2. CCTV location and direction information is saved in the shared memory that has a quick processing speed. This information can be shared between threads that are in one block. This enables an efficient use of memory and an improvement in computing speed.

Topography information for the scenarios and CCTV deployment information are saved in the 2D array map. The path of pedestrian movement is searched and saved into the path in advance and then returned. The pedestrian moves according to the path information, and detection is determined by the deployed CCTV. Detection of the pedestrian is judged every time the pedestrian passes each cell. The value of the detected variable is increased by 1 when a pedestrian is detected, and the undetected variable is increased by 1 for accumulation when a pedestrian is not detected. The detection rate is calculated using equation (1).

Experiments

Experimental environments

The experiment is composed of various combinations with different types and numbers of CCTVs. To measure the detection rate of pedestrians, a boundary is formed by considering buildings and roads in a downtown environment. Optimizations of sensor and CCTV deployment maximize the detection rate of detectable entities and use fewer CCTVs. A fitness evaluation is performed by using the NVIDIA CUDA platform to enhance the computing speed, which is a disadvantage in metaheuristic algorithms.⁵

A statistical method is used on the experimental results of the CPU and GPU to compare and analyze detection rate values. The computing speeds of NVIDIA GeForce GTX graphic cards are compared. In addition, improvements in the kernel function are presented through an analysis of the kernel function using the NVIDIA Nsight profiler.^24,25

The C language and CUDA-C language based on NVIDIA CUDA platform 7.0 were used to develop 2D deployment simulators to establish the experimental environment. A CCTV deployment simulator and genetic algorithm based on CPU were implemented using the C language. A parallel CCTV deployment simulator was implemented using the CUDA-C language.

Table 4 shows information about the clock speed, the number of cores, and the amount of CPU memory used in the experiment. (These are commonly used in all of the experiments.) Table 5 lists information about the three types of graphic cards used in the experiment. The NVIDIA GeForce GTX 970 graphic card is mainly used, and the other two cards are used for comparative experiments. The number of CUDA cores and the L2 cache size are the items that have significant influence on enhancing the computing speed of the GPU specification. The L2 cache is used in shared memories and registers. Using the L2 cache can result in great enhancements in the performance of the memory approach speed. The L2 cache of the NVIDIA GeForce GTX 970 is more than three times larger than that of other GPUs, and the time for memory approach and copying can be greatly shortened if this is used effectively.

Table 4.

CPU specifications used in experiments.

CPU components	Intel Core 2 Duo
CPU clock (GHz)	3.0
CPU cores	2
Memory (MB)	4096

CPU: central-processing unit.

Table 5.

GPU specifications used in experiments.

GPU Components	GTX 970	GTX 770	GTX 560
GPU clock (MHz)	1317	1189	1620
CUDA cores	1664	1536	336
Memory clock rate (MHz)	3505	3505	2010
L2 cache size (kB)	1792	512	512
Global memory (MB)	4096	4096	2048

GPU: graphics-processing unit; CUDA: Compute Unified Device Architecture.

A significant reduction in computing time can be achieved when there are many tasks that must be processed at the same time and do not require high-performance operation. In addition, the computing speed can be improved by optimizing software aspects such as block and thread configurations that have high efficiency, using a random-number generator in the GPU environment,²⁰ and by avoiding or reducing remainder operations and division operations.

Results

The experiment was repeated 30 times, which is the minimum statistically significant number.¹⁹ Scenarios 1 and 2 were classified according to topography and were also classified according to the numbers and types of CCTVs. These are listed in Table 6. An analysis of the experiment results included a statistical analysis of the detection rate using 3σ, comparison of computing time, and an analysis of the kernel function using the NVIDIA Nsight profiler. The experiment resulted in an average of the optimum solutions of 30 experiments.

Table 6.

The number and types of CCTVs.

Instances	CCTV1	CCTV2	CCTV3
Instance 1	15	25	30
Instance 2	15	25	30
Instance 3	20	25	40

CCTV: closed-circuit television.

Statistical analysis of CCTV deployment

Table 7 lists the results of the experiment in Scenario 1. The averages of the NVIDIA GeForce GTX 970 and CPU were similar to the analysis of the 95% confidence interval in Figure 5(a). The computing speed of the NVIDIA GeForce GTX 970 was approximately 3.7 times faster.

Table 7.

Test results of CCTV Scenario 1.

	Instance	Average (%)	Standard deviation (%)	GA (s)	Instance (s)	POD (s)	Random (%)
GTX 970	Instance 1	86.2	3.1	3.694	3.651	3.268	56.9
	Instance 2	93.0	1.9	3.716	3.653	3.276	68.9
	Instance 3	96.1	1.2	3.749	3.668	3.284	76.4
CPU	Instance 1	84.9	2.4	12.688	12.648	N/A	55.8
	Instance 2	91.8	1.9	13.300	13.241	N/A	67.2
	Instance 3	96.0	1.3	13.751	13.674	N/A	75.0

CCTV: closed-circuit television; GA: genetic algorithm; POD: probability of detection.

Figure 5.

95% confidence interval graph of CCTV deployment: (a) 95% confidence interval graph of Scenario 1 and (b) 95% confidence interval graph of Scenario 2.

Table 8 lists the results of the experiment in Scenario 2. The averages of the NVIDIA GeForce GTX 970 and CPU were similar to the analysis of the 95% confidence interval in Figure 5(b). The computing speed of the NVIDIA GeForce GTX 970 was approximately 4.2 times faster. Figure 6 shows the results of a comparison between computing time according to scenario and instance.

Table 8.

Test results of CCTV Scenario 2.

	Instance	Average (%)	Standard deviation (%)	GA (s)	Fitness (s)	POD (s)	Random (%)
GTX 970	Instance 1	85.1	2.9	4.061	4.016	3.636	51.9
	Instance 2	91.5	1.9	4.088	4.025	3.646	59.3
	Instance 3	94.9	1.5	4.116	4.035	3.651	68.7
CPU	Instance 1	84.8	2.9	16.096	16.056	N/A	49.3
	Instance 2	90.1	2.2	16.880	16.820	N/A	58.7
	Instance 3	94.2	1.6	17.041	16.966	N/A	69.0

CCTV: closed-circuit television; GA: genetic algorithm; POD: probability of detection.

Figure 6.

Comparison between CPU and GPU computing time according to scenarios and the number of CCTVs: (a) CPU versus GTX 970 (Scenario 1) and (b) CPU versus GTX 970 (Scenario 2).

The computing time of the NVIDIA GeForce GTX 560 and NVIDIA GeForce GTX 770 was similar to the results listed in Tables 9 and 10. The computing time of NVIDIA GeForce GTX 970 was decreased by approximately half. In addition, the NVIDIA GeForce GTX 970, which has an L2 cache with a capacity that was three times larger, showed relatively better performance.

Table 9.

Comparison of Test results according to GPU types for CCTV Scenario 1.

	Instance	Average (%)	Standard deviation (%)	GA (s)	Instance (s)	POD (s)	Random (%)
GTX 970	Instance 1	86.2	3.1	3.694	3.651	3.268	56.9
	Instance 2	93.0	1.9	3.716	3.653	3.276	68.9
	Instance 3	96.1	1.2	3.749	3.668	3.284	76.4
GTX 770	Instance 1	86.2	2.7	7.124	7.079	6.719	57.3
	Instance 2	92.6	1.8	7.055	6.990	6.628	67.5
	Instance 3	96.3	1.0	7.051	6.968	6.600	76.2
GTX 560	Instance 1	87.0	2.1	6.946	6.897	6.420	56.9
	Instance 2	92.7	2.1	7.000	6.935	6.453	67.4
	Instance 3	95.1	1.8	8.589	8.506	8.030	69.1

CCTV: closed-circuit television; GA: genetic algorithm; POD: probability of detection.

Table 10.

Comparison of Test results according to GPU types for CCTV Scenario 2.

	Instance	Average (%)	Standard deviation (%)	GA (s)	Fitness (s)	POD (s)	Random (%)
GTX 970	Instance 1	85.1	2.9	4.061	4.016	3.636	51.9
	Instance 2	91.5	1.9	4.088	4.025	3.646	59.3
	Instance 3	94.9	1.5	4.116	4.035	3.651	68.7
GTX 770	Instance 1	85.7	3.1	7.939	7.894	7.533	51.0
	Instance 2	91.5	2.1	7.873	7.807	7.442	60.4
	Instance 3	94.9	1.9	7.830	7.748	7.384	68.9
GTX 560	Instance 1	84.4	3.2	8.483	8.437	7.972	50.6
	Instance 2	91.4	1.8	8.545	8.479	8.005	59.0
	Instance 3	95.0	1.8	8.597	8.514	8.035	69.6

CCTV: closed-circuit television; GA: genetic algorithm; POD: probability of detection.

Figure 7 shows the comparative experiment of computing time using three types of NVIDIA GeForce GTX graphic cards on Scenario 1. An overall even difference in computing time is shown for Instances 1 and 2. The time width of the operation in Instance 3 increases, and it is shown that the GPU operation time has significant influence when there are more than 100 deployed CCTVs.

Figure 7.

Comparison of computing time of GPUs according to the number of CCTVs in Scenario 1.

Figures 8 and 9 show the results of Instance 3 experiment and examples of CCTV deployment for Scenarios 1 and 2, respectively. The CCTV detection directions of the initial solutions in Figures 8(a) and 9(a) are different, but the cases of CCTV detection direction facing the direction of pedestrian movement increased slightly in the optimum solutions in Figures 8(b) and 9(b). In Figures 8 and 9, CCTV deployment location was concentrated along the path of pedestrian movement.

Figure 8.

Optimized CCTV deployment of Scenario 1: (a) initial solution and (b) improved solution.

Figure 9.

Optimized CCTV deployment of Scenario 2: (a) initial solution and (b) improved solution.

CCTV kernel function analysis

An analysis of the kernel function in the three aspects of occupancy, instruction statistics, and issue efficiency were performed using NVIDIA Nsight profiler. A total of 100 blocks were allocated in the implemented kernel function, and 10 threads per block were set.

In the occupancy analysis, the number of warps that can be used when there are 10 threads per block is shown in the “Varying Block Size,” and low values are shown proportionately to the number of blocks and threads per block that were configured. An enhancement in performance is expected by increasing the number of threads per block. A total of 64 registers per thread are used in the “Varying Register Count.” It is considered that to enhance performance, an adjustment procedure is required to increase the number of registers to use per thread. The number of warps when using a shared memory of 1200 bytes per block is shown in the “Varying Shared Memory Usage.” It is considered that an enhancement in performance can be expected by increasing the used shared memory per block.

In the analysis of instruction statistics, the average of total SMs in the “SM Activity” is 93.50%, but the overall SM activity is unbalanced in that the exculpation time is different or the number of scheduled blocks per SM is variable. Thus, different activities in each SM, that is a Tail Effect, have an unfavorable influence on the total execution time. The balance level of workload distribution is also not effective in the “Instructions per Warp.” This can be considered as an effect of cases that have conditional statements depending on the block index, or when there is a high workload only in a few blocks.

The overall low performance appears to be a result of the concentrated operations in a few CUDA cores. In addition, the path-searching part of the A* algorithm is processed in order as in the CPU experiment. A considerable enhancement in performance is expected by improving the parallelism level in the path-searching part.

The analysis of issue efficiency highlights an efficiency issue of the kernel function in CCTV deployment. The number of warps per SM is considerably small in the “Warps per SM,” and the value of “No Eligible” in the “Warp Issue Efficiency” is very high. This shows that the efficiency of the kernel code is not high. The “No Eligible” part is related to the contents in the “Issue Stall Reasons,” and the memory dependency value in the “Issue Stall Reasons” is high, with a value of 70.25%. The values show a high possibility that the requested resource cannot be used or is being completely used by another process or too many processors have requested the resource. It is considered that this problem can be improved at some point by memory alignment and optimization of the memory approach pattern.

Analysis

A comparison of detection performance, comparison of computation time, and analysis of the kernel function using the NVIDIA GeForce GTX 970 were analyzed in the experiment. The simulator used in the experiment is classified into CPU-based sequential processing simulators and GPU-based parallel processing simulators. The C language was used to develop the sequential processing simulator, and a CUDA-C language based on the NVIDIA CUDA platform 7.0 was used to develop the parallel processing simulator.

It was found in the analysis (on the 95% confidence interval graph using 3σ) that there was no significant difference in detection rate between the experiment results from the CPU and NVIDIA GeForce GTX 970. With regard to computing speed, the NVIDIA GeForce GTX 970 experiment was approximately four times faster than the CPU experiment.

Concluding remarks

A generational genetic algorithm was used to optimize 2D deployment in the boundary coverage of this study. The C language and a CUDA-C language were used to develop the sequential processing simulator and parallel processing simulator for the evaluation of fitness.

The CCTV deployment issues in the two downtown environments were used as the case studies, and the results of various experiments using these cases were analyzed. An experimental analysis consisted of a statistical analysis on the detection rate, a comparison between the computing time of the sequential processing experiment and parallel processing experiment, and an analysis of the CUDA kernel function.

The detection rates of the CPU experiment and the experiment using the NVIDIA GeForce GTX 970 graphic card were similar and showed a considerable improvement in computing time. Problems with the kernel function were investigated, and an analysis of future improvements was conducted.

Like other precedent papers, a direct comparison of values with the results of other research is difficult. However, a new method for the optimization of 2D deployment was presented in from a methodological aspect and is expected to be used in various military, security, and invasion detection fields.

It is considered that more realistic experiment results can be obtained if various metaheuristic-based algorithms are used by finding parameter combinations of high efficiency. It is required to find efficient GPU parameter settings to reduce computing time. It is also considered that the efficiency of block settings and memory usage must be increased in order to enhance the parallelism level of the simulators.

Footnotes

Handling Editor: Bin Gao

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by Wonkwang University in 2017.

References

Mostafaei

. Stochastic barrier coverage in wireless sensor networks based on distributed learning automata. Comput Commun 2015; 55: 51–61.

Cardei

. Coverage in wireless sensor networks. In: Ilyas

Mahgoub

(eds) Handbook of sensor networks. Boca Raton, FL: CRC Press, 2004, pp.1–18.

Wang

. Build intrusion barriers. In: Wang

(ed.) Coverage control in sensor networks. London: Springer, 2010, pp.175–186.

Tao

Tang

Zhang

et al . Strong barrier coverage in directional sensor networks. Comput Commun 2012; 35(8): 895–905.

Garey

Johnson

. Computers and intractability: a guide to the theory of NP-completeness. San Francisco, CA: Freeman, 1979.

Holland

. Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. Cambridge, MA: The MIT Press, 1992.

Yoon

Kim

Y-H

. An efficient genetic algorithm for maximum coverage deployment in wireless sensor networks. IEEE T Cybernetics 2013; 43(5): 1473–1483.

Zhang

. COMP 4905-Honours Project. Carleton University, 2013.

Wang

Cao

. Barrier coverage in camera sensor networks. In: Proceedings of the twelfth ACM international symposium on mobile ad hoc networking and computing, Paris, 17–19 May 2011, Article no. 12. New York: ACM.

10.

Navin

. Distributed genetic algorithm to solve overage problem in wireless camera-based sensor networks. Res J Recent Sci 2015; 4(12): 106–109.

11.

Shih

Chou

Liu

et al . On barrier coverage in wireless camera sensor networks. In: 2010 24th IEEE international conference on advanced information networking and applications (AINA), Perth, WA, Australia, 20–23 April 2010, pp.873–879. New York: IEEE.

12.

Navin

Asadi

Pour

et al . Solving coverage problem in wireless camera-based sensor networks by using genetic algorithm. In: 2010 international conference on computational intelligence and communication networks (CICN), Bhopal, 26–28 November 2010, pp.226–229. New York: IEEE.

13.

Yang

et al . Minimum camera barrier coverage in wireless camera sensor networks. In: 2012 Proceedings IEEE, INFOCOM, Orlando, FL, 25–30 March 2012, pp.217–225. New York: IEEE.

14.

Costa

Guedes

. The coverage problem in video-based wireless sensor networks: a survey. Sensors 2010; 10(9): 8215–8247.

15.

Tabu search, https://en.wikipedia.org/wiki/Tabu_search

16.

Glover

. Future paths for integer programming and links to artificial intelligence. Comput Oper Res 1986; 13(5): 533–549.

17.

Gradient decent, https://en.wikipedia.org/wiki/Gradient_descent

18.

Deift

Zhou

. A steepest descent method for oscillatory Riemann-Hilbert problems. Asymptotics for the MKdV equation. Ann Math 1993; 137(2): 295–368.

19.

Hogg

Tanis

. Probability and statistical inference. New York: Macmillan, 2014.

20.

Park

Miller

. Random number generators: good ones are hard to find. Commun ACM 1988; 31(10): 1192–1201.

21.

Deb

Pratap

Agarwal

et al . A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE T Evolut Comput 2002; 6(2): 182–197.

22.

Horn

Nafpliotis

Goldberg

. A niched Pareto genetic algorithm for multiobjective optimization. In: Proceedings of the first IEEE conference on evolutionary computation, 1994. IEEE world congress on computational intelligence, Orlando, FL, 27–29 June 1994, pp.82–87, New York: IEEE.

23.

Krömer

Snåšel

Platoš

et al . Many-threaded implementation of differential evolution for the CUDA platform. In: Proceedings of the 13th annual conference on genetic and evolutionary computation, Dublin, 12–16 July 2011, pp.1595–1602. New York: ACM.

24.

NVIDIA Nsight Visual Studio edition 4.5 user guide, http://docs.nvidia.com/nsight-visual-studio-edition/4.5/Nsight_Visual_Studio_Edition_User_Guide.htm

25.

NVIDIA. Compute Unified Device Architecture (CUDA) C programming guide. NVIDIA Corporation, 2014.

Coverage problem in camera-based sensor networks using the CUDA platform

Abstract

Keywords

Introduction

CCTV deployments

CCTV types and detection capabilities

Calculation of detection ratio in each cell

Genetic algorithm

Parallel CCTV deployments

Calculation of parallel A* algorithm

Implementation of parallel A* algorithm

Experiments

Experimental environments

Results

Statistical analysis of CCTV deployment

CCTV kernel function analysis

Analysis

Concluding remarks

Footnotes

Declaration of conflicting interests

Funding

References