Abstract
While the theoretical foundation of the optimal camera placement problem has been studied for decades, its practical implementation has recently attracted significant research interest due to the increasing popularity of visual sensor networks. The most flexible formulation of finding the optimal camera placement is based on a binary integer programming (BIP) problem. Despite the flexibility, most of the resulting BIP problems are NP-hard and any such formulations of reasonable size are not amenable to exact solutions. There exists a myriad of approximate algorithms for BIP problems, but their applications, efficiency, and scalability in solving camera placement are poorly understood. Thus, we develop a comprehensive framework in comparing the merits of a wide variety of approximate algorithms in solving the optimal camera placement problems. We first present a general approach of adapting these problems into BIP formulations. Then, we demonstrate how they can be solved using different approximate algorithms including greedy heuristics, Markov-chain Monte Carlo, simulated annealing, and linear and semidefinite programming relaxations. The accuracy, efficiency, and scalability of each technique are analyzed and compared in depth. Extensive experimental results are provided to illustrate the strength and weakness of each method.
1. Introduction
Due to the significant progress in visual sensor technology, wireless communication, and pattern recognition algorithms, the deployment of wide-area visual sensor networks has become practical and cost-effective. These networks have a wide range of commercial and military applications from video surveillance to smart home and from traffic monitoring to antiterrorism. A proper placement of different visual sensors in the target environment is an important design problem as it has a direct impact on both the cost and performance of the network. The number of sensors, the dedicated communication networks, and the proper routing of power supply can contribute a significant portion of the overall construction cost. The placement of the sensors also determines the size and shape of the coverage area as well as the visual appearance of all surveillance subjects, which affect the performance of all subsequent computer vision and pattern recognition tasks.
However, even with decades of study in camera placement design, the most ambitious goal of designing a universal camera network configuration tool has not yet been achieved. The difficulties come from a variety of factors. First, cameras are vulnerable to occlusions by both static and dynamic objects. A typical wide-area indoor or outdoor environment is often characterized by complicated topologies, stringent placement constraints, and a constant flux of occupant or vehicular traffic. A decent configuration tool must take into account all these factors so as to minimize the amount of occlusion. Second, different applications impose very different performance objectives and requirements on camera networks. Simply maximizing the coverage is no longer adequate for modern applications. For example, object identification or biometric applications may require minimum object size and specific pose to ensure proper detection. Traffic measurement applications may require observation of full trajectories or proper segmentation of pedestrians. 3D reconstruction or distributed camera hand-off algorithms require multiple camera views on the same object. Resource constraints such as number of cameras, types of cameras, connectivity between neighboring cameras, power consumption, and camera movement controls are routinely used to reflect limitations in realistic deployment. A universal camera placement framework must be able to accommodate all these sometimes contradictory requirements in a single optimization framework.
Classical approaches formulate the camera placement problem as a geometry problem over the continuous space of the physical environment and camera network configuration. With the tremendous complexity in modeling camera visibility, environment topology, and the disparate application requirements mentioned above, it is difficult if not impossible to capture all aspects of the problem via any continuous-domain approach. As such, most of the camera placement researches resort to discrete approaches, where the physical environment and the camera network configuration search space are first discretized. A binary variable is defined for each point in the search place to indicate the possible presence of a camera at that location and pose. The totality of these camera variables defines the configuration of a camera network which in turn determines the visibility binary variable at each point of the physical environment. Application-specific constraints and objective functions can then be defined based on these variables to formulate a binary integer programming (BIP) optimization problem.
While the BIP formulation provides great flexibility in modeling the camera placement problem, BIP optimization problems are notorious to solve in practice. Many BIP formulations of camera placement are NP-complete, where exact solutions are too complex to obtain even for a small-size problem. As a result, a myriad of approximate algorithms have been applied in solving BIP-based camera placement problem. However, many of the proposed approaches [1–8] are customized for specific applications and a fair evaluation of different approximate algorithms on solving the general camera placement problem remains elusive.
In this paper, we present the BIP formulation for a wide variety of camera placement problems and provide a comprehensive framework to study various approximation algorithms in solving them. Specifically, we consider two general camera placement problems: the MIN formulation in which the camera resource is minimized for a target performance, and the FIX formulation in which the target performance is maximized under a resource constraint. Many other commonly used constraints in camera network design are also discussed. To solve the BIP-based camera placement problems, we have applied a spectrum of approximate algorithms from greedy, heuristics, Markov-chain Monte Carlo, simulated annealing, linear programming relaxation, and semidefinite programming relaxation. The accuracy, efficiency, and scalability of each technique are analyzed and compared in depth. Extensive experimental results are also provided to illustrate the strengths and weaknesses of each method. To the best of our knowledge, this is the first comprehensive comparison of various approximate algorithms in solving the camera placement problems. Compared with an earlier version of this work [9], we have further expanded the BIP framework to handle a richer set of objective functions and constraints and included more experimental results to compare various approximate algorithms.
The paper is organized as follows. First, we give a brief literature review in Section 2. Section 3 defines the two camera placement problems and formulates them via a BIP framework to incorporate some of the most commonly used objective functions and constraints. In Section 4, we introduce different approximate algorithms and present details on the adaptation of each algorithm in solving the BIP camera placement problems. Extensive experimental results are presented in Section 5 to compare their performances. We conclude our paper in Section 6.
2. Related Work
The problem of optimal camera placement has been studied for decades. The earliest investigation can be traced back to the “art gallery problem” in computational geometry. This problem is the theoretical study on how to place cameras in an arbitrary-shaped polygon so as to cover the entire area [10–12]. It covers a set of important topics in computational geometry including Delaunay triangulation and vertex covering. While Chvátal has shown in [13] that the upper bound of the number of cameras is
Recent widespread deployments of video camera networks, however, turned camera placement from a problem of theoretical interest into an important tool that can significantly improve the performance, coverage, and cost-effectiveness of the network. While the original “art gallery” problem was formulated in the continuous 2D or 3D spaces, the complexity of modeling visibility in continuous space increases dramatically when practical considerations such as orientation and multiple views are incorporated. Sophisticated continuous-space modeling pertinent to visual sensor networks is recently proposed in [18, 19]. The sophistication in their visibility models has greatly limited their practical usage in wide-area surveillance environments.
As such, the majority of the recent approaches consider the problem entirely in discrete domain—instead of optimizing continuous functionals using calculus of variation, discrete-domain approaches quantize the search space into finitely many candidate positions and search for the best configurations that optimize a target objective function. This strategy naturally leads to combinatorial problems with the camera, environment, and traffic models encoded in different integral constraints and objective functions. Horster and Lienhart [1] made an early effort to use generic BIP to formulate the discrete camera placement problem. Our earlier work in [20] followed a similar optimization strategy but with more sophisticated probabilistic model approach to capture the uncertainty of object orientation and mutual occlusion. Similar works can be found in [6, 8]. In Ercan's formulation the objective contains quadratic term so quadratic programming was used instead of BIP which requires all terms to be linear [8].
While most of the formulations result in NP-hard problems, a myriad of practical solutions including binary integer programming solvers (BIP), greedy approach, greedy heuristics, Markov-Chain Monte Carlo (MCMC), and semidefinite programming relaxations (SDP) have been proposed [1–8]. Each method has its own merits in terms of ease of formulation, computational complexity, worst or average case performances, scalability, and so forth. To further complicate matters, different researchers often tackle slightly different objective functions and design specific approximation techniques accordingly. To the rest of the vision community, it is difficult to discern the merits of different approaches and to identify the appropriate solution for a specific placement problem at hand. It is our goal to provide the pros and cons of using different approximate algorithms in solving the camera placement problem. Table 1 lists the approximate algorithms tested in this paper and their usage in existing works.
Various approaches for camera placement.
3. Camera Placement in General
In this section, we define the two camera placement problems and formulate them via a BIP framework to incorporate some of the most commonly used objective functions and constraints.
3.1. BIP Formulation
Although there is a myriad of camera placement problems in the literature, the fundamental objectives of these problems almost always fall into two broad categories which we refer to as the MIN and FIX problems.
The goal of the MIN problems is to minimize the number of cameras such that the target coverage p can be achieved subject to other constraints. The goal of the FIX problems is to maximize the coverage subject to a fixed number of cameras m and other application specific constraints.
Both problems can be tackled in the following fashion. First, the space of possible camera configurations, including locations, yaw, pitch angles, and camera types, can be converted into discrete points by either random [1] or uniform sampling [6, 20]. The target space of the camera network can also be discretized into a finite space, which can be the 2D [7] or 3D [18], object positions, object orientations [21], motion paths [18] or even a combination of all the above spaces.
We denote the discretized camera space as
We assume that the coverage function and all constraints are linear in
3.2. Common Constraints Used in Camera Planning
3.2.1. Visibility Constraint
To determine if a particular target grid point
In this example, we only select the first camera, so
For
Similarly, for
It is a straightforward matter to extend the basic case of visibility to multiple coverage requirement in which an object is considered visible when it is seen by at least
3.2.2. Connectivity
Connectivity requirements impose certain communication limitations on how the camera nodes can communicate with each other. It is typical requirement in wireless camera network where only nearby camera nodes can communicate between each other due to the power constraints. To model such a neighborhood relationship, we introduce an adjacency matrix A with each entry
3.2.3. Localization Error
Localization error indicates the possible error in estimating the location of an object. This function is not isotropic due to the nature of perspective projection. In [8], the authors model the 2D localization error of tracking a single nonoccluded object, given its first and second order statistics, as the minimum mean-square error of the linear estimate based on the perspectively distorted images at the camera. In the 2D case, this location error is a function of the yaw angle
The linearization is shown as below. First, the objective function can be expanded into the following form:
3.2.4. Tracking Performance
Tracking is the process of locating moving objects through a camera network in a continuous fashion. The crucial element for tracking is occlusion handling. In the sense of camera placement or selection, it means the continuous time interval when the object is not observed should be minimized. To incorporate such a constraint, we need to introduce additional path variable
3.2.5. Partial Coverage
In some computer vision applications, a simple binary visibility metric is inadequate. For instance, in facial recognition systems, face image with a lower resolution has a higher probability to produce erroneous matching than a high resolution image. As such, it requires an objective function that can gracefully degrade from a fully visible state to a complete miss. A fuzzy model has been introduced in [25] for this purpose, and it can be easily incorporated into our BIP model. Let
3.2.6. Group Constraint
Group constraint is used to impose different requirements on a subset of the parameter space. For instance, the control of a PTZ camera can be formulated as a group constraint over binary camera variables with known position but unknown pose. Suppose the spatial positions of the cameras have already been determined and the goal is to determine the optimal pose for the visible task. At a specific camera location
3.2.7. Placement of Stereo Sensors
BIP can be also used to model planning of stereo sensors, in which we try to minimize the number of cameras while ensuring every target has been covered by a stereo pair. We provide an alternative BIP formulation to the ones proposed in [4]. The visibility variable
4. Approximate Solutions to BIP Camera Placement Problems
The number of variables in camera placement problems is directly proportional to the volume of the search space and is typically very large even for simple environments. Although there is optimization software capable of solving BIP problems exactly, it is in general impractical or even impossible to obtain an exact solution for any reasonable-size camera placement problem. In this section, we investigate several approximate algorithms for camera placement problems and we will show how close approximated solutions are to the exact solutions by simulations in Section 5.
4.1. Greedy Method
The greedy method is probably the most intuitive method in solving camera placement problems. The basic idea is that instead of seeking a global optimum by checking all possible configurations, we choose one camera that optimizes the objective value at each step. The advantages of the greedy algorithm include a simple implementation and tremendous efficiency—most greedy algorithms have
by other constraints S, the target mean visibility p, maximum number of cameras m, minimum camera coverage Set target grid points in W;
Output
In fact, the greedy algorithm has deeper theoretical motivations than intuition. In combinatorial optimization, there is a well-studied class of problems known as the “set cover” problem [27] defined as follows: given a finite set X and a family ℱ of subsets of X, a cover is a subset of ℱ whose union is X. The set covering optimization problem is to find the cover that uses the fewest elements in ℱ.
Feige has shown in [28] that the greedy algorithm is the best polynomial-time approximation for the set covering problem under the worst-case analysis if
The case for using greedy algorithms in solving the FIX camera placement problems is more complicated. For the simple case of single camera coverage, that is,
4.2. Greedy Heuristics
As mentioned in the previous section, the objective functions of some camera placement problems cannot be computed by adding one camera at a time. Nevertheless, we can still follow the idea of finding local maximum/minimum at each iteration by maintaining a constant number of cameras at every iteration. Algorithm 2 shows such a greedy heuristic for the FIX problem.
by other constraints S, objective function the maximum number of cameras m
Set select a pair with each other; Break;
Output
In this greedy heuristic, we have a well-defined objective function in each iteration as the maximum number of cameras is always used. However, there is no sensible way to choose one set of initial
4.3. Naïve Sampling Methods
Although the deterministic greedy algorithm is very efficient, we cannot improve the result once a local optimum is achieved due to its deterministic nature. Sampling methods allow a definitive advantage over deterministic approaches—one can always improve the results by sampling more points from the distribution.
The simplest version of a random sampler is to uniformly sample points from the camera space. One can terminate the algorithm when a good enough solution is obtained or the maximum number of iterations is reached. However, the large search space makes it difficult to sample even a near-optimal solution in a reasonable running time. As such, this naive version of random sampling is rarely useful.
A better scheme should relate the objective value of the sampled point to the probability of it being sampled. By assigning a higher probability to sampled points with better objective values, we have a higher chance to sample points from the distribution that are close to the global optimum.
We denote S as the set of all possible combinations of cameras subject to all constraints and
We start with the most straightforward one which assumes that different camera positions are independent from each other and the sampling probability at each camera position is directly proportional to the number of target positions it can observe. The first assumption provides an effective mean to focus on one camera at a time, and the second assumption naively relates the overall objective value to the coverage of a single camera. While these two assumptions provide a crude approximation to (21), they provide far better samples than uniform distribution and admit a very efficient implementation. The details of the algorithm are provided in Algorithm 3.
Set
Calculate targets it can observe
Sample one c from W according to P;
Output
We will show in Section 5 that Algorithm 3 provides decent results with complexity comparable to the greedy approach. On the other hand, the assumptions used in Algorithm 3 are very strong and are certainly not applicable in many situations. To cope with arbitrary probability functions, a more general approach is to use MCMC sampling and its many variants. In the next section, we adopt the Metropolis algorithm [33, ch.5] to improve tracking of the probability of each point in the search space without calculating the normalization factor.
4.4. MCMC Sampling
The classical Metropolis algorithm follows three simple steps in determining the next sample point to explore: (1) makes a small perturbation around the current sample point, (2) calculates the gain of this perturbation, and (3) decides whether to accept the perturbation by sampling a random number and comparing with the gain. Algorithm 4 describes this process in solving the FIX problem. In order to conform to the notations typically used in optimization literature, we redefine the probability function in (21) as follows:
Set
Draw random number u from uniform
Output
Algorithm 4 is very similar to Algorithm 2 except for a simple change in sampling strategy; instead of always exchanging with the camera that maximizes the objective function, it chooses a random candidate to proceed. The probability of selecting this candidate is proportional to the amount by which the objective value of the random candidate exceeds that of the current choice. Such a sampling strategy prevents the algorithm from being trapped at local optima and admits random candidates that can explore the rest of the search space. Also, the algorithm can be adapted by changing the perturbation range—one can simply exchange many cameras at a time or only allow switching with cameras nearby.
4.4.1. Gibbs Sampling
An alternative for Metropolis sampling is Gibbs sampling. Similar to Metropolis sampling, the Gibbs sampler draws new sample based on an existing sample. However, instead of applying a random perturbation, the Gibbs sampler draws a partial sample from the conditional probability defined as
4.4.2. Simulated Annealing
The key step in every efficient Monte Carlo method is how it relates the possibility of a point sampled from the distribution with its objective value. Obviously, such a relationship does not need to be linear. Furthermore, we can see that this relationship does not need to be static. There is an extensive literature of a class of techniques known as simulated annealing that focus on how to change this relationship to get a faster rate of converging on an optimum point.
In order to use simulated annealing to solve the FIX program, we need to add another variable called temperature T to our probability function in (22):
temperature Set
Output
The cooling function
4.5. LP and SDP Relaxation
A significant drawback of sampling techniques is that it may take many iterations for the algorithm to converge. Even after convergence, the algorithm provides little clue on how close the resulting approximation is to the true optimal solution.
One possible remedy is to relax the original formulation by replacing the binary constraints with real values
However, the gap between the LP relaxation and original BIP—called the integrality gap—is still unknown. Various methods can be used to reduce the integrality gap by adding more constraints [35–37]. In particular, the SDP introduced in [35] offers an attractive solution in closing the integrality gap.
Note for any binary variable x, an equivalent constraint can be given as
In this paper, we adopt the “Lift and Project” method proposed by Lovász and Schrijver [35] in solving the SDP formulation of the camera placement problem. We first generalize MIN and FIX into the following standard form:
Define a variable matrix Replace each constraint Replace each instance of x variable with y such that Replace binary constraints Solve the SDP problem of Y and recover
It is shown in Section 5 that the “Lift and Project” process provides a much tighter bound than LP relaxation. In fact, we can get an even tighter relaxation by continuing to raise the dimension of variables. In [38], Laurent analyzed and compared three different hierarchical methods to obtain a series SDP relaxations of the 0-1 problem. However, in practical camera placement problems, the number of variables becomes large. Conducting more than one round of SDP relaxation will inevitably run into memory issues.
5. Experimental Results
In our previous paper [5], simulation using a walking humanoid and real-life experiments using a camera network with 7 cameras over a surveillance area of 7.6 m × 3.7 m were conducted to validate the model. We reused this model in this paper so that we could focus on how to solve the model efficiently. We propose three sets of simulation experiments to illustrate the strength and weakness of each proposed method. Firstly, we compare various fast and simple algorithms on a problem with a simple topology. Then, we compare them with more sophisticated algorithms on complex environments. Last but not least, we apply the LP and SDP relaxations on a small example to see how the SDP relaxation dramatically reduces the integrality gap.
For comparison, we only use the constraints in (10) and (18) with
5.1. Environment with Simple Topology
In this section, we test our algorithms on a simple 2D square environment as in Figure 1(a). The blue hollow circles are discretized camera grid positions, and yellow solid stars are target grid positions. The blue arrows are the placed cameras. Here we have

Performance comparison of four approximation algorithms.
We first compare the running time for different algorithms and different sample sizes in Figure 1(b). In Figures 1(c)–1(f), we compare the results for different algorithms when the number of cameras varies. From those comparisons, we can make the following observations: (1) when the number of cameras is sufficiently large, the greedy algorithm has good approximation of IP solution with a fraction of the running time. However, when the number of cameras is small, the greedy algorithm provides much worse results due to its complete overlook of the combinatorial characteristics of the problem; (2) the sampling techniques can trade off performance with computational time; (3) using elements sampled from densities derived from the objective function significantly outperforms those from uniform random sampling; (4) the greedy heuristics generally outperforms other approximation methods. However, it can still be trapped in a local optimum regardless of the sample size. We will see that this disadvantage will incur big penalty when the environment becomes more complex.
5.2. Metropolis Sampling and Simulated Annealing on Complex Problems
As we can see above, for small and simple problems we can choose from the greedy algorithm (Algorithm 1), greedy heuristics (Algorithm 2), or sampling based on marginal distribution (Algorithm 3). Furthermore, these problems can also be solved by a standard IP solver in a reasonable running time. Now, we begin to look at much more complex environments.
In Figure 2, we show two complex environments generated by our camera placement GUI. We present the performances of IP solver, the greedy algorithm, Metropolis sampling, and simulated annealing (SA) approach in Table 2.
Comparison of two Monte Carlo sampling methods with other algorithms.

Two complex topologies. Black objects are obstacles and blue areas are secured areas with grid density 4 times higher than surroundings.
In order to demonstrate the merit of Simulated Annealing, Figure 3 shows three independent Markov chains using Metropolis sampling method with different single parameter T as a in (24). with higher temperature, we get more uniform samples which is useful for exploring the space. as temperature cool down we get less accepted samples but focus more on region with higher objective value.

Three separate Metropolis Markov chain with different temperature.
We can conclude that both sampling algorithms are highly efficient when compared with the IP solver. We can further see the change of temperatures plays an important role in escaping the local optimums and exploring the entire search spaces.
5.3. LP Relaxation and SDP Relaxation
At last, we show the effectiveness of using SDP on relaxation on a simple and a moderate complex environment. The results are shown in Table 3. We can see that the SDP relaxation always gives a tighter bound comparing with LP relaxation. We visualize the camera grid variables for the simple topology in Figures 4(a) and 4(b), with the variable indexes in the x-axis and values in the y-axis. We can see that the SDP relaxation gives results closer to the binary with smaller objective value. In fact, in this particular example, the SDP relaxation solution coincides with the IP solution.
Comparison of the objective values of SDP and LP relaxation.

Comparison of IP and SDP relaxation.
6. Discussion
In this paper, we have presented and compared strengths and weaknesses of various well-known optimization frameworks to solve the generic camera placement problem including a greedy approach, MCMC methods, and LP and SDP relaxations.
In addition to our simulation study presented in this paper, it might be interesting to study how those algorithms can be combined together for solving the generic camera placement problem even more effectively. For example, the output of a greedy approach can be used as an initialization for the sampling methods. While SDP relaxations outperform the LP relaxation in terms of getting tighter bounds, it suffers from the dimension increase due to the “Lift and Project” process. Some dimension reduction approaches may be useful to reduce the problem size and apply more layers of SDP relaxations.
Additionally, it might be interesting to use other emerging optimization tools on our formulation. One example is least absolute shrinkage and selection operator (LASSO), which approximates the sparse approximation problem by a convex problem. For example, recall our MIN problem as follows:
