Abstract
In order to address the difficult problem to determine the number of populations, this paper improves the algorithm based on the Harris point detection algorithm, and the number of people is returned through the first-order linear regression model. First of all, according to the shortcomings of Harris corner algorithm in population statistics, an adaptive gray difference idea is proposed, and the concept of integral image is introduced to overcome its defects in noise immunity and real-time operation. Secondly, in view of the large error generated in the process of population statistics in the first-order static model, a dynamic linear model regression method is proposed. In this method, it is believed that there is certain proportionality coefficient between each frame of corner points and the number of people with the change of time, and this coefficient has certain correlation with the angle points in the previous frame and current frame. At the same time, in order to eliminate the number of redundant corners generated in the corner statistics process, the frame difference method is used to filter the stationary point. Finally, the number of people is returned through first-order linear model.
Keywords
Introduction
With the continuous progress of science and technology and the expansion of urbanization, there are increasingly more people in public places, and as a result, population statistics has great significance in the fields of public security monitoring and business information collection. The traditional manual monitoring and statistical methods 1 are not only time-consuming and labor-intensive but also tend to result in statistical errors due to lack of concentration during the long and stressful working hours. The automatic counting method based on computer vision has drawn more and more attention. Besides, population concentration can also lead to security problems, such as the serious stampede in Shanghai's Bund at midnight on 31 December 2014. Therefore, the population density estimation and statistics have practical significance. Although there have been various researches on population density estimation conducted by scholars both in China and abroad, the demographic accuracy is still low under practical conditions, including interference such as complex background in indoor and outdoor environment, illumination, crowd blocking and non-human mobile background.
The current population statistics research can be mainly divided into the two categories of pedestrian detection and global feature detection. The first kind focuses on detecting human profile, texture, color and other characteristics. Some scholars use the contour matching method to locate the target. The semi-circle model is used in literature 2 to search for head and shoulders in the foreground, and then the snake model is used to track the target. However, when the pedestrians overlap, the positioning accuracy will be greatly reduced. Prior knowledge and non-means clustering algorithm are used to separate the motion region in literature, 3 which may not work when the pedestrian carries large objects. Motion vector is proposed in literature 4 to detect pedestrians when it is difficult to correctly separate motion segmentation and merge objects in the foreground by estimating the number of pedestrians and direction of motion based on the density and direction clustering of motion-vector feature points. However, the positioning accuracy will also be greatly reduced when pedestrians overlap, and this method does not work when the actual scene is more complex. Therefore, the pedestrian detection is only suitable for such demographics as video surveillance of high quality, low population density and no overlaps, and the detection speed is relatively slow. The second category is based on global features. Rather than counting single person by detecting human characteristics, it treats the objects as a whole by establishing the relationship between crowd characteristics and the number of people. The population statistics are obtained in literature 5 by extracting Gray-Level Difference Matrix (GLDM) features of surveillance scenes in the square. However, this method needs to extract more eigenvalues, and the computational procedure is too complicated. Chang et al. 6 adopt Gray-Level Co-occurrence Matrix (GLCM) to conduct population statistics, and the final feature index is simplified through principal component analysis (PCA) to reduce the computational complexity without affecting the accuracy. However, this method does not work because it fails to consider the perspective effect, and it is not very effective in actual detection of complex scenes. Population statistics in literature7,8 is conducted by extracting corner information of motion area in the scene without extracting in the foreground. However, this algorithm can only detect pedestrians in motion, but it cannot detect stationary pedestrians.
Based on literature, 7 this thesis improves the Harris algorithm by overcoming the defects in corner detection. Besides, we adopt the idea of Kalman filter mentioned in Panchal et al. 8 and first-order statistic regression model in Wu et al., 9 so that an improved algorithm of Harris corner detection is proposed for population statistics.
Harris corner detection algorithm and its defects
Harris corner detection method
The corner detection is a mathematical method to compute corners in images. At present, it is mainly used in the fields of computer vision such as image matching, surveillance video acquisition, target tracking, image mosaic and 3D model establishment. By extracting some special points of interest in images, which are defined as corners, the time and complexity in extracting points of interest play a decisive role in the real-time population statistics. The points of interest detected by corners in images also play a key role in the description of local characteristics. Due to the particularity of corners, they are not affected by rotation, dimensional change and other factors. Therefore, it is of great significance to apply corners to population density statistics. Proposed by Harris and Stephens in 1988, Harris corner detection algorithm is used to extract features of signal points, and it is theoretically based on Moravec operator.
10
Moravec operator defines a local rectangle detection window for the central pixel in the image. The window can be slightly shifted in any horizontal, vertical, positive and negative diagonal direction. The average energy change is extracted in rectangular windows during the continuous slight-shift. Besides, the value of window changes is calculated through Taylor series, thus further calculating corners. For a given pixel center C(x, y), its grayscale intensity change in any micro-shift can be expressed by equation (1),
In the equation,
According to definition of interval in equation (2), there are eight types of intensity changes in different directions through window sliding in equation (1), as shown in Figure 1.

Diagram of window sliding in arbitrary direction.
Figure 1 shows how the Harris algorithm can be used to solve the difficulty of Moravec operators moving in any direction. Equation (1) and its subitem are expanded by Taylor, and the approximate value is as shown in equation (3)
The approximate value obtained in equation (3) is incorporated into the autocorrelation function of equation (1)
Equation (5) is incorporated into the autocorrelation function of equation (1)
M in the above equation (6) is expressed as
Equation (7) is a matrix. Assuming its eigenvalues are

Mathematical meanings of eigenvalues
Figure 2 shows that ①:
The Harris algorithm can be used to evaluate the problem of a large amount of computation in corner detection by computing eigenvalues, and it also defines a corner response value R, as shown in equation (8)
In the above equation,
Algorithm defects
During the actual analysis and application of algorithm, the following problems are found: (1) In the sliding process of matrix windows, the algorithm tends to be affected by noise, which has great impact on the final position of diagonal points. (2) In the process of calculating the corner response, it tends to result in overlapping of adjacent pixels and increase the computational complexity. (3) The selection of threshold value is directly related to corners. The selection of threshold value in Harris algorithm can only be obtained empirically. If the threshold value is too large, it will lose corner information and produce pseudo corners; if the threshold value is too small, it will reduce corner quality and increase the sensitivity to noise.
Improved Harris corner detection algorithm
Ideas of improved algorithm
Aiming at solving the above problems existing in Harris algorithm, this thesis proposes an idea of self-adaptive gray-level difference mean-value for its non-noise-resistant feature. The method calculates pixels of sliding windows in the image, and then calculates the mean value of pixel in the neighborhood. The obtained mean value is used to compute the gray difference of pixel, and then its mean value and variance are calculated. Integral image is introduced to solve the problem of repetitive computation for overlapping pixels in window sliding.
Adaptive gray-scale difference
Equation (1) shows that the corner is determined based on window sliding in Harris algorithm, and it is not noise-proof because of its selection without any treatment in
Calculate
Calculate its contrast value in the neighborhood and find the average value
Calculate the deviation
Estimate dispersion according to the standard deviation
Binary coding of pixel center
According to the above calculation of dispersion, dispersion will change with each neighborhood, indicating that dispersion has the adaptability to selected points. At the same time, dispersion is less affected by the change of pixel selection. Therefore, the variation of dispersion in the same case is not large, which shows that dispersion can be used to effectively describe the selected pixel and its variation. It is thus effective to choose dispersion
Integral image
The integral image 11 has fast computation in the rectangular region in the image. Once the integral image is calculated, the characteristic information in rectangle region of any size can be obtained continuously, which can reduce the computational complexity and improve the processing speed. The basic idea is shown in Figure 3.

Integral image.
Figure 3 shows that an equation can be obtained from equation (16) in any sliding window region
The original equation (1) is improved based on the above equation so as to solve the problem of double counting in window sliding, as shown in equation (18)
Related work of population statistics
The proposed algorithm
In this thesis, the actual experiment of population statistics should satisfy the following two conditions: (1) Non-human objects in the video must be stationary, so are their corners. (2) The person in the video should be in motion, so are their corners.
We first extract the video frame and its adjacent frame (the previous frame is used in this thesis) under this algorithm, and then the improved corner detection method is used to detect corners in the extracted video frame before filtering out redundant corners with inter-frame difference method. Finally, the number of people is obtained based on the first-order dynamic linear model. The system flowchart of this algorithm consists of the following steps: Read 2 frame images between intervals in the video; Use the improved corner detection method to carry out corner detection; Filter out the stationary corners with frame difference method; Calculate the number of people.
Filtering out stationary target corners with frame difference method
The inter-frame difference method
12
is insensitive to light changes and other external factors. It can quickly detect the target moving area in a dynamic environment with a large population. The detailed method is as follows
In the above equation,

Test results after filtering static corners.
Population statistics regressed by first-order dynamic linear model
Statistical regression
13
is an important step in population statistics algorithm. There is a certain linear relationship between the normalized foreground of scenes and the number of people. In literature,
14
the researchers believed that the number of people in a video frame is proportional to the number of corners, described by the first-order static linear model in equation (20)
In the above equation,
The regression method of population statistics based on the first-order static linear model is not only simple but also has better calculation performance. However, if the training of proportional coefficient is only conducted manually, the statistic result will have certain randomness without practical value. When the crowd density in the video frame fluctuates greatly and the density is too high, artificial statistics tend to generate errors, and the deviation of proportion coefficient has greater impact on the results. As shown in Figure 4, the number of people is indicated by a black line, the broken line indicates the effect of larger
Figure 5 shows that there is still a sudden change in the number of corners between adjacent frames when the crowd density fluctuates slightly, mainly due to certain instability in its corner detection. In addition, Chen et al.
15
mention that there is a positive relationship between the number of corners and the number of people. However, it is difficult to present a single proportional coefficient relationship in a scene where the population density is constantly changing. Therefore, equation (13) based on the first-order dynamic linear model is more in line with statistical work in this thesis than equation (20) based on the first-order static linear model.

Test results after filtering static corners.
In order to adjust the dynamic proportion coefficient, this thesis introduces the idea of Kalman filtering by continuously adjusting the information obtained in video frames based on the self-renewal of
The accuracy of improved equation (23) cannot be effectively enhanced in the actual experiment, and its transition depends on the proportional coefficient
In order to get an adjustment coefficient, we refer to the first two frames
The experiment result shows that better performance can be obtained through the improved equation (21)
Experimental results and analysis
The experimental environment is VS2013 + OpenCV2.4.13, Windows10 operating system, Intel (R) Xeon (R) CPU E5-2603 v4 @ 2.20GHz and memory of 32GB. In order to verify the effectiveness of this algorithm, the test video frames came from the internationally recognized Dataset S2: Person Count and Density Estimation in database PETS2009. 16 The database contains several videos taken by a fixed diagonally single camera. The video frame rate is 7 frames/s. In this thesis, two videos in dataset S2 are used to verify the validity of this algorithm. The first video is S2. L3.Time_14-41 in dataset S2, including a total of 240 pictures, and it is characterized by the crowd moving closer to the camera from a distance. The second video is S2.L3.Time_14-41.View_0 in dataset S2, including a total of 240 pictures, and it is characterized by the crowd moving towards the school building and slowly away from the camera.
This thesis uses the average absolute error and relative error to evaluate the performance of this algorithm, and the definition equation is shown as in equations (26) and (27)
In the above equation,
Performance comparison between two algorithms (%).
Figure 6 shows the comparison between this algorithm and the algorithm of Liu and Zhou 7 in two different video experiments, as well as the experimental results of the actual number of people.

Chart of proportional coefficient correlation. (a) Experimental results of View 1. (b) Experimental results of View 4.
It can be seen from Table 1 and Figure 6 that compared with Liu and Zhou, 7 this thesis has achieved good statistical results in both videos. The main reason is that the Harris corner detection algorithm has properly improved its application in population statistics. Besides, this thesis has introduced the idea of Kalman filtering combined with accurate data information in the video frames. The dynamic proportion coefficient regressed in the first-order linear model is adjusted dynamically in real time. The accuracy of experimental estimation is guaranteed without increasing the computational complexity.
Conclusion
First of all, this thesis addresses the defects of Harris corner detection algorithm and introduces the ideas of adaptive gray difference and integral image, which can enhance the anti-noise performance of this algorithm and reduce its time complexity, and as a result, it can meet the application requirements in population statistics. Furthermore, a dynamic first-order linear regression model with self-adjustment is proposed by introducing the idea of Kalman filtering and combining the real and valid information in experimental data. It is applied to population density statistics, which can solve such problems as high dependence on proportion coefficient and large error under the static linear regression model. In Liu and Zhou 7 and in our application method, the experiment results are both unsatisfactory due to severe overlapping and non-humanoid target motions. The main reason for this is that this thesis has not proposed an effective solution yet. A further study should be conducted to address those problems through improvement of the algorithm.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This paper is supported by Open Fund Program of Key Laboratory in Sichuan Province with project number: 2015WZJ02.
