Abstract
This work presents a fall detection system that is based on image processing technology. The system can detect falling by various humans via analysis of video frame. First, the system utilizes the method of mixture and Gaussian background model to generate information about the background, and the noise and shadow of background are eliminated to extract the possible positions of moving objects. The extraction of a foreground image generates more noise and damage. Therefore, morphological and size filters are utilized to eliminate this noise and repair the damage to the image. Extraction of the foreground image yields the locations of human heads in the image. The median point, height, and aspect ratio of the people in the image are calculated. These characteristics are utilized to trace objects. The change of the characteristics of objects among various consecutive images can be used to evaluate those persons enter or leave the scene. The method of fall detection uses the height and aspect ratio of the human body, analyzes the image in which one person overlaps with another, and detects whether a human has fallen or not. Experimental results demonstrate that the proposed method can efficiently detect falls by multiple persons.
Introduction
According to recent statistics from the Ministry of Health and Welfare in Taiwan, falls that lead to death are the second major cause of death, especially of the elderly who are over 65 years. 1 Since the elderly commonly suffer from osteoporosis, organ degeneration, and other diseases, even a slight fall can be very dangerous. Such falls call serious injury when medical treatment is delayed for most elderly people. Such occurrences are reported daily. The field of gerontechnology seeks to solve problems that are associated with elderly people in the areas of housing, mobility, communication, leisure, and work. To ensure that the elderly can live in a safe and comfortable environment, new technologies are being developed. Preventing the elderly from fall and informing rescuers rapidly about a fall are important issues. Information and communication technology (ICT) is useful in the fast detection of, and response to, a fall by an elderly person. A video surveillance system can be designed with a fall detection capability, and all accidents can be identified using such a system, which even allows a remote care worker to observe over a communication network and respond appropriately.
A fall detection system that is deployed in a hazardous environment or a daily activity space can both detect falls and send emergency signals to an ambulance unit. Injured people will be helped as quickly as possible and promptly receive medical treatment, resulting in low mortality. The proposed system combines a video surveillance system with fall detection technology that involves image processing, with the purpose of identifying falls by the elderly. A situation in which multiple people fall consecutively is difficult to detect owing to the possibility that one person overlaps with another. The proposed system solves this problem.
The rest of this article is organized as follows. Section “Related works” briefly describes previous research into fall detection. Section “Proposed fall detection system” describes the proposed fall detection system. Section “Experimental results” summarizes the experimental results and compares them with those concerning other previous designs. Section “Conclusion” draws conclusions and makes recommendations for future research.
Related works
In recent years, many research results concerning the tracking of human posture and the detection of falls have been published. Methods for recognizing human postures, and especially for recognizing falls, involve various important processes such as foreground segmentation,2,3 multi-object tracking,4–9 and fall detection.10–17 Foreground segmentation method is used to cut objects of interest from a static background and dominates the accurate rate of recognizing those moving objects in surveillance applications. A better multi-object tracking method is less likely to misidentify overlapping objects. A fast and efficient fall detection method provides rapid assistance to fallen elderly people and also reduces their probability of death.
The major sub-process of foreground segmentation is the construction of a background model. Several methods for constructing background models have been proposed. The arithmetic background model-building method averages first N consecutive frames to yield the background model. For example, if the method stores first 10 consecutive frames with
The extraction of characteristics of the images can also be carried out to build a background model; these characteristics include colors and gradients of colors. This method supports rapid and effective model-building, but it provides bad results when a moving object has the same color(s) as the background image. The mixed Gauss background building method uses statistics and probability to represent the intensity of a pixel and constructs relationships among all pixels. 2 This method solves the problem of object residue in the background image when an object is moving slowly. A codebook method has also proposed to reduce the amount of background information that is generated in the building process; 3 this method uses quantization and clustering to construct a codebook that corresponds to each pixel. Each codebook has one or more code words. When applied to a new image, the method determines whether the pixels in the image have corresponding codebooks and updates the code words of the background model. The method effectively compresses the background information and quickly constructs the background model.
Multi-object tracking is useful for tracking multiple objects in a single space. Numerous tracking methods have been proposed to increase the efficiency of the tracking of multiple objects using one or more cameras. They involve a passive camera such as a wide-angle camera or a map camera, or they use an active pan-tilt-zoom (PTZ) camera to capture an image of the scene. A Kalman filter and a particle filter are utilized to perform the track forecasting algorithm to forecast people’s tracks. 18 A color model is designed to track a target, and Bayesian probability is used to determine the position of an object by calculating its color probability distribution. 19 A Kalman filter is utilized to forecast the future location of a moving object and used this location to predict and trace the object. 20
Huang et al. 5 utilized a single-lens reflex camera to trace objects and divide the detection blocks of the image into three parts—long-distance, middle-distance, and short-distance parts. Leibe et al. 6 presented a method for tracing objects using one camera. The method transforms the two-dimensional (2D) coordinates of moving objects to three-dimensional (3D) coordinates. It can identify two objects because no two objects have the same 3D coordinates. The method executes different complex algorithms in three types of detection blocks; it requires a longer calculation time when objects are farther from the camera. A new method that combines more PTZ cameras and one wide-range camera has also been used to trace multiple objects. 8 When the wide-range camera has identified various objects, the method assigns various PTZ cameras to trace these objects. Ristic 9 proposed an algorithm for the online detection of anomalies in the motion and number of objects, based on the output of a multi-object tracking algorithm.
Fall detection identifies people who fall in a space that is being viewed through a camera. The simplest method for recognizing human falling utilizes the aspect ratio of the moving person. 3 Anderson et al. 10 proposed a method that uses the aspect ratio of the human body and its height to identify falling. Meunier et al. 11 utilized historical momentum map and the ratio of the length to the angle, obtained by an approximate ellipse method to determine whether a person has fallen. Foroughi et al. 12 used various methods to detect different human behaviors, including those based on a best-fit approximate ellipse around the human body, projection histograms of the segmented silhouette of the body, and change in head position. The proposed method provides a recognition rate of approximately 90% when one person is detected. A new method uses the ellipse curve to draw the height of a human body and detects the color of skin to locate a human head. This method also uses the position of the head and the central axis through the body to locate a person’s feet. It then uses the position of the head and the feet to identify a fall. 13 Rougier et al. 14 proposed a method for detecting a fall by analyzing deformation of the human shape in a video sequence. They used a shape matching technique to track the person’s silhouette through the video sequence. An online one-class support vector machine (OCSVM) has been used to capture an image, calculate its feature space, and distinguish normal daily postures from abnormal postures such as that following a fall. 15 Another work provides a semi-unsupervised fall detection system; although an OCSVM is utilized, human intervention is required to segment and select video clips of normal postures. Ma et al. 16 utilized a depth camera to detect falls and combined shape-based fall characterization with learning-based classification to distinguish falling from other everyday movements. The method detects falls with an accuracy of 86.83%. Another method uses a depth camera to track critical joints of the human body. A pose-invariant randomized decision-tree algorithm has also been proposed for extracting important joints and a support vector machine classifier has been utilized to determine whether a fall has occurred. 17 All of these methods perform well when falling by only one person is detected, but they do not effectively detect falling by multiple persons. This work proposes a detection system for identifying falls by multiple persons at once. Experimental results reveal that the proposed system efficiently detects falls by multiple persons.
Proposed fall detection system
The proposed fall detection system uses a single camera to trace multiple persons in a scene. The fall detection process recognizes falls by multiple persons at once. The main processes include foreground capture, object detection, and fall detection. Figure 1 presents the operation flow of the system. Each process comprises various sub-processes, which are described below.

Operation flow of the proposed detection system.
Foreground capture
The accuracy of foreground capture influences the success rate of the subsequent object tracing and identifying of action. Therefore, the rapid and accurate detection of a moving object is the most important task in the proposed system. When images are input to the system, the system initializes the background model, which is built from a Gaussian mixture and background model. 2 The background model is constantly updated with input images and the pixels that are belonging to the foreground or background are identified. In the foreground capture process, the input image may contain noise and shadows, which are eliminated using the Morphology method.
Noise elimination
When the Gaussian mixture and background sub-process is carried out, the foreground image is obtained but noise is generated owing to the shadows in the image. The effective elimination of shadows reduces this noise. The shadows in the image can be eliminated using the shadow elimination method that is presented in Figure 2. The method uses the shadow detection function to eliminate the shadows of image and then uses the Morphology processing function to segment the object. Finally, when the calculation of the area of the foreground block is smaller than a set threshold, the block is identified as noise and is eliminated. If the area of the foreground is larger than the threshold, the block is a moving object.

The proposed shadow elimination method.
Foreground block capture
A moving object is identified after noise elimination, but this object may appear segmented in an image. An object may be represented as several unconnected blocks. In such case, it requires to find their relationship and combine them to form a complete object for determining whether these blocks belong to the same block. The proposed foreground block capturing method can efficiently find the relationship between two blocks, as presented in Figure 3. For instance, the foreground image has N blocks and the block group is B = B1, B2, B3, …, BN. The method first selects two blocks Bi and Bj and checks whether their x-axis coordinates are equal. If the two blocks have the same x-axis coordinate, then they are part of the same object and should be connected to each other by updating their vertices and widths. A new and larger block is generated by combining these two blocks. After the proposed method has been applied to all of the blocks, complete objects can be identified in the foreground image. Figure 4 reveals that two persons are segmented in the image. After the proposed method has been implemented, all persons are associated with a complete body, as presented in Figure 5.

The proposed foreground block capturing method.

Persons are segmented in the foreground image.

Each person has a complete body after performing the proposed foreground block capturing method.
Object detection
After the foreground of objects has been obtained, the target object must be traced and its action can be detected. Figure 6 presents the object detection process. Before an action is identified, the vertices of the object must be detected and overlapping objects separated. The second step is extraction of the characteristics of object to enable continued detection. Object tracking and judgment of the motion and number of objects is performed. In the final step, information about objects is updated and input to the fall detection process.

The proposed object detection method.
Object vertex detection
The vertex detection sub-process is applied to each foreground area to locate the head of each person and separate objects that obscure each other. This process detects the upper halves of foreground area because the upper halves of objects are usually complete and the most obvious feature of a person is the prominent head which becomes evident when overlapping people separate.
Object characteristic extraction
The sub-process of object detection extracts the height, width, and center of gravity of an object. The ratio of apparent height to apparent width of an object is a major factor in judging its action. The width can be used to evaluate the overlap or separation of two objects. The center of gravity, expressed as (x,y) coordinates, locates the object. These characteristics of the object are important for positioning and determining its action.
Detection of overlap and separation of objects
The above characteristic extraction yields the height, width, and center of gravity information that are used to determine the number of objects in a scene. The extracted information determines whether overlapping judgment steps or separating judgment steps will be carried out. If the number of objects decreases, then either objects are overlapping or an object has left the image space. If the number of objects increases, then a new object has appeared in the space or formerly overlapping objects have separated.
Whether objects overlap is determined in the following steps:
Captures the current image frame and previous image frame and finds characteristics of all objects. If the number of objects in the current image is less than that in the previous image, then some objects have left the space or some objects are newly overlapping. An image boundary check is conducted to determine which of these two situations pertains.
Finds overlapping objects. First, the tagged box of each object that is drawn from height and width information of the object is enlarged. Then, close neighbors within the new, larger tagged box are identified. These neighbors are the most likely to overlap the tagged object in the future.
Tags overlapped objects and updates with new information. New information about overlapping objects is stored for subsequent tracing.
The sub-process identifies the overlapping of objects using two continued frames, such as those in Figure 7(a) and (b). Figure 7(a) shows three persons, of whom two are close to each other. Figure 7(b) presents the frame that follows that of Figure 7(a). The sub-process is composed of the above three steps and finds the overlapping objects in this image. These overlapping objects are tagged with a new bigger box and update their information for ongoing tracing.

Overlap judgment of two persons: (a) previous image frame and (b) current image frame (overlap of two persons).
The separation of objects can be identified by the following steps:
Captures the current image frame and the previous image frame that contains information about overlapping objects. If the number of objects in the current image exceeds that in the previous image, then objects have entered the space or objects have separated. An image boundary check is carried out to determine which situation pertains.
Identifies overlapping objects that are separating. First, the tagged box of each object that is drawn from the height and width information of the object is enlarged. Then, close neighbors within the new, larger tagged box are identified. These neighbors are the most likely to separate from the tagged object.
Tags separated objects and updates information. New information about separated objects is stored for subsequent tracing.
Checks whether the information concerning overlapping objects belongs to only one object; if it does, then the overlapping objects are regarded as a single object and the information is updated accordingly; otherwise, the separation judgment sub-process is exited.
The sub-process identifies the separation of objects using two consecutive image frames, such as those in Figure 8(a) and (b). Figure 8(a) presents an image of two overlapping persons and two other persons who are separate from each other. Figure 8(b) presents the frame that follows Figure 8(a). The sub-process that comprises the above three steps identifies the overlapping objects that are separate from each other in this image. The separated objects are tagged with a new, larger box and their information is updated to support subsequent tracing. Finally, if the separation between the two formerly overlapping objects exceeds a threshold, then a tagged box is drawn around each of these two separated objects in the next frame.

Separation judgment of two overlapped persons: (a) previous image frame (overlap of two persons) and (b) current image frame (separation of two persons).
Figure 9 presents the use of the proposed method to detect the people in a real scene that includes three persons. Figure 9(a) presents three overlapping persons in the scene. A bigger tagged box contains these three persons and their information is updated. Figure 9(b) presents the frame that follows the frame in Figure 9(a); it reveals that a person has moved away from the overlapping object while two persons still overlap in the scene. Therefore, a new tagged box frames the leaving person. The original box that framed the three overlapping persons is reduced in size. Figure 9(c) presents three separate persons, each of whom is tagged with a box. The experimental results reveal that the proposed object detection method identifies whether objects overlap or are separate.

Separation judgment of three overlapped persons in a real scene: (a) overlap of three persons, (b) leaving of two persons, and (c) separation of three persons.
Detection of entering or leaving of scene by object
When an object enters or leaves the image space, it touches the boundaries of the space. Therefore, when an object touches the boundaries of the image space, it triggers the detection of an object entering or leaving. This process uses contact with the boundary, the number of objects, and the object database to determine whether the touching object is an entering object or a leaving object. When an entering or leaving object is detected in the image space, information about this object is updated and recorded in the object information database. Our experimental results show that the detection process can effectively tag the entering or leaving objects and recognize them after updating the object information database.
Updating object information
This step is the final step in object detection. If various objects are present, then the objects move in the image space and no overlapping or separation occurs. Therefore, the center of gravity of an object and the least distance from each object are used to perform new traces separately, and the corresponding information is updated accordingly.
Identification of action
When a person falls, the ratio of height and the aspect ratio of a human body change sharply. However, when one overlapping object moves left and another moves right, these ratios will also change more than that of previous overlapping object. With respect to overlapping objects, if a person falls while obstructed by another, standing person, the change in the height will be small and a misjudgment will be made, as presented in Figure 10. Therefore, the identification of a fall is divided into two sub-processes—fall detection of a single person and the detection of multiple persons (Figures 11 and 12).

A standing person shelters a fallen person.

A standing person occludes a fallen person: (a) two persons overlap in the left scene; (b) two pairs of persons overlap; (c) two overlapped persons separate in the left scene, and two persons still overlap in the right scene; and (d) two overlapped persons separate in the right scene, and all persons separate.

Two persons separate from an overlapped object: (a) two persons walk closely, (b) they overlap together, and (c) one person falls in the overlapped object.
Detection of fall by single person
In a real situation, the change in the height of a person may be caused by a change between postures, such as standing and lying down, rather than a fall. To identify such a situation, fall velocity is tested to set a threshold for identifying a fall by calculating the difference between the object’s current height and that 1 s ago. If the difference exceeds the threshold, the fall is confirmed; otherwise, the object is not deemed to have fallen. The criterion for identifying a fall is as follows
where
The fall detection of multiple persons
Detecting the falls of multiple persons that do not overlap using equation (1) is easy. If one person is standing while the other person, whom the first is obscuring, falls, then the height of the overlapping object does not change but the width of the object changes rapidly. The speed of the change in width of the overlapping object exceeds the rate of change of distance between two objects that are separating from each other gradually. Figure 10 presents the fall of an overlapping object. A standing person obscures a fallen person, and the width of the object increases. To solving this problem, the detection method comprises the following steps:
The trigger condition for the identification of a fall is a change in the width of the overlapping object. This standing object in the overlapping situation is eliminated from the image using an object vertex detection method. When the change in width of the overlapping object exceeds the threshold, the foreground block of the overlapping object is captured and the vertices of the standing person in the block are found. The length of rectangle is set to the height of the overlapping object; the width equals D times of that of the object, where D is between 0.25 and 0.5. The foreground in the rectangle is cleared, as presented in Figure 10.
Noise is eliminated. If the remaining height of the object is less than a certain height, a person is deemed to have fallen.
Most systems for detecting a fall are designed to detect a fall by a single person. The fall detection method that is proposed herein can be used to identify falls by multiple persons consecutively. The proposed method uses two techniques to detect the falls. A change in height can be used to detect the fall of a single person. The elimination of the foreground in an overlapping object is useful for identifying a fall by a person who is behind another, standing person. The combination of the proposed two techniques efficiently detects falls.
Experimental results
The proposed system is designed using visual C++ 2008 and an open-source computer vision library (Open CV). To evaluate the performance of the system, the distance between the camera and the observed persons is set to approximately 1–6 m. Numerous conditions are considered in evaluating the effectiveness of the system, which are falling by multiple persons, falling by two overlapping persons, and the detection of a person entering or leaving the image space. The system is performed by Intel Core i7 CPU with 8-GB external memory. The computation time of the proposed system that recognizes an actual fall from successive images only requires about 0.3–0.8 s under all simulation conditions.
The rates of success of object tracking (SOT) and fall detection (FD) are defined as follows
where the parameters SR of object tracking (OT) and SR of FD in equations (2) and (3) are as shown in Table 1. OT denotes object tracking, and SOT denotes the success of object tracking. SR of OT refers to the rate of success with which the system traces objects in all frames. However, equation (3) presents a similar definition: SR of FD denotes the rate of success of a system in detecting falls as the number of detected falls as a fraction of the total number of actual falls.
Experimental results of our system performance.
SR: success rate; OT: object tracking; FD: fall detection; FFD: failure of fall detection.
Table 1 summarizes the experimental success rates of OT and FD. The experiments involve various scenarios, including one with various falling persons at different times and one in which random persons fall consecutively. For example, in the second row of the table, 40 falls by different people occur but only one person falls during the 310 s that is recorded in the video. The experimental results demonstrate that the rate of SOT is 100% and the rate of success of fall detection is also 100%. The same rates are achieved in the detection of the 80 falls by two persons. The rates of SOT and fall detection are reduced when more persons fall at once. The SR of OT and SR of FD are reduced to approximately 88% when seven persons fall consecutively. The rates are reduced because more objects overlap in and leaving, a crowded space. More mixed object information result in more misjudgments. However, for the proposed system, the rates of SOT and fall detection exceed 90% even when seven persons fall down at once.
For another example, in the eight row of Table 1, it describes that seven persons fall down simultaneously or consecutively up to 140 times, but 89% of such falls can be detected by our proposed system. It means that the failed detection rate reaches 11%. In other words, about 15 times of such falls cannot be found by our system. But our system is still better than all previous systems that can only detect falling by one person.
Table 1 also shows that some false falls are detected by our system. The parameters FFD denote the time(s) of false detection that a person does not fall but our system detects his fall. The experimental results show that the scenarios of falling by four, six, and seven persons have one, three, and three false detections separately. These false detections happen in the system because multiple overlaps of persons appear in the scene. In Table 1, the experimental results show that the system can achieve 100% of fall detection rate not only in detecting one fallen person but also in detecting two and five fallen persons.
Conclusion
The proposed video surveillance system can track all persons in an image space and detect falls by multiple persons. The proposed foreground block capture method combines the segments of each object after the background model has been constructed. The method finds all complete objects in the foreground image. Object detection involves an important method to determine the overlap and separation of objects. The method is effective in identifying whether objects overlap or are separate, and it influences the accuracy of continued fall detection process. Most similar systems are designed only to detect the fall of a single person because they ignore the possibility that overlapping people may fall. This work proposes a combined method that effectively detects falling by multiple persons consecutively. The proposed system achieves a 100% success rate in object tracking and fall detection when only one or two persons fall. The experimental results also reveal that the corresponding success rates exceed 90% when seven persons fall down consecutively. All previous researches focus on detecting fall by only one elder. Our system not only effectively detects fall by one elder but also detects falls by multiple elders. The major contribution of this article comes from enhancing the accurate rate of fall detection and effectively detecting falls by multiple persons.
Footnotes
Acknowledgements
Ted Knoy is appreciated for his editorial assistance.
Academic Editor: Stephen D Prior
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors would like to thank the Ministry of Education of the Republic of China, Taiwan, for financially supporting this research under Contract No. 104E018-02.
