Abstract
The purpose of this paper is to present a rapid and efficient fish tracking method suitable for real world automatic underwater fish observation. Based on fish tracking, biologists are able to observe fish and their ecological environment. A distributed real-time underwater video stream system has been developed in Taiwan for large-scale, long-term ecological observation. In addition, not only does the system archive video data, but also incorporates data analysis. However, it is difficult to discriminate moving fish from drift water plants due to the severe drift of water plants caused by the water flow in real world underwater environments. Thus, fish tracking is complicated in unconstrained water. In order to overcome this problem, we propose a bounding-surrounding boxes method, which enables integration with state-of-the-art tracking methods for fish tracking in this paper. According to the method, fixing cameras must be used so that the moving fish are classified as foreground objects and are tracked, whereas the drifting water plants are classified as the background objects and are removed from the tracked objects. It enables the efficient, rapid removal of irrelevant information (non-fish objects) from large-scale fish video data. Experimental results show that the proposed method is able to achieve high accuracy.
1. Introduction
Ecological observation is imperative for marine scientists to study marine ecosystems. It is, however, difficult to sustain a long-term and real-time observation, mainly as a result of the inaccessibility of the marine environment. A distributed real-time underwater video stream system has been developed for long-term observation of ecosystems on the Southern tropical coast of Taiwan [6]. The video data is not only broadcasted in real-time via the Internet, but also archived to form a resource base for further analysis.
In recent years, considerable research has been conducted on video monitoring systems. Detecting and retrieving moving objects are necessary pre-processes for the application of real-time monitoring systems [2, 9]. Therefore, numerous outstanding algorithms for land object tracking have been proposed. Background subtraction is regarded as an approach to capture the complete shape of tracking objects [4]. Particularly, the Gaussian Mixture Model (GMM) proposed by Stauffer and Grimson [8], an adaptive background subtraction method used widely in many areas, enables dynamic background models to be built. Although many applications have been proposed, under the uncontrolled conditions, i.e., in real-life underwater systems, a challenge is still existent [1, 7]. Fish detection and tracking are complicated by the variability of the underwater environment. The water plants are regarded as foreground objects as a result of the severe drift from the interference of the water flow, which results in complexities and difficulties in discriminating moving fish from drifting water plants. Thus, the accuracy of fish tracking has been seriously affected by the use of traditional methods. In this paper, a rapid fish tracking method, bounding-surrounding boxes (BSB), is implemented to efficiently deal with the problem of drifting water plants; meanwhile, it enables the discrimination of moving fish as foreground objects and drifting water plants as background objects. It enables the efficient, rapid removal of irrelevant information.
The rest of the paper is organized as follows. Section 2 briefly introduces the real world underwater video stream system. Section 3 describes the BSB method for fish tracking. Section 4 shows experimental results and the conclusion is drawn in Section 5.
2. Real World Underwater Video Stream System
In this paper, a distributed real-time underwater video stream system is developed to conduct long-term underwater fish observation. Figure 1 shows the architecture of the underwater video stream system. Four fixing CCTVs are set up underwater. The video signal captured from the CCTV is converted into a Motion JPEG stream. However, the native Motion JPEG stream data can arrive 20 gigabytes per hour. With 12+ hours of usable daylight this could lead to the order of 100 terabytes of data per year. We transfer the source stream data into multiple encoded formats and bitrates to reduce the massive amount of data. In general, we obtained a better result, good quality and low data capacity by computing the Peak Signal-to-Noise Ratio (PSNR) between native Motion JPEG and other encoded formats. Figure 2 shows the PSNR values with different encoded formats and bitrates. Mpeg4 format with 5Mb bitrate (PSNR=31.87) are adopted since it meets our requirements and allows real-time observation (low data) and fish tracking (good quality data).

Architecture of the underwater video stream system.

The PSNR values with different encoded formats and bitrates.
On the other hand, the interlace effect existing in the source data might affect fish detection and tracking. Hence, a motion adaptive deinterlacing method [3] is implemented to remove the interlace effect. First, the motion areas of the video data are detected. Then, intra-field deinterlacing is adopted in motion areas and inter-field deinterlacing is implemented in static areas. Figure 3 shows the workflow of the motion adaptive deinterlacing method, which is able to achieve the advantages of both intra-field deinterlacing and inter-field deinterlacing. Figure 4(a) presents the original image with the interlace effect and Figure 4(b) shows the deinterlacing result.

The workflow of the motion adaptive deinterlacing method.

(a) The original image with interlace effect (b) the deinterlacing result.
Afterwards, the Mpeg4 stream is processed in two modes, one is converted into Mpeg4 video files for storing in local storage and the other is directly transmitted into a multicasting pool via ADSL lines for real-time observation. Figure 5 shows the real-time underwater stream observation via the Internet.

The real-time underwater video stream observation via the Internet.
3. Bounding-Surrounding Boxes Method for Fish Tracking
In this paper, we provide a Bounding-Surrounding Boxes method, which enables integration with several tracking algorithms, such as particle filtering or probability hypothesis density filtering to improve the tracking result. The aim of the BSB method is to efficiently distinguish the foreground objects between moving fish and drifting water plants. A tracking method based on the GMM is adopted to integrate the BSB method for fish tracking.
3.1 Background Subtraction
Fish tracking is implemented using the stored video data and background subtraction is the first step. In this paper, the GMM [8] is utilized to build a background model and to update this model frame by frame. It can be updated and restructured by spatial variation of successive images. Each pixel is modelled by a mixture of G Gaussian distributions. The history of a pixel is defined as a time series {X1, …, X
Where
where the covariance matrix is assumed to be of the form:
The weighted value
where
where
In the above formulas,

(a) The current frame (b) the background model.
3.2 Foreground Segmentation and Tracking
After the background model is constructed, we use background subtraction to extract the pixel value of the background model from the current frame by adopting R, G and B, respectively. Thereafter, the foreground objects are obtained. However, due to the movement of the foreground objects, the remnant shade usually appears in the foreground object by means of background subtraction, which causes the broken and disordered shape of the foreground objects. To compensate for this situation, a 3×3 crossed structure element is utilized to implement a morphology operation including erosion and dilation. Thus, these foreground pixels can be segmented into regions by the connected components algorithm.
By using colour information, the foreground images are compared to track the moving objects. In this paper, we apply a correlation coefficient histogram to measure the similarity between two consecutive frames in the time sequence. The most familiar measure of dependence between two values is the Pearson's correlation [5]. It has the form:
where

(a) The binary foreground objects (b) the bounding boxes of the foreground objects.
3.3 Bounding-Surrounding Boxes Method
Owing to the interference of the severely drifting water plants, the underwater environment in the real world is unconstrained, which leads to difficulty and complexity in discriminating moving fish and drifting water plants. Therefore, we propose a BSB method based on the concept of drifting water plants in a fixed field, but movable fish in an unfixed field. Due to the fixed CCTV, the frame underwater is stationary so our method can be adopted. Each foreground object is circumscribed by its bounding box with width

Illustration of the bounding and surrounding boxes.
Let

(a) The object (red box) is classified as fish (b) the object (blue box) is classified as non-fish (drifting water plant).
4. Experimental Results
To evaluate the effectiveness of our proposed system, ten different underwater videos with multiple complex scenes were tested. Each video is ten minutes long with a frame rate of 20fps. The resolution of these videos is 640×480. On a PC with CPU i5-3570 at 3.4 GHz, our method can process nine to ten frames per second. The aim of our experiment is to evaluate the accuracy of classifying moving fish as foreground objects and drifting water plants as background objects.
In our experiment, we define that the size of the surrounding box as four times (
Where
The
The
Figure 10 shows the fish tracking results from another video by using the bounding-surrounding boxes method. Figure 10(a) shows the binary foreground image by background subtraction with a morphology operation and Figure 10(b) shows the background model constructed using the GMM. The red box and red line in Figure 10(c) represent the bounding box and the trajectory of the tracking object, which is classified as a fish. The blue box and blue line in Figure 10(d) represent the bounding box and the trajectory of the tracking object, which is classified as non-fish.

(a) The binary foreground objects (b) the background model (c) The object (red box) is classified as a fish (d) the object (blue box) is classified as non-fish (drifting water plant).
5. Conclusion
In this paper, a distributed real-time underwater stream system is developed for fish observation and analysis in the real world. In particular, a bounding-surrounding boxes method, which enables integration with several state-of-the-art tracking algorithms for fish tracking in complicated underwater scenes, is proposed. In light of the method, fixing cameras must be utilized so that foreground objects such as moving fish and drifting water plants can be efficiently distinguished between. That is, the moving fish are classified as foreground objects and continue to be tracked and the drifting water plants are identified as background objects and are removed from the tracking objects. The performance of our method is computed by a classification success rate (
Footnotes
6. Acknowledgments
The research was funded by the Taiwan National Science Council (grant NSC 100-2933-I-492-001) and the European Commission (FP7 grant 257024) and undertaken in the Fish4Knowledge project (www.fish4knowledge.eu). We thank the Third Nuclear Power Plant of Taiwan Power Company and National Museum of Marine Biology & Aquarium, Taiwan for logistical support. We also thank Ecogrid team at National Centre for High-Performance Computing, Taiwan for fish video data supply.
