Data-Wave-Based Features Extraction and Its Application in Symbol Identifier Recognition and Positioning Suitable for Multi-Robot Systems

Abstract

In this paper, feature extraction based on data-wave is proposed. The concept of data-wave is introduced to describe the rising and falling trends of the data over the long-term which are detected based on ripple and wave filters. Supported by data-wave, a novel symbol identifier with significant structure features is designed and these features are extracted by constructing pixel chains. On this basis, the corresponding recognition and positioning approach is presented. The effectiveness of the proposed approach is verified by experiments.

Keywords

Data-wave Symbol identifier Structure features Pixel chain Recognition and positioning

1. Introduction

Vision has been widely studied for many robotic applications such as visual serving^[1], object tracking^[2], human-robot interaction, etc. Among these applications, visual recognition and positioning plays an important role. In structural environments where illumination conditions are stable, vision-based recognition and positioning have received good solutions when the distance between camera and object is close. However, problems arise when the distance is larger or environments are complex and unstable. Some methods have been proposed to address these problems. Background subtraction^[3] is often used for object tracking and their positions can be calculated with the help of prior knowledge. This method is feasible with immobile cameras, however it is hard to distinguish different objects effectively. Template matching algorithms^[4] based on image region and feature point matching^[5] have been used for object recognition. In [6], Yamaguchi and Mizoguchi present an object recognition method based on online template learning. There are still many recognition methods based on the object features of shape, colour and texture^[7–9].

For robots working in unknown and complex environments, especially a multi-robot system in which group members cooperate with each other, real-time recognition and positioning of multiple objects is required with adaptability to changing environments. In this case, one solution which has received much attention is to equip identifiers with objects of interest and thus the problem is converted into the recognition and positioning of identifiers. As a typical identifier, colour mark is commonly used^[10][11]. In order to recognize different robots or objects with corresponding colour marks, a basic method is to use thresholds in RGB space to find the colours of interest and achieve identification. Compared with RGB space, HSI is proven to be more robust against light changes^[12]. In [13], Minten et al. makes a fiducial mark that is made up of two rectangles with different colours and a spherical coordinate transform (SCT)^[14] is used. Although significant improvements have been made in colour mark-based recognition, this method still has many limitations and environmental adaptability needs to be further enhanced. A feasible way to reduce the influence of illumination is to use a shape feature instead of colour. In [15], a method based on shape features for robot navigation is given. In addition, some image quality improvement methods may also be used for better recognition such as colour correction^[16][17], super-resolution^[18], etc. Considering the requirements of multiple objects with enough discrimination as well as visual angle adaptability, in this paper, the structure feature is utilized for better recognition.

The main contributions of this paper are summarized as follows. We introduce the concept of data-wave as well as the corresponding detection means, which is applicable to structure information extraction of an image. On this basis, a novel symbol identifier is designed with its structure features in the form of data-wave extracted from pixel chains.

The remainder of the paper is organized as follows. Section 2 describes the feature extraction based on data-wave. In Section 3, a kind of symbol identifier is designed and the corresponding recognition and positioning approach is also presented. The experiments are given in Section 4 and Section 5 concludes the paper.

2. Feature Extraction Based on Data-wave

2.1 Data-wave

We define a sequence of data with a rising trend and a falling trend over the long-term as data-wave, as shown in Figure 1. There are two types for data-wave: ridge wave and valley wave. For ridge wave, its rising segment comes before the falling segment, while the rising segment of the valley wave comes after the falling segment; notice that some fluctuations may exist which will destroy the overall variation trend locally. Those fluctuations with intermediate data among three adjacent data being obviously larger or smaller than two other data are called ripples. Accordingly, we have ridge ripple and valley ripple (see Figure 1). Given a sequence of data DS, a ripple filter and a wave filter are designed in order to acquire the rising and falling trends.

Figure 1.

Data-wave and ripple

When three adjacent data d₁, d₂, d₃ satisfy the conditions m(d)≥th₁ and sgn(d₂−d₁)×sgn(d₂−d₃)=1, these three data compose a ripple, where m(d)=|2d₂ d₁−d₃| is the fluctuation amplitude and th₁ is a given threshold. The function of ripple filter is to adjust the intermediate data to (d₁+d₃)/2. An illustration of a ripple filter is given in Figure 2.

Figure 2.

Ripple filter

In order to achieve the detection of a data rising segment and data falling segment for DS, a wave filter is used based on the result of the ripple filter as follows.

f w_{D} = {\begin{array}{l} - 1 & D_{i} < t_{g 2} \\ 0 & t_{g 2} \leq D_{i} \leq t_{g 1} \\ 1 & D_{i} > t_{g 1} \end{array}

(1)

where D_i is the value of ith unit in DS, t_g₁=avr_D+th₂, t_g₂=avr_D-th₂, avr_D may be given by the average value of data and th₂ is a given threshold.

From Eq. (1), there are only three values −1, 0, 1 after the wave filter is executed, as shown in Figure 3. We consider the case with one −1 before a series of 0 and one 1 after them as a data rising segment with its length of the number of data. The data falling segment will arise when one 1 comes before a series of 0 and one −1 comes after them. The first and the last values of a segment are called head value and tail value, respectively. In particular, in a data rising segment or falling segment, there may be only a few, or even no, 0.

Figure 3.

Data filtering and segment detection based on ripple filter and wave filter

2.2 The Application of Data-wave in Image Analysis

In an image, we define a sequence of connected pixel points as pixel chain whose data may be analysed by data-wave. In a grey image, a pixel chain may be expressed by a single data sequence formed by grey values of pixels, while in a colour image, the expression is a combination of three data sequences corresponding to the R, G and B components of pixels. When a pixel chain goes across an image edge, there is at least one data rising or falling segment. If a line is crossed by a pixel chain, there is at least one ridge wave or valley wave. By analysing the data rising and falling of pixel chains chosen flexibly, abundant structure information may be described, which is robust to local interference.

In the following, the data rising and falling will be determined based on the output of wave filter. Firstly, we analyse a random DS whose data are evenly distributed within [0, D_u]. Let avr_D=ηD_u and th₂=kD_u, where η∈[0, 1], k∈[0, 0.5]. Because data distributions in a DS are independent from each other, from Figure 3, the maximum appearance probability of a rising/falling segment with its length of l may be calculated as follows.

p_{r} (l) = \frac{η D_{u} - k D_{u}}{D_{u}} \cdot \frac{{(2 k D_{u})}^{f r}}{D_{u} ​^{f r}} \cdot \frac{D_{u} - η D_{u} - k D_{u}}{D_{u}} = (η - k) {(2 k)}^{f r} (1 - η - k)

(2)

where $f r = f l o o r (\frac{l - 2}{2})$ is the minimum number of data between t_g₁ and t_g₂.

Figure 4 gives the curve of p_r(l) with η=0.5 and k=0.02. From Figure 4, we can see that the maximum appearance probability of a rising/falling segment reduces dramatically with the increasing of l. A smaller l corresponding to a higher probability in a random DS means that it is difficult to acquire reliable information from actual signals from the segment with a length of l. In actual applications, l is larger for reliability. In addition, the reliability may also be improved with the increasing numbers l₁, l₂ of head value and tail value before and after the segment. When l₁+l₂+l≥l_t, where l_t is a given threshold, we think that a data rising or falling segment exists.

Figure 4.

The curve of pr(l) with η=0.5 and k=0.02

3. The recognition and positioning approach of symbol identifier

3.1 The design of symbol identifier

In order to better extract the structure feature by utilizing data-wave, the symbol identifier of the object is specifically designed with positioning information being implied, as shown in Figure 5. The symbol identifier is composed of three zones: top arch zone, radial central zone and bottom arch zone. In addition, there are four parts in the radial central zone: left, up, right and down. The spokes only distribute in the left and right parts of the radial central zone, and they taper off from the exterior near to the centre point p_sc of the radial central zone. Those two arcs, which lie in the top arch zone and bottom arch zone respectively, belong to a circle whose centre is p_sc. In addition, there is a small darker circular area around p_sc.

Figure 5.

The symbol identifier

In each zone, there are four candidate colours: saturated red, saturated green, saturated blue and pure black. RGB space is chosen to describe colours and a saturated colour refers to the colour where two of its RGB components are 0, while the third one is large (usually 2150). For convenience of expression, the nonzero colour component of a saturated colour is called its main component, while the other components are called vice-components. Additionally, the number of spokes in the left or right part may be 0, 1, 2, 3 or 4. Therefore, the possible combination for symbol identifier may up to 4³×5². Even if some invalid cases are removed, such as there is no spoke in the left and right parts, a substantial quantity still exists which is suitable for actual applications.

3.2 The Features of the Symbol Identifier

In order to extract the structure features of symbol identifier, a template with N_c+1 pixel chains is constructed, which is shown in Figure 6, where N_c is an even number. It is easy to know if the base point in the template is the centre point of the symbol identifier as these chains should have the following features:

Figure 6.

The structure features' extraction template based on pixel chains for the symbol identifier

The data sequences of pixel chains 1, 3,…, N_c-1 corresponding to vice-components of spokes colour should have the same number and arrangement of the data rising segment and data falling segment. In addition, the number of valley waves is equal to that of spokes where the chains go across. Similar conclusions can be drawn for pixel chains 2, 4,…, N_c.

When pixel chain N_c+1 is short enough, data sequences of this chain corresponding to vice-components of spokes colour should have one valley wave due to the small darker circular area around p_sc. With the increasing length of pixel chain N_c+1, a data rising segment and a data falling segment corresponding to two arcs in the top arch zone and bottom arch zone will arise, and they are almost symmetrical about the base point. Additionally, there is only one valley wave between these two segments.

Besides the features extraction based on pixel chains, some other features for the centre point of the symbol identifier can also be found:

Luminance feature: RGB components are lower than a given threshold due to the small darker circular area around p_sc.

Luminance difference feature: RGB components are obviously lower than those of points in the up part and down part of radial central zone.

3.3 Recognition of Symbol Identifier

Without image preprocessing a series of judgment machines are constructed based on features provided in Section 3.2 for recognition, and each of them corresponds to a certain feature. Every point of an image is judged by judgment machines one by one and it will be removed directly once it is unable to satisfy the criterion of any judgment machine. Those points that pass all judgment machines constitute a set of candidate points and they are classified according to spokes' distribution and colours' arrangement. Combined with priori knowledge, the centre points of corresponding symbol identifiers are then acquired.

Judgment machines may also be organized optimally. We introduce passing rate p and computing time t to describe the property of a judgment machine. The passing rate of a judgment machine is the ratio of the number of points that can pass the judgment to the total number of points it judges. For an image with a great number of points, this index has a relatively stable statistical significance. It is easy to know that the passing rate P of the combination of all judgment machines is the product of the individual passing rate when the features used by judgment machines are independent of each other. Thus, we have $P = \prod_{i} p_{i}$ , where p_i is the passing rate of the ith judgment machine and it may be obtained experimentally. The total computing time of all judgment machines may be given by

T = \sum_{i} (N_{A} t_{i} \prod p_{0} p_{1} \dots \dots p_{i - 1})

(3)

where N_A is the total number of points in an image, t_i is the computing time of the ith judgment machine and p₀=1 for unity of expression. The ordering of judgment machines is converted to make T be minimum and this is a typical multi-stage optimization problem. Suppose there are only two stages and we have T₁=N_At₁+N_Ap₁ t₂. After the order of these two stages is exchanged, we have T₂=N_At₂+N_Ap₂t₁. Therefore, T₁-T₂= N_A(1-p₂)t₁-N_A(1-p₁)t₂. When T₁<T₂, we get:

\frac{t_{1}}{1 - p_{1}} < \frac{t_{2}}{1 - p_{2}}

(4)

We define $q_{i} = \frac{t_{i}}{1 - p_{i}}$ as the average removing time of the ith judgment machine. The optimized result may be obtained according to q_i with the order from small to large.

3.4 Positioning of Symbol Identifier

Figure 7 gives the positioning diagram of the symbol identifier. We establish the camera coordinate system, whose origin point is camera lens and Z axis is lens axis, and thus we have

Figure 7.

The positioning diagram of the symbol identifier

\begin{array}{l} \frac{\sqrt{{(u_{2} - u_{1})}^{2} + {(v_{2} - v_{1})}^{2}}}{f} = \frac{d_{A}}{z} \\ \frac{x_{1}}{u_{1}} = \frac{x_{2}}{u_{2}} = \frac{y_{1}}{v_{1}} = \frac{y_{2}}{v_{2}} = \frac{z}{f} \end{array}

(5)

where (x₁, y₁, z), (x₂, y₂, z) are the coordinates of centre points of the top arc and bottom arc, respectively, and (u₁, v₁) and (u₂, v₂) are their image coordinates that may be obtained based on pixel chain N_c +1; f is the focal length in the pixels of the camera; d_A is the actual diameter of the circle where these two arcs are located. Therefore, the space position of the symbol identifier may be given by ((x₁+x₂)/2, (y₁+y₂)/2, z).

4. Experiments

The experiments are conducted to verify the proposed approach. In order to achieve the recognition of a symbol identifier from a farther distance, i.e., where it is small in the image, only ridge ripples are filtered and l_t cannot be too large due to the properties of both the identifier and pixel chains. In the following experiments, the parameters are as follows: l_t=4, th₁=60, th₂=5, N_c=2, d_A=19cm.

Experiment 1 is conducted to test the distance and visual angle adaptability. Figure 8 gives the recognition results which are displayed in red circles. In experiment 2, illumination conditions are changeable and the results are depicted in Figure 9. Figure 9(a) and Figure 9(b) demonstrate the recognition results in a brighter and darker environment, respectively.

Figure 8.

The recognition result of the symbol identifier in experiment 1

Figure 9.

The recognition results of the symbol identifier in different illumination conditions

Experiment 3 considers the case with a moving object. Figure 10 gives the positioning result of the symbol identifier placed on the object. From the measurement values of the centre point of the symbol identifier and actual path, the proposed approach is considered effective.

Figure 10.

The positioning result of experiment 3

5. Conclusions

This paper firstly describes feature extraction based on data-wave. The specifically designed symbol identifier to identify the object is then given and its structure features are extracted from a series of pixel chains. On this basis, the recognition and positioning approach of the symbol identifier is proposed. The results of the experiments show the proposed approach works well with certain adaptability to illumination conditions. In the future, we shall focus on the construction of pixel chains for better structure features' extraction. In addition, the actual application in multi-robot systems will be conducted.

Footnotes

6. Acknowledgments

This work is supported in part by the National Natural Science Foundation of China under grants 61273352 and 61175111, and in part by the National High Technology Research and Development Program of China (863 Program) under grant 2011AA041001.

References

Wang

Lang

H X

Silva

C W

(2010) A Hybrid Visual Servo Controller for Robust Grasping by Wheeled Mobile Robots. IEEE/ASME Transactions on Mechatronics, Vol. 15, no. 5, pp. 757–769

Xia

G H

Xing

Z Y

(2007) A New Algorithm for Target Recognition and Tracking for Robot Vision System. 2007 IEEE International Conference on Control and Automation, Guangzhou, China, pp. 1004–1008

Tsai

D M

Lai

S C

(2009) Independent Component Analysis-Based Background Subtraction for Indoor Surveillance. IEEE Transactions on Image Processing, Vol. 18, no. 1, pp. 158–167

Zhong

Jain

A K

Dubuisson-Jolly

M P

(2000) Object Tracking Using Deformable Templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, no. 5, pp. 544–549

Kang

J S

Jeon

Park

(2009) Mobile Vision Robot Using the SIFT Algorithm for Object Recognition. International Conference on Ultra Modern Telecommunications & Workshops, St. Petersburg, Russia.

Yamaguchi

Mizoguchi

(2003) Robot Vision to Recognize both Face and Object for Human-Robot Ball Playing. Proceedings of the IEEE/ASME International Conference on Advanced Intelligent Mechatronics, Kobe, Japan, pp. 999–1004

Leichter

Lindenbaum

Rivlin

(2009) Tracking by Affine Kernel Transformations Using Color and Boundary Cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 31, no. 1, pp. 164–171

Caban

J J

Joshi

Rheingans

(2007) Texture-based feature tracking for effective time-varying data visualization. IEEE Transactions on Visualization and Computer Graphics, Vol. 13, no. 6, pp. 1472–1479

Wang

J Q

Yagi

(2008) Integrating Color and Shape-Texture Features for Adaptive Real-Time Object Tracking. IEEE Transactions on Image Processing, Vol. 17, no. 2, pp. 235–240

10.

Tang

H B

Wang

Sun

Z Q

(2004) Accurate and Stable Vision in Robot Soccer. International Conference on Control, Automation, Robotics and Vision, Kunming, China, pp. 2314–2319

11.

Chiang

J S

Hsia

C H

Chang

S H

(2010) An Efficient Object Recognition and Self-Localization System for Humanoid Soccer Robot. SICE Annual Conference, Taipei, Taiwan, pp. 2269–2278

12.

Ren

Zhong

Q B

(2008) A New Image Segmentation Method Based on HSI Color Space for Biped Soccer Robot, Proceedings of 2008 IEEE International Symposium on IT in Medicine and Education, Xiamen, China, pp. 1058–1061

13.

Minten

B W

Murphy

R R

Hyams

Micire

(2001) Low-Order-Complexity Vision-Based Docking. IEEE Transactions on Robotics and Automation, Vol. 17, no. 6, pp. 922–930

14.

Hyams

Powell

M W

Murphy

(2000) Cooperative Navigation of Micro-rovers Using Color Segmentation. Autonomous Robots, Vol. 9, pp. 7–16

15.

Deans

Kunz

Sargent

(2005) Combined Feature Based and Shape Based Visual Tracker for Robot Navigation. IEEE Aerospace Conference, Big Sky, Montana, USA.

16.

Siddiqui

Bouman

C A

(2008) Hierarchical Color Correction for Camera Cell Phone Images. IEEE Transactions on Image Processing, Vol. 17, no. 11, pp. 2138–2155

17.

Choi

Y J

Lee

Y B

Cho

W D

(2009) Color Correction for Object Identification from Images with Different Color Illumination. Fifth International Joint Conference on INC, IMS and IDC, Seoul, Korea, pp. 1598–1603

18.

Xie

L J

D L

Simske

S J

(2011) Feature Dimensionality Reduction for Example-based Image Super-resolution. Journal of Pattern Recognition Research, Vol. 6, no. 2, pp. 130–139