Abstract
Terrain recognition technology plays a key role in enhancing autonomous mobility for Quadruped robot in off-road environments. However, feature extraction and classification algorithm are the key to accuracy and efficiency of the terrain recognition. Regarding the characteristics of different terrain surface properties and structures, it gets low-dimensional and high dimensional characteristics by texture features and wavelet transform and uses them as training features of classifier. Then its efficiency is not high and convergence speed is slow for traditional learning algorithm, which is difficult to meet the requirements. So the extreme learning machine is used to classify the terrain pictures collected by robot in real time. Experimental results show that the accuracy of extreme learning machine terrain classification is higher than the traditional neural network algorithm and the support vector machine, and algorithm efficiency is raised more than several times for the sample size of 6000, which meets the requirements for accuracy, especially for real time.
Introduction
Terrain recognition ability in complex unstructured environment is a key factor to improve the motion efficiency of quadruped robots. Terrain recognition technology can be used to select different access areas according to the different terrains (such as grassland, highway, stone road), select different gaits according to different terrain environments, 1 and implement different motion control strategies. In this process, terrain feature extraction method and classification algorithm are particularly important. They will have a direct impact on the results of the terrain recognition.
In order to improve the terrain recognition ability of quadruped robot, it is necessary to select the appropriate methods of terrain feature extraction. At present, the methods of terrain feature selection mainly have these that are the method based on absolute and relative elevation extraction, 1 and the method based on slope, slope direction extraction, 2 and the method based on fractal dimension extraction. 3 The method in Wang et al. 1 is that it is classified by extracting the relative height and absolute height features of Pingdi and Gangqiu, and so on. This method is only suitable for manual processing and it is relatively limited. For the method based on the slope and slope direction, it is to distinguish by calculating the slope and slope direction, and the main difficulty lies in determining the seed and regional growth criteria. The fractal dimension regarded the terrain as fractal Brownian motion, but the scale-free interval is difficult to determine. These methods are time-consuming, requiring manual participation, and the parameters are difficult to determine. In recent years, the extraction method based on the low-level visual features (image color, texture, shape, location, etc) has become a hot spot. Among them, the texture features of images have strong adaptability and robustness, and it is widely used in image detection and classification. 4 It is found that different terrain images have different texture features, and texture is a direct response to terrain surface roughness, so texture feature is one of the key factors in terrain recognition. In addition, GY Sung et al. 5 proposed a classification method based on wavelet feature. In the experiment, the wavelet feature is combined with the spatial feature as the classification feature, and it is classified using the neural network classifier, and result is better than the method using the color feature and the spatial feature. 6 W Xia et al. 4 also used the orthogonal wavelet transform feature to classify the complexity of submarine topography and achieved good results. The main idea of image classification based on wavelet feature is to transform the original image into feature space first and then extract its high-level features in the feature space to realize image classification. 7 Therefore, this article combines texture and wavelet terrain features as the input of terrain classification algorithm.
In order that the robot can take timely measures to prevent leg joint subsidence and sliding detection and compensation, it must use the algorithm with higher classification accuracy and faster learning speed to classify the corresponding features. At present, there are support vector machine (SVM) and neural network methods for robot terrain classification.8,9 For SVM, it need to manually set the kernel function, error control parameters, and so on, and the parameters are difficult to determine. So the SVM algorithm need to spend a lot of time to adjust the parameters in the learning process. 10 The computational efficiency of the traditional neural network classification method is not high, and learning speed is slow, such as back propagation (BP) neural network algorithm is not easy to be trapped in the local minimum, the precision is low, and the convergence rate is slow. 11 The radial basis function (RBF) neural network algorithm is more complex, and many parameters need to be adjusted manually, reducing the practicality of the algorithm. 12 However, the extreme learning machine (ELM) is a new algorithm for single hidden layer feed-forward neural network. Compared with the general BP, RBF neural network, and SVM, the parameters selection is easy and the learning speed is fast, and generalization performance is good.10,13 To sum up, in order to improve the terrain recognition ability of quadruped robot, texture feature and wavelet feature are used as the classification feature vector, and a new neural network algorithm ELM is introduced to classify the terrain image collected in the field. In the experiment, the texture description, wavelet transform, and classification technique are involved. In addition, it is compared with SVM in the judgment and recognition of terrain images, which verifies the accuracy and efficiency of ELM algorithm. Its design is shown in Figure 1.

Field terrain image recognition flow diagram.
Terrain image feature extraction
Texture feature
Texture feature is a kind of information description based on image content, which can describe the surface property and structure of the object effectively in the image. 14 It is found that different terrain images have different statistical and texture characteristics, so this article combines the statistical and texture features of the terrain image together to form the characteristics of the training data of the classifier so that it can obtain good classification performance.
Texture feature based on histogram
Histogram is one of the simplest methods of statistical image texture feature. It has an important effect on the research of terrain image, which can provide theoretical guidance for terrain image classifier. Figure 2 shows a histogram of different terrains, from which we can see that the statistical characteristics of different terrains have significant differences, so it can be used for image classification. In this article, the statistical moments of the histogram, such as mean (

Statistical feature analysis of terrain images: (a) grayscale of the terrain image and (b) histogram of the terrain image.
Texture feature based on gray-level co-occurrence matrix
Texture recognition of histogram statistical texture feature is weak due to lack of two-dimensional information of the image. 16 Since Haralick proposed the gray-level co-occurrence matrix (GLCM), it is increasingly used in texture classification studies. Among the four classification features commonly used in textures, the feature based on the GLCM is more suitable for texture description than the Markov model and the Gabor filter model. 17
GLCM is a typical method of extracting texture features, 16 and it can be obtained by counting the gray levels of the two pixels at a certain distance in the image. 14 Using the GLCM, it can get the information about the gray level of the image, such as the direction, the change range, the adjacent interval. Usually, the secondary statistic is calculated on the basis of GLCM to describe texture features. In this article, four unrelated texture features of GLCM are selected, which are angular second moment/energy (ASM), correlation (COR), contrast (CON), and entropy (ENT). 18
Assuming that
In the formula, #(x) represents the number of elements in the set x; P is Ng × Ng matrix.
The GLCM of different terrain images has different characteristics in azimuth. After processing the field terrain image, the uniform size is 128 × 128 pixels, and the angle range of the GLCM is [0°, 45°, 90°, 135°]. The energy, correlation, contrast, and entropy in different directions were calculated, respectively. The comparison results are shown in Figure 3.

GLCM features of the terrain.
Figure 3 shows a case where the feature amount of the gradation co-occurrence matrix changes in accordance with the azimuth shift. Among them, the energy of the highway is significantly larger than that of the grassland, and the energy of the two kinds of geology does not change with the increase of the azimuth offset, which indicates that the road is thicker and the energy is larger. For correlation, the highway is smaller than the texture unit of the grassland. Contrast of grassland and highway is opposite to energy characteristic. Contrast of grassland is obviously greater than that of highway. This shows that the texture groove of grassland is deep and the effect is clear. From the perspective of entropy, the grassland is more random than the highway. These four areas show significant statistical texture features.
Wavelet feature
Orthogonal wavelet transform can decompose the image into a simple multi-level frame, and each component of the framework has unique frequency characteristics and spatial characteristics. 4 Here, terrains of different complexities can be regarded as different frequency signals, and the terrain can be classified by time-frequency analysis of wavelet transform.
Wavelet feature extraction of terrain image19–21 is that the input terrain images are processed by the wavelet transform, that is, the J-layer two-dimensional wavelet decomposition, and transform the different underlying features of the terrain image into different wavelet coefficients. The terrain image is decomposed into four sub graphs, and they are

Wavelet decomposition diagram of the terrain.
Assuming that the size of the terrain image f(m, n) is N × N (where N = 128), the average energy is 7
The average energy of the kth wavelet detail map is
Terrain classification based on ELM
Professor Huang Guangbin has studied single hidden layer feed-forward neural network (SLFN) deeply and proposed the concept of ELM. Neural network is an important classification method. It can approximate any complex nonlinear function with arbitrary accuracy. The most widely used feed-forward neural network model is BP neural network. Typical three-layer BP network can achieve higher approximation accuracy.
The purpose of BP neural network training is to find the input weight
where N is the number of samples, L is the number of hidden nodes,
All the parameters of the traditional neural network need to be adjusted, which are got by calculating
Compared with the traditional BP neural network, the ELM network lacks the output layer threshold
Let

ELM model based on terrain.
If the nonlinear excitation function is g(x) and infinitely differentiable over any real number interval, and the hidden layer has
where
Formula 12 converts to matrix multiplication
where
Hidden layer output matrix
The training process of terrain classification based on ELM is equivalent to finding the least-squares solution of linear equation (14)
The solution is
In the formula,
Experimental results
At present, there is no public terrain data set, so this article offline collected 6000 terrain images in the field natural environment through the multi-class sensor, and there are six kinds of terrain types, which are grassland, highway, land, sandy land, stone road, and cement ground. Each type has 1000 images, and they are jpg file format. All of images are processed to 128 × 128 pixels. The training set contains 5400 images, and the test set contains 600 images. For each class, the training set contains 900 images and the other 100 pieces are the test set. The terrain image is preliminarily preprocessed by MATLAB tool in the Windows 7 environment.
In this article, the texture features and wavelet features are extracted from the experimental image first, and a 17-dimensional eigenvector is established for each image. Then the feature vectors of all the images are extracted to construct a vector space, and a 6000 × 17 sample matrix is generated. The characteristic values from single image are shown for each terrain in Table 1.
Characteristic measure of different terrains.
For ELM, it only need to determine the number of hidden layer nodes, because the number of hidden layer nodes will lead to significant differences for the prediction of classification performance. It is found that the number of hidden layer nodes (which is defined as n) is too small to fully express the nonlinear relationship between input vector and label. When the number of hidden layer nodes is enough to express the nonlinear relationship between input and output, the recognition rate tends to be stable. Then increasing the number of nodes does not continue to improve the classification accuracy. 10
It can be seen that the accuracy of terrain classification is stable when n = 250 from Figure 6. For the terrain data set, this article repeated the experiment several times. As can be seen from Figure 7, the terrain recognition rate is between about 95% and 100%, which shows that ELM classification algorithm has good generalization and robustness. For a certain experiment, the running results of 600 terrain test sets based on ELM are shown in Figure 8. In order to more clearly show the results of the ELM classification algorithm for each terrain recognition, this article gives a graphical explanation of the classification results in Table 2 and Figure 9.

The influence of hidden layer nodes on recognition accuracy.

The influence of experiment number on recognition accuracy.

ELM running results chart.
Terrain correct and error classification based on ELM.
ELM: extreme learning machine.

Terrain classification results based on ELM.
It is shown that the accuracy of the total test set is 97.5% in Figure 8, and the accuracy rate of each terrain is more than 94% in Figure 9. This proves the feasibility and accuracy of ELM in field terrain classification. Terrain feature representation and terrain classification methods are two important components of terrain recognition. In the experiment, the terrain is described by the method of single feature and multiple features. What is more, in order to show that the fusion features (wavelet and texture features) proposed in this article are better, the color features of the terrain extracted by RGB are used for comparison. It is shown in Tables 3 and 4.
Comparison of terrain classification based on single feature.
BP: back propagation; SVM: support vector machine; ELM: extreme learning machine.
Comparison of terrain classification based on two features.
BP: back propagation; SVM: support vector machine; ELM: extreme learning machine.
Comparing Table 3 with Table 4, the color feature has good effect on the results of terrain recognition except fusion features of color and texture for BP. In addition, the method based on texture feature and wavelet feature is superior to other methods. Then the accuracy of terrain classification based on ELM is relatively higher than that of SVM and BP, because SVM and BP algorithm are more complex in parameter selection. More specifically, the SVM algorithm needs to optimize the model type, the kernel function type, and the related parameters, such as the manual setting penalty coefficient and the test mode. The BP algorithm needs to update the weights by BP, resulting in iterating several times to obtain the optimal solution. For the field terrain data set, the running time of SVM and ELM image classification method is basically close, and they are less time than that of BP classification method. So it can be seen that the recognition rate of terrain classification algorithm based on ELM can not only meet the requirements of robot terrain classification, but also meet the high real-time requirements of robot terrain classification.
Conclusion
The terrain image has rich information, and it can be used to classify terrain images by analyzing the statistical texture features and wavelet features of terrain images. For ELM, the training process is fast and simple, parameter selection is easy, and generalization performance and robustness are good. In this article, texture features and wavelet features are merged with texture characteristics and wavelet transform, and they are used as the training feature of ELM, and ELM classifier is obtained. Compared with traditional BP neural network algorithm and SVM algorithm, the experiment results show that ELM get higher accuracy by the relatively low parameter and iterations, thereby verify the accuracy and efficiency of terrain recognition. Therefore, the application of ELM to field terrain image recognition has practical value.
Footnotes
Handling Editor: ZW Zhong
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Importation and Development of High-Caliber Talents Project of Beijing Municipal Institutions (CIT&TCD20150314) and the National Natural Science Foundation of China (4142018).
