Computer vision-based techniques and path planning strategy in a slope monitoring system using unmanned aerial vehicle

Abstract

Unmanned aerial vehicle is a typical field robot which can work in many unstructured environments like mines, forests, and even radiation areas. In our mine monitoring system built in a northeast province of China, special designed unmanned aerial vehicle is applied to take photos and perceive the environment. We select a series of image-based techniques to process aerial pictures to monitor the slope. The visual features are initially refined by histogram equalization. Then, the rocks and cracks can be detected by different digital image processing operators, like Canny, so as to assess displacements. Advanced semantic segmentation model, U-Net, is also selected to process the problem. Experimental results show that both Canny and U-Net can perceive the edges in pictures effectively, better than other operators. In addition, we model the inspection mission for mine slopes into a traveling salesman problem, then plan the path for unmanned aerial vehicle by swarm intelligence-based optimization.

Keywords

Slope monitoring field robots UAV image processing edge detection path planning

Introduction

Landslide hazard is regarded as one of the most common geological disasters in many developing countries. There are almost a thousand slope sliding accidents in China per year, posing threat to people’s life and property. In the mining production, the rock slopes are more fragile since they are always weaken in the blasting operation, the mining process, and even in the rainfall erosion. It is an important issue to detect and monitor the surface displacement, then warn the potential risks so as to avoid rockfall disasters. The classical method, manual measurement, is time-consuming and inefficient. Due to the fact that the field is always complicated with arduous conditions, the labor costs for this work are very high.

As pointed by Whittaker, future robots are expected to work in field sites, in the unstructured environment (like mine), relieving humans of duty in hazardous environments and facing more challenges.¹ In recent years, unmanned aerial vehicles (UAVs) are widely used in many remote sensing and monitoring tasks. Mozas-Calvache et al. utilized unmanned aircraft systems for photogrammetric flights.² Based on the aerial photos, the authors designed some effective linear elements-based metrics and methods to analyze the particular behavior of landslide.² D’Oleire-Oltmanns et al. chose UAV to monitor soil erosion in Morocco.³ The method is able to reduce the “gap” between the satellite scale data and field scale data.³

For UAVs in slope monitoring system, the problem is more complex which is associated with coordinate transformation, lens distortion, scene reconstruction, data transmission, and the UAV position control. In this article, we mainly focus on two subworks: (1) image processing techniques which will extract slope features for stability analysis and (2) path planning strategies which can adaptively generate suitable routes for UAV in different monitoring requirements.

Our UAV-based slope monitoring system is constructed in a northeast province of China. The open-pit mine has already transferred into a deep pit and the risk of slope instability is very high. Against this background, we adopt a series of image-based techniques to detect rock edges which are valuable information for mine scene analysis. In addition, a swarm intelligence-based route planning strategy is employed for UAV.

Related work

Slope monitoring and stability analysis

As the landslide has become a growing concern, many displacement monitoring approaches are utilized in different environments. Global positioning system (GPS) is an effective technique. After the GPS positioning instruments are installed in slopes, the coordinates of each points can be determined with the satellite data (always from three satellites). Gili et al. applied GPS to monitor the landslide of Vallcebre, Eastern Pyrenees (Spain).⁴ Their results show that GPS can obtain a larger monitoring coverage. The main disadvantage of GPS is that the instruments are very expensive and the precision of civil GPS is lower than that for military. Power supply is another problem for many GPS tools.

Light detection and ranging (LiDAR) is also common device for displacement monitoring. Airborne laser altimetry can provide rich terrain information with high resolution.⁵ However, laser scanning time is relatively long and the device is often used for static measurement.

Another monitoring method is employing different remote sensing instruments, for example, synthetic aperture radar (SAR)^6,7 and satellite.⁸ Remote sensing-based method can also obtain spatial information and measure it, but the cost is relatively high. For most small- and medium-sized mines, the method is not economical.

With the development of photography equipments, photogrammetry is widely used in many geological projects.^9,10 UAV can be regarded as good carrier for camera. Compared with other methods, UAV is more flexible. The flight height and photo-frequency can be set and adjusted according to different situations. Furthermore, the instruments are cheap and easy to maintain. These years, more and more monitoring systems adopt more than one measurement devices. UAV is good supplement for other methods like GPS.

After the data, points and displacements, are recorded in the system, they will be further processed and calculated so as to generate timely and effective prediction. Classical methods applied statistical approaches to analyze the observed data, for example, Saito method¹¹ and Fukuzono method.¹² According to different signal receiving strategies, the data can be reconstructed in different styles and then matched in a new feature space.^13,14 In this article, we mainly focus on the image-based techniques for slope perception.

Edge detection techniques

In a slope monitoring system, the most important issue is to detect object edges and key points. To evaluate the fissures and rocks so as to monitor the mine, scholars carried out many valuable computer vision-based and digital image processing-based works. Computer vision and digital image processing are correlated but different concepts. As the words, “digital image processing” is more relevant to signal processing. The pictures are regarded as two-dimensional (2-D) signals. The most common form is to transform images by different mathematical functions so as to make them easier to handle by computers. For example, X-ray images can be enhanced and deblurred by filters. In computer vision tasks, we focus more on the understanding of visual data, just like human vision, recognizing and analyzing different objects and understanding the scenes. Most computer vision methods are utilized to help computers achieve more information and infer by them, like the perception of categories, sizes, locations, and colors about the objects in the pictures. These methods are always correlated with machine learning models. It can be summarized that in digital image processing, image details are processed by pixel-wise mathematical functions. Then, the processed images are further modeled in computer vision tasks by machine learning algorithms.

Traditional edge detection methods adopt different digital image processing operators. The photos are dealt in different steps: filtering,¹⁵ feature extraction (like color and texture features), and edge detection. Common filters include median filter, average filter, low-pass filter, and morphological filter.¹⁵ In addition to Red Green Blue and Hue Saturation Intensity¹⁶ color features, Harris,¹⁷ scale-invariant feature transform,¹⁸ and Gabor¹⁹ are all effective feature extractors to perceive surface information.

Among these methods, edge detection operators are closely correlated with the analysis about the slopes. Roberts,²⁰ Sobel,²¹ and Prewitt²² are general first-order differential operators to detect the edges. The second-order differential operators are represented by Laplacian,¹⁵ Laplacian of the Gaussian function (LoG),²³ and Canny.²⁴

In recent years, along with the development of deep learning-based algorithms, a series of models with deep networks, like fully convolutional neural network (FCN),²⁵ Mask R-convolutional neural network (CNN),²⁶ and U-Net,²⁷ are utilized in semantic segmentation tasks. The methods can also extract edges of different objects in the picture. In general, semantic segmentation models need dense (pixel-level) labels.

Path planning for UAV

For UAV in slope monitoring system, a notable problem is to obtain a suitable route which can guide the UAV fulfill shooting missions in the mine. UAV path planning methods can be divided into the following two categories.

Geometric approach or graphical approach: In these methods, the map for UAVs is modeled according to geometric theory, then UAV will search for possible path in the constructed map (feasible points and obstacles) by certain rules. Graphical approaches include Voronoi diagram,²⁸ Grid,²⁹ probabilistic roadmap,³⁰ visibility graph,³¹ and so on Since the construction of maps is the key point of these approaches, they are not flexible in dynamic environments. It also should be noticed that graphical map modeling strategies are also applicable to other methods, like grids for A*³² and ant colony optimization (ACO).³³

Heuristic searching approach: Heuristic methods depend on random search in the solution space and some of them adopt swarm intelligence. These methods build many acceptable routes and then optimize and select the better one according to heuristic information gradually. The searching will continue until the optimal solution is obtained. A*,³² D*,³⁴ genetic algorithm,³⁵ particle swarm optimization (PSO),³⁶ and ACO³³ are common heuristic methods. The main disadvantage of these methods is that the computing costs for a larger solution space will be very high.

There are also some classical methods that do not belong to the categories above, like Dijkstra³⁷ and artificial potential field.³⁸

Background

As mentioned above, our slope monitoring system is built on an iron mine located in a northeast province of China. Figure 1 is the panorama of the mine pit.

After years of mining, the mine is already changed from a hill to pit with lots of unstable factors.³⁹ The exploitation depth reaches −270 m and different strata have been exposed. The geological structure of the mine is very complex with huge differences in different parts. After mechanical mining and blasting operations, the slope is very fragile. Besides, different instruments like excavators and drillers are put aside the pit which may bring more loads. There are also different kinds of geological compositions, making the problem more complex.³⁹

Figure 1.

The panorama of the iron mine.

According to the analysis using stereographic projection, the pit can be divided into eight parts, and the parts are numbered from A to H, as shown in Figure 2.³⁹

Figure 2.

Region partition of the mine. A: The rock is solid with relatively stable geological faults. B: The periphery is mountainous region in high terrain, existing slides. C: Impacted by river water and loess layer, the slope is unstable. D: Backed to the mountain, the slope is relatively stable. E: The geological faults are mature which may trigger landslides. F: The region is in the north point of the mine with gentle slope and higher stability. G: There are loess and underground water in this region, bring landslide risks. H: The part is in the upper of the slope, with topsoil in 30 m thick.³⁹

In this context, we design an integrated monitoring system with GPS monitoring points, photogrammetry observatories, and a UAV to perceive the mine environment comprehensively. The UAV needs to fulfill two tasks: taking slope pictures aerially and inspecting regions (points) periodically. The UAV can not only monitor the rocks and gaps in the slope but also inspect the devices in the system so as to make them work normally.

The used UAV in the system is shown in Figure 3.

Figure 3.

The UAV in our system: (a) UAV, (b) UAV in flight, and (c) remote control. UAV: unmanned aerial vehicle.

Edge detection and scene understanding for slope monitoring

As analyzed in the “Background” section, to relieve people of the dangerous, burdensome, and inefficient fieldwork, UAV is adopted in the monitoring system. Appropriate sensing strategy for mine environment is the key point. In this article, we pay more attention to edge detection in the photos taken by UAV. For this task, we utilize both traditional digital image processing methods and deep learning-based computer vision segmentation model. Edge detection is essential for the perception of environment for UAV. The task is able to help the UAV navigate. The perceived visual signals are also important data for monitoring.

Preprocessing

In our system, the pictures can be processed both on the terminal (UAV) and the computer station. For those original pictures taken by UAV, due to the fact that the pictures may be affected by the weather and light, we refine the picture by histogram equalization.¹⁵ We also visualize the color pictures by same strategy (equalize the picture in three channels then merge them). The noisy pictures and equalized pictures are shown in Figures 4 and 5.

Figure 4.

Image refinement by histogram equalization, example 1: (a) original, (b) equalized (gray), and (c) equalized (color).

Figure 5.

Image refinement by histogram equalization, example 2: (a) original, (b) equalized (gray), and (c) equalized (color).

It can be seen that after equalization, the pictures are clearer with higher contrast.

When the visual signals are transmitted to the computer station, considering the noise during the transmission, the pictures will be processed by a median filter. We adopt median filter instead of average filter so as to keep the edge information. Since Gaussian filtering is one step in “Canny” operation (it will be described in the next section), we do not design other filter in the image processing module.

Edge detection based on Canny

Along with the CV technique development, many edge detection methods have appeared,^20,23 Canny operator has been regarded as one of the most effective methods in the past many years. Compared with other operators, Canny has a better anti-noise capability and robustness. Because of that Canny applies two different thresholds for “strong edges” and “weak edges,” and judge the “weak edges” to be “positive” if they are connected to “strong edges,” the operator has excellent performance in those “weak edges.”²⁴

Canny is a four-step algorithm. At the beginning, the pictures are denoised by Gaussian filter. The Gaussian function can be written as equation (1)

G (x, y) = \frac{1}{2 π σ} exp (- \frac{x^{2} + y^{2}}{2 σ})

$G (x, y)$ is utilized to smooth the input image, $f (x, y)$ , as equation (2)

f^{'} (x, y) = G (x, y) ⊙ f (x, y)

where $⊙$ denotes the convolution operation.

In the next step, the gradients in two directions can be calculated by equations (3) and (4)

G_{x} = \frac{\partial f^{'}}{\partial x}

G_{y} = \frac{\partial f^{'}}{\partial y}

These two gradient values can be used to approximately generate the intensity and direction of each pixel gradient, as equations (5) and (6)

M (x, y) = \sqrt{G_{x}^{2} + G_{y}^{2}}

α (x, y) = arctan (\frac{G_{y}}{G_{x}})

where $M (x, y)$ and $α (x, y)$ are the intensity and angle of pixels.

The third stage is non-maximum suppression which will refine the edges in $M (x, y)$ . Only the detected points (pixels) meeting the following conditions can be treated as the “real” edges.

Comparing the gradient intensity of one point in $M (x, y)$ with other two points in the gradient direction and the reverse direction, if the value (intensity) is bigger than other two points, then the point as part of the edge is retained. Otherwise, the gradient of this point is set to 0.

In the final stage, two thresholds are utilized to evaluate the edges detected in step 3. If the intensity of one point is above the higher threshold, the point is part of the final edges (“strong edges”). For those points with the gradient between the lower threshold and the higher, they are considered as the “weak edges.” If the “weak edge point” is isolated without any connection to “strong edges,” the point is also suppressed, setting to zero.

In the designing of Canny detector, the designer defined detection and localization criteria: signal-to-noise ratio (SNR) and localization accuracy.²⁴ Based on these two criteria, the operator design problem is concluded into the maximization of them simultaneously.²⁴ Due to the definition and consideration of these criteria, Canny operator performs better in many applications than other previous methods.

In these article, we adopt another three common metrics, mean squared error (MSE), peak signal-to-noise ratio (PSNR), and Structural SIMilarity index (SSIM) for a more general and appropriate comparison. The mathematical expressions of these are written in equations (7), (8), and (9) respectively.

MSE = \frac{1}{m n} \sum_{i = 0}^{m - 1} \sum_{j = 0}^{n - 1} {‖I_{d} (i, j) - I_{g} (i, j)‖}^{2}

where $m \times n$ is the size of edge image, I_d is the detected edge pixels, and I_g is the ground-truth edge pixels.

PSNR = 10 \cdot {log}_{10} (\frac{{MAX}_{I}^{2}}{MSE}) = 20 \cdot {log}_{10} (\frac{{MAX}_{I}}{MSE})

where ${MAX}_{I}$ is the maximum value of pixels; ${MAX}_{I} = 255$ here.

SSIM (x, y) = \frac{(2 μ_{x} μ_{y} + c_{1}) (2 σ_{x y} + c_{2})}{(μ_{x}^{2} + μ_{y}^{2} + c_{1}) (σ_{x}^{2} + σ_{y}^{2} + c_{2})}

where $μ_{x}$ and $μ_{y}$ are mean values of x and y, respectively; $σ_{x}^{2}$ and $σ_{y}^{2}$ are variances of x and y, respectively; $σ_{x y}$ is the covariation of x and y; and c ₁ and c ₂ can be calculated as equations (10) and (11)

c_{1} = {(k_{1} L)}^{2}

c_{2} = {(k_{2} L)}^{2}

where k ₁, k ₂, and L are set to 0.01, 0.03, and 255, respectively, in our experiments.

When MSE is expected to be lower, the performance is better, while other two metrics, PSNR and SSIM, are higher, the performance is better.

In this article, we select two typical scenes, “rock slope picture” and “ore picture,” in the slope to process and analyze. The algorithms’ performances on these two scenes are able to show the ability of them on the focused two different kinds of objects of interest, fissures and dangerous rocks, in the slope. This is valuable to evaluate edge detection methods in the slope monitoring system. Edge detection for slope pictures can measure the displacements of different objects. The texture and fissures in the slope are able to illustrate the changes in different layers, too. The analysis about the ore particle sizes can directly detect those dangerous rocks which are bigger or instable. This is the reason why we take these two pictures as examples.

We carry out comparative experiments with other operators including Roberts,²⁰ Sobel,²¹ Scharr,⁴⁰ Prewitt,²² Laplacian,¹⁵ and LoG.²³ Statistical results on different metrics are listed in Tables 1 and 2. We also make line graphs for the data in these two tables, as shown in Figures 6 and 7. The detected edges are shown in Figures 8 and 9.

Table 1.

The performance of different detectors on three metrics for rock slope picture.

Detector	MSE	PSNR	SSIM
Roberts	28,744.36197	3.545276876	0.099498412
Sobel	27,212.89701	3.783055828	0.139473202
Scharr	29,164.80789	3.482212408	0.074138022
Prewitt	27,057.61728	3.807908113	0.14380207
Laplacian	30,099.71094	3.345180359	0.053933224
LoG	28,139.11829	3.637698757	0.112268271
Canny	25,144.40231	4.12639044	0.187609315
U-Net	8734.360886	8.718492288	0.624276275

MSE: mean squared error; PSNR: peak signal-to-noise ratio; SSIM: Structural SIMilarity index.

Table 2.

The performance of different detectors on three metrics for ore picture.

Detector	MSE	PSNR	SSIM
Roberts	29,950.13638	3.366815566	0.030869363
Sobel	29,526.95991	3.428616265	0.023509563
Scharr	29,913.52964	3.372127004	0.020405498
Prewitt	41,677.81424	1.931754262	0.005465028
Laplacian	26,538.99076	3.891959576	0.012372649
LoG	30,966.35057	3.221903348	0.021943793
Canny	25,902.47906	3.997390295	0.069967451
U-Net	13,186.87077	6.929386108	0.514948132

MSE: mean squared error; PSNR: peak signal-to-noise ratio; SSIM: Structural SIMilarity index.

Figure 6.

The performance of different detectors on three metrics for rock slope picture: (a) MSE, (b) PSNR, (c) SSIM. MSE: mean squared error; PSNR: peak signal-to-noise ratio; SSIM: Structural SIMilarity index.

Figure 7.

The performance of different detectors on three metrics for ore picture: (a) MSE, (b) PSNR, (c) SSIM. MSE: mean squared error; PSNR: peak signal-to-noise ratio; SSIM: Structural SIMilarity index.

Figure 8.

Edge detection by different image-based techniques for rock slope picture: (a) original, (b) equalized, (c) Roberts, (d) Sobel, (e) Scharr, (f) Prewitt, (g) Laplacian, (h) LoG, (i) Canny, and (j) U-Net.

Figure 9.

Edge detection by different image-based techniques for ore picture: (a) original, (b) equalized, (c) Roberts, (d) Sobel, (e) Scharr, (f) Prewitt, (g) Laplacian, (h) LoG, (i) Canny, and (j) U-Net.

For the rock slope, as illustrated in Table 1 and Figure 6, Canny has a lower MSE and higher PSNR, SSIM than other classical edge detection methods (the methods not based on deep networks). In first-order operators, Prewitt and Sobel perform better than Scharr and Roberts, also better than two second-order operators, Log and Laplacian. This is probably because there are many curved edges in different thickness. For LoG, too thin edges may lead to the loss of edge details in the points that the derivatives are zero. Compared with other operators, the noise reduction effects of Prewitt are better and stable.

In Figure 8, it is observed that classical methods are more susceptible to the noisy points. Canny has a better performance to perceive the main textures in the slopes and segment objects in loose rock zones. Compared with the edges obtained by Sobel and Prewitt, those processed by LoG and Laplacian are somewhat vague.

As listed in Table 2 and shown in Figure 7, in the detection for a picture about ores, Canny also performs best among all classical detectors. Prewitt performs weaker in these experiments. Sobel has a similar performance to Scharr and Roberts in all the indicators, weaker than Laplacian in MSE and PSNR. Since the objects and noises are complicated and varied in this picture, average pixel value-based noise reduction method may lead to a poor edge locating ability, like the method in Prewitt. This is the possible reason why Prewitt is weaker than other detectors. It is a difficult problem to keep edge details and suppress noise interferences. Two thresholds in Canny help this algorithm obtain better edges.

In Figure 9, there are so many stones in different sizes that the detected edges are fragmentized and inconsecutive. The edges detected by Canny are more distinct and the noisy points and lines are relatively few compared with them obtained by other classical operators.

Edge detection based on U-Net

The popular semantic segmentation framework, U-Net,²⁷ is also used to detect rock edges in the pictures. The proposal of U-Net is based on FCN.²⁵ As the typical CNN⁴¹ segmentation model, U-Net also use pixel-level labels. However, in many real-world applications, like medical image processing, the data with dense labels are scarce and expensive.

In this context, U-Net further design different down-sampling, up-sampling middle layers in different scales and connect them in different dimensions. The down-sampling layers work as a “contracting path” which can help to keep the contextual information in the feature maps. On the other hand, the up-sampling layers form the “expanding path.” With this ‘expanding path,’ the model can effectively detect the objects and locate them. In general, the “contracting path” and “expanding path” are symmetrical, constituting the U-like structure. Due to the introduction of U-like structure, there is no fully connected layer in U-Net, reducing the trainable parameters significantly, thereby enabling the model more independent of training samples and computing resource.

The application scenario of U-Net is similar to our slope monitoring system: to detect the edges of objects and the main textures in the pictures (slopes), with limited training samples. We transfer the pretrained parameter weights on electron microscopy segmentation data set,⁴² then further train the model using newly labeled slope images and their corresponding segmentation maps. VGG 16⁴³ is selected as the backbone. To avoid gradient disappearance, we also adopt residual mechanism.⁴⁴ The training of U-Net is conducted on a workstation with one Intel Core(TM) i7-8700 CPU @2.4 GHz, 1 16 GB RAM, and 1 11 GB GeForce 1070 GPU. The development environment is Pycharm on Windows 10 (Anaconda). The deep learning platform is Keras 2.2.4 using TensorFlow 1.8 as backend. The hyper-parameters are set as this: $A d a m o p t i m i z e r$ ; $l e a r n i n g r a t e$ , 0.0001; $b a t c h s i z e$ , 2; $e p o c h$ , 50; $s t e p s - p e r - e p o c h$ , 200.

We use same criteria, MSE, PSNR, and SSIM written in equations (7) to (9) to evaluate the performance of U-Net. In these experiments, we label 15 images as a group for a typical region of interest.

Similarly, the statistical results and detected edges are recorded and demonstrated along with those obtained by other detectors, as presented in Tables 1 and 2 and shown in Figures 6 to 9.

It can be seen that the performance of U-Net on rock slope picture is excellent, even better than Canny. Compared with other detectors, U-Net’s MSE reduces significantly. Its PSNR doubles, the value obtained by Canny and the SSIM is three times as high as that obtained by Canny. In Figure 8(j), the detected rock edges are distinct and smooth.

In the processing for ore picture, U-Net also outperforms other methods with labeled training data. It can reduce the MSE by nearly 50% compared with Canny. The PSNR and SSIM obtained by U-Net are also the highest among all the detectors. As shown in Figure 9(j), U-Net is able to detect clear and continuous edge profiles of ores in different sizes. The noise points in this subgraph are also fewer than others obviously.

Image processing for slope monitoring

As mentioned in the “Related work” section, the main purpose of image processing is to detect and perceive the valuable information for scene understanding. As analyzed in the “Edge detection techniques” section, edge detection for rock slopes can measure displacements of different objects. The changes in different slope layers can be illustrated by the texture and fissures. The analysis about the ores can directly detect those dangerous rocks. These perceived data will be recorded and reconstructed in the system, and then fed into a predictor. In general, the predictor can be built by statistical analyses^11,12 or data-driven model training, for example, support vector machines and neural networks. The predictors will be described in our future work. In addition, the visual signals are also important references for the expert to make appropriate decisions.

In this article, to make more analyses about the mine status, we adopt Canny and U-Net²⁷ to process the image sequences taken by UAV, as shown in Figures 10 and 11. The original pictures in these two figures are two typical slopes in the mine. They also represent different two flight directions relative to the slope: flying close to the slope, or near the slope, from one side to another.

Figure 10.

Edge detection by Canny and U-net on typical slope images (flying close to the slope): (a) original 1, (b) original 2, (c) original 3, (d) Canny 1, (e) Canny 2, (f) Canny 3, (g) U-Net 1, (h) U-Net 2, and (i) U-Net 3.

Figure 11.

Edge detection by Canny and U-Net on typical slope images (flying from left to right near the slope): (a) original 1, (b) original 2, (c) original 3, (d) original 4, (e) original 5, (f) Canny 1, (g) Canny 2, (h) Canny 3, (i) Canny 4, (j) Canny 5, (k) U-Net 1, (l) U-Net 2, (m) U-Net 3, (n) U-Net 4, and (o) U-Net 5.

As shown in Figure 10, it is obvious that the objects are more explicit in (b) and (c) than those in (a). Similarly, both Canny and U-Net can detect the edges in pictures. The detected edges in (b) and (c) are more fitting to the ground truth. The performance of Canny is stable. It is able to perceive the stone grains in the loess, although with noises, as those in (e) and (f). U-Net has better performance to detect the main veins in the slope, as demonstrated in (g), (h), and (i), obvious and smooth.

The perceived visual information in Figure 11 is with the similar characteristics to Figure 8. In the long zone taken by UAV, the detected edges by U-Net are more distinct with good continuity. For those small gravels and small structures, Canny is more likely to perceive their edges and the surface textures. Compared with U-Net, the edges detected by Canny are more fragmentary.

Swarm intelligence-based UAV path planning

Problem formulation

Apart from visual perception, another basic mission for UAV is to determine an appropriate flight path, as described in the “Related work” section. In our system, the regions, the points of interest in the mine, are changing. The status of rocks and the cracks in the slope are also changing all the time. Besides, there are different kinds of instruments installed in the field and needing periodic inspection. Considering the power and flight time limits, it is necessary to plan a suitable path adaptively for the dynamic flight mission. Then, the UAV can take photos according to the route, processing them online or off-line and transmitting them to central computing station.

As demonstrated in Figure 12, the objects in the slope and several instruments are pointed in the map with different colors.

Figure 12.

Objects and instruments in the mine.

In the system, shooting points for the objects are recorded by their longitudes, latitudes, and heights. Table 3 lists an example of a few points on one flight.

Table 3.

Points of interest in one flight.

Number	Longitude (°)	Latitude (°)	Height (m)
1	118.2583291	34.9896223	219
2	118.2591323	34.9917996	220
3	118.2593608	34.9937517	231
4	118.2589825	34.9897308	239
5	118.2590167	34.9914123	248
6	118.2579963	34.9889917	253

At the present stage, we measure the UAV flight cost by the distance. Due to the fact that our background is an open pit and there are few obstacles for UAV in the environment, the route planning problem can be formulated as a traveling salesman problem (TSP)⁴⁵: the salesman (UAV) needs to travel around all the cities (points of interest) by the optimal route.

As shown in Figure 13, for a group of targets in the mine, our UAV needs to inspect them at a set of suitable positions for photograph, keeping proper distances from the slopes and also in the air. If we can obtain the coordinates, or longitudes and latitudes of these shooting points, the inspection flight path for UAV can be obtained by solving TSP with nodes like those recorded in Table 3.

Figure 13.

UAV path planning by solving TSP. UAV: unmanned aerial vehicle; TSP: traveling salesman problem.

Method for cost calculation

With the longitude, latitude, and height of each point, we can calculate the distance between every two points, i and j, as the following equation

D_{i j}^{'} = 2 arcsin \sqrt{{sin}^{2} (\frac{a}{2}) + cos ({lat}_{i}) \times cos ({lat}_{j}) \times {sin}^{2} (\frac{b}{2})} \times R

where ${lat}_{i}$ and ${long}_{i}$ denote the radians of the coordinates in Table 3 (longitude and latitude); they can be calculated by equations (13) and (14), R is the equatorial radius, 6,378,137 m, a and b are two intermediate variables which can be computed in equations (15) and (16)

{lat}_{i} = {Latitude}_{i} \times π / 180

{long}_{i} = {Longitude}_{i} \times π / 180

a = {lat}_{i} - {lat}_{j}

b = {long}_{i} - {long}_{j}

Considering the 3-D background, we further add heights as listed in Table 3. Then the distance, also the cost of the segment [i, j], $D_{i j}$ can be approximately obtained by equation (17)

D_{i j} = \sqrt{{(D_{i j}^{'})}^{2} + {(h_{i} - h_{j})}^{2}}

where h_i and h_j are the heights of point i and point j.

Due to the fact that our environment is an open pit and there are few obstacles, the cost of UAV is measured by flight distance. For a route R_k , assuming that the serial number of nodes are $[i, j, k, \dots, N^{'}, N]$ , the total cost can be calculated as equation (18)

D = D_{i j} + D_{j k} + \dots + D_{N^{'} N}

where $D_{i j}$ is the distance between 2 nodes, i and j, as written in equation (17).

When the cost between every two nodes are determined, the path planning problem is transformed into a TSP which can be solved by many optimization algorithms.^46,47

Path planning by ACO

In our system, limited by the power supply, the flight time of UAV is less than 1 h at a time. This is the reason why the shooting points are also less than 10 during each UAV flight.

For TSP with fewer nodes, classical optimization methods with larger population size are already more than adequate and they can also obtain an appropriate route. We simply adopt the classical model, ACO^33,48 for this task.

ACO is a popular heuristic optimization algorithm over the last few decades. This algorithm is inspired by the foraging behavior of ants who will release pheromone in different routes. Since the amount of pheromone of an “artificial ant” is a fixed value, the concentration is higher in the short routes and lower in the long routes. The ant will consider the concentration in path when it moves to the next position. After a number of iterations, more and more ants will select the shorter route and find the optimal solution gradually. There are many variants and improvements of ACO like Ant System including $a n t - c y c l e$ model, $a n t - d e n s i t y$ model, and $a n t - q u a n t i t y$ model³³; Ant Colony System⁴⁹; and MAX–MIN ant system.⁵⁰

In this article, we adopt classical $a n t - c y c l e$ model³³ to plan the route for UAV. In this model, the probability, $P_{i j}^{k} (t)$ , for the kth ant from position i to position j at time t, can be written as equation (19)

P_{i j}^{k} (t) = \{\begin{array}{l} \frac{{[τ_{i j} (t)]}^{α} {[η_{i j} (t)]}^{β}}{\sum {[τ_{i s} (t)]}^{α} {[η_{i s} (t)]}^{β}}, & j, s \in {allowed}_{k} \\ 0, & otherwise \end{array}

where $τ_{i j} (t)$ denotes the amount of pheromone on edge [i, j], α and β are two parameters, ${allowed}_{k}$ is a shortening vector which contains the cities that the kth ant have not visited; there are K ants (k = 1, 2,…, K) in the algorithm.

$τ_{i j} (t)$ and $η_{i j} (t)$ can be calculated by equations (20) and (21)

τ_{i j} (t + 1) = ρ τ_{i j} (t) + Δ τ_{i j} (t, t + 1)

η_{i j} (t) = \frac{1}{d_{i j}}

where ρ is an important parameter in the model, the coefficient representing the evaporation on the path; $d_{i j}$ is the cost (distance in this scenario) of [i, j]; $Δ τ_{i j} (t, t + 1)$ denotes the variation of pheromone at edge [i, j], it can be obtained as equations (22) and (23)

Δ τ_{i j} (t, t + 1) = \sum_{k = 1}^{K} τ_{i j}^{k} (t, t + 1)

τ_{i, j}^{k} (t, t + 1) = \{\begin{array}{l} \frac{Q}{L_{k}}, & if  ant k passed  through  edge [i, j] \\ 0, & otherwise \end{array}

where Q is the preset parameter, a constant representing the total amount of pheromone; L_k is the length obtained by the kth ant among K ants.

In previous studies, there are some rules summarized for the determination of ACO hyper-parameters. If the number of ants, K, is higher, the algorithm’s global searching ability is stronger. But bigger K will also bring higher computational costs and time. In our applications, due to the fact that the nodes in one inspection mission are less than 10, we simply use the basic ACO model and slightly more K, 50, so as to find a better solution. Similar to the determination of K, we also choose a slightly more $M a x I t e r a t i o n$ , 500. The total amount of pheromone, Q, mainly impacts the accumulation of pheromone in each route. It is related to the length of routes (the environment) and can be determined by experiments. We adopt Q = 100 in this article. α, β, and ρ are three important parameters of ACO. ρ illustrates the volatilization rate of pheromones. A bigger ρ will promote the randomness of algorithm and improve the global searching ability. It may also cause the non-convergence. ACO with smaller ρ is able to converge rapidly but it is likely to plunge into local optimal solution. In this article, we simply set ρ =0.5. α and β are better selected among [0, 5]. These two parameters adjust the influences of τ and η. For complex TSPs, other optimization algorithms, like PSO,³⁶ are always utilized to optimize these two parameters. We determine these two parameters by “trail and error” method, determining them 1 and 2, respectively. Since we use slightly bigger K and $M a x I t e r a t i o n$ , $a n t - c y c l e$ model³³ obtained good performance in our experiments.

The obtained sequence for the nodes in Table 3 is [1–6–4–5–2–3], and the cost is 1134.2444 m.

Conclusion

In this article, we introduce a UAV-based slope monitoring system and describe two major works in this system: image-based edge detection and swarm intelligence-based path planning.

For visual perception, we employ Canny and U-Net to detect edges, cracks, and ores. Comparative experiments show that these two methods can detect edges effectively. Without the training process and labeled data, Canny has a better ability to detect the targets in the slope. For closely shooting pictures and ore pictures, U-Net can obtain clearer and smoother edges without the manual threshold setting.

The periodic inspection task for the pit is transformed into a TSP. ACO is utilized to deal with the TSP and find an appropriate route for the points of interest in the mine.

In our future work, we will make improvements to extend the battery life so as to increase the monitoring points on one flight. More deep learning-based semantic segmentation methods can further be applied. The data analysis model for landslide prediction will also be researched in the system.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the National Natural Science Foundation of China [grant no. 61673098] and in part by the Fundamental Research Funds for the Central Universities, China.

ORCID iD

Peng Chen

References

Whittaker

Field robots for the next century. In: Intelligent components and instruments for control applications. Selected papers from the IFAC symposium (eds Ollero

Camacho

)., Malaga, Spain, 20–22 May 1992, pp. 41–48. Oxford: Pergamon Press. ISBN 0-08-041899-6.

Mozas-Calvache

Perez-Garcia

Fernandez-del Castillo

. Monitoring of landslide displacements using UAS and control methods based on lines. Landslides 2017; 14(6): 2115–2128.

D’Oleire-Oltmanns

Marzolff

Peter

, et al. Unmanned aerial vehicle (UAV) for monitoring soil erosion in morocco. Remote Sens 2012; 4(11): 3390–3416.

Gili

Corominas

Rius

. Using global positioning system techniques in landslide monitoring. Eng Geol 2000; 55(3): 167–192.

Glenn

Streutker

Chadwick

, et al. Analysis of LiDAR-derived topographic information for characterizing and differentiating landslide morphology and activity. Geomorphology 2006; 73(1–2): 131–148.

Strozzi

Kaab

Frauenfelder

, et al. Detection and monitoring of unstable high-mountain slopes with l-band SAR interferometry. In: IGARSS 2003: IEEE international geoscience and remote sensing symposium. Proceedings: learning from Earth’s shapes and sizes, volume I–VII, Toulouse, France, 21–25 July 2003, pp. 1852–1854. New York, USA: IEEE. ISBN 0-7803-7929-2.

Atzeni

Barla

Pieraccini

, et al. Early warning monitoring of natural and engineered slopes with ground-based synthetic-aperture radar. Rock Mech Rock Eng 2015; 48(1): 235–246.

Tralli

Blom

Zlotnicki

, et al. Satellite remote sensing of earthquake, volcano, flood, landslide and coastal inundation hazards. ISPRS J Photogramm Remote Sen 2005; 59(4): 185–198.

Ohnishi

Nishiyama

Yano

, et al. A study of the application of digital photogrammetry to slope monitoring systems. Int J Rock Mech Min Sci 2006; 43(5): 756–766.

10.

Lee

Bassett

. Application of a photogrammetric technique to a model tunnel. Tunn Undergr Sp Tech 2006; 21(1): 79–95.

11.

Saito

. Forecasting the time of occurrence of slope failure. In: Proceeding of the sixth international conference on soil mechanics and foundation engineering, Vol. 2, Montreal, 8–15 September 1965, pp. 537–541.

12.

Fukuzono

. A method to predict the time of slope failure caused by rainfall using the inverse number of velocity of surface displacement. J Jap Landslide Soc 1985; 22: 8–13.

13.

Zhang

. A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 2000; 22(11): 1330–1334.

14.

Zhang

Tsang

Mori

, et al. Automatic 3d model reconstruction of cutting tools from a single camera. Comput Ind 2010; 61(7): 711–726.

15.

Gonzalez

Woods

. Digital image processing, 4th ed. London: Pearson, 2018.

16.

Martin

Wabuyele

Chen

, et al. Development of an advanced hyperspectral imaging (HSI) system with applications for cancer detection. Ann Biomed Eng 2006; 34(6): 1061–1068.

17.

Harris

Stephens

. A combined corner and edge detector. In: Proceedings of the Alvey vision conference, AVC 1988, Manchester, UK, September, 1988, pp. 1–6.

18.

Lowe

DG.

Object recognition from local scale-invariant features. In Proceedings of the international conference on computer vision, Kerkyra, Corfu, Greece, 20–25 September 1999, pp. 1150–1157.

19.

Daugman

. Complete discrete 2-d Gabor transforms by neural networks for image analysis and compression. IEEE Trans Acoust Speech Signal Process 1988; 36(7): 1169–1179.

20.

Roberts

. Machine perception of three-dimensional solids. New York: Garland Publishing, 1963. ISBN 0-8240-4427-4 [Outstanding Dissertations in the Computer Sciences].

21.

Sobel

Camera models and machine perception. Report, Stanford: Stanford University, 1970.

22.

Lipkin

Rosenfeld

(eds) Picture processing and psychopictorics. Orlando: Academic Press, 1970. ISBN 0124515509.

23.

Marr

Hildreth

. Theory of edge detection. Proc R Soc Lond 1980; 207(1167): 187–217.

24.

Canny

. A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 1986; 8(6): 679–698. URL https://doi.org/10.1109/TPAMI.1986.4767851.

25.

Shelhamer

Long

Darrell

. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 2017; 39(4): 640–651. URL https://doi.org/10.1109/TPAMI.2016.2572683.

26.

Gkioxari

Dollár

, et al. Mask R-CNN. In: IEEE international conference on computer vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2980–2988.

27.

Ronneberger

Fischer

Brox

. U-net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention—MICCAI 2015—18th international conference, Munich, Germany, 5–9 October 2015, Proceedings, Part III, pp. 234–241.

28.

Takahashi

Schilling

. Motion planning in a plane using generalized Voronoi diagrams. IEEE Trans Robot Autom 1989; 5(2): 143–150.

29.

Choi

Park

, et al. Complete coverage navigation of cleaning robots using triangular-cell-based map. IEEE Trans Ind Electron 2004; 51(3): 718–726.

30.

Kavraki

Svestka

Latombe

, et al. Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Trans Robotic Autom 1996; 12(4): 566–580.

31.

Oommen

Iyengar

Rao

NSV

, et al. Robot navigation in unknown terrains using learned visibility graphs. Part I: the disjoint convex obstacle case. IEEE J Robot Autom 1987; 3(6): 672–681.

32.

Pal

Tiwari

Shukla

. Modified a* algorithm for mobile robot path planning. In: Soft computing techniques in vision science, studies in computational intelligence, Vol. 395. Berlin: Springer, 2012, pp. 183–193.

33.

Dorigo

Maniezzo

Colorni

. Ant system: optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern Part B 1996; 26(1): 29–41.

34.

Stentz

Optimal and efficient path planning for partially-known environments. In: Proceedings of the 1994 IEEE international conference on robotics and automation, Vol. 4, San Diego, CA, USA, 8–13 May 1994, pp. 3310–3317.

35.

Tanese

. Distributed genetic algorithms for function optimization. PhD Thesis, University of Michigan Ann Arbor, USA, 1989.

36.

Kennedy

. Particle swarm optimization. In: Proceedings of 1995 IEEE international conference on neural networks, Vol. 4. Perth, Australia, pp. 1942–1948.

37.

Wybe Dijkstra

. A note on two problems in connexion with graphs. Numer Math 1959; 1: 269–271.

38.

Khatib

. Real-time obstacle avoidance for manipulators and mobile robots. New York: Springer, 1990, pp. 396–404. ISBN 978-1-4613-8997-2.

39.

Sun

. Forecast and prevention of slope deformed destruction in Qidashan iron ore mine. Min Eng 2006; 4(3): 18–20.

40.

Jähne

Scharr

Körkel

, et al. Principles of filter design. Cambridge: Academic Press, 1999. pp. 125–151.

41.

LeCun

Bengio

. Word-level training of a handwritten word recognizer based on convolutional neural networks. In: 12th IAPR international conference on pattern recognition, Conference B: pattern recognition and neural networks, ICPR 1994, Vol. 2, Jerusalem, Israel, 9–13 October 1994, pp. 88–92.

42.

Arganda-Carreras

Turaga

Berger

, et al. Crowdsourcing the creation of image segmentation algorithms for connectomics. Front Neuroanat 2015; 9: 142.

43.

Simonyan

Zisserman

. Very deep convolutional networks for large-scale image recognition. In: Third international conference on learning representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings. URL http://arxiv.org/abs/1409.1556.

44.

Zhang

Ren

, et al. Deep residual learning for image recognition. In 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 770–778.

45.

Gambardella

Dorigo

. Solving symmetric and asymmetric TSPs by ant colonies. In: Proceedings of IEEE international conference on evolutionary computation, Nagoya, Japan, 20–22 May 1996, pp. 622–627. Piscataway, NJ, USA: IEEE.

46.

Zhang

Chen

, et al. Ant colony optimization combined with immunosuppression and parameters switching strategy for solving path planning problem of landfill inspection robots. Int J Adv Robot Syst 2016; 13(3): 130.

47.

Chen

Zhang

, et al. Hybrid chaos-based particle swarm optimization-ant colony optimization algorithm with asynchronous pheromone updating strategy for path planning of landfill inspection robots. Int J Adv Robot Syst 2019; 16(4): 1–11.

48.

Colorni

Dorigo

Maniezzo

. An investigation of some properties of an ‘ant algorithm’. In Reinhard

Bernard

(eds) PPSN, Vol. 92. Amsterdam: Elsevier Publishing, pp. 509–520.

49.

Dorigo

Gambardella

. Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1997; 1(1): 53–66.

50.

Stutzle

Hoos

. Max-min ant system and local search for the traveling salesman problem. In: Proceedings of the IEEE international conference on evolutionary computation, Indianapolis, USA, 13–16 April 1997, pp. 309–314. Piscataway, NJ, USA: IEEE.