Abstract
Marine debris is a serious global problem that is not limited to areas where humans live but also drifts around the world with wind and currents. More than 10 million tons of plastic waste flow into the ocean every year, posing a major threat to humanity. This study designs a path planning algorithm for surface garbage-cleaning robots called U*, which aims to improve the efficiency of salvaging marine debris and reduce labor and time costs. The U* algorithm consists of two procedures: exploration and path-planning. The exploration procedure searches for marine debris, while the path-planning procedure predicts the possible location of marine debris using the velocity and direction of ocean currents and finds the shortest path by using a genetic algorithm (GA) to collect the found marine debris. According to the experimental results, the U* method is more efficient in terms of reducing path length and time costs.
Introduction
Marine debris is a general term for waste intentionally or unintentionally produced by human activities and eventually flows into oceans. Plastics are crucial marine debris components because they are challenging to biodegrade. As marine organisms easily consume plastics, marine debris seriously threatens their survival. In addition, marine debris poses a threat to navigational safety, marine and coastal environments, and human health. A Taiwan Biodiversity Research Center team analyzed marine debris in seven oceans worldwide using big data technology and found that wind resistance and ocean currents affect the distribution of debris. The accumulation of marine debris in the Pacific region is the most severe, from subtropical to tropical and polar regions, with 50% of the debris still drifting at sea [1]. If the monitored sea marine debris could be cleared, the best solution would be to assist in garbage clearing using underwater garbage-clearing robots. An underwater garbage-clearing robot can collect data and clear marine debris without geographical limitations [1-3].
Path planning is one of the core areas of underwater robot technology research and is critical for realizing autonomous navigation. With the development of science and technology, many researchers have been working on underwater robots to address path-planning problems using various algorithms. However, these methods assume that the target and the environment are static. In the real environment, there is an influence of ocean currents and wind, causing drifting garbage to change its position continuously. Therefore, designing a path-planning algorithm suitable for marine environments is more realistic. For example, in the real world, roads change permanently in road construction or temporarily in traffic control. However, the location information of the drifting garbage is dynamic, which increases the complexity of path planning.
More than 80% of marine debris is found at or one meter below the water’s surface [12]. For garbage that has entered the sea, clearing marine debris from the water will significantly reduce the amount of marine debris. Figure 1 shows a schematic of the underwater garbage-cleaning robot. Because the robot has good mobility and can collect all the marine debris, we modeled this problem as a Hamiltonian cycle [30]. However, in a real environment, marine debris is also affected by currents and wind, thus changing its location constantly and increasing the difficulty of robot path planning. Owing to the sensors’ detection limitations, underwater garbage-cleaning robots can only effectively detect marine debris within their detection range. Finally, after the robot collects all the marine debris, it must return to the charging station (starting point) to recharge, making salvaging marine debris challenging.

An example of an underwater garbage-cleaning robot.
In this study, a path-planning algorithm based on a genetic algorithm (GA) with real-time collected data, including current velocity and direction, predicts the marine debris location and then plans a traversal path suitable for a floating garbage-cleaning robot called U*. U* improves salvaging marine debris efficiency and reduces labor and time costs. Compared with the Random and the Greedy methods, the U* method proposed in this study can shorten the path length, reducing the time cost of salvaging marine debris.
In summary, the aims of this study are as follows: The ocean constantly changes, and winds and currents can influence and vary the location of marine debris. Most marine debris is being collected. This study considered setting up a charging station, allowing the robot to return to the starting point for charging. This problem formed the basis for calculating the Hamiltonian cycle [30]. For greater efficiency, the expected Hamiltonian cycle was the smallest.
The contributions of this study are summarized into four aspects. In this study, we designed a path-planning algorithm for a floating garbage-cleaning robot based on a GA called U*. We compared four path planning algorithms, namely Random, Greedy, Improved GA [22], and SUMRF [24], and confirmed that U* is superior to the above four methods in terms of average path length and efficiency in salvaging marine debris. U* can increase the marine debris amount collected, thereby improving the efficiency of marine debris collection using a floating garbage-cleaning robot. U* estimates and explores marine debris location based on the ocean currents’ velocity and direction. The robot chooses a shorter path based on the remaining power or task deadline to reduce power consumption. The study also examined the NP-hard Hamiltonian cycle problem to enable the robot to collect time-shared data and charge itself at the station.
The rest of this paper is organized as follows. Section II describes two classes of solutions for cleaning up underwater debris. Section III defines marine debris collection as a dynamic TSP and describes the U* algorithm details. The fourth section is a simulation experiment, comparing the performance of the Random method, the Greedy method, and U*. The last section is the conclusions of this study and future work.
The solutions for clearing the underwater garbage
If the monitored marine debris can be cleared, the best solution is to assist in clearing the garbage through underwater garbage robots. The underwater garbage robot can not only collect data but also clear the marine debris, which is without geographical restrictions [5-13].
In addition to garbage collection boats, the construction of underwater dumpsters is also one of the feasible solutions. At present, there are four types of common underwater dumpster: Mr. Trash Wheel, Seabin, The Ocean Clean-up, and Diandian Plastic underwater dumpster [14-16].
Table 1 summarizes the advantages and disadvantages of the current technologies for the treatment of marine debris.
Comparison of advantages and disadvantages of various solutions
Comparison of advantages and disadvantages of various solutions
Table 2 lists the research objectives and characteristics of references [21-29]. In [21], the authors overcome the problems of premature population growth and the slow convergence speed of traditional genetic algorithms. The author proposes a strategy using multi-domain inversion to increase the number of offspring and conducts a second fitness evaluation to eliminate bad offspring.
Research aims and characteristics of path-planning algorithms in references [21-29]
Research aims and characteristics of path-planning algorithms in references [21-29]
This improvement helps to effectively enhance the local search ability and increase the probability of producing excellent individuals. For the effectiveness of AUV path planning and to allow the algorithm to converge quickly. In [22], the author established the path planning model of AUV based on the improved genetic algorithm. In the model, the shortest path length is used as the main criterion, and the boundary conditions are established with the collision avoidance range and turning angle as constraints. Reasonable crossover and mutation-adaptive probability models are designed.
Ocean protection and monitoring have always been important topics. Today, autonomous underwater vehicles (AUVs) are used for monitoring and exploration of the marine environment. However, the path planning of AUV is not efficient in terms of information content and energy consumption. In view of this, informative path planning (IPP) is an effective alternative, which is defined as the path to maximize the collection of information. In [23], the authors propose a genetic path planner (GPP) that incorporates a genetic algorithm based IPP strategy with the aim of generating a path while maximizing the collected information and the coverage of the inspected area. At present, water surface cleaning mainly relies on manual operation, which is inefficient and dangerous. In [24], the authors designed a fully automatic water surface cleaning robot, called SMURF, to achieve efficient water surface cleaning without manual operation. The authors also propose a novel and easy-to-implement path planning method for water surface coverage and an improved nonlinear model predictive controller.
Another vehicle similar to an AUV, called an unmanned surface vehicle (USV), usually explores a target area of interest and requires full area coverage. The biological inspired neural network (BINN) algorithm has been widely used in the path planning of mobile robots. In [25], the author proposed a completely covered neural network (CCNN) algorithm for USV path planning. CCNN reduces the calculation time by simplifying the calculation process of neural activities. To improve coverage efficiency, CCNN combines coverage and establishes a decision formula for paths. The CCNN algorithm can also reduce the corner of the path trajectory and make the path smoother. In addition, to reduce path turns, the author also proposes an improved A* algorithm. Optimizing coverage path planning (CPP) in robotics has become the key to efficient coverage applications.
In [26], the authors proposed a new approach based on the traveling salesman problem (TSP) and exploiting grid-based maps with deep reinforcement learning (DRL) to solve the CPP problem in large, complex environments. The solution to the TSP is determined by using reinforcement learning (RL) to train a recurrent neural network (RNN) with long-short-term memory (LSTM) layers. With the increasing demand for ocean exploration activities, more and more underwater vehicles with autonomous path planning capabilities have been designed and developed. The traditional A* algorithm for path planning has many limitations, such as too many nodes, too many turns, a small turning angle, etc. The designed path is not conducive to the development of autonomous underwater vehicles (AUVs). In [27], the author uses the improved A* algorithm for path optimization to smooth and optimize the motion path of the AUV.
In the actual environment, there are many obstacles, so the robot’s obstacle avoidance problem is also an important issue in path planning. Therefore, anti-collision path planning is crucial in path planning for the decision to establish a Maritime Autonomous Surface Ship (MASS). Many collision avoidance methods for static and dynamic obstacles have been proposed. In [28], the authors proposed a distributed coordinated path planning algorithm especially suitable for multi-vessel encounters. The algorithm combines constraints of apparent action, time dimension, and the ship’s dynamic properties and uses the Three-Dimensional Generalized Velocity Obstacle (TGVO) algorithm to generate a practical and COLREGs-compliant collision-free velocity for the ship. In [29], the authors propose an alternative to traditional robotic foraging techniques. With this approach, layouts can be traced faster with fewer moves. The algorithm is well suited for autonomous lawnmowers and is expected to allow the robot to cover the entire area, reducing the robot’s traversal of repetitive areas. The algorithm can be applied to fields such as drones and underwater vehicles.
Taken together, these studies explore different approaches and optimization algorithms for AUV path planning. A genetic algorithm is one of the most used among them. The ocean is a dynamic environment, and the location of marine debris is constantly changing due to the influence of factors such as ocean currents and wind. The traditional genetic algorithm cannot cope with the dynamic environment. This study has two research goals: How to maximize the number of garbage collections in a dynamic environment. To the needs of the situation, we expect the robot to return to the charging station to continue charging after collecting marine debris, so we need to find the Hamiltonian cycle, which is an NP-hard problem [30], so this research will predict the location of marine debris and use the genetic algorithm to be executed multiple times to gradually collect marine garbage in time-sharing.
Environment modelling and problem description
Before explaining the U* algorithm, the environment’s information must be represented. Considering the Fig. 2, it is assumed that there is only one garbage cleaning robot in the environment, and a GPS receiver is installed on the robot, which can accurately locate its position. In addition, the lidar and camera are installed on the robot, and the image measurement approach (IMA) localization technology is used [22], which can detect the location of marine debris and obstacles within the effective detection range (DR) [4-8].

An example of the garbage-cleaning mission.
Therefore, we assume the marine debris outside the detection range is invisible, and the marine debris needs to be within the detection range to be accurately detected. The robot is also equipped with ocean current sensors that can detect the speed and direction of the ocean current. In the process of the robot picking up marine debris, it is thought that the marine debris has been picked up when the distance between the robot and the marine debris is less than 0.1 meters.
The author stated in [12] that more than 80% of marine debris is at the water’s surface or one meter below the water’s surface. Therefore, this study takes the marine debris on the sea surface as the main object of removal. In this study’s case, we plan to set up a charging station so that the robot can come back to it after work to get charged. This problem can be modelled as a Hamiltonian cycle problem. The Hamiltonian cycle with the smallest path is the minimum cycle. The calculation of the Hamiltonian cycle is a NP-hard problem [30] and the situation is shown in Fig. 2.
For the robot to be able to traverse all the marine debris once, and only once, and return to the starting point. This problem can be transformed into the Hamiltonian cycle. Assuming that there are n marine debris in the environment, the marine debris can be expressed as {G1, G2, . . , Gn-1, G
n
}. After the robot starts from the starting point V0, each marine debris can be traversed once, and it finally returns to the starting point V0 (Hamiltonian cycle). Therefore, the robot’s movement trajectory,
In U* path planning, it is divided into two procedures: the exploration procedure and the path planning procedure. At the beginning of the task of collecting marine debris, the robot starts from the charging station (the starting point) and detects whether there is marine debris in Through lidar and cameras, this study uses the image measurement approach (IMA) [20] to obtain the location information of marine debris. The primary goal of the exploration procedure is to detect marine debris and record its location. Since the robot does not record any information about marine debris in its initial state, the location of marine debris is detected through the exploration procedure. The following three situations are discussed: There is no marine debris within the detection range. The robot executes the U* exploration procedure. The robot generates a virtual target within the detection range according to the velocity and direction of the ocean current, moves towards the virtual target position, and then to detect marine debris. There is marine debris in the detection range, and the number is less than threshold, k. Where the threshold k is the threshold of the number of marine debris, it determines whether the robot executes the U* path planning procedure. Recording the location information of the marine debris in the GT table and selecting the nearest marine debris as the moving target position. As shown in Table 3, the GT table records the ID, whether or not it has been salvaged (flag
Gi
={ 0, 1 }), and the location information (x
Gi
, y
Gi
) of marine debris G
i
, which k can be adjusted based on demand and is set to 5 in this study. When the number of marine debris collected exceeds the threshold value k, the path planning procedure is started, and the traversal path of the robot is calculated through GA and prediction of the position of marine debris. The reason for setting k to 5 in this study is that when the value of k is too large, it affects the execution time of the genetic algorithm. Up to 10!, that is, up to 3,628,800 path trajectories. In contrast, if the number is too small, there is no need to use a genetic algorithm. Whereas 5! is about 120 kinds of path trajectories. As for the best value of k, there is no conclusion in the current study about the genetic algorithm. There is marine debris in the detection range, and the number is more than or equal to threshold, k. The robot executes the U* path planning procedure. The path planning procedure is described in of Section 3.3. When the marine debris G
i
is successfully collected, G
i
is remarked and avoid being selected in the future. The Algorithm 1 is the pseudo-code of the exploration procedure of U*.
An example of GT table in initial state of robot
An example of GT table in initial state of robot
The exploration procedure of U*
remaining power of robot: RP, desire task time: T desire , present time: T present , speed of the ocean current: C s , direction of the ocean current: C d
1) Predicting the location of marine debris
After the GT table is established, the GT table is converted into a distance matrix,
Figure 3 shows how to predict the possible position of marine debris and obtain

An example of calculating the possible position of marine debris (G j ) after time t i .
The U* algorithm proposed in this paper is used for path planning by an autonomous robot tasked with cleaning up ocean debris. The algorithm consists of two main parts: exploration and path planning. In the path planning stage, the algorithm first predicts the possible locations of ocean debris and calculates the GT table and distance matrix
After figuring out the best way to plan a path, the robot moves along the path to pick up the trash. The algorithm also changes the GT table and the distance matrix
At each time interval t
d
, the GT table
2) Population initialization

An example of path planning procedure of U*.
After the GT table and the distance matrix

An example of population initialization.
There is n ! kinds of individuals from G1 to G n . However, if the number of individuals is large, the calculation is complicated, and the operational efficiency of the genetic algorithm is reduced. Therefore, in this study, we set the parameter p to 5.
3) Calculate the fitness of each individual
To select suitable individuals for crossover and mutation operations, calculate the total length of the moving trajectory
4) Crossover and mutation
After calculating the fitness of the individual, crossover and mutation of the individual are performed to generate new offspring individuals. Intersect the two parents except that the end point is the same as the start point. Suppose there are two parent individuals with parent1 {
From Equation (16), it can be found that when the γ is larger, the probability that the offspring selects the gene segment of the parent
Since the starting point and end point of the generated offspring individuals are consistent, only the intermediate nodes are mutated, as shown in Fig. 6. In the mutation operation, given a probability ω, the initial value is 0.01. We randomly select one set of nodes (the i-th and j-th nodes) from the offspring individual e for mutation operations. As shown in Equation (17), the order of the two nodes is exchanged. Algorithm 2 is the pseudo-code of the genetic algorithm in the path planning procedure.

Mutation operation.
The path planning procedure of U*
The robot salvages marine debris according to the movement trajectory
Simulation configuration parameters
In this study, we used the operating system Windows 11 with 16 GB of DDR4-3000 memory. In terms of simulation development, we use the C/C++ language, MATLAB, and Gnuplot as auxiliary tools for drawing and complex calculations.
To verify the effectiveness of U*, this study compares the four path planning algorithms: the Random, Greedy, Improved GA [22], and SUMRF [24]. Random and Greedy methods have the advantage of being simple to implement. In the Random method, when the robot detects marine debris, it travels directly to the location of the marine debris for salvage. During the detection process, the ID and location information of the marine debris are recorded in the queue, and the detected marine debris is removed one by one in a “first come, first served” strategy. To reduce the robot’s time and power costs, another approach employs a greedy approach. Using information about the location of marine debris in the queue, the robot chooses the garbage closest to itself to salvage first.
We implemented three sets of experiments. Each set of experiments was conducted as follows: (A) changes the detection radius of the robot. (B) Changes the robot’s movement speed. (C) Changes the velocity of ocean current. The parameters of the experimental settings are listed in Table 4.
Experimental parameter settings
Experimental parameter settings
For realism in the simulation, marine debris movement is modeled using a kinematic model [31], which assumes a current field that is a combination of tidal and residual currents. One can make an approximation to estimate the dimensionless velocity field in the kinematic model. where Vx is the speed on the X-axis, Vy is the speed on the Y-axis, and k1, k2, k3, and λ are variables that are closely related to environmental factors, such as tides and bathymetry. The kinematic model parameters of this study are shown in Table 4.
The Equation (21) defines the average number of marine debris collected, Avg num , where NG,m represents the number of marine debris collected in the m-th experiment. In addition, when the distance between the robot and the marine debris is less than 1 m, it is assumed that the marine debris can be successfully collected, where et is the total number of runs.
This set of experiments compared the effects of the five methods on the total path length of the robot and the collection efficiency of marine debris when the detection radius was changed. The robot starts in the upper left corner of the area, the detection radius is set to grow from 10 meters to 150 meters, the robot’s speed is set to 10 m/s, the ocean current velocity is 1.2 m/s, and each experiment takes 3000 seconds to run.
Table 5 and Fig. 7 show the results of the average path length and amount of marine debris collected at different detection radius. In terms of average path length, when the detection radius is small, the difference between the Random and the Greedy methods are not significant due to the smaller amount of marine debris detected. In SUMRF, the influence of the detection radius is small; even if no marine debris is detected, the entire area will be traversed. Therefore, the detection radius size won’t have an impact on SUMRF. To achieve full coverage of SUMRF, the path length will only increase as the area size increases.
Influence of detection radius in term of average path length
Influence of detection radius in term of average path length
DR: detection range (m).
As the detection radius increases, the amount of marine debris detected by the robot gradually increases. Since the Random method does not consider the path cost, many redundant paths will be generated, while the Greedy method only considers the local optimal situation and only selects the shortest distance from the robot to the marine debris collection site, so it is only superior to the Random method. When the detection radius exceeds 60 meters, U* outperforms Improved GA in path length. As the detection radius increases, U* presents a downward and stable trend.
From the perspective of marine debris collection efficiency, the larger the detection radius, the better the efficiency of marine debris collection. Since SUMRF is a static path planning algorithm, it cannot traverse the entire area in a limited time, so the performance is the worst. The Random method does not consider the path cost factor, showing an unstable trend; that is, the first detected marine debris is collected first. The Greedy method only traverses the optimal area, which is slightly better than the Random method. Similar to Improved GA, U* is close to 100% when the detection radius reaches 40 m, as shown in Fig. 7.

Influence of detection radius in term of average number of marine debris collected.
The set of experiments compares the effects of the five methods on the total path length of the robot and the efficiency of marine debris collection when the robot’s speed changes. 50 pieces of marine debris are randomly generated, the robot starts from the upper left corner of the area, the detection radius is set to 50 meters, the movement speed of the robot is increased from 2 m/s to 10 m/s, and the execution time of each experiment is 3000 seconds.
Table 6 and Fig. 8 show the results of the average path length and amount of marine debris collected by the robot at different moving speeds. From Table 6, we can see that the Random and Greedy methods are similar. SUMRF is a static path planning method. The faster the robot moves, the longer the average path length is, showing regular incremental results. It can be seen that when the robot speed increases, both U* and the average path length of the improved genetic algorithm show an upward trend. The average path lengths of the Random method and the Greedy method are not much different.
Influence of robot’s movement speed in term of average path length
Influence of robot’s movement speed in term of average path length
MS: robot’s movement speed (m/s).
In Fig. 8, U* and the improved GA both achieve approximately 100% in terms of the amount of collected marine debris. SUMRF is a predetermined static path trajectory with a time limit, so it performs more uniformly in garbage collection. However, the Random method does not consider the path cost, and marine debris is also found first, which leads to the worst efficiency in salvaging marine debris. In U*, when the amount of marine debris exceeds the threshold k > 5, the genetic algorithm starts to run. Because the genetic algorithm considers the global location of marine debris, it collects more marine debris than the Random, Greedy, and SUMRF algorithms.

Influence of robot’s movement speed in term of average number of marine debris collected.
This set of experiments compares the efficiency of marine debris collection when the velocity of ocean changes. Randomly generate 50 marine debris, the robot starts from the upper left corner of the area, the detection radius is set to 90 meters, the detection frequency is 1 time every ten seconds, the movement speed of the robot is set to 10 m/s, and the ocean velocity is incremented from 1 m/s to 4.6 m/s, each experiment executes for 3000 seconds.
Table 7, and Fig. 9 show the results of the mean path length versus the amount of collected marine debris for different ocean current velocities. In Table 7, when the ocean current velocity increases, the location of marine debris changes rapidly. Since the size of the area is 300 m×300 m, the rapid flow velocity leads to the tendency of marine debris to gather, and it is easier to salvage multiple pieces of garbage at one time. As a result, the average path length tends to decrease. The path length is unaffected by the ocean current velocity because SUMRF is a fixed, static path planning algorithm. The average path lengths of U* and Improved GA are similar, and both show a downward trend. The Random method does not consider the path cost and is relatively unstable. Greedy showed a downward trend when the velocity reached 3.4 m/s.
Influence of ocean velocity in term of aver-age path length
Influence of ocean velocity in term of aver-age path length
OV: ocean velocity (m/s).

Influence of ocean velocity in term of average number of marine debris collected.
In Fig. 9, as the ocean current speed increases, the amount of marine debris collected by U* and Improved GA can reach 100%, while SUMRF is the worst because the robot only traverses a specific path and cannot collect it within a limited time (3000 seconds). traverse the entire area.
Marine debris is a serious global problem. Its scope of influence is not limited to areas where humans live but also drifts around the world with wind or currents. More than 10 million tons of plastic waste flow into the ocean every year, posing a major threat to the marine ecosystem and humans. This study proposes a path planning applied to the surface garbage cleaning robot, called U*. U* is mainly divided into two procedures: exploration and path planning. U* predicts the location of marine debris based on the velocity and direction of the ocean current. Then, the optimal path is found through the genetic algorithm (GA), thereby shortening the path trajectory of the robot, improving the collection efficiency of marine debris, and saving time and power costs. In the future, we will consider using long short-term memory (LSTM) technology to improve the ability to predict the location of marine debris and further improve the accuracy of marine debris collection.
Footnotes
Acknowledgments
This work was supported by the Putian City Science and Technology Project, China (Grant No. 2022GZ2001ptxy14), and the Science Foundation of the Fujian Province Project, China (Grant No. 2020J01924, Grant No. 2023J011015), and Ying Lin was supported by the Putian University Students’ Innovation and Entrepreneurship Training Project (202111498015).
