Abstract
Finding the potential spatiotemporal co-occurrence behavior patterns of large groups of ships while sailing is a challenging problem of great importance in many real-world applications. Through spatiotemporal data mining of ship trajectory data, route rules, navigation behavior, and potential anomalies can be mined, providing important support for maritime management, navigation safety, and emergency response. With the analysis and mining of ship trajectory data in some hotspot sea areas, this paper introduced a ship spatiotemporal co-occurrence pattern mining algorithm based on association rules. Based on the research of data model and the judgment criterion of spatio-temporal co-occurrence law, such concepts as candidate set, frequency set, and instance set are introduced together with the key procedure of algorithm, including pruning and pasting of candidate sets, screening of instance sets, definition of association reasoning, and association rule mining. Subsequently, the process of implementing the spatiotemporal co-occurrence pattern mining algorithm is devised. In the end, the algorithm is verified by taking the automatic identification system data of ships in hotspot sea areas as the source data. The proposed algorithm can find several ship combinations with spatiotemporal co-occurrence regularity in these hotspot sea areas, and the association rules on the co-occurrence of several ships. The performance of the proposed algorithms is illustrated on a real-world ship trajectory database and made a detailed comparative analysis. The results are very promising in terms of computational time. The experimental results show that our algorithm can effectively identify the motion patterns and behavior characteristics of ships, which provides an important reference and support for Marine traffic management, ship safety and Marine environment protection. The research results of this paper are of great significance for improving the efficiency and safety of maritime traffic, and also provide new ideas and methods for further research in related fields.
Introduction
In the modern world history, great powers must be strong in ocean around the world. Presently, ocean has gradually become the fundamental and restriction for the sustainable development of China. China’s national maritime rights and interests depend on ocean security. With the implementation of “the Belt and Road” national development initiative, “Smart Ocean” has been gradually brought up on the agenda as a program mainly aiming to combine modern intelligent techniques and big data technology with marine equipment and activities of China. Supported by an independent, secure, and reliable marine cloud environment, all kinds of marine information resources are integrated to achieve the all-directional three-dimensional perception, extensive interconnection, mass data sharing, knowledge analysis and decision, and in-depth smart service related to ocean. In this way, an integrated solution can be developed to enhance the maritime military, marine management and control, and ocean development capabilities. 1
In recent years, people have been constantly improving the capability of gathering the situation information of ships at sea with the rapid development and wide application of global positioning, sensor networking, and wireless communication technologies. According to statistics, more than 50,000 ships travel at sea every day. 2 These large objects moving in a complicated marine environment generate mass data under the support of the improving sensor technology. Marine location big data (LBD) refers to the data of marine geographical and human sociological information with spatial and temporal identifiers, including basic electronic charts and electronic navigation charts, real-time navigation information related to ship location, and real-time information of ship-borne equipment during navigation. Automatic identification system (AIS) has been extensively spread, so that the mass data of ship navigation in unified format is taken as an important information source for marine situation analysis and mining system. On a daily basis, around 4 GB of AIS data are collected by satellites and shore-based stations worldwide, generating up to 1.4 terabytes of AIS data in a year. 3 The information is lastingly recorded to accumulate the mass data of historical and spatiotemporal ship trajectories in various fields. The trajectory data has not only contained a record of activities for each single moving object, but also embodied the hidden knowledge regarding activity regularities, kinematic features, behavioral patterns, and regional hotspots of population. 4 Analyzing the mass data of ship trajectory and mining the hidden knowledge therein are of great significance to improving the construction of marine transportation facilities, enhancing maritime regulation capability, and defending national ocean security.
It is one of the hot research fields to analyze the ship trajectory data and to mine the ship motion pattern. Ship trajectory data refers to the data that records the ship’s moving path in the ocean or inland waters, which contains the ship’s position, speed, heading, and other key information. These data are not only of great significance to maritime management, safety monitoring, and navigation planning, but also provide rich information resources for Marine scientific research, resource development, and environmental protection. With the continuous growth of global maritime traffic and the development of ship automation technology, the scale and complexity of ship trajectory data are also increasing rapidly.
Spatiotemporal data mining is a technique for pattern recognition, correlation analysis and prediction using spatiotemporal information, which plays a vital role in ship trajectory data analysis. Through spatiotemporal data mining of ship trajectory data, route rules, navigation behavior, and potential anomalies can be mined, providing important support for maritime management, navigation safety, and emergency response. 5 This paper will discuss the spatiotemporal data mining technology of ship trajectory data, including key technologies such as data preprocessing, trajectory pattern recognition, and anomaly detection, in order to deeply understand the characteristics and rules of ship trajectory data and mine the useful information contained therein, so as to provide scientific basis for decision-making and management in the maritime field.
With the wide application of Automatic Identification System (AIS) and other technologies, the generation of large amounts of ship trajectory data provides a valuable opportunity to study the movement behavior of single ships and the collective movement patterns at sea. These data not only contain the spatio-temporal information of ships, but also cover multi-dimensional information such as ship type, speed, and width, which makes it possible to conduct in-depth analysis of ship behavior. However, how to mine valuable co-occurrence patterns from these massive spatio-temporal data is still an urgent problem to be solved.
This paper aims to develop an efficient algorithm for mining the spatio-temporal co-occurrence patterns of ships in order to discover the co-occurrence patterns of ships in space, time, and other dimensions from AIS data. Through this pattern mining, the frequent encounter situation of different ship types in a specific area can be revealed, so as to provide a scientific basis for maritime traffic safety supervision. In addition, the algorithm will also be combined with the Hadoop platform to realize parallel processing to improve the computational efficiency and ensure that it can process large-scale spatio-temporal data sets. Finally, through the in-depth analysis of ship trajectory data, this study hopes to provide more accurate maritime traffic situation assessment and decision support for the shipping industry. This paper first introduces the background and significance of the research. In the third section, the theoretical model of spatio-temporal co-occurrence is established. In the fourth section, the spatio-temporal co-occurrence mining algorithm of ship trajectory is designed. In the fifth section, the algorithm is verified by a large number of AIS data in the hot sea area, and by adjusting the parameter Settings, the results of the algorithm are analyzed in detail. Finally, the conclusion of the paper is given.
Related work
In recent years, with the increasing of Marine traffic and the increasing importance of Marine environmental protection, the research on ship spatio-temporal co-occurrence pattern mining algorithms has attracted much attention. At present, many scholars have carried out in-depth research in this field. Among them, some researches focus on the application of data mining and machine learning techniques, such as clustering analysis based on trajectory data, pattern recognition, and anomaly detection. Other studies focus on algorithm optimization and improvement to improve the accuracy and efficiency of mining algorithms. At the same time, some researches are devoted to combining the ship spatio-temporal co-occurrence pattern mining algorithm with other related fields, such as intelligent transportation systems and Marine resources development, so as to expand the application range and depth of the algorithm.
Spatiotemporal co-occurrence pattern mining is an important research focus of motion pattern mining for moving objects. Co-occurrence refers to the state of motion in which two or more objects or instances exist in a neighborhood both spatially and temporally. 6
The spatiotemporal co-occurrence pattern mining technology based on ship trajectory data is of great significance in the maritime field. This technique can reveal the spatiotemporal association patterns in ship trajectory data, that is, the co-occurrence trends or correlations between different ships in time and space, so as to help analysts better understand maritime traffic rules, navigation behaviors and potential risks. Through the mining of spatiotemporal co-occurrence patterns, the possible relationships, agglomeration phenomena or frequent trajectory patterns between ships can be found, which provides important support for ship control, route planning, and emergency response. 7
Compared with the clustering based mining methods, the spatiotemporal co-occurrence pattern mining technology based on ship trajectory data pays more attention to mining the spatiotemporal correlation between different ships, rather than simply dividing ships into the same category or cluster. The spatiotemporal co-occurrence pattern mining method can take into account the interaction between ships and the spatiotemporal evolution law, help analysts to deeply understand the complex relationship behind the ship trajectory data, so as to better grasp the dynamic changes and potential risks of maritime traffic. Therefore, the application of spatiotemporal co-occurrence pattern mining technology based on ship trajectory data in the maritime field is of great significance, which can provide more in-depth and comprehensive information support for navigation safety management, route planning optimization, and maritime supervision.
The spatiotemporal co-occurrence mining with the mass data of ship trajectory is a typical problem in the motion pattern mining of ships, and greatly valuable in practice. For instance, smuggling ships normally navigate in a co-occurrence behavioral pattern for a period, and pirates may follow a target ship closely for a long time before taking any illegal action. In the antiterrorist and military fields, maritime military operations often involve ships in formation during navigation. Different formations reflect different operations. Frequently, a great number of ships from different countries and organizations may be disguised in different identities but jointly engage in a specific mission at sea. Facing such special maritime operations, spatiotemporal co-occurrence pattern mining can be adopted to perform the association analysis of AIS trajectory data from many ships, so as to provide the technical support for the operations planning and tactical activities in the military field. 8 Liu et al.9,10 introduced an AIS data-driven trajectory prediction framework, whose main component is a long short term memory (LSTM) network, to predict the spatiotemporal vessel trajectories. The main benefit of the proposed framework is that it takes full advantages of DBSCAN-based outliers detection and BLSTM-based trajectory reconstruction. To analyze the problem of taxi fraud, Belhadi et al. 11 proposed a two-phase trajectory anomaly detection model. The first phase determines the individual trajectory outliers by computing the distance of each point in each trajectory, whereas the second identifies the group trajectory outliers by exploring the individual trajectory outliers using both feature selection and sliding windows strategies. Veena et al. 12 proposed a generic model of fuzzy frequent spatial pattern that may exist in a quantitative spatiotemporal database. An neighbor-hood pruning has been introduced to effectively reduce the search space and the computational cost of finding the desired itemsets. Bommisetty et al. 13 introduced an efficient algorithm to find spatial high utility itemsets in a high-dimensional spatiotemporal database, which contain two novel concepts “spatially closed utilities of an itemset” and “spatially non-closed utilities of an itemset.”
Researchers have proposed a variety of methods and models for mining spatiotemporal co-occurrence patterns of ships. For example, the analysis of AIS trajectories using co-clustering algorithms can discover co-occurrence patterns of ships in space, time, and other dimensions. 14 In addition, based on the technologies of satellite navigation, spatio-temporal entity, Web-GIS, and spatial database, the key information of ships, such as basic attributes, real-time location, historical trajectory, association relationship, and running state, can be extracted and entity reorganized, so as to construct ship entity resources and establish a ship trajectory visual analysis system. 15 In order to improve the mining efficiency and accuracy, researchers have proposed a GRU (gated recurrent unit) ship trajectory prediction method based on spatio-temporal attention mechanism, which significantly improves the accuracy of trajectory prediction by enhancing the ability of spatio-temporal information data. 16
In addition, significant progress has been made in ship behavior feature mining and prediction based on AIS data. The research shows that through big data mining and AI analysis, the ship portrait can be formed, the ship operation behavior can be deeply analyzed, and the ship trajectory can be analyzed in detail. 17 These studies not only help to improve the ability of water traffic safety supervision, but also provide support for the modeling and trend prediction of shipping network dynamics. 8
In general, the research on ship spatio-temporal co-occurrence pattern mining algorithms has made some progress, but there are still some challenges and problems to be solved, such as data quality, algorithm efficiency, and real-time performance, which need further discussion and improvement. Future research is needed to further optimize the algorithm and improve the mining efficiency and accuracy to better serve the development of maritime traffic safety and intelligent navigation systems.
In this paper, a novel method is proposed to detect the potential spatiotemporal co-occurrence behavior patterns of large groups of ships while sailing. The ship’s spatiotemporal co-occurrence model is different from the vehicle’s or person’s spatiotemporal co-occurrence model. It can detect more subtle patterns of ship behavior, and can support the discovery of more ships with co-occurrence patterns. 18 The main contributions of this work are summarized as:
1) According to the characteristics of ship trajectory, the data model of ship co-occurrence unit sea area is put forward, and the judgment criterion of spatio-temporal co-occurrence law is studied.
2) Based on the above data model and judgment criterion, such concepts as candidate set, frequency set, and instance set are introduced together with the key procedure of algorithm, including pruning and pasting of candidate sets, screening of instance sets, definition of association reasoning, and association rule mining. Subsequently, the process of implementing the spatiotemporal co-occurrence pattern mining algorithm is devised.
3) In the end, the algorithm is implemented by taking the automatic identification system data of ships in hotspot sea areas as the source data. We obtain several ship combinations with spatiotemporal co-occurrence regularity in these hotspot sea areas, and the association rules on the co-occurrence of several ships.
4) The performance of the proposed algorithms is illustrated on a real-world ship trajectory database and made a detailed comparative analysis. The results are very promising in terms of computational time. The results gathered in this paper can be linked to future enhancements using a variable time window function for future work.
Model preparations
Before describing the spatio-temporal co-occurrence algorithm of ship trajectories, a suitable theoretical model should be established as a basis. In this section, the co-occurrence unit sea area model of hot spot sea area is established, and then the concepts of spatio-temporal co-occurrence phenomenon judgment criterion and spatio-temporal co-occurrence regularity judgment criterion are elaborated in detail, which are the execution basis of the spatio-temporal co-occurrence algorithm of ship trajectory. Then, the conceptual models of candidate set, frequent set, and instance set are defined, which are the mathematical basic concepts of the ship trajectory spatio-temporal co-occurrence algorithm. The subsequent research algorithm is a large number of complex operations of these models.
Co-occurrence unit sea area
It is assumed that a hotspot sea area
where
If the time period
Spatiotemporal co-occurrence phenomenon judgment criterion
It is assumed that a ship
Spatiotemporal co-occurrence regularity judgment criterion
If two ships
-element candidate set
In the AIS data, each ship has its unique identification (ID) number. If it is necessary to find out the spatiotemporal co-occurrence regularity between
Hence, a
-element frequency set
If the survey finds the spatiotemporal co-occurrence relationship between the ships with ID number in a
where
Hence, a
-element instance set
In order to confirm a
A hotspot sea area is equally divided into several co-occurrence unit sea areas with the same time slot
After traversing all
The
Algorithm design
A recursive structure is adopted for the spatiotemporal co-occurrence pattern mining algorithm based on association rules. It mainly involves five steps such as pruning candidate sets, pasting frequency sets, screening instance sets, defining association reasoning, and mining association rules. The algorithm is implemented in the following procedure: a
Pruning of candidate sets
Pruning of candidate sets means to calculate how frequently an element in a
It is assumed that, in all
Definition: the number of
The frequency
Definition of frequency judgment criterion: The frequency threshold is set to
The set
This aims to find out a
Pasting of frequency sets
Pasting of frequency sets means to generate a
If there are at most
It is assumed that the
If
After traversal, all retained
Screening of instance sets
Screening of instance sets means to delete all the instance sets that are not associated with the ships in
It is assumed that there is a
After traversal, all retained sets
Association reasoning
Association reasoning is to enumerate all possible association rules for all ships belonging to a k-element frequency set, that is, to put forward the hypothesis: “The co-occurrence of several ships in the k-element frequency set can infer that other ships in the set are also spatiotemporal associated.” On the one hand, this ensures that there is spatiotemporal co-occurrence regularity among all ships involved in the inference. On the other hand, all possible conclusion sets and phenomenon sets in k-element frequency sets are enumerated, which ensures that the association rules of ship co-occurrence can be obtained without missing.
For any k-element frequency set, select
Definition:
Association rule mining refers to mining the association rules between the co-occurrence of ships by obtaining the confidence of the co-occurrence of any few ships in a frequency set to the co-occurrence of other ships, that is, filtering out all the “co-occurrence of a few ships in the k-element frequency set, it can be inferred that the co-occurrence of other ships in the set also occurs.” The inference by which this assumption holds, is the association rule.
Suppose that for some k-element frequency set
Let the confidence of an inference
It means the ratio of the number of instance sets containing the ship number in the phenomenon set
Define the judgment criteria of association rules: set the confidence threshold as
Then
Process of implementing spatiotemporal co-occurrence mining algorithm
Spatiotemporal co-occurrence pattern mining algorithm is used to find a set of all ships with spatiotemporal co-occurrence regularity. The process of implementing the algorithm is as presented in Figure 1.

Process flow of spatiotemporal co-occurrence pattern mining algorithm.
The specific steps of implementing the algorithm are as follows:
Step 1: Initialization of parameters.
We set
A set of all ship ID numbers is denoted by
Step 2: Description of candidate sets
1) Each element in a
2) The total number of
3) The frequency of the
Step 3: Screening of instance sets
1) All
If
If
2) After traversal, all retained sets
Step 4: Processing of candidate sets
1) All
If
If
2) The frequency of each
Step 5: Pruning of candidate sets
1) All
If
If
2) After traversal, all retained
3) If
4) If
Step 6: Pasting of candidate sets
1) As for each
If
If
2) After reversal, all retained
3) We let
Experiment and analysis
In this section, all the parameters needed in the algorithm were studied and evaluated. The procedure of spatiotemporal co-occurrence pattern mining algorithm was then implemented to obtain all
Setting of parameters
Considering the actual situation, relevant references30,31 were consulted to eventually select such parameters as given in Table 1:
Parameters of the algorithm.
The data of ships in a hotspot sea area divided by time slot and into co-occurrence unit sea areas is summed up in Figure 2:

Data of ships in a hotspot sea area.
In the above figure, all the sea areas in a hotspot sea area were divided into 744 layers in terms of time slot. Each layer represented the distribution of ships in the hotspot sea area within 60 min. Each layer had a different color to indicate different ships in different time slots.
Selection of frequency threshold
Frequency threshold
The procedure was implemented to obtain the frequency of all one-element candidate sets

Frequency support of one-member candidate sets.
It was assumed that the mean value of the frequencies
It was found that the variation curve of the frequency
The calculation gave
Selection of confidence threshold
Confidence threshold

Eight-element reasoning confidence curve.
It was assumed that the mean value of all eight-member association reasoning confidences
It was found that the variation curve of
The calculation gave
Results of spatiotemporal co-occurrence pattern mining
(1) Candidate sets
The variation curve of the frequency

Frequency support curves: (a) frequency support of two-element candidate sets, (b) frequency support of three-element candidate sets, (c) frequency support of four-element candidate sets, (d) frequency support of five-element candidate sets, (e) frequency support of six-element candidate sets, (f) frequency support of seven-element candidate sets, and (g) frequency support of eight-element candidate sets.
In the figure, the red broken line represents the frequency threshold of
In the above figure, it is found that:
1) When
This may be highly attributed to the unchanged threshold in the traditional Apriori algorithm. While generating two-element candidate sets, the frequency support of candidates is still very low. Thus the algorithm can still screen a large proportion of candidate sets. When the third cycle starts, lots of instance sets are removed. In this case, the frequency support of candidate sets goes up noticeably, but the threshold remains low as before. For this reason, most
2) When
This may be caused by the fact that a large proportion of
Thereafter,
(2) Frequency sets
The number of

Variation of
The above figure reveals that:
1) When
This happens since candidate sets are screened by frequency at a lower rate than the generation of frequency sets from candidate sets when
2) When
When
(3) Ship spatiotemporal co-occurrence results
In the end, a frequency set contained at most eight elements. In other words, there was spatiotemporal co-occurrence regularity between at most eight ships. Two frequency sets with the highest frequency support among
Results of spatiotemporal co-occurrence pattern mining.
Taking the ships in a frequency set in Table 2 as an example, their trajectories were drafted as given in Figure 7.

Ship co-occurrence diagrams: (a) ship trajectories in two-element candidate sets, (b) ship trajectories in three-element candidate sets, (c) ship trajectories in four-element candidate sets, (d) ship trajectories in five-element candidate sets, (e) ship trajectories in six-element candidate sets, (f) ship trajectories in seven-element candidate sets, and (g) ship trajectories in eight-element candidate sets.
The above figure is analyzed to find that
1) The actual trajectories of the ships in the eight-element frequency sets calculated by the association rule mining algorithm are very highly similar in terms of shape. These trajectories are also very close to each other. This may be attributed to the navigation of these ships along the preset channel for navigational safety, which results in massive overlapping of their trajectories. Evidently, there is certainly spatiotemporal co-occurrence regularity between these ships. Meanwhile, this proves the effectiveness and correctness of the proposed algorithm. 29
2) When
3) When the
4) The trajectories “ramify” only at the top, but coincide perfectly at the bottom. This must be caused since the left channel is for shipping between Xinlei Port and Haikou Bay, while the right channel goes between Haian Port and Haikou Bay. The two channels start and terminate at the same port in Hainan Island, but at different ports in Leizhou Peninsula. Two channels coincide in the middle of the straits, resulting in this situation.
5) The ships in two sets with the highest frequency support among all
(4) Association rule analysis results
On the basis of the above experiments, we further carry out the analysis and mining of ship association rules. By studying the confidence
1) When
2) When
Take the group of
Results of association rule analysis.
Observing the above table, we can find that:
1) With the increase of K value, the highest confidence value in the K-element association rule increases. This is because with the increase of K value, the instance set becomes less and less, which makes the denominator in the confidence expression of K-element correlation inference smaller and smaller. With the increase of K value, the screening of instance sets becomes more and more stringent. When K value is large enough, the instance sets that meet the screening conditions become less and less, which further increases the confidence of K-element association reasoning.
2) The larger the K value, the more associative reasoning with a confidence of 100%. This is because with the increase of K value, the filtering of instance set becomes more and more stringent, and when K value is large enough, the instance set that meets the filtering condition becomes more and more rare. In fact, when
Conclusions
In this paper, the automatic identification system data of ships in a hotspot sea area is taken as the source data to study a ship spatiotemporal co-occurrence pattern mining algorithm based on association rules. Based on the research of data model and the judgment criterion of spatio-temporal co-occurrence law, such concepts as candidate set, frequency set, and instance set are introduced together with the key procedure of algorithm, including pruning and pasting of candidate sets, screening of instance sets, definition of association reasoning, and association rule mining. Subsequently, the key steps of implementing the ship spatiotemporal data mining algorithm based on association rules are illustrated, and the process of spatiotemporal co-occurrence pattern mining algorithm is designed. In the end, an experiment is presented to obtain several ship combinations with spatiotemporal co-occurrence regularity in the hotspot sea area. The results of implementing the algorithm are analyzed by adjusting the setting of parameters. The proposed algorithm can find several ship combinations with spatiotemporal co-occurrence regularity in these hotspot sea areas, and the association rules on the co-occurrence of several ships. The performance of the proposed algorithms is illustrated on a real-world ship trajectory database and made a detailed comparative analysis. After implementing the procedure for 30 days, the spatiotemporal co-occurrence patterns and association rules for 1386 ships are eventually mined from 7071893 pieces of AIS data. The conclusion reveals that spatiotemporal co-occurrence relationship exists between at most eight ships in the hotspot sea area. Moreover, the maximum confidence of association rules is higher when the number of elements in a frequency set is larger.
This research algorithm is based on an application attempt of APRIORI algorithm in ship trajectory big data, and some meaningful results are obtained, which can reveal the aggregation trend of ships in specific time and space, which is of great significance for maritime traffic management, ship path planning and maritime supervision. In addition, this method reduces the computation time of the algorithm and improves the efficiency of pattern mining through the mined instances and the spatio-temporal relationships between instances. Through the use of advanced algorithms and technologies, the performance of spatio-temporal co-occurrence pattern mining can be effectively improved, which can provide some reference for the subsequent research in the field of ship spatio-temporal co-occurrence mining.
However, there is still room for further improvement in the ship spatio-temporal co-occurrence pattern mining algorithm based on association rules. For example, the setting of the time window may affect the performance of the algorithm. In theory, different types of vessels have different time and space characteristics in application. In the next step, we can consider setting a variable time window function to adjust the range of the time window according to the different identity of the ship, so as to obtain a more real algorithm result of ship spatio-temporal co-occurrence mode.
Footnotes
Acknowledgements
Not applicable.
Handling Editor: Sharmili Pandian
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by National Science Foundation for Outstanding Young Scholars (grant number 42122025), Natural Science Foundation for Distinguished Young Scholars of Hubei Province of China (grant number 2019CFA086), and Natural Science Foundation of Hubei Province of China (grant number 2017CFB377).
Data availability
Not applicable.
