Abstract
Without ground-truth data, trajectory anomaly detection is a hard work and the result lacks of interpretability. Moreover, in most current methods, trajectories are represented by geometric features or their low-dimensional linear combination, and some hidden features and high-dimensional combined features cannot be found efficiently. Meanwhile, traditional methods still cannot get rid of the limitation of space attributes. Therefore, a novel trajectory anomaly detection algorithm is present in this article. Unsupervised learning mechanism is used to overcome nonground-truth problem and deep representation method is used to represent trajectories in a comprehensive way. First, each trajectory is partitioned into segments according to its open angles, then the shallow features at each point of a segment are extracted and. In this way, each segment is represented as a feature sequence. Second, shallow features are integrated into auto-encoder-based deep feature fusion model, and the fusion feature sequences can be extracted. Third, these fused feature sequences are grouped into different clusters using a unsupervised clustering algorithm, and then segments which quite differ from others are detected as anomalies. Finally, comprehensive experiments are conducted on both synthetic and real data sets, which demonstrate the efficiency of our work.
Introduction
With the continuous development and application of positioning equipment, such as global position system (GPS), Wi-Fi as well as wireless sensor network (WSN), a large volume of moving objects can be tracked with their trajectory data stored in the database. It has been a hot topic of analyzing and extracting knowledge from these data. Hawkins 1 gave the definition of outlier as follows: an observation deviates greatly from others and arouses suspicion that it was generated by a different mechanism. Trajectory anomaly detection2,3 is an important research branch in the field of trajectory data mining. And its task is to discover trajectories or their segments which differ substantially from or are inconsistent with the remaining set. As for trajectory anomaly, an outlier means a data object that is grossly different from or inconsistent with the remaining set of data. 4 Traditional anomaly detection algorithms mainly consist of distance-based algorithms5–7 and density-based algorithms,8,9 but unsupervised methods are rarely used. The distance-based algorithm mostly judges whether the trajectory is anomaly or not according to the space distance of the trajectory. Due to limitations in space distance comparison, the trajectory with the same characteristics but far distance may be detected as an exception, which will affect the accuracy of the trajectory anomaly detection. The density-based algorithm is essentially the extension of distance algorithm, which cannot fundamentally eliminate the limitations of distance-based trajectory anomaly detection algorithm.
In the traditional methods of trajectory anomaly detection, the trajectory is usually represented by hand extracted geometric features or low-dimensional linear combination of geometric features. These representation methods cannot find some hidden features or high-dimensional composite features. For example, Figure 1 shows the shape, angle change, distance change, and speed change of trajectory, respectively. By analyzing the internal characteristic attributes of trajectory such as angle, speed, and distance, it is found that the change degree is different. It can be seen that only measuring the linear combination of a certain geometric feature or a simple geometric feature is not enough to comprehensively reflect the movement trend of the trajectory. It is difficult to comprehensively describe the movement mode of the moving object’s trajectory. Therefore, error may exist in the result of anomaly detection because of incomplete feature extraction and incomplete representation of trajectory feature sequence.

An example of trajectory structural features: (a) shape of trajectory, (b) angle change, (c) distance change, and (d) velocity change.
In view of the shortcomings of existing research methods, this article proposes an unsupervised learning trajectory anomaly detection algorithm based on deep representation (TAD-FM). Through the transformation and processing of GPS data, the trajectory is described comprehensively with deep characteristics. Compared with algorithms based on distance and density, the deep feature fusion model based on auto-encoder is proposed in this article, which transforms the geometric feature representation of trajectory into sequence pattern, represents trajectory by feature sequence, and desalinates the spatial attribute of trajectory itself. Therefore, the limitation of using spatial distance to judge anomalies is fundamentally solved. At the same time, from the perspective of fully exploring the characteristics of the trajectory, the low-dimensional shallow features and the high-dimensional deep features of the trajectory are integrated to describe the trajectory comprehensively. So as to characterize the overall trajectory features according to the description sequence to detect the trajectory anomaly.
The main framework of this article is given in Figure 2 with parts of ideas borrowing from Chen et al. 10 GPS trajectory data can be obtained from a variety of positioning devices. The original GPS trajectory data are a three-dimensional (3D) data composed of longitude, latitude, and time. In the trajectory data division stage, the trajectory is divided into segments according to the open angle. In the stage of deep feature extraction, the trajectory segment is transformed into shallow feature sequence first, and then the shallow feature sequence is put into the deep feature fusion model based on auto-encoder to obtain the deep feature sequence representation of trajectory segment, that is, the original GPS data of trajectory segment are transformed into the fusion feature sequence. In the clustering stage, in order to obtain the trajectory segments of different motion modes, the density-based spatial clustering of applications with noise (DBSCAN) algorithm is used to cluster by measuring the cosine distance between fusion feature sequences. In the phase of anomaly detection, the anomaly trajectory segments are extracted by comparing the cosine similarity between the fusion feature sequences, and then consider the proportion of the anomaly trajectory segments in the overall trajectory, finally detect the anomaly trajectory.

The framework of this article.
The main contributions of this article are as follows:
On the basis of the idea of deep embedding, a deep feature fusion model based on auto-encoder is proposed to reduce human intervention, extract high-dimensional combined features of trajectory, transform the geometric feature sequence of trajectory into the mixed feature sequence of high-dimensional non-linear representation, and describe the trajectory comprehensively.
The geometric representation of the trajectory is transformed into a sequential pattern. The limitation of using spatial distance to judge the anomaly is fundamentally solved using the way of feature sequence to represent the trajectory and desalinate the spatial attribute of the trajectory itself.
Experiments based on multiple data sets show that the algorithm proposed in this article can discover the anomaly trajectory, which proves the effectiveness and feasibility of the algorithm.
The organization structure of this article is as follows. In section “Related work,” the related work is described. In section “Trajectory representation,” the method of trajectory division is introduced. In section “Feature extraction and transformation,” the deep feature fusion model based on auto-encoder is proposed to describe how to extract and transform the features from the divided trajectory segments. In section “Unsupervised learning clustering,” the multi-category clustering method based on unsupervised learning is introduced. In section “Trajectory outlier detection,” the method of trajectory anomaly detection is described in detail. In section “Experimental analysis,” the algorithm is evaluated by experiments. In section “Some real application scenarios,” some typical application scenarios are discussed. In section “Conclusion,” the full text is summarized, and the future research direction is pointed out.
Related work
Trajectory anomaly detection
Distance-based algorithm
Knorr et al. 7 not only regarded the trajectory as a series of points but also transformed the trajectory into a feature sequence of position, direction, and speed. By comparing the distance, the trajectory anomaly was detected. However, this method only compares the overall characteristics of the trajectory while ignores the local characteristics. As the trajectory grows, the problems exposed by this method become more and more obvious. In order to solve the problem caused by comparing the whole trajectory, Lee et al. 5 proposed the density-based trajectory outlier detection (TRAOD) algorithm. The TRAOD algorithm consists of two steps: division and detection. In the division step, the trajectory is divided into several segments, which are represented by the line segments connected by two sampling points. In the detection step, the distance between the line segments is used to measure its matching, so as to detect the trajectory anomaly. This method solves the problem of complex trajectory anomaly detection, but only by describing the trajectory segment as a directed segment composed of the start and end points, the characteristics of the trajectory itself are inevitably ignored. Yuan et al. 6 put forward the trajectory structure framework and introduced it into the trajectory anomaly detection. By considering the internal characteristics of the trajectory, the accuracy of the analysis is improved, so as to avoid the problem of only comparing the spatial shape of the trajectory. However, due to the different importance of each internal feature of the trajectory, subjective weight will interfere with the results, thereby affecting the accuracy of anomaly detection. Most distance-based algorithms judge whether the trajectory is anomaly or not according to the space distance of the trajectory. Because of the limitation of space distance comparison, it is inevitable that the trajectory with the same characteristics but a long distance is detected as an exception, which affects the accuracy of the trajectory anomaly detection.
Density-based algorithm
Breunig et al. 8 used local anomaly factor (LOF) to determine whether the trajectory is anomaly, but a large number of neighbor queries are required to calculate LOF, so the cost of calculation is very high. 11 Liu et al. 9 introduced the concept of trajectory density and proposed density-based trajectory outlier detection (DBTOD) algorithm to make up for the shortcomings that TRAOD algorithm cannot detect local and dense trajectories. The density between trajectories is calculated by distance, so the density-based algorithm is essentially an extension of distance-based algorithm, which cannot fundamentally solve the limitations of the distance-based algorithm.
Embedding
Sutskever et al. 12 first proposed the method of sequence-to-sequence auto-encoder for machine translation. Chen et al. 10 used auto-encoder fusing a variety of different features to get more semantical and discriminative context representation in the latent space, so as to analyze the taxi destination. Liu et al. 13 used embedding method for real-time personalized search and similar product list recommendation. Dong et al. 14 proposed a novel Auto-encoder Regularized deep neural Network (ARNet) and a trip encoding framework trip2vec to learn drivers’ driving styles directly from GPS records, by combining supervised and unsupervised feature learning in a unified architecture. Embedding method is widely used. Through the analysis of the above methods, this article applies the embedding method to trajectory anomaly detection. Beritelli et al. 15 proposed a radial basis probabilistic neural network to classify non-linear and unstable data. The implemented method has been devised in order to slowly increase, during training, the generalization capabilities of a radial basis probabilistic neural network classifier as well as preventing it from over-generalization and the consequent lack of resulting classification performances.
Trajectory representation
The temporal distance between two consecutive points in the same trajectory is relatively small (with respect to the mobility phenomenon under consideration). In this view, the trajectory is simply a sequence of relatively close in time ordered points. 16 Generally speaking, the segmentation task splits a sequence of data points, in a series of disjoint subsequences consisting of points that are homogeneous with respect to some criteria.17,18 Therefore, this article uses the divided trajectory segment as the basic unit of analysis.
Maike et al. 19 defined the segmentation problem as follows: to split a high-sampling rate trajectory into a minimum number of segments such that the movement characteristic inside each segment is uniform in some sense. The purpose of dividing a trajectory into segments is that trajectory segments contain part of the original trajectory data and hidden information, so they can be more flexible to combine and describe trajectory. 20 The key of trajectory division is to find some key points that can divide the trajectory into multiple smooth trajectory segments. Through the key points to divide the trajectory, it can effectively divide the trajectory segments that retain trajectory characteristics and are relatively smooth. If the trajectory is not divided and the whole trajectory is taken as the input data to extract features, some key information will be ignored. Therefore, compared with not dividing the trajectory, the trajectory division can grasp the internal characteristics of the trajectory. At the same time, keeping the internal characteristics of the trajectory without losing is conducive to anomaly detection. The definition of trajectory and open angle is given here.
Definition 1
Trajectory:
21
Given a trajectory set
Definition 2
Open angle:
22
Three points {

The open angle of trajectory.
Using the two-pass corner detection algorithm, 21 all the points are scanned and the open angle are calculated according to Definition 2, so as to determine the key points of the trajectory finally. The method can accurately divide the trajectory into each smooth trajectory segment with consistent overall movement trend according to the key points.
In order to describe the process of track division more clearly, a simple example is used to illustrate the trajectory division. Figure 4 shows the schematic diagram of trajectory segment division. According to Definition 2,

Trajectory segmentation.
Feature extraction and transformation
Deep neural networks have been proved powerful in learning from speech data,23,24 text data, 25 and image data. 26 Trajectory data are similar to language and text data, and these are also a kind of time series data. However, according to Definition 1, trajectory data are 3D data composed of two-dimensional (2D) coordinates and timestamps. Empirical studies showed that simply treating raw trajectory data as 3D signal inputs for either traditional machine learning or deep learning algorithms just does not work. 27 Therefore, in order to make the trajectory data suitable for the deep neural network algorithm, it is necessary to extract the trajectory features with stronger semantic expression from the original trajectory data. The original trajectory data are represented by feature sequence.
As shown in Figure 5, the core of feature extraction and transformation is divided into two steps. First, the geometric features are extracted from several indefinite length original trajectory segments and transformed into fixed length shallow feature sequence

Feature extraction and conversion.
Shallow feature sequence extraction
The trajectory segment
Definition 3
Shallow feature sequence: A frame encoding statistics of the basic features is called the shallow feature of the trajectory. The set
Dimension expansion requires calculating mean, minimum, maximum, 25%, 50%, and 75% quartiles, and standard deviation of each trajectory segment feature.
1
The calculated results are the shallow feature sequences of each trajectory segment. According to Definition 3, this article analyzes the velocity variation
Deep feature fusion
A deep feature fusion model based on auto-encoder is proposed. The shallow feature sequence is transformed into deep feature sequence using the model, and the shallow feature sequence and deep feature sequence are spliced into mixed feature sequence, so that the features of trajectory segments can be represented comprehensively from deep and shallow aspects.
Auto-encoder
By analyzing the trajectory segments represented by the intuitionistic feature sequences, we can find the anomalies. However, the shallow feature sequence is only the feature representation of low-dimensional trajectory segments, which will limit the performance of machine learning algorithm. The typical problem is that the potential feature information and the association between different types of combined features cannot be found. 22 Therefore, this article uses auto-encoder10,28 to solve this problem. Auto-encoder belongs to the embedding method, which is an unsupervised artificial neural network. After many training, it can capture the most significant features in the input data.
Figure 6 shows a three-layer auto-encoder model.

The process of auto-encoder.
The value
The purpose of using auto-encoder is to extract the significant high semantic features of the original data from the hidden layer; at the same time, it has the function of reducing dimension and noise. In order to make the learned hidden layer meaningful, the data output by the encoder needs to enter the decoder again for decoding. Through repeated iterations, the input and output parameters are updated to make the value of
The updating methods of weight matrix
In this article, in order to make the results more practical, only the vectors in the hidden layer
Mixed feature sequence transformation
In this section, the deep feature fusion model based on auto-encoder is introduced in detail. In order to express conveniently, the definition of deep feature sequence and mixed feature sequence is introduced first.
Definition 4
Deep trajectory feature sequence: The deep trajectory feature sequence
Definition 5
Mixed feature sequence: The feature sequence
The shallow feature sequence and the deep feature sequence can express the shallow and deep features, respectively. Using the deep feature fusion model based on auto-encoder, the shallow feature sequence and the deep feature sequence are transformed into the mixed feature sequence, which not only considers the detail information of the shallow feature but also combines the semantic information of the deep feature. So, the shallow and deep features can be highlighted at the same time.
Figure 7 shows the deep feature fusion model based on auto-encoder. By inputting the shallow feature sequence into auto-encoder 1, we can get the deep feature sequence shown in Definition 4. Using the deep feature sequence

Feature integration.
Obviously, in this article, the dimension of input shallow feature sequence
Unsupervised learning clustering
The purpose of unsupervised learning–based multi-category clustering is to divide different types of trajectory segments into different clusters. The trajectory segments that need to be detected should exist in the same neighborhood, that is, the same cluster with similar motion mode, rather than all the trajectory segments in the set. This is conducive to reducing the complexity of the algorithm and also has practical significance.
However, traditional clustering methods are often based on hand-crafted features. These features have limited capacity and cannot dynamically adjust to capture the prior distribution. 29 Therefore, this article uses the features processed by the deep feature fusion model based on auto-encoder to cluster and breaks through the limitation of manual feature extraction for the overall representation of trajectory.
Definition 6
Cosine similarity: The method to evaluate the similarity of two mixed feature sequences by calculating the cosine value of the angle between them is called cosine similarity. Equation (9) shows the cosine similarity of the mixed feature sequence
The purpose of using cosine similarity to measure distance is that most of the current trajectory clustering algorithms rely on the spatiotemporal similarity measurement of trajectory. They cannot cluster trajectories with similar characteristics that exist in different spatiotemporal regions. In this article, the mixed feature sequence is used to represent the trajectory segments, and all the trajectory segments are described as space vectors of common starting point. The cosine similarity is used to measure the distance between them. The vectors pointing to a certain range are grouped into one category, that is, the trajectory segments with similar features are grouped into one cluster, which breaks through the limitation of space-time.
In this article, the density-based clustering algorithm DBSCAN 30 is used to cluster, and the cosine similarity shown in Definition 6 is used to measure the distance. DBSCAN can find clusters of any shape and can find irregular clusters without knowing the number of clusters in advance. The unsupervised trajectory anomaly detection algorithm proposed in this article has no clear attribute label for the attribute type of trajectory, and the subjective specified clustering number will affect the classification of trajectory. Therefore, the DBSCAN algorithm has a high applicability.
Definition 7
Definition 8
Core sequence: For any mixed feature sequence
Definition 9
Directly density-reachable: If
Definition 10
Density-reachable: If there is a series of mixed feature sequences
Definition 11
Cluster: If a non-empty set
Connectivity: For
Maximality: for
Definition 12
Noise: In the cluster set
For the problem of noise treatment, one view holds that the noise is composed of data points that are not suitable for clustering structure, so it can be considered as divergence. 31 A simple processing solution is introducing a tolerance on the number of noise points inside a segment to smooth out the noise. 32 But another point of view is that noise contains important semantic information and cannot be ignored in dynamic phenomena. In some areas, specialists use the term “excursion” to characterize the temporary absence from the home-range where the animals reside. 33
In this article, GPS coordinates smoothed by Kalman filtering is taken into consideration. 19 At the same time, through feature extraction and transformation, a single trajectory point can be mapped into multi-dimensional features, which can dilute and smooth the possible noise caused by sampling equipment to the maximum extent. Therefore, noise is considered as a part of real data and classified as anomaly.
As shown in Figure 8, the vectors

Illustration of clusters.
The specific algorithm pseudo code of clustering based on unsupervised learning is shown in Algorithm 1. In lines 04–10, the
The time complexity of the algorithm is
Trajectory outlier detection
In this section, the specific method of trajectory anomaly detection will be described in detail. In order to express clearly, some necessary definitions are given here.
Definition 13
Cosine similarity matrix: Calculate the cosine similarity of each element in the mixed feature sequence set
The calculation method of each element in cosine similarity matrix
Definition 14
Similarity index: The sum of row
Definition 15
Anomaly trajectory segment: Given the threshold value
Definition 16
Anomaly trajectory: Given the threshold value
After clustering, the cluster set
Taking Figure 9 as an example. It contains two clusters and multiple trajectory segments. The red lines represent the anomaly trajectory segments, and the black or green ones are the normal trajectory segments. Among them, the noise is the trajectory segment that does not belong to any clusters. According to the previous statement, it is classified as an anomaly first.

Example of anomaly detection.
The specific algorithm pseudo code of trajectory anomaly detection is shown in Algorithm 2. It is easy to see that the whole algorithm can be divided into two parts: lines 02–10 calculate the cosine similarity matrix and similarity index set of each cluster in the cluster set. If the similarity index of a trajectory segment is less than the similarity index threshold
The analysis of the algorithm shows that the time complexity of the algorithm is
Experimental analysis
In this section, abundant experiments are conducted to verify the effectiveness and feasibility of the proposed algorithms. Two data sets, synthetic trajectories 34 and North Atlantic hurricanes, 35 are used in experiments. The operation environment required for the experiment is shown in Table 1.
Experimental environment.
GB: Gigabyte; TB: Terabyte
Data set
Synthetic trajectories
Synthetic trajectory data set is a labeled data set proposed by Piciarelli et al. 36 The data set includes 1000 subsets, each of which is composed of 260 randomly generated trajectories with the length of 16. The trajectories are divided into five clusters. Of the 260 trajectories, 250 are normal trajectories and the remaining 10 are anomaly trajectories. As shown in Figure 10(a), it is a subset of the data set, in which light (blue) represents normal trajectory and dark (red) represents anomaly trajectory.

Graphical representation of data sets: (a) synthetic trajectories data and (b) North Atlantic hurricane data.
North Atlantic hurricanes
North Atlantic hurricanes data set records the hurricane data of the North Atlantic from 1950 to 2006, with a total of 861 trajectories composed of 23,043 points. The information in the data set includes the longitude, latitude, wind speed, pressure, level, and sampling time of the hurricane center. The graphical representation of the data set is shown in Figure 10(b).
Algorithm parameters analysis
The parameters involved in trajectory anomaly detection algorithm include open angle threshold
Open angle threshold ω
The open angle threshold

Effects of
Learning rate
The setting of

Loss with different learning rates.
Nearest neighbor threshold ε and the minimum number of MinPts
The setting of nearest neighbor threshold

Effects of
Similarity index threshold ρ
In the process of anomaly detection, the similarity index obeys normal distribution. At the same time, it can avoid the deviation of detection result caused by the inaccurate threshold value

Distribution of
In order to conform the actual situation, the threshold value
Experimental results analysis
Synthetic trajectories
In this article, labeled artificial trajectory data set is used first to verify the unsupervised trajectory anomaly detection algorithm based on deep representation; then, we compare the detected anomaly with the label in the data set. The validity and feasibility of the algorithm are proved by high accuracy.
Figure 15 shows the loss function, with orange (light) representing the mixed feature sequence

Loss function on synthetic trajectory data.
With iterating experiments for many times, we can get the parameters range or setting strategy. the minimum number threshold value

The selection of
In order to verify the effectiveness of our algorithm and facilitate the observation of clustering effect, it is necessary to visualize the data after clustering. Similar to Figure 13, we also use t-SNE algorithm
37
to compress multi-dimensional data into two-dimensional data to visualize cluster results. t-SNE is especially suitable for the visualization of high-dimensional data. Figure 17 shows the effect of clustering. Figure 17(a) shows the clustering effect of

Clustering results on both models: (a)
There are many parameters in TAD-FM. The setting of different parameters may affect precision and recall of anomaly detection. In this article, many groups of parameters are selected to carry out multiple experiments, and the experimental results are averaged to analyze the causes of different results caused by parameter setting. The evaluation criterion used in the experiments are precision and recall. Precision refers to the proportion of positive cases in the samples identified as positive cases. Recall represents the proportion of positive cases in the sample to be correctly predicted. The two criterion are given as equations (13) and (14)
where
Table 2 shows the precision rate and recall rate of anomaly detection obtained by setting different parameters for experiments. It can be seen that different parameters will have a greater impact on the results. Too excessive
The influence of parameters on the precision of trajectory anomaly detection.
In this article, K nearest neighbors (KNN), convolutional neural networks (CNN), support vector machines (SVM), trajectory anomaly detection with shallow feature sequence (TAD-F), trajectory anomaly detection with deep feature sequence (TAD-FD), and trajectory anomaly detection with mixed feature sequence (TAD-FM) are tested in several subsets of the data set. After many experiments, the precision of KNN is 83.96%, CNN is 87.50%, SVM is 91.29%, TAD-F is 85.46%, TAD-FD is 89.31%, and TAD-FM is 92.67%. TAD-F represents the algorithm using shallow feature sequence

Precision comparison of different algorithms.
North Atlantic hurricanes
The North Atlantic hurricanes is a data set without label. Due to the lack of ground truth, the evaluation indicators in the field of classical machine learning cannot be applied to the evaluation of anomaly detection results. Therefore, the purpose of using the North Atlantic hurricanes data set is to verify that TAD-FM can detect anomaly trajectories in the real unlabeled situation.
Figure 19 shows the loss function of experiment, the orange (light) curve represents the mixed feature sequence

North Atlantic hurricane trajectories data loss function.
Figure 20 shows the clustering effect (

North Atlantic hurricane trajectories data clustering.
In order to verify the effectiveness of DBSCAN in this algorithm, a variety of clustering algorithms are used to compare with DBSCAN. Figure 21 shows the silhouette coefficient of K-means, Gaussian mixture model (GMM), and birch clustering algorithms with the number of clusters between 3 and 9. The performance of clustering is evaluated by silhouette coefficient. It can be seen from the figure that when the number of clusters is 6, the contour coefficient of GMM is 0.635, which is higher than other algorithms. The silhouette coefficient of DBSCAN is 0.633, which is similar to GMM. At the same time, DBSCAN does not need to specify the number of clusters, while the other three algorithms need to determine the number of clusters in advance. Considering the practical data, DBSCAN is more practical.

Comparison of silhouette coefficient.
Table 3 shows the influence of parameters on the number of anomaly trajectories. It can be seen from the table that the setting of different parameters has a significant impact on the results of anomaly detection. It is necessary to select appropriate parameters in the experiment. Too low
The influence of parameters on the number of anomaly trajectories.
In this article, TRAOD algorithm, TAD-FM algorithm, TAD-FD algorithm, and TAD-F algorithm are used for comparison. Due to the difference of anomaly detection methods, parameter settings, and calculation volume, they are not comparable in performance. Therefore, this article compares the experimental results. Figure 22(a) shows the experimental effect of TAD-FM algorithm. Figure 22(b) shows the experimental effect of TRAOD algorithm (

View of trajectory anomaly detection results: (a) TAD-FM, (b) TRAOD, (c) TAD-F, and (d) TAD-FD.
Some real application scenarios
Trajectory anomaly detection can be widely used in numerous spatial-temporal application scenarios such as climate monitor, animal habit analysis, and traffic detection. In this section, we introduce some typical application scenarios to intuitively describe the meaning of studying trajectory anomaly detection and point out future application directions.
Abnormal climate detection is one of the typical application scenarios. There are mainly two ways to find abnormal climate. One is finding outlier trajectories of various storms, typhoon as well as other meteorological data. 5 The other is finding the trends or periods of extreme climate using spatial and temporal weighted regression. 39 Lee et al. 5 developed a trajectory outlier detection algorithm TRAOD. Their algorithm consists of two phases: partitioning and detection. The first phase is to partition a trajectory into a set of line segments, which ensures both high quality and high efficiency. In the second phase, they present a hybrid of the distance-based and density-based approaches. While Fotheringham et al. 39 present a novel model called geographical and temporal weighted regression (GTWR) to account for local effects in both space and time dimensions. In their work, they first detailed both the model formulation and the estimation of GTWR focusing on spatiotemporal kernel function definition and spatiotemporal band width optimization. Then, the performance of GTWR was examined by comparison with basic geographical weighted regression models.
Animal habit analysis is a very important application of trajectory analysis. Li et al. 40 developed an efficient prototype system called MoveMine to discover animal movement patterns from animal movement data. Animal abnormal habits or moving tendencies may be events or observations that do not conform to an expected pattern. These finds can greatly attract the attention from biologists or even botanists.
Urban traffic detection is very important for city life. Both spatial and temporal outliers can be detected from traffic data. Through analyzing trajectories in urban area or with constraint of road network, different traffic flow can be identified. Urban traffic detection is very helpful for smart city plan and smart transportation management. Li et al. 41 proposed a temporal outlier discovery framework for detecting temporal outliers from vehicle traffic data. In their work, agglomerated temporal information of the entire data set is utilized to detect outliers. At each time stamp, similarity of each road segment is checked with other road segments, and historical similarities of each road segment are recorded in a temporal neighborhood vector. Finally, outliers are detected from drastic changes.
Urban safety management has been considered as a new important application of trajectory moving mining. The movement of crowds in a city is strategically important for risk assessment and public safety management. 2 Therefore, through analyzing the status of traffic flow and communication base stations, the direction and density of moving crowds can be extracted, and quick response to the urban safety management can be provided to the city administrator. In the work of Hoang et al., 42 a practical solution for citywide traffic prediction was developed. First, they partition the map of a city into regions using both road network and human historical mobility records. In order to model multiple complex factors that affect crowd flows, they decompose flows into three components: seasonal patterns, trend, and residual flows. Their method is scalable and outperforms compared with some baselines.
Conclusion
Some current trajectory anomaly detection algorithms cannot fully take the characteristics of trajectories into account, and they are also limited by the space attribute of trajectories. In order to solve these problems, we propose an unsupervised trajectory anomaly detection algorithm based on deep representation (TAD-FM) in this article. Attributes of trajectories are deeply studied and their shallow features are extracted. Meanwhile, a deep feature fusion model based on the auto-encoder is present to obtain the mixed features of trajectories expressed by fusing both shallow and deep features. This model can comprehensively consider features of trajectories themselves. The feature sequence is used to represent trajectories and their anomalies, through which, the limitation of spatial attribute can be reduced. The experimental results show that TAD-FM algorithm has higher accuracy and can find the inconspicuous anomaly trajectories and their segments comprehensively.
As discussed above, trajectory anomaly detection is quite important for many fields. Most trajectory data are label free, and traditional classification algorithms cannot work on such data directly. Therefore, how to find anomaly using state-of-art machine learning algorithms is a challenge. Moreover, without ground-truth data, how to assess the results is another challenge.
Footnotes
Acknowledgements
The author(s) would like to thank the Research Center of Innovation on Intelligent Prevention of Disaster and Emergency Rescue (CUMT) for providing experiment facility.
Handling Editor: Yanjiao Chen
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the National Natural Science Foundation of China (Grant No. 71774159) and Fundamental Research Funds for the Central Universities of China (Grant No. 2015QNA38).
