Abstract
In the era of the digital economy, the exploration of useful knowledge from data streams has garnered significant attention due to its wide-ranging applications. However, the rapid and infinite nature of data streams poses challenges for efficiently mining high utility sequential patterns, including strong spatio-temporal constraints and the combinatorial explosion of sequence data search spaces. To address this and adapt to a variety of application scenarios, this paper delves into the investigation and design of an efficient algorithm for high utility sequential pattern mining over data streams based on the sliding window model (HUSP_DS). This algorithm utilizes a projection mechanism within a sliding window to recursively search for all interesting patterns. Additionally, it introduces a novel structure called the dynamic utility index table, which stores information such as the utility and index positions of data stream sequences. Notably, this structure proves highly effective in recursive search processes and utility updates. Comprehensive experimentation, conducted on both real-world and synthetic datasets, have shown that the superior performance of the HUSP_DS algorithm compared to state-of-the-art algorithms. This superiority is particularly evident in terms of temporal and spatial efficiency. Furthermore, the algorithm demonstrates suitability for mining sliding windows of arbitrary sizes, showcasing stable scalability.
Get full access to this article
View all access options for this article.
