Abstract
K-nearest neighbor (KNN) queries algorithm has been widely used in many fields. However, as data volume increases, query efficiency descends sharply. Inspired by Persistent Memory, we propose a new method of knowledge base and cache mechanisms to evaluate exact KNN queries. Initially, this method caches all the dataset tuples. Next in importance, this method creates a knowledge base and uses a learning-based method to get the first tuple. With the evaluating of stream KNN queries, the knowledge base gets sufficient information. In the knowledge base, every tuple is thought as a region. The regions of the knowledge base will be clustered into more significant regions. When a query is submitted, our method tries to obtain all of the results from the clustered regions. From this strategy, we can minimize the response time by getting candidates set quickly from the clustered regions and avoiding partially or wholly access to the underlying systems say database or files. Numerous experiments have been conducted using datasets of varying dimensions, including low-dimensional datasets (2, 3, and 4) and high-dimensional datasets (25, 50, and 104). The outcomes of these experiments demonstrate a notable superiority of our proposed method over analogous approaches presented in antecedent literature, particularly concerning the evaluation of a sequence of KNN queries. Our method is not only database friendly but also can be applied to many online systems that need fast and exact KNN retrieval.
Get full access to this article
View all access options for this article.
