Abstract
Nowadays, the exponentially increasing amount of digital images available imposes a great challenge to a content-based image retrieval (CBIR) system due to the requirement of extensive-computing. Considering this challenge, this paper presents an approach to achieve effectiveness and scalability of a CBIR system in a large-scale dataset. To do that, we propose a cache mechanism to spare the distance computation efforts of a retrieval task in the CBIR system. Additionally, a MapReduce technique is presented to exploit the cached data in a parallel facility, thereby not only improving the performance of a CBIR system but also ensuring scalability for the system. Additionally, a collaborative caching service has been introduced for enhancing the data availability, thus decreasing the network traffic load due to fetching data remotely in the distributed environment. Moreover, by clustering the dataset before a search, this system can be efficient at responding to a user query since only a portion of the dataset is actually operated at a time. Through experiments, our approach obtains significant efficiency gains compared to other methods in terms of response time and achieves an acceptable accuracy ratio, which is applicable in the practical environment.
Get full access to this article
View all access options for this article.
