A Rough Set Model of Information Retrieval

Abstract

The purpose of this paper is to define and investigate document retrieval strategies based on rough sets. A subject classification index is modelled by means of a family of set-theoretic information systems. The family induces a power set hierarchy built from the set of keywords. In the paper, the task of information retrieval in response to a query that refers to several levels of that hierarchy is defined. Retrieval strategies are proposed and discussed. The strategies are, roughly speaking, based on various types of similarity between sets of keywords included in a query and in a document. It is not necessarily required that the document matches the query; we allow for approximate matching. Measures of relevance of documents to a query are proposed for the given strategies.

Get full access to this article

View all access options for this article.