Data handling and text compression

Abstract

Data compression has a function in text storage and data handling, but not at the level of compressing data files The reason is that the decompression of such files adds a time delay to the retrieval process, and users can see this delay as a drawback of the system concerned Compression techniques can with benefit be applied to index files

A more relevant data handling problem is that posed by the need, in most systems, to store two versions of imported text. The first is the "native" version, as it might have come from a word processor or text editor. The second is the ASCH version which is what is actually imported. Inverted file in dexes form yet another version. The problem arises out of the need for dynamic indexing and re-indexing of revisable docu ments in very large database applications such as are found in Office Automation systems

Four mainstream text-management packages are used to show how this problem is handled. and how generic document architectures such as OCA/CDN and SOMI might help.

Get full access to this article

View all access options for this article.

References

Atomic nib rewrites data storage record, New Scientist 24 August (1991) 22.

Electronic filing and retrieval: developments in tull text retrieval, JAT Pritchard. NCC Publications (1989 )