This paper is a personal take on the history of evaluation experiments in information retrieval. It describes some of the early experiments that were formative in our understanding, and goes on to discuss the current dominance of TREC (the Text REtrieval Conference) and to assess its impact.
K. Spärck Jones (ed.), Information Retrieval Experiment (Butterworths , London, 1981).
2.
E.M. Voorhees and D.K. Harman (eds), TREC: Experiments and Evaluation in Information Retrieval (MIT Press, Cambridge, MA, 2005).
3.
The Royal Society Scientific Information Conference , 21 June-2 July 1948: Report and Papers Submitted ( Royal Society, London, 1948).
4.
Proceedings of the International Conference on Scientific Information, Washington, DC ( National Academy of Sciences, Washington, DC, 1958).
5.
C.W. Cleverdon , Report on the Testing and Analysis of an Investigation into the Comparative Efficiency of Indexing Systems (College of Aeronautics, Cranfield, 1962). [Aslib Cranfield Research Project]
6.
C.W. Cleverdon , J. Mills and E.M. Keen, Factors Determining the Performance of Indexing Systems (2 vols) (College of Aeronautics, Cranfield, 1966). [Aslib Cranfield Research Project]
7.
D.R. Swanson , Some unexplained aspects of the Cranfield tests of index language performance, Library Quarterly41(3) (1971) 223-8.
8.
D.C. Blair , Some thoughts on the reported results of TREC , Information Processing and Management38(3) (2002) 445-51, 2002.
9.
Letters to the editor, Information Processing and Management, 39(1) (2003) 153-9.
10.
G. Salton (ed.), The SMART Retrieval System: Experiments in Automatic Document Processing ( Prentice-Hall, Englewood Cliffs, NJ, 1971).
11.
J.J. Rocchio , Relevance feedback in information retrieval . In: G. Salton (ed.), The SMART Retrieval System: Experiments in Automatic Document Processing (Prentice-Hall, Englewood Cliffs, NJ, 1971) 313-23.
12.
F.W. Lancaster , MEDLARS: report on the evaluation of its operating efficiency, American Documentation20(2) (1969) 119-48.
13.
K. SpärckJones, Automatic Keyword Classification for Information Retrieval (Butterworths, London, 1971).
14.
K. SpärckJones, A statistical interpretation of term specificity and its application in retrieval, Journal of Documentation , 28(1) ( 1972) 11-21.
15.
S.E. Robertson , Understanding Inverse Document Frequency: on theoretical arguments for IDF, Journal of Documentation, 60(5) (2004) 503-520. Available at http://www.soi.city.ac.uk/~ser/idf.html (accessed 17 August 2007).
16.
K. Spärck Jones and C.J. van Rijsbergen, Report on the Need for and Provision of an `Ideal' Information Retrieval Test Collection (Computing Laboratory, University of Cambridge, Cambridge, 1975).
17.
K. Spärck Jones and R.G. Bates, Report on a Design Study for the `Ideal' Information Retrieval Test Collection (Computing Laboratory, University of Cambridge, Cambridge, 1977).
18.
E.M. Keen, Evaluation parameters. In: G. Salton (ed.), The SMART Retrieval System: Experiments in Automatic Document Processing (Prentice-Hall, Englewood Cliffs, NJ, 1971) 74-111.
19.
E.M. Keen, The Aberystwyth index languages test, Journal of Documentation29(1) ( 1973) 1-35.
20.
E.M. Keen, Evaluation of Printed Subject Indexes by Laboratory Investigation: Final report to the British Library Research and Development Department, BLR&D Report 5454 (BLR&DD, London , 1978).
21.
R.N. Oddy, Information retrieval through man-machine dialogue, Journal of Documentation33(1) (1977) 1-14.
22.
N.J. Belkin , Anomalous states of knowledge as a basis for information retrieval, Canadian Journal of Information Science , 5(May) (1980) 133-43.
23.
N.J. Belkin , R.N. Oddy and H.M. Brooks, ASK for information retrieval. Part I: background and theory; part II: results of a design study , Journal of Documentation, 38(2) (1982) 61-71 and 38(3) 145-64.
24.
N. Mitev, G. Venner and S. Walker, Designing an Online Public Access Catalogue: Okapi, a Catalogue on a Local Area Network (The British Library, London, 1985). [Library and Information Research Report no. 39]
25.
S. Walker, The Okapi online catalogue research projects. In: C. Hildreth (ed.), The Online Catalogue: Developments and Directions (Library Association, London, 1989) 84-106.
26.
S.E. Robertson , Overview of the Okapi projects, Journal of Documentation53(1) (1997) 3-7. [Introduction to special issue of Journal of Documentation]
27.
M. Hancock-Beaulieu , Experiments on interfaces to support query expansion , Journal of Documentation53(1) (1997) 8-19.
28.
W.B. Croft (ed.), Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval (Kluwer, Boston, MA, 2000).
29.
S.E. Robertson and M. Hancock-Beaulieu, On the evaluation of IR systems , Information Processing and Management, 28(4) (1992) 457-66.
30.
NIST (National Institute of Standards and Technology) , Text REtrieval Conference. Available at: http://trec.nist.gov/ (accessed 13 August 2007).
31.
S.E. Robertson and K. Spärck Jones, Relevance weighting of search terms, Journal of the American Society for Information Science27 (1976) 129-46. Available at: http://www.soi.city.ac.uk/~ser/papers/RSJ76.pdf (accessed 17 August 2007).
32.
H.R. Turtle and W.B. Croft, Evaluation of an inference-network based retrieval model, ACM Transactions on Information Systems , 9(3) ( 1991) 187-222.
33.
K. Spärck Jones, S. Walker and S.E. Robertson, A probabilistic model of information retrieval: development and comparative experiments. Information Processing and Management, 36 ( 2000) 779-808 (Part 1) and 809-40 (Part 2). Available at: http://www.soi.city.ac.uk/~ser/blockbuster.html (accessed 17 August 2007).
34.
W.B. Croft and J. Lafferty (eds), Language Modelling for Information Retrieval (Kluwer, Dordrecht/ Boston/London, 2003).
35.
M.E. Maron and J.L. Kuhns, On relevance, probabilistic indexing and information retrieval, Journal of the ACM7(3) (1960) 216-44.