Abstract

Abstract The exhaustivity of document descriptions and the specificity of index terms are usually regarded as independent. It is suggested that specificity should be interpreted statistically, as a function of term use rather than of term meaning. The effects on retrieval of variations in term specificity are examined, experiments with three test collections showing in particular that frequently‐occurring terms are required for good overall performance. It is argued that terms should be weighted according to collection frequency, so that matches on less frequent, more specific, terms are of greater value than matches on frequent terms. Results for the test collections show that considerable improvements in performance are obtained with this very simple procedure.

Keywords

Term (time)Meaning (existential)Information retrievalComputer scienceInterpretation (philosophy)Function (biology)Value (mathematics)Test (biology)Simple (philosophy)Statistical hypothesis testingNatural language processingStatisticsMathematicsMachine learningEpistemologyPhilosophy

Affiliated Institutions

Related Publications

The future of optimism.

Recent theoretical discussions of optimism as an inherent aspect of human nature converge with empirical investigations of optimism as an individual difference to show that opti...

2000 American Psychologist 1079 citations

Publication Info

Year
1972
Type
article
Volume
28
Issue
1
Pages
11-21
Citations
4285
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

4285
OpenAlex

Cite This

Karen Spärck Jones (1972). A STATISTICAL INTERPRETATION OF TERM SPECIFICITY AND ITS APPLICATION IN RETRIEVAL. Journal of Documentation , 28 (1) , 11-21. https://doi.org/10.1108/eb026526

Identifiers

DOI
10.1108/eb026526