Abstract

Most probabilistic retrieval models incorporate information about the occurrence of index terms in relevant and non‐relevant documents. In this paper we consider the situation where no relevance information is available, that is, at the start of the search. Based on a probabilistic model, strategies are proposed for the initial search and an intermediate search. Retrieval experiments with the Cranfield collection of 1,400 documents show that this initial search strategy is better than conventional search strategies both in terms of retrieval effectiveness and in terms of the number of queries that retrieve relevant documents. The intermediate search is shown to be a useful substitute for a relevance feedback search. Experiments with queries that do not retrieve relevant documents at high rank positions indicate that a cluster search would be an effective alternative strategy.

Keywords

Relevance (law)Information retrievalComputer scienceProbabilistic logicSearch engineHuman–computer information retrievalDivergence-from-randomness modelRanking (information retrieval)Relevance feedbackRank (graph theory)Search engine indexingDocument retrievalData miningArtificial intelligenceMathematicsImage retrieval

Affiliated Institutions

Related Publications

Publication Info

Year
1979
Type
article
Volume
35
Issue
4
Pages
285-295
Citations
432
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

432
OpenAlex

Cite This

W. Bruce Croft, David J. Harper (1979). USING PROBABILISTIC MODELS OF DOCUMENT RETRIEVAL WITHOUT RELEVANCE INFORMATION. Journal of Documentation , 35 (4) , 285-295. https://doi.org/10.1108/eb026683

Identifiers

DOI
10.1108/eb026683