Abstract
Measures of cluster-based retrieval effectiveness are computed for five composite representations in the cystic fibrosis (CF) Document Collection. The composite representations are constructed from combinations of two subject representations, based on Medical Subject Headings and subheadings, and two citation representations, consisting of the complete list of cited references and a comprehensive list of citations for each document. Experimental retrieval results are presented as a function of the exhaustivity and similarity of the composite representations and reveal consistent patterns from which optimal performance levels can be identified. The optimal performance values provide an assessment of the absolute capacity of each composite representation to associate documents relevant to the same query and discriminate between documents relevant to different queries in single-link hierarchies. The optimal performance values for all composite representations are completely comparable and are superior to the optimal performance of constituent representations. Optimal performance consistently occurs at low levels of exhaustivity. Exhaustive composite representations that include subject descriptions produce the lowest levels of performance; retrieval results derived from random structures are comparable to the observed results. The effectiveness of the exhaustive representation composed of references and citations is materially superior to the effectiveness of exhaustive composite representations that include subject descriptions. © 1991 John Wiley & Sons, Inc.
Keywords
Affiliated Institutions
Related Publications
Subject and citation indexing. Part I: The clustering structure of composite representations in the Cystic Fibrosis Document Collection
The presence of clustering structure in the cystic fibrosis (CF) Document Collection is evaluated as a function of the exhaustivity of five composite representations. The compos...
TOWARDS AUTOMATIC INDEXING: AUTOMATIC ASSIGNMENT OF CONTROLLED‐LANGUAGE INDEXING AND CLASSIFICATION FROM FREE INDEXING
A number of techniques have been studied for the automatic assignment of controlled subject headings and classifications from free indexing. These techniques involve the automat...
Recent Studies in Automatic Text Analysis and Document Retrieval
Many experts in mechanized text processing now agree that useful automatic language analysis procedures are largely unavailable and that the existing linguistic methodologies ge...
Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques
We present sentiment analyzer (SA) that extracts sentiment (or opinion) about a subject from online text documents. Instead of classifying the sentiment of an entire document ab...
Using Linear Algebra for Intelligent Information Retrieval
Currently, most approaches to retrieving textual materials from scientific databases depend on a lexical match between words in users’ requests and those in or assigned to docum...
Publication Info
- Year
- 1991
- Type
- article
- Volume
- 42
- Issue
- 9
- Pages
- 676-684
- Citations
- 20
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1002/(sici)1097-4571(199110)42:9<676::aid-asi6>3.0.co;2-2