TOWARDS AUTOMATIC INDEXING: AUTOMATIC ASSIGNMENT OF CONTROLLED‐LANGUAGE INDEXING AND CLASSIFICATION FROM FREE INDEXING

1975 Journal of Documentation 33 citations

Abstract

A number of techniques have been studied for the automatic assignment of controlled subject headings and classifications from free indexing. These techniques involve the automatic manipulation and truncation of the free‐index phrases assigned to a document and the use of a manually‐constructed thesaurus and automatically‐generated dictionaries together with statistical ranking and weighting methods. These are based on the use of a statistically‐generated ‘adhesion coefficient’ which reflects the degree of association between the free‐indexing terms, the controlled subject headings, and the classifications. By the analysis of a large sample of manually‐indexed documents the system generates dictionaries of free‐language and controlled‐language terms together with their associated classifications and adhesion coefficients. Having learnt from the manually‐indexed documents the system uses these dictionaries in the subsequent automatic classification procedure. The accuracy and cost‐effectiveness of the automatically‐assigned subject headings and classifications has been compared with that of the manual system. The results were encouraging and the costs comparable to those of a manual system.

Keywords

Search engine indexingAutomatic indexingComputer scienceRanking (information retrieval)Information retrievalWeightingSubject (documents)Index (typography)ThesaurusNatural language processingArtificial intelligenceWorld Wide Web

Related Publications

Publication Info

Year
1975
Type
article
Volume
31
Issue
4
Pages
246-265
Citations
33
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

33
OpenAlex

Cite This

Barry James Field (1975). TOWARDS AUTOMATIC INDEXING: AUTOMATIC ASSIGNMENT OF CONTROLLED‐LANGUAGE INDEXING AND CLASSIFICATION FROM FREE INDEXING. Journal of Documentation , 31 (4) , 246-265. https://doi.org/10.1108/eb026605

Identifiers

DOI
10.1108/eb026605