FIRST: Flexible Information Retrieval System for Text

Abstract

Abstract An on‐line document retrieval system is described which combines a data base management system with automatic processing of natural language queries and abstracts. Data consists of an abstract, from which index terms are automatically extracted, along with bibliographic and descriptive information. The data base management system is used to store bibliographic and descriptive information, providing direct access to documents with specified bibliographic or descriptor items. Methods originally developed in the SMART project are used for abstract analysis: stemming algorithm, cosine function for query‐document comparisons, ranked output, and clustered document collection. Searches are entered and performed on‐line, with output consisting of document abstracts ranked in decreasing order of similarity with the query. Additional facilities include off‐line searches, SDI, and display of data base statistics. Future plans and improvements are also discussed.

Keywords

Computer scienceInformation retrievalCosine similarityDocument retrievalIndex (typography)Base (topology)Knowledge baseSimilarity (geometry)Function (biology)Vector space modelDatabaseData miningWorld Wide WebArtificial intelligenceCluster analysisMathematics

Affiliated Institutions

Xerox (United States) US

Related Publications

BLAST+: architecture and applications

Christiam Camacho , George Coulouris , Vahram Avagyan +4 more

Abstract Background Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its ...

2009 BMC Bioinformatics 21022 citations

The Grid file: A data structure designed to support proximity queries on spatial objects

Klaus Hinrichs , J. Nievergelt

Abstract : This document describes a technique for storing large sets of spatial objects so that proximity queries are handled efficiently as part of the accessing mechanism. Th...

1983 Repository for Publications and Resea... 41 citations

Document Language Models, Query Models, and Risk Minimization for Information Retrieval

John Lafferty , ChengXiang Zhai

We present a framework for information retrieval that combines document models and query models using a probabilistic ranking function based on Bayesian decision theory. The fra...

2017 ACM SIGIR Forum 772 citations

Secure statistical databases with random sample queries

Dorothy E. Denning

A new inference control, called random sample queries, is proposed for safeguarding confidential data in on-line statistical databases. The random sample queries control deals d...

1980 ACM Transactions on Database Systems 202 citations

HMMER web server: interactive sequence similarity searching

ROBERT FINN , Jody Clements , Sean R. Eddy

HMMER is a software suite for protein sequence similarity searches using probabilistic methods. Previously, HMMER has mainly been available only as a computationally intensive U...

2011 Nucleic Acids Research 6038 citations

Publication Info

Year: 1979
Type: article
Volume: 30
Issue: 1
Pages: 9-14
Citations: 30
Access: Closed

External Links

Download PDF (Free) View on DOI.org Semantic Scholar

Social Impact

Altmetric

FIRST: Flexible Information Retrieval System for Text

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

OpenAlex

Influential

CrossRef

Cite This

APA Style

                            
                                    Robert T. Dattola
                                
                            (1979). 
                            FIRST: Flexible Information Retrieval System for Text. 
                            Journal of the American Society for Information Science
                            , 30
                            (1)
                            , 9-14.
                            https://doi.org/10.1002/asi.4630300103

Identifiers

DOI: 10.1002/asi.4630300103

Data Quality

Data completeness: 77%