Abstract
This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down). The classification of a review is predicted by the average semantic orientation of the phrases in the review that contain adjectives or adverbs. A phrase has a positive semantic orientation when it has good associations (e.g., "subtle nuances") and a negative semantic orientation when it has bad associations (e.g., "very cavalier"). In this paper, the semantic orientation of a phrase is calculated as the mutual information between the given phrase and the word "excellent" minus the mutual information between the given phrase and the word "poor". A review is classified as recommended if the average semantic orientation of its phrases is positive. The algorithm achieves an average accuracy of 74% when evaluated on 410 reviews from Epinions, sampled from four different domains (reviews of automobiles, banks, movies, and travel destinations). The accuracy ranges from 84% for automobile reviews to 66% for movie reviews.
Keywords
Affiliated Institutions
Related Publications
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews
This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down). The classification of a review ...
Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus
The evaluative character of a word is called its semantic orientation. A positive semantic orientation implies desirability (e.g., "honest", "intrepid") and a negative semantic ...
Technical terminology: some linguistic properties and an algorithm for identification in text
Abstract This paper identifies some linguistic properties of technical terminology, and uses them to formulate an algorithm for identifying technical terms in running text. The ...
Seeing stars
We address the rating-inference problem, wherein rather than simply decide whether a review is "thumbs up" or "thumbs down", as in previous sentiment analysis work, one must det...
Composition in Distributional Models of Semantics
Abstract Vector‐based models of word meaning have become increasingly popular in cognitive science. The appeal of these models lies in their ability to represent meaning simply ...
Publication Info
- Year
- 2002
- Type
- preprint
- Citations
- 1580
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.48550/arxiv.cs/0212032