Thumbs up? Sentiment Classification using Machine Learning Techniques

Bo Pang; Lillian Lee; Shivakumar Vaithyanathan

doi:10.48550/arxiv.cs/0205070

Abstract

We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three machine learning methods we employed (Naive Bayes, maximum entropy classification, and support vector machines) do not perform as well on sentiment classification as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classification problem more challenging.

Keywords

Naive Bayes classifierComputer scienceCategorizationArtificial intelligenceSentiment analysisSupport vector machineMachine learningPrinciple of maximum entropyNatural language processing

Affiliated Institutions

Related Publications

Thumbs up?

Bo Pang , Lillian Lee , Shivakumar Vaithyanathan

We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data,...

2002 Proceedings of the ACL-02 conference ... 6965 citations

A sentimental education

Bo Pang , Lillian Lee

Sentiment analysis seeks to identify the viewpoint(s) underlying a text span; an example application is classifying a movie review as "thumbs up" or "thumbs down". To determine ...

2004 3318 citations

Seeing stars

Bo Pang , Lillian Lee

We address the rating-inference problem, wherein rather than simply decide whether a review is "thumbs up" or "thumbs down", as in previous sentiment analysis work, one must det...

2005 2121 citations

Advances in kernel methods: support vector learning

Bernhard Schölkopf , Christopher J. C. Burges , Alexander J. Smola

Introduction to support vector learning roadmap. Part 1 Theory: three remarks on the support vector method of function estimation, Vladimir Vapnik generalization performance of ...

1999 International Conference on Neural In... 5814 citations

Seeing stars when there aren't many stars

Andrew B. Goldberg , Xiaojin Zhu

We present a graph-based semi-supervised learning algorithm to address the sentiment analysis task of rating inference. Given a set of documents (e.g., movie reviews) and accomp...

2006 317 citations

Publication Info

Year: 2002
Type: preprint
Citations: 2207
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Thumbs up? Sentiment Classification using Machine Learning Techniques

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

2207

OpenAlex

Cite This

APA Style

                            
                                    Bo Pang, 
                                
                                    Lillian Lee, 
                                
                                    Shivakumar Vaithyanathan
                                
                            (2002). 
                            Thumbs up? Sentiment Classification using Machine Learning Techniques. 
                            arXiv (Cornell University)
                            
                            .
                            https://doi.org/10.48550/arxiv.cs/0205070

Identifiers

DOI: 10.48550/arxiv.cs/0205070