Composition in Distributional Models of Semantics

Abstract

Abstract Vector‐based models of word meaning have become increasingly popular in cognitive science. The appeal of these models lies in their ability to represent meaning simply by using distributional information under the assumption that words occurring within similar contexts are semantically similar. Despite their widespread use, vector‐based models are typically directed at representing words in isolation, and methods for constructing representations for phrases or sentences have received little attention in the literature. This is in marked contrast to experimental evidence (e.g., in sentential priming) suggesting that semantic similarity is more complex than simply a relation between isolated words. This article proposes a framework for representing the meaning of word combinations in vector space. Central to our approach is vector composition, which we operationalize in terms of additive and multiplicative functions. Under this framework, we introduce a wide range of composition models that we evaluate empirically on a phrase similarity task.

Keywords

Natural language processingComputer scienceDistributional semanticsOperationalizationPhraseArtificial intelligenceMeaning (existential)Word (group theory)Semantic similaritySimilarity (geometry)Semantics (computer science)Priming (agriculture)Multiplicative functionComposition (language)LinguisticsMathematicsPsychology

Affiliated Institutions

University of Edinburgh GB

Related Publications

Compositional-ly Derived Representations of Morphologically Complex Words in Distributional Semantics

Angeliki Lazaridou , Marco Marelli , Roberto Zamparelli +1 more

Speakers of a language can construct an unlimited number of new words through morphological derivation. This is a major cause of data sparseness for corpus-based approaches to l...

2013 109 citations

Dependency-Based Construction of Semantic Space Models

Sebastian Padó , Mirella Lapata

Traditionally, vector-based semantic space models use word co-occurrence counts from large corpora to represent lexical meaning. In this article we present a novel framework for...

2007 Computational Linguistics 695 citations

Compositional Matrix-Space Models for Sentiment Analysis

Ainur Yessenalina , Claire Cardie

We present a general learning-based approach for phrase-level sentiment analysis that adopts an ordinal sentiment scale and is explicitly compositional in nature. Thus, we can m...

2011 164 citations

Distributional Memory: A General Framework for Corpus-Based Semantics

Marco Baroni , Alessandro Lenci

Research into corpus-based semantics has focused on the development of ad hoc models that treat single tasks, or sets of closely related tasks, as unrelated challenges to be tac...

2010 Computational Linguistics 652 citations

Distributed Representations of Words and Phrases and their Compositionality

Tomáš Mikolov , Ilya Sutskever , Kai Chen +2 more

The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syn...

2013 arXiv (Cornell University) 18057 citations

Publication Info

Year: 2010
Type: article
Volume: 34
Issue: 8
Pages: 1388-1429
Citations: 967
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Composition in Distributional Models of Semantics

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

967

OpenAlex

Cite This

APA Style

                            
                                    Jeff Mitchell, 
                                
                                    Mirella Lapata
                                
                            (2010). 
                            Composition in Distributional Models of Semantics. 
                            Cognitive Science
                            , 34
                            (8)
                            , 1388-1429.
                            https://doi.org/10.1111/j.1551-6709.2010.01106.x

Identifiers

DOI: 10.1111/j.1551-6709.2010.01106.x