Better Word Representations with Recursive Neural Networks for Morphology

Abstract

Vector-space word representations have been very successful in recent years at improving performance across a variety of NLP tasks. However, common to most existing work, words are regarded as independent entities without any explicit relationship among morphologically related words being modeled. As a result, rare and complex words are often poorly estimated, and all unknown words are represented in a rather crude way using only one or a few vectors. This paper addresses this shortcoming by proposing a novel model that is capable of building representations for morphologically complex words from their morphemes. We combine recursive neural networks (RNNs), where each morpheme is a basic unit, with neural language models (NLMs) to consider contextual information in learning morphologicallyaware word representations. Our learned models outperform existing word representations by a good margin on word similarity tasks across many datasets, including a new dataset we introduce focused on rare words to complement existing ones in an interesting way.

Keywords

MorphemeComputer scienceWord (group theory)Artificial intelligenceMargin (machine learning)Natural language processingSimilarity (geometry)Recurrent neural networkVariety (cybernetics)Complement (music)Artificial neural networkSpace (punctuation)Machine learningLinguistics

Affiliated Institutions

Stanford University US

Related Publications

Glove: Global Vectors for Word Representation

Jeffrey Pennington , Richard Socher , Christopher D. Manning

Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the o...

2014 32840 citations

Unsupervised Feature Learning via Non-parametric Instance Discrimination

Zhirong Wu , Yuanjun Xiong , Stella X. Yu +1 more

Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether...

2018 3435 citations

Momentum Contrast for Unsupervised Visual Representation Learning

Kaiming He , Haoqi Fan , Yuxin Wu +2 more

We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic diction...

2020 2020 IEEE/CVF Conference on Computer ... 11112 citations

Attention Is All You Need

Ashish Vaswani , Noam Shazeer , Niki Parmar +5 more

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also co...

2025 6466 citations

ALBERT: A Lite BERT for Self-supervised Learning of Language\n Representations

Zhenzhong Lan , Mingda Chen , Sebastian Goodman +3 more

Increasing model size when pretraining natural language representations often\nresults in improved performance on downstream tasks. However, at some point\nfurther model increas...

2019 arXiv (Cornell University) 4051 citations

Publication Info

Year: 2013
Type: article
Pages: 104-113
Citations: 810
Access: Closed

External Links

Citation Metrics

810

OpenAlex

Cite This

APA Style

                            
                                    Thang Luong, 
                                
                                    Richard Socher, 
                                
                                    Christopher D. Manning
                                
                            (2013). 
                            Better Word Representations with Recursive Neural Networks for Morphology. 
                            Conference on Computational Natural Language Learning
                            
                            , 104-113.