On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

Kyunghyun Cho; Bart van Merriënboer; Dzmitry Bahdanau; Yoshua Bengio

doi:10.3115/v1/w14-4012

Abstract

Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks.The neural machine translation models often consist of an encoder and a decoder.The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation.In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder-Decoder and a newly proposed gated recursive convolutional neural network.We show that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.Furthermore, we find that the proposed gated recursive convolutional network learns a grammatical structure of a sentence automatically.

Keywords

Computer scienceMachine translationEncoderTranslation (biology)Artificial intelligenceSpeech recognitionOperating system

Affiliated Institutions

Related Publications

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau , Kyunghyun Cho , Yoshua Bengio

Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at...

2014 arXiv (Cornell University) 14564 citations

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Yonghui Wu , Mike Schuster , Zhifeng Chen +28 more

Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based...

2016 arXiv (Cornell University) 5624 citations

Attention Is All You Need

Ashish Vaswani , Noam Shazeer , Niki Parmar +5 more

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also co...

2025 6466 citations

Skip-Thought Vectors

Ryan Kiros , Yukun Zhu , Ruslan Salakhutdinov +4 more

We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tri...

2015 arXiv (Cornell University) 723 citations

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Tao Shen , Tianyi Zhou , Guodong Long +3 more

Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively. Attention mechanisms ...

2018 Proceedings of the AAAI Conference on... 729 citations

Publication Info

Year: 2014
Type: preprint
Pages: 103-111
Citations: 6358
Access: Closed

External Links

Download PDF (Free) View on DOI.org arXiv Semantic Scholar

Social Impact

Altmetric

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

6358

OpenAlex

891

Influential

3978

CrossRef

Cite This

APA Style

                            
                                    Kyunghyun Cho, 
                                
                                    Bart van Merriënboer, 
                                
                                    Dzmitry Bahdanau
                                
                                et al.
                            
                            (2014). 
                            On the Properties of Neural Machine Translation: Encoder–Decoder Approaches. 
                            Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation
                            
                            , 103-111.
                            https://doi.org/10.3115/v1/w14-4012

Identifiers

DOI: 10.3115/v1/w14-4012
arXiv: 1409.1259

Data Quality

Data completeness: 84%