Abstract

We present extensions to a continuousstate dependency parsing method that makes it applicable to morphologically rich languages. Starting with a highperformance transition-based parser that uses long short-term memory (LSTM) recurrent neural networks to learn representations of the parser state, we replace lookup-based word representations with representations constructed from the orthographic representations of the words, also using LSTMs. This allows statistical sharing across word forms that are similar on the surface. Experiments for morphologically rich languages show that the parsing model benefits from incorporating the character-based encodings of words.

Keywords

Computer scienceParsingNatural language processingArtificial intelligenceTransition (genetics)Word (group theory)Character (mathematics)Dependency grammarDependency (UML)Recurrent neural networkArtificial neural networkLinguistics

Affiliated Institutions

Related Publications

Finding Structure in Time

Time underlies many interesting human behaviors. Thus, the question of how to represent time in connectionist models is very important. One approach is to represent time implici...

1990 Cognitive Science 10427 citations

Publication Info

Year
2015
Type
article
Pages
349-359
Citations
244
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

244
OpenAlex

Cite This

Miguel Ballesteros, Chris Dyer, Noah A. Smith (2015). Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs. , 349-359. https://doi.org/10.18653/v1/d15-1041

Identifiers

DOI
10.18653/v1/d15-1041