Long Short-Term Memory | RDL Research Database

Abstract

Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

Keywords

Term (time)Short-term memoryComputer scienceMathematicsPsychologyNeuroscienceCognitionWorking memoryPhysics

Affiliated Institutions

Related Publications

Training Very Deep Networks

Rupesh K. Srivastava , Klaus Greff , Jürgen Schmidhuber

Theoretical and empirical evidence indicates that the depth of neural networks is crucial for their success. However, training becomes more difficult as depth increases, and tra...

2015 arXiv (Cornell University) 1100 citations

A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures

Yong Yu , Xiaosheng Si , Changhua Hu +1 more

Recurrent neural networks (RNNs) have been widely adopted in research areas concerned with sequential data, such as text, audio, and video. However, RNNs consisting of sigma cel...

2019 Neural Computation 4793 citations

Publication Info

Year: 1997
Type: article
Volume: 9
Issue: 8
Pages: 1735-1780
Citations: 90535
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Long Short-Term Memory

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

90535

OpenAlex

Cite This

APA Style

                            
                                    Sepp Hochreiter, 
                                
                                    Jürgen Schmidhuber
                                
                            (1997). 
                            Long Short-Term Memory. 
                            Neural Computation
                            , 9
                            (8)
                            , 1735-1780.
                            https://doi.org/10.1162/neco.1997.9.8.1735

Identifiers

DOI: 10.1162/neco.1997.9.8.1735