Abstract

Long short-term memory (LSTM) can solve many tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams without explicitly marked sequence ends. Without resets, the internal state values may grow indefinitely and eventually cause the network to break down. Our remedy is an adaptive gate that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review an illustrative benchmark problem on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve a continual version of that problem. LSTM with forget gates, however, easily solves it in an elegant way.

Keywords

Benchmark (surveying)Recurrent neural networkComputer scienceArtificial intelligenceReset (finance)Sequence (biology)Long short term memoryState (computer science)Deep learningMachine learningArtificial neural networkAlgorithm

Affiliated Institutions

Related Publications

Publication Info

Year
1999
Type
article
Volume
1999
Pages
850-855
Citations
2376
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

2376
OpenAlex
1155
CrossRef

Cite This

Felix A. Gers, J. Schmidhuber, Fred Cummins (1999). Learning to forget: continual prediction with LSTM. 9th International Conference on Artificial Neural Networks: ICANN '99 , 1999 , 850-855. https://doi.org/10.1049/cp:19991218

Identifiers

DOI
10.1049/cp:19991218

Data Quality

Data completeness: 77%