Keywords
Affiliated Institutions
Related Publications
Greedy function approximation: A gradient boosting machine.
Function estimation/approximation is viewed from the perspective\nof numerical optimization in function space, rather than parameter space. A\nconnection is made between stagewi...
Training Recurrent Networks by Evolino
In recent years, gradient-based LSTM recurrent neural networks (RNNs) solved many previously RNN-unlearnable tasks. Sometimes, however, gradient information is of little use for...
Learning long-term dependencies with gradient descent is difficult
Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties hav...
ADADELTA: An Adaptive Learning Rate Method
We present a novel per-dimension learning rate method for gradient descent called ADADELTA. The method dynamically adapts over time using only first order information and has mi...
Optimization for training neural nets
Various techniques of optimizing criterion functions to train neural-net classifiers are investigated. These techniques include three standard deterministic techniques (variable...
Publication Info
- Year
- 1990
- Type
- article
- Volume
- 6
- Issue
- 2
- Pages
- 192-198
- Citations
- 241
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1016/0885-064x(90)90006-y