Abstract
A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible (1) objective comparisons between solutions using alternative network architectures, (2) objective stopping rules for network pruning or growing procedures, (3) objective choice of magnitude and type of weight decay terms or additive regularizers (for penalizing large weights, etc.), (4) a measure of the effective number of well-determined parameters in a model, (5) quantified estimates of the error bars on network parameters and on network output, and (6) objective comparisons with alternative learning and interpolation models such as splines and radial basis functions. The Bayesian "evidence" automatically embodies "Occam's razor," penalizing overflexible and overcomplex models. The Bayesian approach helps detect poor underlying assumptions in learning models. For learning models well matched to a problem, a good correlation between generalization ability and the Bayesian evidence is obtained.
Keywords
Affiliated Institutions
Related Publications
Multiple Imputation after 18+ Years
Abstract Multiple imputation was designed to handle the problem of missing data in public-use data bases where the data-base constructor and the ultimate user are distinct entit...
Bayesian Regularization and Pruning Using a Laplace Prior
Standard techniques for improved generalization from neural networks include weight decay and pruning. Weight decay has a Bayesian interpretation with the decay function corresp...
An Introduction to Computational Learning Theory
Emphasizing issues of computational efficiency, Michael Kearns and Umesh Vazirani introduce a number of central topics in computational learning theory for researchers and stude...
Training with Noise is Equivalent to Tikhonov Regularization
It is well known that the addition of noise to the input data of a neural network during training can, in some circumstances, lead to significant improvements in generalization ...
Pruning algorithms-a survey
A rule of thumb for obtaining good generalization in systems trained by examples is that one should use the smallest system that will fit the data. Unfortunately, it usually is ...
Publication Info
- Year
- 1992
- Type
- article
- Volume
- 4
- Issue
- 3
- Pages
- 448-472
- Citations
- 2841
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1162/neco.1992.4.3.448