Training with Noise is Equivalent to Tikhonov Regularization

1995 Neural Computation 1,239 citations

Abstract

It is well known that the addition of noise to the input data of a neural network during training can, in some circumstances, lead to significant improvements in generalization performance. Previous work has shown that such training with noise is equivalent to a form of regularization in which an extra term is added to the error function. However, the regularization term, which involves second derivatives of the error function, is not bounded below, and so can lead to difficulties if used directly in a learning algorithm based on error minimization. In this paper we show that for the purposes of network training, the regularization term can be reduced to a positive semi-definite form that involves only first derivatives of the network mapping. For a sum-of-squares error function, the regularization term belongs to the class of generalized Tikhonov regularizers. Direct minimization of the regularized error function provides a practical alternative to training with noise.

Keywords

Tikhonov regularizationRegularization perspectives on support vector machinesRegularization (linguistics)Artificial neural networkMinificationBackus–Gilbert methodEarly stoppingMathematicsError functionBounded functionComputer scienceNoise (video)Proximal gradient methods for learningApplied mathematicsAlgorithmMathematical optimizationArtificial intelligenceInverse problemMathematical analysis

Affiliated Institutions

Related Publications

Publication Info

Year
1995
Type
article
Volume
7
Issue
1
Pages
108-116
Citations
1239
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1239
OpenAlex

Cite This

Chris Bishop (1995). Training with Noise is Equivalent to Tikhonov Regularization. Neural Computation , 7 (1) , 108-116. https://doi.org/10.1162/neco.1995.7.1.108

Identifiers

DOI
10.1162/neco.1995.7.1.108