Regularization Theory and Neural Networks Architectures

Abstract

We had previously shown that regularization principles lead to approximation schemes that are equivalent to networks with one layer of hidden units, called regularization networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known radial basis functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends radial basis functions (RBF) to hyper basis functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, some forms of projection pursuit regression, and several types of neural networks. We propose to use the term generalized regularization networks for this broad class of approximation schemes that follow from an extension of regularization. In the probabilistic interpretation of regularization, the different classes of basis functions correspond to different classes of prior probabilities on the approximating function spaces, and therefore to different types of smoothness assumptions. In summary, different multilayer networks with one hidden layer, which we collectively call generalized regularization networks, correspond to different classes of priors and associated smoothness functionals in a classical regularization principle. Three broad classes are (1) radial basis functions that can be generalized to hyper basis functions, (2) some tensor product splines, and (3) additive splines that can be generalized to schemes of the type of ridge approximation, hinge functions, and several perceptron-like neural networks with one hidden layer.

Keywords

Regularization (linguistics)MathematicsBasis functionRadial basis functionApplied mathematicsArtificial neural networkRegularization perspectives on support vector machinesInverse problemAlgorithmComputer scienceArtificial intelligenceMathematical analysis

Affiliated Institutions

Massachusetts Institute of Technology US

Related Publications

An Equivalence Between Sparse Approximation and Support Vector Machines

Federico Girosi

This article shows a relationship between two different approximation techniques: the support vector machines (SVM), proposed by V. Vapnik (1995) and a sparse approximation sche...

1998 Neural Computation 505 citations

Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes

Guang-Bin Huang , Lei Chen , Chee‐Kheong Siew

According to conventional neural network theories, single-hidden-layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes are universal appro...

2006 IEEE Transactions on Neural Networks 2586 citations

Bootstrapping with Noise: An Effective Regularization Technique

Yuval Raviv , Nathan Intrator

Bootstrap samples with noise are shown to be an effective smoothness and capacity control technique for training feedforward networks and for other statistical methods such as g...

1996 Connection Science 190 citations

Projection-Based Approximation and a Duality with Kernel Methods

David L. Donoho , Iain M. Johnstone

Projection pursuit regression and kernel regression are methods for estimating a smooth function of several variables from noisy data obtained at scattered sites. Methods based ...

1989 The Annals of Statistics 135 citations

A Practical Bayesian Framework for Backpropagation Networks

David Mackay

A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible (1) objective comparisons between sol...

1992 Neural Computation 2841 citations

Publication Info

Year: 1995
Type: article
Volume: 7
Issue: 2
Pages: 219-269
Citations: 1344
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Regularization Theory and Neural Networks Architectures

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1344

OpenAlex

Cite This

APA Style

                            
                                    Federico Girosi, 
                                
                                    Michael Jones, 
                                
                                    Tomaso Poggio
                                
                            (1995). 
                            Regularization Theory and Neural Networks Architectures. 
                            Neural Computation
                            , 7
                            (2)
                            , 219-269.
                            https://doi.org/10.1162/neco.1995.7.2.219

Identifiers

DOI: 10.1162/neco.1995.7.2.219