Abstract
Many neural network classifiers provide outputs which estimate Bayesian a posteriori probabilities. When the estimation is accurate, network outputs can be treated as probabilities and sum to one. Simple proofs show that Bayesian probabilities are estimated when desired network outputs are 1 of M (one output unity, all others zero) and a squared-error or cross-entropy cost function is used. Results of Monte Carlo simulations performed using multilayer perceptron (MLP) networks trained with backpropagation, radial basis function (RBF) networks, and high-order polynomial networks graphically demonstrate that network outputs provide good estimates of Bayesian probabilities. Estimation accuracy depends on network complexity, the amount of training data, and the degree to which training data reflect true likelihood distributions and a priori class probabilities. Interpretation of network outputs as Bayesian probabilities allows outputs from multiple networks to be combined for higher level decision making, simplifies creation of rejection thresholds, makes it possible to compensate for differences between pattern class probabilities in training and test data, allows outputs to be used to minimize alternative risk functions, and suggests alternative measures of network performance.
Keywords
Affiliated Institutions
Related Publications
A probabilistic approach to the understanding and training of neural network classifiers
It is shown that training a neural network using a mean-square-error criterion gives network outputs that approximate posterior class probabilities. Based on this probabilistic ...
Circular backpropagation networks for classification
The class of mapping networks is a general family of tools to perform a wide variety of tasks. This paper presents a standardized, uniform representation for this class of netwo...
Backpropagation training for multilayer conditional random field based phone recognition
Conditional random fields (CRFs) have recently found increased popularity in automatic speech recognition (ASR) applications. CRFs have previously been shown to be effective com...
Links between Markov models and multilayer perceptrons
The statistical use of a particular classic form of a connectionist system, the multilayer perceptron (MLP), is described in the context of the recognition of continuous speech....
A Second-Order Perceptron Algorithm
Kernel-based linear-threshold algorithms, such as support vector machines and Perceptron-like algorithms, are among the best available techniques for solving pattern classificat...
Publication Info
- Year
- 1991
- Type
- article
- Volume
- 3
- Issue
- 4
- Pages
- 461-483
- Citations
- 985
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1162/neco.1991.3.4.461