Active Learning with Statistical Models

Abstract

For many types of machine learning algorithms, one can compute the statistically `optimal' way to select training data. In this paper, we review how optimal data selection techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are computationally expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate. Empirically, we observe that the optimality criterion sharply decreases the number of training examples the learner needs in order to achieve good performance.

Keywords

Computer scienceMachine learningArtificial intelligenceArtificial neural networkRegressionSelection (genetic algorithm)Feedforward neural networkTraining setMathematicsStatistics

Affiliated Institutions

Massachusetts Institute of Technology US

Related Publications

Gaussian Processes for Machine Learning

Carl Edward Rasmussen , Christopher K. I. Williams

We give a basic introduction to Gaussian Process regression models. We focus on understanding the role of the stochastic process and how it is used to define a distribution over...

2005 The MIT Press eBooks 10408 citations

A new optimizer using particle swarm theory

R.C. Eberhart , James Kennedy

The optimization of nonlinear functions using particle swarm methodology is described. Implementations of two paradigms are discussed and compared, including a recently develope...

2002 14646 citations

Improving neural networks by preventing co-adaptation of feature detectors

Geoffrey E. Hinton , Nitish Srivastava , Alex Krizhevsky +2 more

When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly...

2012 arXiv (Cornell University) 6630 citations

Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy

Hanchuan Peng , Fuhui Long , Chen Ding

Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion base...

2005 IEEE Transactions on Pattern Analysis... 10050 citations

A survey on Image Data Augmentation for Deep Learning

Connor Shorten , Taghi M. Khoshgoftaar

Abstract Deep convolutional neural networks have performed remarkably well on many Computer Vision tasks. However, these networks are heavily reliant on big data to avoid overfi...

2019 Journal Of Big Data 11041 citations

Publication Info

Year: 1996
Type: article
Volume: 4
Pages: 129-145
Citations: 1241
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Active Learning with Statistical Models

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1241

OpenAlex

Cite This

APA Style

                            
                                    David Cohn, 
                                
                                    Zoubin Ghahramani, 
                                
                                    Michael I. Jordan
                                
                            (1996). 
                            Active Learning with Statistical Models. 
                            Journal of Artificial Intelligence Research
                            , 4
                            
                            , 129-145.
                            https://doi.org/10.1613/jair.295

Identifiers

DOI: 10.1613/jair.295