Abstract
Overfitting is a fundamental issue in supervised machine learning which prevents us from perfectly generalizing the models to well fit observed data on training data, as well as unseen data on testing set. Because of the presence of noise, the limited size of training set, and the complexity of classifiers, overfitting happens. This paper is going to talk about overfitting from the perspectives of causes and solutions. To reduce the effects of overfitting, various strategies are proposed to address to these causes: 1) "early-stopping" strategy is introduced to prevent overfitting by stopping training before the performance stops optimize; 2) "network-reduction" strategy is used to exclude the noises in training set; 3) "data-expansion" strategy is proposed for complicated models to fine-tune the hyper-parameters sets with a great amount of data; and 4) "regularization" strategy is proposed to guarantee models performance to a great extent while dealing with real world issues by feature-selection, and by distinguishing more useful and less useful features.
Keywords
Related Publications
Unbiased Recursive Partitioning: A Conditional Inference Framework
Recursive binary partitioning is a popular tool for regression analysis. Two fundamental problems of exhaustive search procedures usually applied to fit such models have been kn...
Feature Selection and Dualities in Maximum Entropy Discrimination
Incorporating feature selection into a classification or regression method often carries a number of advantages. In this paper we formalize feature selection specifically from a...
When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs
Remote sensing image scene classification is an active and challenging task driven by many applications. More recently, with the advances of deep learning models especially conv...
Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Variable selection is fundamental to high-dimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which c...
Feature selection: evaluation, application, and small sample performance
A large number of algorithms have been proposed for feature subset selection. Our experimental results show that the sequential forward floating selection algorithm, proposed by...
Publication Info
- Year
- 2019
- Type
- article
- Volume
- 1168
- Pages
- 022022-022022
- Citations
- 2055
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1088/1742-6596/1168/2/022022