Abstract
Resampling methods are commonly used for dealing with the class‐imbalance problem. Their advantage over other methods is that they are external and thus, easily transportable. Although such approaches can be very simple to implement, tuning them most effectively is not an easy task. In particular, it is unclear whether oversampling is more effective than undersampling and which oversampling or undersampling rate should be used. This paper presents an experimental study of these questions and concludes that combining different expressions of the resampling approach is an effective solution to the tuning problem. The proposed combination scheme is evaluated on imbalanced subsets of the Reuters‐21578 text collection and is shown to be quite effective for these problems.
Keywords
Affiliated Institutions
Related Publications
A systematic study of the class imbalance problem in convolutional neural networks
In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks (CNNs) and compare frequently used meth...
Survey on deep learning with class imbalance
Abstract The purpose of this study is to examine existing deep learning techniques for addressing class imbalanced data. Effective classification with imbalanced data is an impo...
SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary
The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is considered "de facto" standard in the framework of learning from imbalanced data. This is due to...
Class imbalances versus small disjuncts
It is often assumed that class imbalances are responsible for significant losses of performance in standard classifiers. The purpose of this paper is to the question whether cla...
A study of the behavior of several methods for balancing machine learning training data
There are several aspects that might influence the performance achieved by existing learning systems. It has been reported that one of these aspects is related to class imbalanc...
Publication Info
- Year
- 2004
- Type
- article
- Volume
- 20
- Issue
- 1
- Pages
- 18-36
- Citations
- 1006
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1111/j.0824-7935.2004.t01-1-00228.x