Oversampling Algorithms for Gaussian-type Data Based on Generative Adversarial Networks
-
Graphical Abstract
-
Abstract
To solve the problem of reduced classification effectiveness due to the tendency to favor some classes in unbalanced data classification, we propose a Monte Carlo oversampling algorithm based on generative adversarial networks (GANs). First, we simulate the probability density function of the minority class data using GANs and determine the oversampling weights of the minority class data using the probability density values of the minority class data. Second, to ensure the diversity of the generated data, we use a Monte Carlo algorithm to oversample a few classes of data. Simultaneously, to avoid crossover and overlapping with the majority class, we introduce the 3σ rule to flip the data of the minority class into the 3σ interval of the majority class, which balances the dataset. Finally, we select seven datasets from the UCI and KEEL databases for algorithm experiments and use the decision tree classifier as the base classifier to classify the data. The experimental results show that the proposed algorithm is more effective than the comparison algorithms.
-
-