Abstract:
This study proposes an oversampling algorithm based on a time series model to address the rebalancing problem of imbalanced data. First, a method of converting deterministic data into random data is proposed through which minority data are converted into time series. Second, a stationarity test is performed on time series transformed from the minority class, and stationary processing is carried out. Third, the stationary series is fitted to obtain a suitable time series model and forecast the minority class. In this way, the datasets are balanced. Lastly, six datasets are selected from UCI and KEEL repositories, and the proposed algorithm is compared with other common oversampling algorithms. A decision tree classifier is utilized to perform classification experiments. Evaluation indicators are used to examine the results of classification experiments. The results show the effectiveness of the proposed algorithm.