Abstract:
Time series in many fields has the problem of excessively high dimensionality and severe multiple correlations between variables. As such, we propose a correlation variable selection partial least squares regression (CVS-PLSR) modeling algorithm. This algorithm introduces correlation-based feature selection to obtain an optimal feature subset that reduces data dimensionality. To solve the damage caused by multiple correlations between variables effectively, we choose PLSR as the core algorithm for modeling. We verify the proposed algorithm by using pulp element-grade prediction data. The model obtained by the improved CVS-PLSR algorithm is the simplest. The root mean square error of prediction is only 1.690 2, and the correlation between the predicted value and the measured value is over 97%. The simulation and comparison results of the model evaluation index show that the proposed algorithm has good practicability and robustness. The obtained model is more simplified and more accurate than other algorithms.