Abstract:
Causality analysis is an important research topics in the field of data mining, but traditional Granger causality models have difficulty accurately identifying the nonlinear causality of multivariable systems. We propose a novel Granger causality analysis method based on the HSIC and group Lasso (HSIC-GL) model. Firstly, we use the Hilbert-Schmidt independence criterion (HSIC) to map the input and output samples into the Hilbert space of the reproducing kernel, which overcomes the inability to apply the traditional Granger causality model to nonlinear systems. Then, we establish a regression model with group Lasso constraints, which implements a causality analysis between multivariate and group-derived variables. The Bayesian information criterion is used for model selection, which prevents the artificial setting of the lag order and regularization parameters. Lastly, based on the regression coefficients and the results of significance tests of the HSIC-GL model, a nonlinear causality analysis is performed on the multivariable time series. The effectiveness of the proposed method is verified by the results of simulations of nonlinear and chaotic systems. We successfully applied this method to the air quality index and meteorological time series in Shenyang, China.