基于分布式随机森林的火电厂燃烧系统设备建模方法

Equipment Modeling Method of Combustion System in Thermal Power Plant Based on Distributed Random Forest

  • 摘要: 本文提出一种基于分布式框架改进随机森林算法的火电厂燃烧系统设备数据建模方法,即利用多元共线性检验改良逐步回归,筛选工业流程中的最优变量.将处理后的变量数据应用于Hadoop平台下,结合Mapreduce和Spark分布式框架,对传统随机森林算法进行并行式优化.研究结果表明基于Hadoop的分布式随机森林算法有效地提升了训练效率和数据处理速度,建立的模型具有较高准确度,泛化能力较强,具有较高的工业研究价值.

     

    Abstract: In this paper, a data modeling method based on distributed framework and improved random forest algorithm is proposed for the equipment of combustion system in thermal power plant. That is to use the multivariate collinearity test improved stepwise regression to screen the optimal variables in the industrial process. The processed variable data is applied to Hadoop platform, and the parallel optimization of traditional random forest algorithm is carried out by combining with Mapreduce and Spark distributed framework. The research results show that the distributed random forest algorithm based on Hadoop effectively improves the training efficiency and data processing speed. The model established by the distributed random forest algorithm has high accuracy, strong generalization ability, and valuable industrial impacts and applications.

     

/

返回文章
返回