融合预测与学习的智慧家庭在线高效能量管理方法

An Online Efficient Smart Home Energy Management Method Integrating Prediction and Learning

  • 摘要: 居民建筑能耗在全球能源消耗中占据重要比重,发展智慧家庭能量管理系统(HEMS)是降低其能耗的关键路径。然而,现有HEMS方法存在明显局限,例如基于模型的方法依赖精确系统建模或参数预测,而基于深度强化学习的方法虽然可以避免上述局限,但其训练样本效率低进而导致策略性能欠优。因此,本文提出一种融合预测与学习的智慧家庭在线高效能量管理方法。首先,在考虑多源不确定性和保证室内用户热舒适的前提下建模智慧家庭运行成本最小化问题。其次,将其重构为马尔可夫决策过程(MDP)。然后,通过将模型预测控制(MPC)框架嵌入智慧家庭能量管理智能体训练过程,构建与智慧家庭运行相关的隐性世界模型实现系统动力学表征。智能体通过与隐性世界模型滚动交互,进而提升样本效率。训练完成后,智慧家庭能量管理智能体可进行在线决策,无需任何参数预测信息。仿真结果表明:在维持室内热舒适的前提下,相较于基于规则的能量管理方法与基于深度确定性策略梯度的能量管理方法,本文方法的运行成本可分别降低20.57%与7.92%。

     

    Abstract: Residential building energy consumption accounts for a significant portion of global energy use, making the development of home energy management systems (HEMS) a crucial pathway to reduce energy consumption. However, existing HEMS methods face notable limitations, i.e., model-based approaches rely on accurate system modeling or parameter prediction, while deep reinforcement learning-based methods, though circumventing such requirements, suffer from low training sample efficiency leading to sub-optimal policy performance. To address these issues, this paper proposes an online energy management method that integrates prediction and learning. First, an optimization problem minimizing operation costs under multi-source uncertainties and indoor thermal comfort constraints is formulated. Subsequently, the problem is reformulated as a Markov decision process (MDP). By embedding a model predictive control (MPC) framework into the HEMS agent training process, an implicit world model is constructed to capture system dynamics, enabling the agent to interact with it in a rolling manner and improve sample efficiency. After finishing the training process, the HEMS agent can realize online decisions without any parameter prediction. Simulation results demonstrate that, while maintaining indoor thermal comfort, the proposed method reduces the operation cost by 20.57% and 7.92% compared to the rule-based energy management method and the deep deterministic policy gradient -based energy management method, respectively.

     

/

返回文章
返回