基于深度强化学习的多状态系统维护策略

金海波; 安晓鹏

doi:10.13976/j.cnki.xk.2024.4382

基于深度强化学习的多状态系统维护策略

Maintenance Strategy for Multi-state Systems Based on Deep Reinforcement Learning

摘要

摘要: 针对传统维护方法在处理不完全观测和多状态系统时在计算上易遭受维度灾难的问题，提出了一种基于深度强化学习的多状态不完全观测系统维护策略。首先，构建基于不完全观测马尔可夫决策过程的维护策略模型；然后，采用深度学习框架进行求解，在该框架下提出了改进的DDQN（Double Deep Q-Network）算法，通过采用优先经验回放技术优化经验回放过程并通过深度神经网络估计价值函数，有效地解决了传统DDQN在训练过程中样本利用率低的问题，从而提高了DDQN算法的学习效率和收敛速度。为验证模型的有效性和改进算法的高效性，以实际的焦炉气制甲醇工艺系统为研究对象给出数值算例。算例结果显示，改进算法在维护效率和系统可靠性方面显著优于传统方法，证明了模型的有效性,为复杂系统的维护问题提供决策支持。

Abstract: Traditional maintenance methods suffer from the curse of dimensionality when dealing with partially observable and multi-state systems. To address this issue, we propose a maintenance strategy for multi-state partially observable systems based on deep reinforcement learning. Firstly, we construct a maintenance strategy model based on a partially observable Markov decision process. Then, we employ a deep learning framework for solving this model by introducing an improved Double Deep Q-Network (DDQN) algorithm. The improved algorithm optimizes the experience replay process through prioritized experience replay and estimates the value function using deep neural networks, which effectively resolves the low sample utilization problem faced by traditional DDQN during training, thereby enhancing the learning efficiency and convergence speed of the DDQN algorithm. To verify the effectiveness of the model and the efficiency of the improved algorithm, numerical examples are provided based on a real coke oven gas to methanol synthesis process system. The results of the examples show that the improved algorithm significantly outperforms traditional methods in terms of maintenance efficiency and system reliability, which demonstrate the validity of the model and provide decision support for complex systems’ maintenance.

HTML全文

参考文献(22)

施引文献

资源附件(0)