Abstract:
The large-scale integration of new energy sources into the distribution system has led to a significant increase in node voltage fluctuations. Traditional reactive power voltage control methods suffer from issues such as computational complexity, reliance on precise model parameters, and difficulty in real-time response to load disturbances. To address this, we propose a two-stage deep reinforcement learning method based on adversarial proximal policy optimization (APPO) to achieve real-time responsive control of reactive voltage in the distribution network. First, we transform the reactive power voltage control problem into an adversarial Markov decision process (AMDP), introducing a main agent and an adversarial agent: The main agent is dedicated to optimizing the control strategy to cope with the uncertainty brought by the integration of new energy sources, while the adversarial agent enhances the disturbance rejection robustness of the main agent by simulating extreme load disturbances. Then, we propose a two-stage training framework combining adversarial training and real-time adaptation: In the adversarial training stage, the generalization ability of the main agent's strategy is improved through load disturbances; In the real-time adaptation stage, the main agent quickly adapts to the real-time environment using the network parameters obtained from the adversarial training stage. Finally, we verify the effectiveness of the proposed method through a simulation example of the improved IEEE 33 node distribution network. The results show that this method significantly reduces the system network loss and voltage violation rate, and improves the robustness and adaptability of the system in disturbed environments.