raybet体育在线 院报 ›› 2025, Vol. 42 ›› Issue (6): 210-218.DOI: 10.11988/ckyyb.20240431

• 水库群多目标优化调度研究专栏 • 上一篇    

基于RLDE算法的梯级水库发电优化调度方法

陈佳雯1,2(), 祝欣1,2, 汤正阳3, 沈柯言3, 陈晓淋1,2, 覃晖1,2()   

  1. 1 华中科技大学 土木与水利工程学院,武汉 430074
    2 华中科技大学 数字流域科学与技术湖北省重点实验室,武汉 430074
    3 三峡水利枢纽梯级调度通信中心,湖北 宜昌 443000
  • 收稿日期:2024-04-29 修回日期:2024-07-03 出版日期:2025-06-16 发布日期:2025-06-16
  • 通信作者:
    覃 晖(1983-),男,湖北宜城人,教授,博士,研究方向为水库群多目标优化调度。E-mail:
  • 作者简介:

    陈佳雯(1998-),女,福建龙岩人,硕士研究生,研究方向为水库群优化调度。E-mail:

  • 基金资助:
    国家重点研发计划项目(2021YFC3200303); 水利部重大科技项目(SKS-2022120); 湖北省自然科学基金联合基金重点项目(2022CFD027); 中国长江电力股份有限公司资助项目(Z242302044)

Optimal Scheduling Method for Power Generation of Cascade Reservoirs Based on RLDE Algorithm

CHEN Jia-wen1,2(), ZHU Xin1,2, TANG Zheng-yang3, SHEN Ke-yan3, CHEN Xiao-lin1,2, QIN Hui1,2()   

  1. 1 School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan 430074,China
    2 Hubei Key Laboratory of Digital Valley Science and Technology, Huazhong University of Science andTechnology, Wuhan 430074, China
    3 Three Gorges Cascade Dispatch and Communication Center,China Yangtze Power Co., Ltd., Yichang 443000, China
  • Received:2024-04-29 Revised:2024-07-03 Published:2025-06-16 Online:2025-06-16

摘要:

梯级水库群联合运用可以充分发挥流域综合利用价值,但同时梯级水库群优化调度是不易求解复杂的系统性问题,差分进化(DE)算法是一种基于群体差异的启发式并行搜索方法,具有非常优秀的寻优能力,常被应用于水库优化调度模型的求解,但传统DE算法的参数设定及进化策略常由经验确定易出现早熟收敛或搜索停滞等现象。针对DE算法常见问题,提出了耦合强化学习与差分进化的智能算法(RLDE),该算法采用混沌映射提高初始解质量,并通过Q-learning算法实现自适应参数调整从而增加个体多样性,避免早熟收敛问题,同时由于Q-learning算法不断与环境交互反馈的机制,很大程度上降低了搜索停滞的风险。金沙江下游流域实践结果表明:RLDE算法相较于DE算法及自适应遗传算法(AGA)具有优秀的全局寻优能力及鲁棒性,能够有效求解梯级水库群发电优化调度模型,具有一定的工程实际应用价值。

关键词: 梯级水库群, 优化调度, 差分进化, 强化学习, 自适应调参

Abstract:

[Objective] To address the shortcomings of differential evolution (DE) algorithms in cascade reservoir optimization, this study proposes an intelligent algorithm that couples reinforcement learning and differential evolution (RLDE). [Methods] The RLDE algorithm improved the standard DE algorithm through three key strategies: chaotic mapping to enhance initial solution quality, Q-learning-based adaptive parameter adjustment, and a variable step-size strategy. Specifically, (1) chaotic mapping enhanced the initial solution quality. Logistic mapping with the best experimental performance was selected and applied to the population initialization of the RLDE algorithm. (2) The adaptive parameter adjustment was conducted based on the Q-learning algorithm. (3) A variable step-size strategy was designed for the actions in the Q-table, where the precision of action rows gradually increased with the number of iterations. To validate the feasibility and effectiveness of the RLDE algorithm, it was applied to optimize the power generation scheduling model for four major cascade reservoirs (Wudongde, Baihetan, Xiluodu, and Xiangjiaba) on the lower Jinsha River. [Results] (1) The chaotic initialization strategy effectively improved the initial solution quality. The adaptive parameter adjustment strategy based on the Q-learning algorithm enabled the algorithm to continuously adapt by receiving feedback from the environment. This process enhanced population diversity, greatly mitigated problems such as premature convergence or population evolutionary stagnation found in the traditional DE algorithm, thereby improving optimization performance. The variable step-size strategy allowed the algorithm to better respond to environmental feedback, further strengthening the optimization capability of the algorithm. (2) Compared with the traditional DE algorithm and adaptive genetic algorithm, the RLDE algorithm achieved an average annual power generation increase of 2.02% and 2.06%, respectively, under three typical inflow scenarios (wet, normal, and dry). Moreover, the average standard deviation of the proposed algorithm after multiple runs was reduced by an average of 729 million kW·h compared with the traditional DE algorithm, and by 844 million kW·h compared with the adaptive genetic algorithm. [Conclusions] This study proposes an intelligent algorithm that integrates reinforcement learning with differential evolution, effectively addressing issues such as premature convergence and search stagnation in the traditional DE algorithm. The proposed method provides an efficient and reliable solution for the optimal scheduling of cascade reservoirs.

Key words: cascade reservoirs, optimal scheduling, differential evolution, reinforcement learning, adaptive parameter adjustment

中图分类号: 

Baidu
map