摘要
针对典型深度强化学习算法若干显性固有弊端,提出了一种改进深度强化学习算法,设计了基于改进深度强化学习算法的电力市场监测模型。引入智能体(agent)机制,Agent执行动作(action)并把当前收益(reward)和未来收益反馈给环境(environment)模拟策略网络,在有限马尔科夫决策过程中引入多重Q网络机制实现深度估值网络。以国家电网某电力公司为效能评价载体,基于谷歌的Tensorflow 1.2.1和OpenAI的Gym 0.9.2环境开发了验证环境并对模型进行了实证分析,仿真验证结果表明所提模型可以在较短的时间内处理多维波动非线性电力市场监预测模型,在稳定性、监测自主性、预测准确性、对抗环境下的模型性能等方面具有明显优势。
Aiming at some obvious inherent drawbacks of typical deep reinforcement learning algorithm,an improved deep reinforcement learning algorithm is proposed,and a power market monitoring model based on the improved deep reinforcement learning algorithm is designed.Agent mechanism is introduced,agent performs action and feeds current and future returns back to environment to simulate strategy network,Q-learning mechanism is introduced to realize deep valuation network in finite Markov decision-making process.Based on Tensorflow 1.2.1 of google and Gym 0.9.2 of OpenAI,a validation environment is developed for a power company of state grid.The simulation results show that the proposed model can deal with the multi-dimensional fluctuation non-linear power market monitoring and forecasting model in a relatively short time.It has obvious advantages in stability,monitoring autonomy,prediction accuracy and model performance in confrontation environment.
作者
许杨子
强文
刘俊
孙鸿雁
胡成刚
Xu Yangzi;Qiang Wen;Liu Jun;Sun Hongyan;Hu Chenggang(State Grid Corporation Shaanxi Electric Power Company Electric Power Trading Center,Xi'an 710004,China;Sichuan Province Zhongdian Qimingxing Information Technology Co.,Ltd.,Chengdu 610041,China)
出处
《国外电子测量技术》
2020年第1期82-87,共6页
Foreign Electronic Measurement Technology
基金
中国南方电网公司科技项目(GZHKJXM20160055)资助.