基于策略约束强化学习的算网多目标优化研究

Research on constrained policy reinforcement learning based multi-objective optimization of computing power network

下载PDF

导出

摘要算力网络需要在满足用户业务需求的基础上最大化系统性能指标,现有方法主要通过多目标加权进行转换和求解,存在超参数难以确定、跨场景适用性差等问题。在分析算网目标特性的基础上,基于策略约束强化学习,将业务需求作为约束、系统性能指标作为优化目标,通过价值—策略—超参数的多级迭代策略,实现算网对用户业务需求的期望确定性保障和对系统性能的最优化。同时,研究了针对超参数寻优的多尺度步长(multi-scale step length,MSL)方法,进一步提升了系统的稳定性和准确性。仿真结果表明,所提方法在系统架构和负载变化情况下均具有良好的收敛性和稳定性。 The computing power network needs to maximize the system performance index on the basis of meeting user business needs,and the existing methods are mainly based on the multi-objective weighting method,which has problems such as difficult to determine hyperparameters and poor cross-scenario applicability.Based on this,based on the analysis of the characteristics of the computing power network target,the user business requirements were taken as the policy constraints,and the performance indicators of the computing power network was taken as the op-timization goal based on constrained policy optimization,and the expectation certainty of user business needs and the optimization of system performance through the value-strategy-hyper-parameter multi-level iterative strategy was realized.At the same time,the multi-scale step length(MSL)method for hyper-parameter optimization was studied,which further improved the stability and accuracy of the system.Simulation results show that the proposed method has good convergence and stability under the conditions of single terminal-single edge server,mul-ti-terminal-multi-edge server and system load change.

作者沈林江曹畅崔超张岩 SHEN Linjiang;CAO Chang;CUI Chao;ZHANG Yan(Inspur Communication Information System Co.,Ltd.,Jinan 250100,China;Research Institute of China United Network Communications Co.,Ltd.,Beijing 100048,China)

机构地区浪潮通信信息系统有限公司中国联合网络通信有限公司研究院

出处《电信科学》 2023年第8期136-148,共13页 Telecommunications Science

关键词算力网络多目标优化强化学习 computing power network multi-objective optimization reinforcement learning

分类号 TP393 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献5

1何涛,杨振东,曹畅,张岩,唐雄燕.算力网络发展中的若干关键技术问题分析[J].电信科学,2022,38(6):62-70. 被引量：15
2李建飞,曹畅,李奥,庞博文.算力网络中面向业务体验的算力建模[J].中兴通讯技术,2020,26(5):34-38. 被引量：18
3雷波,刘增义,王旭亮,杨明川,陈运清.基于云、网、边融合的边缘计算新方案:算力网络[J].电信科学,2019,35(9):44-51. 被引量：80
4雷波,赵倩颖,赵慧玲.边缘计算与算力网络综述[J].中兴通讯技术,2021,27(3):3-6. 被引量：19
5Xiongyan Tang,Chang Cao,Youxiang Wang,Shuai Zhang,Ying Liu,Mingxuan Li,Tao He.Computing Power Network:The Architecture of Convergence of Computing and Networking towards 6G Requirement[J].China Communications,2021,18(2):175-185. 被引量：28

二级参考文献15

1李铭轩,魏进武,张云勇.面向电信运营商的IT资源微服务化方案[J].信息通信技术,2017,11(2):48-55. 被引量：14
2李彤,马季春.云化背景下运营商数据网演进思路探讨[J].邮电设计技术,2017(10):1-4. 被引量：19
3姚惠娟,耿亮.面向计算网络融合的下一代网络架构[J].电信科学,2019,35(9):38-43. 被引量：16
4雷波,刘增义,王旭亮,杨明川,陈运清.基于云、网、边融合的边缘计算新方案:算力网络[J].电信科学,2019,35(9):44-51. 被引量：80
5吕华章,陈丹,范斌,王友祥,乌云霄.边缘计算标准化进展与案例分析[J].计算机研究与发展,2018,55(3):487-511. 被引量：87
6施巍松,张星洲,王一帆,张庆阳.边缘计算:现状与展望[J].计算机研究与发展,2019,56(1):69-89. 被引量：329
7李林哲,周佩雷,程鹏,史治国.边缘计算的架构、挑战与应用[J].大数据,2019,5(2):3-16. 被引量：44
8陈运清,雷波,解云鹏.面向云网一体的新型城域网演进探讨[J].中兴通讯技术,2019,25(2):2-8. 被引量：27
9马季春,孟丽珠.面向云网协同的新型城域网[J].中兴通讯技术,2019,25(2):37-40. 被引量：25
10唐洁,刘少山.面向无人驾驶的边缘高精地图服务[J].中兴通讯技术,2019,25(3):58-67. 被引量：9

共引文献139

1傅文军,陈飞,李凯.基于GPU池化关键技术实现东数西训(渲)小颗粒度资源度量与异构调度[J].中国仪器仪表,2023(2):17-20.
2任晓旭,仇超,邓辉,戴子明,刘泽军,王晓飞.边缘智能融合区块链:研究现状、应用及挑战[J].信息与控制,2024,53(1):1-16.
3王岩,张旭辉,曹现刚,赵友军,杨文娟,杜昱阳,石硕.掘进工作面数字孪生体构建与平行智能控制方法[J].煤炭学报,2022,47(S01):384-394. 被引量：10
4于清林.从边缘计算到算力网络[J].产业科技创新,2020(3):49-51. 被引量：1
5周亮,徐旭,张岱,姚渭菁,张成,杨杉,付伟,刘军,张磊.企业云平台算力开放和运营体系研究[J].长江技术经济,2021,5(S02):190-193.
6席政.西藏自治区开发思路[J].中国投资（中英文）,2000(5):32-32. 被引量：1
7唐魁玉.万物互联时代的兴起及其边缘算法效应[J].学术前沿,2020(9):33-39. 被引量：7
8潘三明,袁明强.基于边缘计算的视频监控系统及应用[J].电信科学,2020,36(6):64-69. 被引量：10
9蔡岳平,李天驰.面向算力匹配调度的泛在确定性网络研究[J].信息通信技术,2020,14(4):9-15. 被引量：5
10李明春,王威,倪西冰.边缘计算在铁路行业的应用和价值[J].信息通信技术,2020,14(4):37-44. 被引量：2

1Manjia Su,Rongzhen Xie,Yu Qiu,Yisheng Guan.Design,Mobility Analysis and Gait Planning of a Leech-like Soft Crawling Robot with Stretching and Bending Deformation[J].Journal of Bionic Engineering,2023,20(1):69-80. 被引量：4
2魏小栋,孙超,刘波,霍为炜,任强,孙逢春.燃料电池汽车车速与能量联合优化[J].机械工程学报,2023,59(8):204-212. 被引量：1
3Guoqing Xu,Changsen Xia,Jun Qian,Guo Ran,Zilong Jin.A Network Traffic Prediction Algorithm Based on Prophet-EALSTM-GPR[J].Journal on Internet of Things,2022,4(2):113-125. 被引量：1
4安风霞,杨玉,吴帅帅,吴家荣.印刷电路板换热器芯体尺寸多目标优化研究[J].电力科技与环保,2023,39(4):345-352. 被引量：2
5谢剑,苗湃林.中国能源消费结构多目标优化研究[J].科学决策,2023(7):159-167. 被引量：3
6Xiyuan Zhang,Ke Rong,Guangming Chen,Aihong Ji,Yawei Song.Effect of Dual-Tasks Walking on Human Gait Patterns[J].Journal of Bionic Engineering,2022,19(4):991-1002.
7K.M.Monica,R.Parvathi.Efficient Gait Analysis Using Deep Learning Techniques[J].Computers, Materials & Continua,2023(3):6229-6249.
8Guangkai Fu,Yiping Cao,Mingteng Lu.A fast auto-focusing method of microscopic imaging based on an improved MCS algorithm[J].Journal of Innovative Optical Health Sciences,2015,8(5):67-76. 被引量：2
9Anu,Anita Singhrova.Levy Flight Firefly Based Efficient Resource Allocation for Fog Environment[J].Intelligent Automation & Soft Computing,2023(7):199-219.
10Hong-Fei Xiao,Qing-Xian Zhang,He-Yi Tan,Bin Shi,Jun Chen,Zhi-Qiang Cheng,Jian Zhang,Rui Yang.The study of a neutron spectrum unfolding method based on particle swarm optimization combined with maximum likelihood expectation maximization[J].Nuclear Science and Techniques,2023,34(4):149-160. 被引量：1

电信科学

2023年第8期

浏览历史

内容加载中请稍等...

基于策略约束强化学习的算网多目标优化研究

参考文献5

二级参考文献15

共引文献139

相关作者

相关机构

相关主题

浏览历史