南方科技大学知识苑(SUSTech KC): Evolving Constrained Reinforcement Learning Policy

题名	Evolving Constrained Reinforcement Learning Policy
作者	Chengpeng Hu1 ; Jiyuan Pei1 ; Jialin Liu2 ; Xin Yao2
通讯作者	Jialin Liu
DOI	10.1109/IJCNN54540.2023.10191982
发表日期	2023
会议名称	International Joint Conference on Neural Networks (IJCNN)
ISSN	2161-4393
ISBN	978-1-6654-8868-6
会议录名称	2023 International Joint Conference on Neural Networks (IJCNN)
页码	1-8
会议日期	18-23 June 2023
会议地点	Gold Coast, Australia
出版地	345 E 47TH ST, NEW YORK, NY 10017 USA
出版者	IEEE
摘要	Evolutionary algorithms have been used to evolve a population of actors to generate diverse experiences for training reinforcement learning agents, which helps to tackle the temporal credit assignment problem and improves the exploration efficiency. However, when adapting this approach to address constrained problems, balancing the trade-off between the reward and constraint violation is hard. In this paper, we propose a novel evolutionary constrained reinforcement learning (ECRL) algorithm, which adaptively balances the reward and constraint violation with stochastic ranking, and at the same time, restricts the policy's behaviour by maintaining a set of Lagrange relaxation coefficients with a constraint buffer. Extensive experiments on robotic control benchmarks show that our ECRL achieves outstanding performance compared to state-of-the-art algorithms. Ablation analysis shows the benefits of introducing stochastic ranking and constraint buffer.
关键词	Evolutionary Constrained Reinforcement Learning Evolutionary Reinforcement Learning Constrained Reinforcement Learning Stochastic Ranking Robotic Control
学校署名	第一 ; 通讯
语种	英语
相关链接	[IEEE记录]
收录类别	CPCI-S
资助项目	National Natural Science Foundation of China[
WOS研究方向	Computer Science ; Engineering
WOS类目	Computer Science, Artificial Intelligence ; Computer Science, Hardware & Architecture ; Engineering, Electrical & Electronic
WOS记录号	WOS:001046198707051
来源库	IEEE
全文链接	https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10191982
引用统计	被引频次[WOS]：0
成果类型	会议论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/553192
专题	工学院_斯发基斯可信自主研究院工学院_计算机科学与工程系
作者单位	1.Research Institute of Trustworthy Autonomous Systems (RITAS), Southern University of Science and Technology, Shenzhen, China 2.Department of Computer Science and Engineering, Guangdong Key Laboratory of Brain-inspired Intelligent Computation, Southern University of Science and Technology, Shenzhen, China
第一作者单位	斯发基斯可信自主系统研究院
通讯作者单位	计算机科学与工程系
第一作者的第一单位	斯发基斯可信自主系统研究院
推荐引用方式 GB/T 7714	Chengpeng Hu,Jiyuan Pei,Jialin Liu,et al. Evolving Constrained Reinforcement Learning Policy[C]. 345 E 47TH ST, NEW YORK, NY 10017 USA:IEEE,2023:1-8.