中文版 | English
题名

Policy gradient adaptive dynamic programming for nonlinear discrete-time zero-sum games with unknown dynamics

作者
通讯作者Zhao, Bo
发表日期
2023
DOI
发表期刊
ISSN
1432-7643
EISSN
1433-7479
卷号27期号:9页码:5781-5795
摘要
A novel policy gradient (PG) adaptive dynamic programming method is developed to deal with nonlinear discrete-time zero-sum games with unknown dynamics. To facilitate the implementation, a policy iteration algorithm is established to approximate the iterative Q-function, as well as the control and disturbance policies via three neural network (NN) approximators, respectively. Then, the iterative Q-function is exploited to update the control and disturbance policies via PG method. To stabilize the training process and improve the data usage efficiency, the experience replay technique is applied to train the weight vectors of the three NNs by using mini-batch empirical data from replay memory. Furthermore, the convergence in terms of the iterative Q-function is proved. Simulation results of two numerical examples are provided to show the effectiveness of the proposed method.
关键词
相关链接[来源记录]
收录类别
SCI ; EI
语种
英语
学校署名
其他
资助项目
Beijing Natural Science Foundation[4212038] ; National Natural Science Foundation of China["61973330","62073085"] ; Open Research Project of the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences[20210108] ; Open Research Project of the Key Laboratory of Industrial Internet of Things & Networked Control, Ministry of Education[2021FF10]
WOS研究方向
Computer Science
WOS类目
Computer Science, Artificial Intelligence ; Computer Science, Interdisciplinary Applications
WOS记录号
WOS:000915638100001
出版者
EI入藏号
20230313393107
EI主题词
Dynamic programming ; Game theory ; Iterative methods ; Neural networks ; Numerical methods
EI分类号
Artificial Intelligence:723.4 ; Optimization Techniques:921.5 ; Numerical Methods:921.6 ; Probability Theory:922.1
ESI学科分类
COMPUTER SCIENCE
来源库
Web of Science
引用统计
被引频次[WOS]:2
成果类型期刊论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/430992
专题工学院_机械与能源工程系
作者单位
1.Beijing Normal Univ, Sch Syst Sci, Beijing 100875, Peoples R China
2.Chongqing Univ Posts & Telecommun, Key Lab Ind Internet Things & Networked Control, Minist Educ, Chongqing 400065, Peoples R China
3.Southern Univ Sci & Technol, Dept Mech & Energy Engn, Shenzhen 518055, Peoples R China
4.Univ Illinois, Dept Elect & Comp Engn, Chicago, IL 60607 USA
推荐引用方式
GB/T 7714
Lin, Mingduo,Zhao, Bo,Liu, Derong. Policy gradient adaptive dynamic programming for nonlinear discrete-time zero-sum games with unknown dynamics[J]. SOFT COMPUTING,2023,27(9):5781-5795.
APA
Lin, Mingduo,Zhao, Bo,&Liu, Derong.(2023).Policy gradient adaptive dynamic programming for nonlinear discrete-time zero-sum games with unknown dynamics.SOFT COMPUTING,27(9),5781-5795.
MLA
Lin, Mingduo,et al."Policy gradient adaptive dynamic programming for nonlinear discrete-time zero-sum games with unknown dynamics".SOFT COMPUTING 27.9(2023):5781-5795.
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Lin, Mingduo]的文章
[Zhao, Bo]的文章
[Liu, Derong]的文章
百度学术
百度学术中相似的文章
[Lin, Mingduo]的文章
[Zhao, Bo]的文章
[Liu, Derong]的文章
必应学术
必应学术中相似的文章
[Lin, Mingduo]的文章
[Zhao, Bo]的文章
[Liu, Derong]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。