中文版 | English
题名

Theoretical Analysis of Value-Iteration-Based Q-Learning with Approximation Errors

作者
DOI
发表日期
2022
ISSN
2164-4357
ISBN
978-1-6654-9738-1
会议录名称
页码
120-125
会议日期
14-16 Oct. 2022
会议地点
Kaifeng, China
摘要
In this paper, the value-iteration-based Q-Iearning algorithm with approximation errors is analyzed theoretically. First, based on an upper bound of the approximation errors caused by the Q-function approximator, we get the lower and upper bound functions of the iterative Q-function, which proves that the limit of the approximate Q-function sequence is bounded. Then, we develop a stability condition for the termination of the iterative algorithm, for ensuring that the current control policy derived from the resulting approximate Q-function is stabilizing. Also, we establish an upper bound function of the approximation errors, which is caused by the policy function approximator, to guarantee that the approximate control policy is stabilizing. Finally, the numerical results verifies the theoretical results with a simulation example.
关键词
学校署名
其他
相关链接[IEEE记录]
来源库
IEEE
全文链接https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9926794
引用统计
被引频次[WOS]:0
成果类型会议论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/412122
专题南方科技大学
作者单位
1.School of Automation, Guangdong University of Technology, Guangzhou, China
2.School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China
3.Institute of Control Science and Technology, Southern University of Science and Technology, Shenzhen, China
推荐引用方式
GB/T 7714
Zhantao Liang,Mingming Ha,Derong Liu. Theoretical Analysis of Value-Iteration-Based Q-Learning with Approximation Errors[C],2022:120-125.
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Zhantao Liang]的文章
[Mingming Ha]的文章
[Derong Liu]的文章
百度学术
百度学术中相似的文章
[Zhantao Liang]的文章
[Mingming Ha]的文章
[Derong Liu]的文章
必应学术
必应学术中相似的文章
[Zhantao Liang]的文章
[Mingming Ha]的文章
[Derong Liu]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。