中文版 | English
题名

Deterministic Policy Gradient: Convergence Analysis

作者
通讯作者Zhang,Wei
发表日期
2022
会议录名称
页码
2159-2169
摘要
The deterministic policy gradient (DPG) method proposed in Silver et al. [2014] has been demonstrated to exhibit superior performance particularly for applications with multi-dimensional and continuous action spaces. However, it remains unclear whether DPG converges, and if so, how fast it converges and whether it converges as efficiently as other PG methods. In this paper, we provide a theoretical analysis of DPG to answer those questions. We study the single timescale DPG (often the case in practice) in both on-policy and off-policy settings, and show that both algorithms attain an ε- accurate stationary policy up to a system error with a sample complexity of O(ε). Moreover, we establish the convergence rate for DPG under Gaussian noise exploration, which is widely adopted in practice to improve the performance of DPG. To our best knowledge, this is the first non-asymptotic convergence characterization for DPG methods.
学校署名
通讯
语种
英语
相关链接[Scopus记录]
资助项目
Science and Technology Program of Jingdezhen City[JCYJ20200109141601708];
Scopus记录号
2-s2.0-85146148658
来源库
Scopus
成果类型会议论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/524336
专题工学院_机械与能源工程系
作者单位
1.Department of Electrical and Computer Engineering,The Ohio State University,Columbus,United States
2.Department of Electrical and Computer Engineering,National University of Singapore,Singapore,Singapore
3.Department of Mechanical and Energy Engineering,Southern University of Science and Technology (SUSTech),Shenzhen,Guangdong,China
通讯作者单位机械与能源工程系
推荐引用方式
GB/T 7714
Xiong,Huaqing,Xu,Tengyu,Zhao,Lin,et al. Deterministic Policy Gradient: Convergence Analysis[C],2022:2159-2169.
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Xiong,Huaqing]的文章
[Xu,Tengyu]的文章
[Zhao,Lin]的文章
百度学术
百度学术中相似的文章
[Xiong,Huaqing]的文章
[Xu,Tengyu]的文章
[Zhao,Lin]的文章
必应学术
必应学术中相似的文章
[Xiong,Huaqing]的文章
[Xu,Tengyu]的文章
[Zhao,Lin]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。