南方科技大学知识苑(SUSTech KC): A Model-Based Exploration Policy in Deep Q-Network

题名	A Model-Based Exploration Policy in Deep Q-Network
作者	Li，Shuailong 1; Zhang，Wei 1; Leng，Yuquan2 ; Zhang，Xin 1
通讯作者	Zhang，Wei; Leng，Yuquan
DOI	10.1109/DSInS54396.2021.9670573
发表日期	2021
ISBN	978-1-6654-0631-4
会议录名称	2021 International Conference on Digital Society and Intelligent Systems, DSInS 2021
页码	336-343
会议日期	3-4 Dec. 2021
会议地点	Chengdu, China
摘要	Reinforcement learning has successfully been used in many applications and achieved prodigious performance (such as video games), and DQN is a well-known algorithm in RL. However, there are some disadvantages in practical applications, and the exploration and exploitation dilemma is one of them. To solve this problem, common strategies about exploration like ϵ-greedy have risen. Unfortunately, there are sample inefficient and ineffective because of the uncertainty of later exploration. In this paper, we propose a model-based exploration method that learns the state transition model to explore. Using the training rules of machine learning, we can train the state transition model networks to improve exploration efficiency and sample efficiency. We compare our algorithm with ϵ-greedy on the Deep Q-Networks (DQN) algorithm and apply it to the Atari 2600 games. Our algorithm outperforms the decaying ϵ-greedy strategy when we evaluate our algorithm across 14 Atari games in the Arcade Learning Environment (ALE).
关键词	exploration and exploitation dilemma model-based exploration method reinforcement learning
学校署名	通讯
语种	英语
相关链接	[Scopus记录]
收录类别	EI
EI入藏号	20220911706199
EI主题词	Computer aided instruction ; Efficiency
EI分类号	Artificial Intelligence:723.4 ; Computer Applications:723.5 ; Education:901.2 ; Production Engineering:913.1
Scopus记录号	2-s2.0-85125104191
来源库	Scopus
全文链接	https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9670573
引用统计	被引频次[WOS]：0
成果类型	会议论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/328083
专题	工学院_机械与能源工程系
作者单位	1.University of Chinese Academy of Sciences,Institutes for Robotics and Intelligent Manufacturing,Chinese Academy of Sciences,State Key Laboratory of Robotics,Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang,China 2.Southern University of Science and Technology,Department of Mechanical and Energy Engineering,Shenzhen,China
通讯作者单位	机械与能源工程系
推荐引用方式 GB/T 7714	Li，Shuailong,Zhang，Wei,Leng，Yuquan,et al. A Model-Based Exploration Policy in Deep Q-Network[C],2021:336-343.