南方科技大学知识苑(SUSTech KC): Output feedback Q-learning for discrete-time finite-horizon zero-sum games with application to the H-? control

题名	Output feedback Q-learning for discrete-time finite-horizon zero-sum games with application to the H-? control
作者	Liu, Mingxiang 1,2; Cai, Qianqian 1,2; Li, Dandan 1,2; Meng, Wei 1,2; Fu, Minyue3
通讯作者	Cai, Qianqian
发表日期	2023-04-07
DOI	10.1016/j.neucom.2023.01.050
发表期刊	NEUROCOMPUTING 影响因子和分区
ISSN	0925-2312
EISSN	1872-8286
卷号	529页码:48-55
摘要	In this paper, we present a Q-learning framework for solving finite-horizon zero-sum game problems involving the H. control of linear system without knowing the dynamics. Research in the past mainly focused on solving problems in infinite horizon with completely measurable state. However, in the prac-tical engineering, the system state is not always directly accessible, and it is difficult to solve the time-varying Riccati equation associated with the finite-horizon setting directly either. The main contribution of the proposed model-free algorithm is to determine the optimal output feedback policies without mea-surement state in finite-horizon setting. To achieve this goal, we first describe the Q-function caused by finite-horizon problems in the context of state feedback, then we parameterize the Q-functions as input- output vectors functions. Finally, the numerical examples on aircraft dynamics demonstrate the algo-rithm's efficiency. (c) 2023 Published by Elsevier B.V.
关键词	Finite-horizon Output feedback Zero -sum games H control Q-learning Linear quadratic (LQ)
相关链接	[来源记录]
收录类别	SCI ; EI
语种	英语
学校署名	其他
资助项目	Grants of National Natural Science Foundation of China["U21A20476","U1911401","U22A20221","62273100","62073090"] ; Guang-dong Basic and Applied Basic Research Foundation["2021A1515012554","2020A1515011505"]
WOS研究方向	Computer Science
WOS类目	Computer Science, Artificial Intelligence
WOS记录号	WOS:000935337000001
出版者	ELSEVIER
EI入藏号	20231113707129
EI主题词	Discrete time control systems ; Game theory ; Linear systems ; Reinforcement learning ; State feedback
EI分类号	Artificial Intelligence:723.4 ; Control Systems:731.1 ; Calculus:921.2 ; Probability Theory:922.1 ; Systems Science:961
ESI学科分类	COMPUTER SCIENCE
Scopus记录号	2-s2.0-85149757845
来源库	Web of Science
引用统计	被引频次[WOS]：4
成果类型	期刊论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/501408
专题	工学院_机械与能源工程系
作者单位	1.Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Guangdong, Peoples R China 2.Guangdong Univ Technol, Guangdong Prov Key Lab Intelligent Decis & Coopera, Guangzhou 510006, Guangdong, Peoples R China 3.Southern Univ Sci & Technol, Dept Mech & Energy Engn, Shenzhen 518055, Guangdong, Peoples R China
推荐引用方式 GB/T 7714	Liu, Mingxiang,Cai, Qianqian,Li, Dandan,et al. Output feedback Q-learning for discrete-time finite-horizon zero-sum games with application to the H-? control[J]. NEUROCOMPUTING,2023,529:48-55.
APA	Liu, Mingxiang,Cai, Qianqian,Li, Dandan,Meng, Wei,&Fu, Minyue.(2023).Output feedback Q-learning for discrete-time finite-horizon zero-sum games with application to the H-? control.NEUROCOMPUTING,529,48-55.
MLA	Liu, Mingxiang,et al."Output feedback Q-learning for discrete-time finite-horizon zero-sum games with application to the H-? control".NEUROCOMPUTING 529(2023):48-55.