题名 | AUTONOMOUS TRADING AGENT WITH REINFORCEMENT LEARNING |
其他题名 | 基于强化学习的自动交易代理
|
姓名 | |
学号 | 11849061
|
学位类型 | 硕士
|
学位专业 | 计算机科学与技术
|
导师 | Georgios Theodoropoulos
|
论文答辩日期 | 2020-05-30
|
论文提交日期 | 2020-07-08
|
学位授予单位 | 哈尔滨工业大学
|
学位授予地点 | 深圳
|
摘要 | This dissertation examines the use of reinforcement learning in autonomous agents that can interact intelligently with financial markets. Stock market trading is used to evaluate and develop a number of machine learning approaches specifically able to handle the challenging characteristics of the financial market trading problem, particularly reinforcement learning. The prediction of change in the stock market is a very difficult task because the underlying patterns that drive market behavior are non-stationary that means useful predictive patterns learned in the past may not be suitable to be applied in the future. Reinforcement learning has not been widely applied in this application domain and the paradigm of reinforcement learning provides a way to allow agents to directly learn trading decision models with more degrees of freedom than many other techniques, for example without a requirement to preset particular thresholds that define certain signals for buy or sell decisions. The change of price can naturally be viewed as a reward and this will avoid the drawbacks of labeling data related to setting thresholds if the problem is formulated as a supervised learning problem. Reinforcement learning can also avoid costs needed for labelling of examples and constructing a training data set. However, in a study of the literature, we find that existing research applying reinforcement learning algorithms to generate trading decisions does not in general account for the environment being non-stationary. The approaches described in the previous literature describe applications of a single agent that may not be recalibrated and learning methodologies that sometimes can be susceptible to limitations from being stuck in local optima. The proposed methods in this dissertation mitigate some of these issues by using multiple agents and a multi-stage learning model where the agents compete to recommend the best decisions. Our approach combines online learning with reinforcement learning. Online learning is used to select a recommendation from a set of agents at the decision point in real time; in addition, the technique is able to relearn and adapt the set of decision models based on recent data. To develop the approach with reinforcement learning, this research produced new methods that can modify the process of training reinforcement learning agents to give additional focus to recent data. The novel methods are evaluated with empirical analysis using data from a range of international and Chinese stock markets. We find that agents based on the proposed methodology are able to outperform other machine learning methods in terms of various metrics and including application specific measures of risk and return that are accepted in the finance industry. Experiments show that agents which use online learning and reinforcement learning achieve higher return over a benchmark trading method buy and hold and using online learning provides substantial improvement in performance of a Deep Q-learning agent. Notably, during the financial crisis, the On-Line/Reinforcement Learning (OLR) agents can stay profitable many cases while other agents suffer a loss in all tests during this time. |
其他摘要 | 本文使用强化学习构建了与金融市场进行智能交互的自动交易代理。股票市场交易可以用于评估和开发新的机器学习方法,这些方法需要对金融市场交易问题的特征做出调整,尤其是强化学习。预测股市变化是一项非常艰巨的任务,因为驱动市场行为的基本模式是非静态的,这意味着过去学习到的有用的预测模式可能不适合在将来应用。强化学习尚未在该应用领域中广泛应用,相比于其他技术,强化学习的范式可以使代理具有更大自由度地直接学习交易决策模型,例如,无需预设定义用于购买或出售这些决策信号的特定阈值。价格的变化可以自然地被看作是一种奖励,所以强化学习可以避免在监督学习中标注示例和构建训练数据集所需的成本。在对先前文献的研究中,我们发现现有的应用强化学习算法来生成交易决策的研究通常不能解决非静态环境的问题。先前文献中所提出的方法得到的单一代理不会随着时间的变化而重新校准,同时学到的交易策略有时会陷入局部最优。本文提出的方法通过使用多个代理和一个多阶段学习模型来缓解上述提到的问题,多个代理可以竞争性地推荐最佳决策。我们的方法将在线学习与强化学习相结合。在线学习用于在决策点实时从一组代理中选择推荐的交易策略,还可以基于最近的数据重新学习和调整决策模型。为了更好地应用强化学习,实验中对训练强化学习代理的过程做出了调整,使更多的注意力集中在最新数据上。本文使用一系列来自国际和中国股票市场的数据,通过实验分析对所提出的方法进行评估。我们发现,在金融行业中常用于评估风险和收益的各种指标上,基于所提出的方法的代理都能够胜过基于其他机器学习方法的代理。实验表明,使用在线学习和强化学习的代理比基准交易方法购买并持有可获得更高的回报,并且使用在线学习可以大大提高Deep Q-learning代理的性能。值得注意的是,在金融危机期间,在线强化学习(OLR)代理可以在许多情况下保持盈利,而其他代理在所有测试中均有亏损。 |
关键词 | |
其他关键词 | |
语种 | 英语
|
培养类别 | 联合培养
|
成果类型 | 学位论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/143029 |
专题 | 工学院_计算机科学与工程系 |
作者单位 | 南方科技大学 |
推荐引用方式 GB/T 7714 |
Liu J. AUTONOMOUS TRADING AGENT WITH REINFORCEMENT LEARNING[D]. 深圳. 哈尔滨工业大学,2020.
|
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | 操作 | |
AUTONOMOUS TRADING A(2536KB) | -- | -- | 限制开放 | -- | 请求全文 |
个性服务 |
原文链接 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
导出为Excel格式 |
导出为Csv格式 |
Altmetrics Score |
谷歌学术 |
谷歌学术中相似的文章 |
[刘健]的文章 |
百度学术 |
百度学术中相似的文章 |
[刘健]的文章 |
必应学术 |
必应学术中相似的文章 |
[刘健]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论