中文版 | English
题名

基于强化学习的复杂环境自动驾驶运动规划

其他题名
MOTION PLANNING OF AUTONOMOUS VEHICLES IN COMPLEX ENVIRONMENT BASED ON REINFORCEMENT LEARNING
姓名
学号
11849329
学位类型
硕士
学位专业
计算机技术
导师
史玉回
论文答辩日期
2020-05-30
论文提交日期
2020-07-08
学位授予单位
哈尔滨工业大学
学位授予地点
深圳
摘要
自动驾驶技术在保证出行安全和提高交通效率方面有着巨大的社会价值,能彻底改变人类社会交通运输形态,近年来吸引了国内外公众和研究者的广泛关注。运动规划系统作为决定车辆运动方式的模块,其性能决定了自动驾驶车辆的安全性、舒适性和高效性。然而,在城市道路等复杂环境中,自动驾驶车辆面临许多不确定性问题。许多研究者将其中的一些问题作为独立的问题进行了研究,但如何表达环境中的不确定性以及如何处理这些不确定性仍是运动规划研究中需要解决的问题。本文将自动驾驶面对的复杂环境定义为不确定性问题,并根据自动驾驶的层级框架,总结了运动规划模块面对的不确定性类别。通过构建一个基于部分可观测的马尔可夫决策过程的模型来表示不确定环境中的运动规划过程,并使用基于规则的运动规划方法控制周围交通参与者的行为。然后将深度强化学习引入运动规划系统中,提出了带有模型检测的运动规划方法,并通过在模拟环境中进行不同条件下的运动规划实验来分析不确定性对运动规划的影响以及验证运动规划方法的有效性。本文具体研究内容如下:分析复杂环境中运动规划面临的问题,将其总结为人类意图不确定性、周围交通参与者路径不确定性、高交通密度、遮挡四种不确定性问题。建立了不确定环境中运动规划系统的基本框架,并基于部分可观测的马尔可夫过程构建一个模型来表示具有不确定性的基础驾驶环境和运动规划过程。相比已有的自动驾驶模拟器,本文构建的环境能够表示多种环境不确定性。通过在不同条件下使用同一运动规划方法,分析了不确定环境对运动规划结果的影响。实验结果表明,遮挡和意图不确定会影响运动规划的安全性,路径不确定和高交通密度会影响通行效率。将深度强化学习引入运动规划系统,并结合基于规则的运动规划方法,将其作为模型检测来提高深度强化学习的学习效率和规划结果的安全性。实验结果表明,强化学习的方法能够减缓不确定性带来的影响,模型检测的方法能够提高不确定环境下运动规划的安全性和效率。
其他摘要
Autonomous driving has a great impact on traffic efficiency and safety, which can give a radical change to the form of transportation in human society. In recent years, autonomous vehicles have attracted extensive attention of the public and researchers around the world. Motion planning system, as a module to determine the way of vehicle motion, determines the safety, comfort and efficiency of autonomous vehicles. However, in complex environment like urban environment, autonomous vehicles face many uncertainties. Many researchers have studied some of them as a separate problem, but how to express the uncertainty in the environment and how to deal with it is still a problem. In this dissertation, the complex environment of autonomous driving is defined as uncertainty problem, and summarized the uncertainty categories of motion planning module according to the hierarchical framework. A model based on Partially Observable Markov Decision Process is constructed to represent the uncertainty faced by motion planning, and a rule-based method is used to determine the action of surrounding traffic participants. Then, deep reinforcement learning is used in motion planning system, and a motion planning method with model checker is proposed. The influence of uncertainty on the motion planning process is analyzed and the effectiveness of the motion planning method is verified through the experiments of motion planning in different conditions in the simulation environment. The specific research contents of this dissertation are summarized as follows: Analyzing the problems faced by the motion planning module in complex environment and summarizing them into four problems: uncertainty of human intention, uncertainty of surrounding vehicles' path, high traffic density and occlusion. A basic framework of motion planning system with uncertainty is established and a model based on Partially Observable Markov Decision Process is constructed to represent the basic driving environment and motion planning process with uncertainty. Compared with the existing autopilot simulator, the environment constructed in this dissertation can represent a variety of environmental uncertainties. By using the same motion planning method under different conditions, the influence of uncertain environment on motion planning is analyzed. The experiments show that occlusion and intention uncertainty will affect the safety of motion planning, and path uncertainty and high traffic density will affect the traffic efficiency.Deep reinforcement learning is introduced into the motion planning system, and combined with the rule-based motion planning method, which is used as the model detection to improve the learning efficiency of deep reinforcement learning and the security of the planning results. The experiments show that the reinforcement learning can reduce the impact of uncertainty, and the model checker can improve the safety and efficiency of motion planning in uncertain environment.
关键词
其他关键词
语种
中文
培养类别
联合培养
成果类型学位论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/143027
专题工学院_计算机科学与工程系
作者单位
南方科技大学
推荐引用方式
GB/T 7714
陈尧. 基于强化学习的复杂环境自动驾驶运动规划[D]. 深圳. 哈尔滨工业大学,2020.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
基于强化学习的复杂环境自动驾驶运动规划.(2332KB)----限制开放--请求全文
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[陈尧]的文章
百度学术
百度学术中相似的文章
[陈尧]的文章
必应学术
必应学术中相似的文章
[陈尧]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。