南方科技大学知识苑(SUSTech KC): 移动机器人在复杂环境下的动态避障

题名	移动机器人在复杂环境下的动态避障
其他题名	DYNAMIC OBSTACLE AVOIDANCE OF MOBILE ROBOTS UNDER COMPLICATED ENVIRONMENTS
姓名	陈志明
学号	11849041
学位类型	硕士
学位专业	机械制造及其自动化
导师	张巍
论文答辩日期	2020-06-04
论文提交日期	2020-07-20
学位授予单位	哈尔滨工业大学
学位授予地点	深圳
摘要	近年来，机器人在人类社会中扮演着越来越重要的角色，尤其是移动机器人正慢慢融入人们的日常生活。然而不论是室内使用的扫地机器人，还是在户外使用的外卖机器人、快递机器人、巡逻机器人等都面临着避障问题，它们的应用场景中不仅有静态的障碍物，还往往有人、动物以及其它交通工具等动态的障碍物，避开静态以及动态障碍物是对移动机器人的基本要求，也是机器人领域研究的一个核心问题。本文首先针对动态环境下的移动机器人避障问题提出了一种基于深度强化学习的端到端的动态避障算法。该算法提出了一种新型策略网络，该网络创造性地将处理文本时序信息的长短期记忆单元与处理图像信息的卷积层相结合构成新的特征提取网络。在此基础上本文设计了新型的奖赏函数，结合增加正则项的近端策略优化，经过训练得到了优化的避障策略模型。该算法输入为激光雷达扫描数据、目标位置坐标参数以及机器人本体的速度信息，基于这些参数可直接输出用于机器人灵活避障的速度指令，在本文仿真实验部分中设置的多种复杂仿真环境的综合平均避障成功率可达70%以上。硬件部分创新性地使用3D 激光雷达与超宽带定位系统相结合的导航避障传感系统，同时优化了传感器的连接与供电方案，使得传感器得以顺利部署在莱卡狗机器人上。然后将基于深度强化学习的避障算法在该机器人上进行了多行人动态避障实验验证，证明了该算法可以灵活地避开动态障碍物，具有很强的实际应用价值。本文针对基于深度强化学习的算法可解释性较差，以及传统动态窗口法在动态障碍物环境下避障成功率较低的问题，还提出了一种基于动态窗口法的新型动态避障算法。该算法采用了由机器人本体到目标点距离增量和最小障碍物距离相结合的新型评价函数和新约束条件，提高了动态窗口法在速度空间中对最优速度的选择能力。该算法在本文仿真实验所设定的仿真环境中避障成功率依据障碍物数目的不同提升了约5%-20%不等，且大大缩短了机器人到达目标点的时间。硬件部分采用深度相机与超宽带定位系统相结合的导航避障传感系统，并将该算法在莱卡狗机器人上进行了真实场景下的实验验证，明确了该算法的可行性与实用性。本文最后对这两套动态避障系统的硬件、算法和实验进行了对比分析总结，为该领域的进一步研究提供了参考借鉴。
其他摘要	In recent years, robots play an increasingly important role in human society, especially mobile robots are gradually integrating into people's daily life. However, whether it's a sweeping robot for home use, or a takeaway robot, express delivery robot, patrol robot, etc. for outdoor use, they are faced with obstacle avoidance problems. In their application scenarios, there are not only static obstacles, but also dynamic obstacles such as people, animals and other means of transportation. To avoid obstacles is the basic requirement for mobile robots, as well as one of the core issues in robotics research.In this paper, we first propose an end-to-end dynamic obstacle avoidance algorithm based on deep reinforcement learning for mobile robot in dynamic environment. This algorithm proposes a new policy network, which creatively combines the long short-term memory units of processing text sequence information with the convolution layer of processing image information to form a new feature extraction network. On this basis, this paper designs a new reward function, combined with the proximal policy optimization along with regularized term, and gets the optimized obstacle avoidance policy model after training. The input of the algorithm is lidar scanning data, target position coordinate parameters and the speed informationof robot body. Based on these parameters, the speed instructions for robot flexible obstacle avoidance can be output directly. The integrated average obstacle avoidance success rate of the various complex simulation environments set in the simulation experiment part of this paper can reach more than 70%. In the hardware part, the navigation obstacle avoidance sensor system which combi nes 3D lidar and UWB positioning system is used innovatively. At the same time, the connection and power supply scheme of the sensor is optimized, so that the sensor can be successfully deployed on the Laikago dog robot. Then, a dynamic obstacle avoidance algorithm based on deep reinforcement learning is tested on the robot, which proves that the algorithm can avoid dynamic obstacles flexibly and has a strong practical value. A new dynamic obstacle avoidance algorithm based on dynamic window method is also proposed in this paper to solve the problem of poor interpretability of algorithm based on deep reinforcement learning and low success rate of traditional dynamic window method in dynamic obstacle environment. The algorithm adopts a newevaluation function and a new constraint condition which combines the distance increment from the robot body to the target point and the minimum obstacle distance, and improves the ability of dynamic window method to select the optimal speed in the speed space. According to the number of obstacles, the success rate of obstacle avoidance is increased by 5% - 20%, and the time of robot reaching the target point is greatly shortened. In the hardware part, the navigation obstacle avoidance sensor system which combines depth camera and ultra wide-band positioning system is used. The algorithm is tested in real scene on L aikago dog robot, and the feasibility and practicability of the algorithm are verified.
关键词	移动机器人动态避障深度强化学习动态窗口法
其他关键词	mobile robots dynamic obstacle avoidance deep reinforcement learning dynamic window approach
语种	中文
培养类别	联合培养
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/142973
专题	工学院_机械与能源工程系
作者单位	南方科技大学
推荐引用方式 GB/T 7714	陈志明. 移动机器人在复杂环境下的动态避障[D]. 深圳. 哈尔滨工业大学,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
移动机器人在复杂环境下的动态避障.pdf（5650KB）	--	--	限制开放	--	请求全文