南方科技大学知识苑(SUSTech KC): 自动驾驶系统中的协作3D点云目标检测

题名	自动驾驶系统中的协作3D点云目标检测
姓名	汪俊永
姓名拼音	WANG Junyong
学号	11930198
学位类型	硕士
学位专业	0809 电子科学与技术
学科门类/专业学位类别	08 工学
导师	贡毅
导师单位	工学院
论文答辩日期	2022-05-11
论文提交日期	2022-06-17
学位授予单位	南方科技大学
学位授予地点	深圳
摘要	近年来，智能交通系统的发展越来越受到人们的关注。为了确保自动驾驶汽车运行安全可靠，人们做了大量的研究工作来探寻如何增强自动驾驶汽车对周围环境的感知能力。3D目标的准确检测是自动驾驶系统的一个核心功能，然而由于自动驾驶汽车自身物理局限性，其在复杂环境下常常难以准确地感知周围环境进而影响自动驾驶汽车的表现性能。协同感知可以整合来自不同空间传感器的信息，这对提高自动驾驶系统的感知精度具有重要意义。在这项工作中，我们考虑自动驾驶汽车利用本地激光雷达点云数据，并结合邻近基础设施观察的信息，通过无线链路实现自动驾驶汽车与周围交通设施的3D目标协同检测。在本文中，我们提出了三种基于自动驾驶汽车场景的协同检测方案。依次考虑了如何提高自动驾驶检测精度以及减少通信资源消耗的问题。在我们提出的协同3D目标检测框架中主要有以下三个部分：将激光雷达点云映射数据到特征图的特征学习网络；一个自动驾驶汽车通过无线链路获得周围交通设施的特征图并以不同的方式进行融合的通信模块；用来输出最终3D目标检测结果的区域生成网络。我们使用Carla模拟器模拟了两个典型的驾驶场景：一个环形路口和一个T型路口，并创建了相应的协作感知数据集用以评估所提出的框架的性能。最后，我们从自动驾驶汽车检测精度以及通信带宽的消耗两方面分析了协同3D目标检测的性能。实验结果表明，协同3D目标检测的方案可以节省通信带宽和计算资源的消耗，显著提高各种场景下自动驾驶汽车在不同检测难度下的检测性能。
关键词	协作感知 3D目标检测自动驾驶激光雷达点云
语种	中文
培养类别	独立培养
入学年份	2019
学位授予年份	2022-07
参考文献列表	[1] 张彦; 张科; 曹佳钰;. 边缘智能驱动的车联网[J]. 物联网学报, 2018, 2(04): 40-48. [2] SUN P, KRETZSCHMAR H, DOTIWALLA X, et al. Scalability in perception for autonomousdriving: Waymo open dataset[C]//2020 IEEE/CVF Conference on Computer Vision and PatternRecognition. 2020: 2443-2451. [3] 吕品; 李凯; 许嘉; 李陶深; 陈宁江. 无人驾驶汽车协同感知信息传输负载优化技术[J]. 计算机学报, 2021, 44(10): 1984-1997. [4] WU X, SAHOO D, HOI S C. Recent advances in deep learning for object detection[J]. Neurocomputing, 2020, 396: 39-64. [5] 王亚东; 田永林; 李国强; 王坤峰; 李大字. 基于卷积神经网络的三维目标检测研究综述[J].模式识别与人工智能, 2021, 34(12): 1103-1119. [6] 兰天蔚. 面向交通场景的 3d 目标检测研究[D]. 北京：北京交通大学, 2021. [7] CHEN X, KUNDU K, ZHANG Z, et al. Monocular 3d object detection for autonomous driving[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2147-2156. [8] WANG Y, CHAO W L, GARG D, et al. Pseudo-lidar from visual depth estimation: Bridgingthe gap in 3d object detection for autonomous driving[C]//CVPR. 2019. [9] SUN J, CHEN L, XIE Y, et al. Disp R-CNN: stereo 3d object detection via shape prior guidedinstance disparity estimation[J]. CoRR, 2020, abs/2004.03572. [10] ZHOU Y, TUZEL O. Voxelnet: End-to-end learning for point cloud based 3d object detection[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018: 4490-4499. [11] YAN Y, MAO Y, LI B. Second: Sparsely embedded convolutional detection[J]. Sensors, 2018,18: 3337. [12] CHARLES R Q, SU H, KAICHUN M, et al. Pointnet: Deep learning on point sets for 3dclassification and segmentation[C]//2017 IEEE Conference on Computer Vision and PatternRecognition. 2017: 77-85. [13] BELTRáN J, GUINDEL C, MORENO F M, et al. Birdnet: A 3d object detection frameworkfrom lidar information[C]//2018 21st International Conference on Intelligent TransportationSystems. 2018: 3517-3523. [14] GAVRILESCU R, ZET C, FOșALăU C, et al. Faster r-cnn:an approach to real-time object detection[C]//2018 International Conference and Exposition on Electrical And Power Engineering. 2018: 0165-0168. [15] 张新钰; 高洪波; 赵建辉; 周沫;. 基于深度学习的自动驾驶技术综述[J]. 清华大学学报 (自然科学版), 2018, 58(04): 438-444. [16] CHEN X, MA H, WAN J, et al. Multi-view 3d object detection network for autonomous driving [C]//Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition.2017: 1907-1915. [17] ENGELCKE M, RAO D, WANG D Z, et al. Vote3deep: Fast object detection in 3d pointclouds using effcient convolutional neural networks[C]//2017 IEEE International Conferenceon Robotics and Automation. 2017: 1355-1361. [18] KU J, MOZIFIAN M, LEE J, et al. Joint 3d proposal generation and object detection from view aggregation[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2018: 1-8. [19] QI C R, LIU W, WU C, et al. Frustum pointnets for 3d object detection from rgb-d data[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018: 918-927. [20] GAO S, LIM A, BEVLY D. An empirical study of dsrc v2v performance in truck platooningscenarios[J]. Digital Communications and Networks, 2016, 2(4): 233-244. [21] LI H, NASHASHIBI F. Multi-vehicle cooperative perception and augmented reality for driver assistance: A possibility to‘see’ through front vehicle[C]//2011 14th International IEEE Conference on Intelligent Transportation Systems. 2011: 242-247. [22] KIM S W, QIN B, CHONG Z J, et al. Multivehicle cooperative driving using cooperativeperception: Design and experimental validation[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(2): 663-680. [23] WANG Y, DE VECIANA G, SHIMIZU T, et al. Performance and scaling of collaborative sensing and networking for automated driving applications[C]//2018 IEEE International Conference on Communications Workshops. 2018: 1-6. [24] CHEN Q, TANG S, YANG Q, et al. Cooper: Cooperative perception for connected autonomous vehicles based on 3d point clouds[C]//2019 IEEE 39th International Conference on DistributedComputing Systems. 2019: 514-524. [25] ARNOLD E, DIANATI M, DE TEMPLE R, et al. Cooperative perception for 3d object detection in driving scenarios using infrastructure sensors[J]. IEEE Transactions on IntelligentTransportation Systems, 2020: 1-13. [26] CHEN Q, MA X, TANG S, et al. F-cooper: Feature based cooperative perception for autonomous vehicle edge computing system using 3d point clouds[C]//Proceedings of the 4th ACM/IEEE Symposium on Edge Computing. 2019: 88–100. [27] GUO J, CARRILLO D, TANG S, et al. Coff: Cooperative spatial feature fusion for 3-d objectdetection on autonomous vehicles[J]. IEEE Internet of Things Journal, 2021, 8(14): 11078-11087. [28] YOON D D, ALI G G M N, AYALEW B. Cooperative perception in connected vehicle traffcunder field-of-view and participation variations[C]//2019 IEEE 2nd Connected and Automated Vehicles Symposium. 2019: 1-6. [29] BISWAS S, TATCHIKOU R, DION F. Vehicle-to-vehicle wireless communication protocolsfor enhancing highway traffc safety[J]. IEEE Communications Magazine, 2006, 44(1): 74-82. [30] TAN M. Multi-agent reinforcement learning: Independent vs. cooperative agents[C]//Proceedings of the tenth international conference on machine learning. 1993: 330-337. [31] SINGH A, JAIN T, SUKHBAATAR S. Learning when to communicate at scale in multiagentcooperative and competitive tasks[C]//International Conference on Learning Representations.2019. [32] SUKHBAATAR S, FERGUS R, et al. Learning multiagent communication with backpropagation[J]. Advances in neural information processing systems, 2016, 29: 2244-2252. [33] PENG P, WEN Y, YANG Y, et al. Multiagent bidirectionally-coordinated nets: Emergenceof human-level coordination in learning to play starcraft combat games[J]. arXiv preprintarXiv:1703.10069, 2017. [34] JIANG J, LU Z. Learning attentional communication for multi-agent cooperation[C]//Advancesin Neural Information Processing Systems: volume 31. 2018. [35] LIU Y C, TIAN J, MA C Y, et al. Who2com: Collaborative perception via learnable handshakecommunication[C]//2020 IEEE International Conference on Robotics and Automation. 2020:6876-6883. [36] Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? the kitti vision benchmarksuite[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012: 3354-3361. [37] GEIGER A, LENZ P, STILLER C, et al. Vision meets robotics: The kitti dataset[J]. International Journal of Robotics Research, 2013. [38] HURL B, COHEN R, CZARNECKI K, et al. Trupercept: Trust modelling for autonomous vehicle cooperative perception from synthetic data[C]//2020 IEEE Intelligent Vehicles Symposium. 2020: 341-347. [39] LANG A H, VORA S, CAESAR H, et al. Pointpillars: Fast encoders for object detection frompoint clouds[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 12689-12697. [40] HIROSE Y, YAMASHITA K, HIJIYA S. Back-propagation algorithm which varies the numberof hidden units[J]. Neural Netw., 1991, 4(1): 61–66. [41] PASCANU R, MIKOLOV T, BENGIO Y. On the diffculty of training recurrent neural networks[J]. 30th International Conference on Machine Learning, ICML 2013, 2012. [42] SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: A simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(56): 1929-1958. [43] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017: 936-944. [44] LIU W, ANGUELOV D, ERHAN D, et al. Ssd: Single shot multibox detector[C]//Europeanconference on computer vision. 2016: 21-37. [45] DUMOULIN V, VISIN F. A guide to convolution arithmetic for deep learning[J]. arXiv preprint arXiv:1603.07285, 2016. [46] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedingsof the IEEE international conference on computer vision. 2017: 2980-2988. [47] GIRSHICK R. Fast r-cnn[C]//2015 IEEE International Conference on Computer Vision. 2015:1440-1448. [48] KIM S W, LIU W, ANG M H, et al. The impact of cooperative perception on decision makingand planning of autonomous vehicles[J]. IEEE Intelligent Transportation Systems Magazine,2015, 7(3): 39-50. [49] NIU Z, ZHONG G, YU H. A review on the attention mechanism of deep learning[J]. Neurocomputing, 2021, 452: 48-62. [50] LIU Y C, TIAN J, GLASER N, et al. When2com: Multi-agent perception via communicationgraph grouping[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020: 4105-4114. [51] DAS A, GERVET T, ROMOFF J, et al. Tarmac: Targeted multi-agent communication[C]//International Conference on Machine Learning. 2019: 1538-1546. [52] LUONG T, PHAM H, MANNING C D. Effective approaches to attention-based neural machinetranslation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1412-1421. [53] FOERSTER J, ASSAEL Y M, DE FREITAS N, et al. Learning to communicate with deepmulti-agent reinforcement learning[C]//Advances in Neural Information Processing Systems.2016: 2137-2145. [54] SUKHBAATAR S, SZLAM A, FERGUS R. Learning multiagent communication with backpropagation[C]//NIPS’16: Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook, NY, USA: Curran Associates Inc., 2016: 2252–2260. [55] HUANG X, CHENG X, GENG Q, et al. The apolloscape dataset for autonomous driving[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2018:1067-10676. [56] CAESAR H, BANKITI V, LANG A H, et al. nuscenes: A multimodal dataset for autonomousdriving[J]. CoRR, 2019. [57] CHEN S, HU J, SHI Y, et al. A vision of c-v2x: Technologies, field testing, and challenges with chinese development[J]. IEEE Internet of Things Journal, 2020, 7(5): 3872-3881. [58] NAIK G, CHOUDHURY B, PARK J M. Ieee 802.11bd amp; 5g nr v2x: Evolution of radioaccess technologies for v2x communications[J]. IEEE Access, 2019, 7: 70169-70184. [59] EVERINGHAM M, VAN GOOL L, WILLIAMS C, et al. The pascal visual object classes (voc)challenge[J]. International Journal of Computer Vision, 2010, 88: 303-338. [60] PASZKE A, GROSS S, MASSA F, et al. Pytorch: An imperative style, high-performance deep learning library[M]//Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 2019: 8024-8035. [61] KINGMA D P, BA J. Adam: A method for stochastic optimization[C]//International Conference on Learning Representations. 2015
所在学位评定分委会	电子与电气工程系
国内图书分类号	TP391.4
来源库	人工提交
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/335899
专题	工学院_电子与电气工程系
推荐引用方式 GB/T 7714	汪俊永. 自动驾驶系统中的协作3D点云目标检测[D]. 深圳. 南方科技大学,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
11930198-汪俊永-电子与电气工程（9293KB）	--	--	限制开放	--	请求全文