南方科技大学知识苑(SUSTech KC): 基于上下文聚类融合的单视图整体三维理解算法研究

题名	基于上下文聚类融合的单视图整体三维理解算法研究
其他题名	RESEARCH ON SINGLE-SHOT HOLISTIC 3D UNDERSTANDING ALGORITHM BASED ON CONTEXT CLUSTER FUSION
姓名	梁泽瑞
姓名拼音	LIANG Zerui
学号	12032458
学位类型	硕士
学位专业	0809 电子科学与技术
学科门类/专业学位类别	08 工学
导师	张进
导师单位	计算机科学与工程系
论文答辩日期	2023-05-13
论文提交日期	2023-06-19
学位授予单位	南方科技大学
学位授予地点	深圳
摘要	单视图整体三维理解，即从单张 RGB-D 图像中检测多个目标物体并推断其六自由度位姿、三维形状以及真实尺寸，在机器人作业、自动驾驶、虚拟现实等领域具有重要意义。因为依赖先验信息的传统算法不能匹配未见过的感兴趣物体，基于深度学习的解决方案已成为研究热点。目前，关于单视图整体三维理解的工作主要分为两类：先从图像中分割出目标实例区域的多阶段方案和直接推断多目标三维信息的单阶段方案。前者计算成本高，且严重依赖图像分割质量，在复杂的多目标遮挡场景表现不佳；后者不能很好地处理目标物体类内差异性，也缺乏对三维空间结构的理解，在识别和定位方面性能较差。为了解决单阶段方案存在的问题，本论文设计了基于上下文聚类融合的单视图整体三维理解算法 CoCFusion (Context Cluster Fusion)。与现有研究的主要差异如下：(1)提出坐标系分离的输入端处理方法，能够更加显式地挖掘真实空间几何信息并提高网络对形状信息分布的理解力。(2)构建由上下文聚类模块和空间-通道注意力模块组成的层次化特征融合网络以及改进的点云聚类自编码器，以相似性度量的方式分层聚合不同偏好的特征，使网络更关注簇间整体差异而不是外观和形状上的细节匹配。(3)引入置信度几何一致性约束，增强网络对二维像素平面与真实三维空间映射过程的学习能力。本论文在 NOCS 数据集上进行了验证。实验结果表明，CoCFusion 的性能显著优于现有单阶段整体三维理解算法，对于未见过的目标实例，六自由度位姿估计的类平均精度最高取得了 8.7% 的绝对提升。在华为技术有限公司 2012 实验室的支持下，本论文搭建了精度为 0. 1 mm 的机械臂视觉伺服实验系统，在实际工业生产场景中以 35 FPS 的速率成功完成算法测试,证明研究内容具有一定实用价值。
其他摘要	Single-shot holistic 3D understanding detects multi-objects from a single RGB-D observation and determines their six-degree-of-freedom pose, shape, and size, which assumes immense significance in various domains. Traditional algorithms that depend on prior knowledge cannot process unseen objects, leading to an increased focus on deep learning-based solutions. The existing works mainly converge upon two categories: the multi-stage scheme that first divides the image to capture the region of object instances, and the single-stage strategy, which directly deducts complete 3D knowledge from the initial data. The former suffers from high-computational cost and low performance in complex multi-object scenarios, where occlusions can be present, while the latter presents detection and positioning issues and struggles to accommodate intra-class variations and comprehend spatial geometry structures. We present CoCFusion (Context Cluster Fusion), a novel algorithm based on fusion with context clusters to address the limitations of the single-stage scheme. The fundamental distinctions from existing methods are: (1) An input processing method using coordinate system separation to mine the real spatial geometry information more explicitly and improving network comprehension of shape information distribution. (2) A hierarchical feature fusion network comprised of context clustering modules & spatial-channel attention modules along with an improved point-cloud clustering autoencoder. Allow the network to focus recognition of overall differences among clusters over subtle matches in appearance and shape. (3) To reinforce the understanding of the mapping process between pixel plane and real 3D space, we introduce confidence geometric consistency constraints. Our method significantly outperforms the existing single-stage approach on the NOCS dataset with an 8.7% absolute improvement in mean average precision(mAP) for unseen objects 6D pose estimation. With the support of Huawei 2012 Lab, we also built an experimental mechanical-arm visual servo system with an accuracy of 0.1 mm and completed the algorithm test at a speed of 35 FPS in an industrial production scene, which proves that the research content is significant.
关键词	上下文聚类三维理解位姿估计深度学习视觉伺服
其他关键词	Context Cluster 3D Understanding Pose Estimation Deep Learning Visual Servoing
语种	中文
培养类别	独立培养
入学年份	2020
学位授予年份	2023-06
参考文献列表	[1] OZTEMEL E, GURSEV S. Literature review of Industry 4.0 and related technologies[J]. Journal of intelligent manufacturing, 2020, 31: 127-182. [2] NIE Y, HAN X, GUO S, et al. Total3dunderstanding: Joint layout, object pose and mesh reconstruction for indoor scenes from a single image[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 55-64. [3] ZHANG C, CUI Z, ZHANG Y, et al. Holistic 3d scene understanding from a single image with implicit representation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 8833-8842. [4] HE Z, FENG W, ZHAO X, et al. 6D pose estimation of objects: Recent technologies and challenges[J]. Applied Sciences, 2020, 11(1): 228. [5] WANG K, XIE J, ZHANG G, et al. Sequential 3D human pose and shape estimation from point clouds[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recogni tion. 2020: 7275-7284. [6] HE Y, SUN W, HUANG H, et al. Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 11632-11641. [7] ZHANG Z. Microsoft kinect sensor and its effect[J]. IEEE multimedia, 2012, 19(2): 4-10. [8] KEHL W, MANHARDT F, TOMBARI F, et al. Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again[C]//Proceedings of the IEEE international conference on computer vision. 2017: 1521-1529. [9] LI Y, WANG G, JI X, et al. Deepim: Deep iterative matching for 6d pose estimation[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 683-698. [10] XIANG Y, SCHMIDT T, NARAYANAN V, et al. Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes[A]. 2017. [11] PENG S, LIU Y, HUANG Q, et al. Pvnet: Pixel-wise voting network for 6dof pose estimation [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019: 4561-4570. [12] TEKIN B, SINHA S N, FUA P. Real-time seamless single shot 6d object pose prediction[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 292-301. [13] WANG C, XU D, ZHU Y, et al. Densefusion: 6d object pose estimation by iterative dense fusion[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2019: 3343-3352. [14] SUNDERMEYER M, MARTON Z C, DURNER M, et al. Implicit 3d orientation learning for 6d object detection from rgb images[C]//Proceedings of the european conference on computer vision (ECCV). 2018: 699-715. [15] TIAN M, ANG M H, LEE G H. Shape prior deformation for categorical 6d object pose and size estimation[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16. Springer, 2020: 530-546. [16] WANG H, SRIDHAR S, HUANG J, et al. Normalized object coordinate space for category-level 6d object pose and size estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 2642-2651. [17] IRSHAD M Z, ZAKHAROV S, AMBRUS R, et al. Shapo: Implicit representations for multi object shape, appearance, and pose optimization[C]//Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part II. Springer, 2022: 275-292. [18] IRSHAD M Z, KOLLAR T, LASKEY M, et al. Centersnap: Single-shot multi-object 3d shape reconstruction and categorical 6d pose and size estimation[C]//2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022: 10632-10640. [19] LAN S. Object Detection and Instance Segmentation for Real-World Applications[D]. University of Maryland, College Park, 2022. [20] ZHOU X, WANG D, KRÄHENBÜHL P. Objects as points[A]. 2019. [21] HE K, GKIOXARI G, DOLLÁR P, et al. Mask r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969. [22] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]//Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 2014: 740-755. [23] MA X, ZHOU Y, WANG H, et al. Image as Set of Points[A]. 2023. [24] HARALICK R M, JOO H, LEE C N, et al. Pose estimation from corresponding point data[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1989, 19(6): 1426-1446. [25] BESL P J, MCKAY N D. Method for registration of 3-D shapes[C]//Sensor fusion IV: control paradigms and data structures: volume 1611. Spie, 1992: 586-606. [26] HINTERSTOISSER S, HOLZER S, CAGNIART C, et al. Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes[C]//2011 international conference on computer vision. IEEE, 2011: 858-865. [27] HINTERSTOISSER S, LEPETIT V, ILIC S, et al. Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes[C]//Computer Vision–ACCV 2012: 11th Asian Conference on Computer Vision, Daejeon, Korea, November 5-9, 2012, Revised Selected Papers, Part I 11. Springer, 2013: 548-562. [28] ZHU Y, LI M, YAO W, et al. A Review of 6D Object Pose Estimation[C]//2022 IEEE 10th Joint International Information Technology and Artificial Intelligence Conference (ITAIC): volume 10. IEEE, 2022: 1647-1655. [29] DROST B, ULRICH M, NAVAB N, et al. Model globally, match locally: Efficient and robust 3D object recognition[C]//2010 IEEE computer society conference on computer vision and pattern recognition. Ieee, 2010: 998-1005. [30] CHOI C, CHRISTENSEN H I. 3D pose estimation of daily objects using an RGB-D camera[C]//2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2012: 3342-3349. [31] CHOI C, TREVOR A J, CHRISTENSEN H I. RGB-D edge detection and edge-based registration[C]//2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2013: 1568-1575. [32] VIDAL J, LIN C Y, LLADÓ X, et al. A method for 6D pose estimation of free-form rigid objects using point pair features on range data[J]. Sensors, 2018, 18(8): 2678. [33] HODAN T, MICHEL F, BRACHMANN E, et al. Bop: Benchmark for 6d object pose estimation[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 19-34. [34] RUSU R B, BLODOW N, MARTON Z C, et al. Aligning point cloud views using persistent feature histograms[C]//2008 IEEE/RSJ international conference on intelligent robots and systems. IEEE, 2008: 3384-3391. [35] ALDOMA A, MARTON Z C, TOMBARI F, et al. Tutorial: Point cloud library: Three dimensional object recognition and 6 dof pose estimation[J]. IEEE Robotics & Automation Magazine, 2012, 19(3): 80-91. [36] ALDOMA A, TOMBARI F, DI STEFANO L, et al. A global hypotheses verification method for 3d object recognition[C]//Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part III 12. Springer, 2012: 511-524. [37] BREIMAN L. Random forests[J]. Machine learning, 2001, 45: 5-32. [38] BRACHMANN E, KRULL A, MICHEL F, et al. Learning 6d object pose estimation using 3d object coordinates[C]//Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part II 13. Springer, 2014: 536-551. [39] FISCHLER M A, BOLLES R C. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6): 381-395. [40] MICHEL F, KIRILLOV A, BRACHMANN E, et al. Global hypothesis generation for 6D object pose estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 462-471. [41] PITTERI G, RAMAMONJISOA M, ILIC S, et al. On object symmetries and 6d pose estimation from images[C]//2019 International conference on 3D vision (3DV). IEEE, 2019: 614-622. [42] LI Z, WANG G, JI X. Cdpn: Coordinates-based disentangled pose network for real-time rgb based 6-dof object pose estimation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 7678-7687. [43] OBERWEGER M, RAD M, LEPETIT V. Making deep heatmaps robust to partial occlusions for 3d object pose estimation[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 119-134. [44] LEPETIT V, MORENO-NOGUER F, FUA P. EPnP: An accurate O(n) solution to the PnPproblem[J]. International journal of computer vision, 2009, 81: 155-166. [45] RAD M, LEPETIT V. Bb8: A scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth[C]//Proceedings of the IEEE international conference on computer vision. 2017: 3828-3836. [46] O`SHEA K, NASH R. An introduction to convolutional neural networks[A]. 2015. [47] HU Y, HUGONOT J, FUA P, et al. Segmentation-driven 6d object pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3385-3394. [48] JEON M H, KIM A. Prima6d: Rotational primitive reconstruction for enhanced and robust 6d pose estimation[J]. IEEE Robotics and Automation Letters, 2020, 5(3): 4955-4962. [49] HU Y, FUA P, WANG W, et al. Single-stage 6d object pose estimation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 2930-2939. [50] KENDALL A, GRIMES M, CIPOLLA R. Posenet: A convolutional network for real-time 6-dof camera relocalization[C]//Proceedings of the IEEE international conference on computer vision. 2015: 2938-2946. [51] LIU W, ANGUELOV D, ERHAN D, et al. Ssd: Single shot multibox detector[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, 2016: 21-37. [52] DO T T, CAI M, PHAM T, et al. Deep-6dpose: Recovering 6d object pose from a single rgb image[A]. 2018. [53] REN S, HE K, GIRSHICK R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28. [54] CHEN W, JIA X, CHANG H J, et al. G2l-net: Global to local network for real-time 6d pose estimation with embedding vector features[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 4233-4242. [55] HE Y, HUANG H, FAN H, et al. Ffb6d: A full flow bidirectional fusion network for 6d pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog nition. 2021: 3003-3013. [56] CHANG A X, FUNKHOUSER T, GUIBAS L, et al. Shapenet: An information-rich 3d model repository[A]. 2015. [57] UMEYAMA S. Least-squares estimation of transformation parameters between two point patterns[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 1991, 13(04): 376-380. [58] CHEN D, LI J, WANG Z, et al. Learning canonical shape space for category-level 6d object pose and size estimation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 11973-11982. [59] KINGMA D P, WELLING M. Auto-encoding variational bayes[A]. 2013. [60] QI C R, SU H, MO K, et al. Pointnet: Deep learning on point sets for 3d classification and segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 652-660. [61] ZOU L, HUANG Z, GU N, et al. 6d-vit: Category-level 6d object pose estimation viatransformer-based instance representation learning[J]. IEEE Transactions on Image Processing, 2022, 31: 6907-6921. [62] FAN Z, SONG Z, XU J, et al. ACR-Pose: Adversarial canonical representation reconstruction network for category level 6D object pose estimation[A]. 2021. [63] CHEN K, DOU Q. Sgpa: Structure-guided prior adaptation for category-level 6d object pose estimation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 2773-2782. [64] CHEN W, JIA X, CHANG H J, et al. Fs-net: Fast shape-based network for category-level 6d object pose estimation with decoupled rotation mechanism[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 1581-1590. [65] DI Y, ZHANG R, LOU Z, et al. Gpv-pose: Category-level object pose estimation via geometry guided point-wise voting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 6781-6791. [66] YOU Y, SHI R, WANG W, et al. Cppf: Towards robust category-level 9d pose estimation in the wild[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni tion. 2022: 6866-6875. [67] LIN J, WEI Z, LI Z, et al. Dualposenet: Category-level 6d object pose and size estimation using dual pose network with refined learning of pose consistency[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 3560-3569. [68] CHEN X, WU Q, WANG S. Research on 3D reconstruction based on multiple views[C]//2018 13th International Conference on Computer Science & Education (ICCSE). IEEE, 2018: 1-5. [69] CHEN Z, ZHANG H. Learning implicit fields for generative shape modeling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 5939-5948. [70] KATO H, USHIKU Y, HARADA T. Neural 3d mesh renderer[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 3907-3916. [71] NIEMEYER M, MESCHEDER L, OECHSLE M, et al. Differentiable volumetric rendering: Learning implicit 3d representations without 3d supervision[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 3504-3515. [72] GROUEIX T, FISHER M, KIM V G, et al. A papier-mâché approach to learning 3d surface generation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 216-224. [73] LI L, KHAN S, BARNES N. Silhouette-assisted 3d object instance reconstruction from a cluttered scene[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019: 0-0. [74] KULKARNI N, MISRA I, TULSIANI S, et al. 3d-relnet: Joint object and relational network for 3d prediction[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 2212-2221. [75] STUELPNAGEL J. On the parametrization of the three-dimensional rotation group[J]. SIAM review, 1964, 6(4): 422-430. [76] WANG X, ZHU Z. Context understanding in computer vision: A survey[J]. Computer Vision and Image Understanding, 2023, 229: 103646. [77] REN X, MALIK J. Learning a classification model for segmentation[C]//Computer Vision, IEEE International Conference on: volume 2. IEEE Computer Society, 2003: 10-10. [78] TOUVRON H, BOJANOWSKI P, CARON M, et al. Resmlp: Feedforward networks for image classification with data-efficient training[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022. [79] LIU Y, SHAO Z, HOFFMANN N. Global attention mechanism: Retain information to enhance channel-spatial interactions[A]. 2021. [80] WOO S, PARK J, LEE J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19. [81] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2980-2988. [82] QI C R, YI L, SU H, et al. Pointnet++: Deep hierarchical feature learning on point sets in a metric space[J]. Advances in neural information processing systems, 2017, 30. [83] MA X, QIN C, YOU H, et al. Rethinking network design and local geometry in point cloud: A simple residual MLP framework[A]. 2022. [84] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2117-2125. [85] JIANG X, LI D, CHEN H, et al. Uni6d: A unified cnn framework without projection breakdown for 6d pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 11174-11184. [86] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words:Transformers for image recognition at scale[A]. 2020. [87] XIANG Y, MOTTAGHI R, SAVARESE S. Beyond pascal: A benchmark for 3d object detection in the wild[C]//IEEE winter conference on applications of computer vision. IEEE, 2014: 75-82. [88] IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]//International conference on machine learning. pmlr, 2015: 448-456. [89] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30. [90] HAN J, MORAGA C. The influence of the sigmoid function parameters on the speed of back propagation learning[C]//From Natural to Artificial Neural Computation: International Workshop on Artificial Neural Networks Malaga-Torremolinos, Spain, June 7–9, 1995 Proceedings 3. Springer, 1995: 195-201. [91] YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[A]. 2015. [92] KIRILLOV A, GIRSHICK R, HE K, et al. Panoptic feature pyramid networks[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 6399-6408. [93] GIRSHICK R. Fast r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448. [94] LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization[A]. 2017. [95] REDDI S J, KALE S, KUMAR S. On the convergence of adam and beyond[A]. 2019. [96] PASZKE A, GROSS S, MASSA F, et al. Pytorch: An imperative style, high-performance deep learning library[J]. Advances in neural information processing systems, 2019, 32. [97] CHEN X, DONG Z, SONG J, et al. Category level object pose estimation via neural analysis-by synthesis[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI 16. Springer, 2020: 139-156. [98] LEE T, LEE B U, KIM M, et al. Category-level metric scale object shape and pose estimation[J]. IEEE Robotics and Automation Letters, 2021, 6(4): 8575-8582. [99] VAN DER MAATEN L, HINTON G. Visualizing data using t-SNE.[J]. Journal of machine learning research, 2008, 9(11). [100] SUN X, ZHU X, WANG P, et al. A review of robot control with visual servoing[C]//2018 IEEE 8th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER). IEEE, 2018: 116-121. [101] WU J, JIN Z, LIU A, et al. A survey of learning-based control of robotic visual servoing systems[J]. Journal of the Franklin Institute, 2022, 359(1): 556-577. [102] 海康机器人股份有限公司. MV-CS050-10GM-PRO 工业相机技术规格书[M]. 杭州: 杭州海康机器人股份有限公司, 2021: 3. [103] INSTRUMENTS T. DLP471NE 0.47 全高清 DMD 数据表 (Rev. B)[M]. Dallas: Texas Instruments, 2022: 11. [104] 遨博机器人有限公司. AUBO-i16 协作机器人产品规格说明书[M]. 浙江: 浙江遨博机器人有限公司, 2021: 2. [105] BRADSKI G. The openCV library.[J]. Dr. Dobb’s Journal: Software Tools for the Professional Programmer, 2000, 25(11): 120-123. [106] GUENNEBAUD G, JACOB B, et al. Eigen[J]. URl: http://eigen. tuxfamily. org, 2010, 3. [107] AGARWAL S, MIERLE K, et al. Ceres solver[Z]. 2012. [108] RUSU R B, COUSINS S. 3d is here: Point cloud library (pcl)[C]//2011 IEEE international conference on robotics and automation. IEEE, 2011: 1-4. [109] BLANCHETTE J, SUMMERFIELD M. C++ GUI programming with Qt 4[M]. Prentice Hall Professional, 2006. [110] STANHOPE S A, ABNEY M. GLOGS: a fast and powerful method for GWAS of binary traits with risk covariates in related populations[J]. Bioinformatics, 2012, 28(11): 1553-1554. [111] TSAI R. A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses[J]. IEEE Journal on Robotics and Automation, 1987, 3(4): 323-344. [112] ZHANG Z. A flexible new technique for camera calibration[J]. IEEE Transactions on pattern analysis and machine intelligence, 2000, 22(11): 1330-1334. [113] HU Z Y, WU F C. A review on some active vision based camera calibration techniques[J]. CHINESE JOURNAL OF COMPUTERS-CHINESE EDITION-, 2002, 25(11): 1149-1156. [114] ALVAREZ S, LLORCA D F, SOTELO M. Hierarchical camera auto-calibration for traffic surveillance systems[J]. Expert Systems with Applications, 2014, 41(4): 1532-1542. [115] ZELLER C, FAUGERAS O. Camera self-calibration from video sequences: the Kruppa equations revisited[D]. INRIA, 1996. [116] SHAH M. Solving the robot-world/hand-eye calibration problem using the Kronecker product[J]. Journal of Mechanisms and Robotics, 2013, 5(3): 031007.
所在学位评定分委会	电子科学与技术
国内图书分类号	TP391.4
来源库	人工提交
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/543983
专题	工学院_计算机科学与工程系
推荐引用方式 GB/T 7714	梁泽瑞. 基于上下文聚类融合的单视图整体三维理解算法研究[D]. 深圳. 南方科技大学,2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
12032458-梁泽瑞-计算机科学与工（57173KB）	--	--	限制开放	--	请求全文