中文版 | English
题名

新型自参考泛化模型方法的研究

其他题名
RESEARCH ON NEW SELF-REFERENTIAL GENERALIZATION MODEL METHOD
姓名
姓名拼音
HAN Zhehan
学号
12132120
学位类型
硕士
学位专业
0856 材料与化工
学科门类/专业学位类别
0856 材料与化工
导师
何志海
导师单位
电子与电气工程系
论文答辩日期
2023-05-18
论文提交日期
2023-06-30
学位授予单位
南方科技大学
学位授予地点
深圳
摘要

学习和预测中的一个核心挑战是泛化问题:经过充分训练的预测网络模型在 新的测试样本上往往会遭遇严重的性能衰退。造成这个泛化问题的一个主要原因 是,一旦预测网络成功地用标记样本进行了训练,它将保持不变,并以完全前馈 的方式执行推理过程,没有考虑测试样本的特定特征作为指导信息来纠正其推理 过程。由于测试样本的真实值不可用,它没有能力对预测错误进行表征,从测试 样本中生成指导信息,并即时地为每个单独的测试样本纠正预测错误。我们认为, 这种预测误差表征和基于反馈的纠正能力对预测网络的泛化性能至关重要。 在这项工作中,我们提出建立一种自纠正推理方法来解决这个泛化挑战,并 以人体姿态估计为例来展示其有效性和性能。具体来说,我们引入样本特定的约 束误差,包括通过引入任务相关约束来引入的外部约束误差以及通过设计适应性 反馈网络将预测网络的输出结果映射回输入样本来引入的自参照误差。 利用与实际预测误差高度相关的样本特定约束误差作为目标函数,我们探索 了三种不同的方法来在推理阶段纠正预测结果:(1) 在预测结果的局部邻域中搜索 以最小化样本特定约束误差;(2) 学习一个能够根据指导信息自适应纠正预测误差 的纠正网络;(3) 使用样本特定约束误差作为损失函数,以在当前测试样本上改进 预测网络模型。 我们将自纠正推理方法应用于两个应用:2D 人体姿态估计和 3D 手部姿态估 计。我们在这三个应用上进行了广泛的实验,证明了所提出的自纠正推理方法能 够显著提高网络预测的泛化能力和性能。

关键词
语种
中文
培养类别
独立培养
入学年份
2021
学位授予年份
2023-06
参考文献列表

[1] QUINONERO-CANDELA J, SUGIYAMA M, SCHWAIGHOFER A, et al. Dataset shift inmachine learning[M]. Mit Press, 2008.
[2] 陆佳鑫. 基于深度神经网络的人体跌倒碰撞前行为检测研究[D]. 硕士论文, 电子科技大学, 2021.
[3] XIAO B, WU H, WEI Y. Simple Baselines for Human Pose Estimation and Tracking[C]//Lecture Notes in Computer Science: volume 11210 ECCV (6). Springer, 2018: 472-487.
[4] CHENG B, XIAO B, WANG J, et al. HigherHRNet: Scale-Aware Representation Learning forBottom-Up Human Pose Estimation[C]//CVPR. 2020: 5385-5394.
[5] XU T, TAKANO W. Graph Stacked Hourglass Networks for 3D Human Pose Estimation[C]//CVPR. 2021: 16105-16114.
[6] SUN K, XIAO B, LIU D, et al. Deep High-Resolution Representation Learning for Human PoseEstimation[C]//CVPR. Computer Vision Foundation / IEEE, 2019: 5693-5703.
[7] SU K, YU D, XU Z, et al. Multi-Person Pose Estimation With Enhanced Channel-Wise andSpatial Information[C]//CVPR. Computer Vision Foundation / IEEE, 2019: 5674-5682.
[8] FANG H S, XIE S, TAI Y W, et al. RMPE: Regional Multi-Person Pose Estimation[C]//ICCV.2017: 2353-2362.
[9] HE K, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//ICCV. 2017: 2980-2988.
[10] SUN X, XIAO B, WEI F, et al. Integral Human Pose Regression[C]//ECCV. 2018: 536-553.
[11] MOON G, CHANG J Y, LEE K M. PoseFix: Model-Agnostic General Human Pose RefinementNetwork[C]//CVPR. Computer Vision Foundation / IEEE, 2019: 7773-7781.
[12] PAPANDREOU G, ZHU T, KANAZAWA N, et al. Towards Accurate Multi-person Pose Estimation in the Wild[C]//CVPR. 2017: 3711-3719.
[13] GENG Z, SUN K, XIAO B, et al. Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression[C]//CVPR. 2021: 14676-14686.
[14] CAO Z, SIMON T, WEI S, et al. Realtime Multi-person 2D Pose Estimation Using Part AffinityFields[C]//CVPR. 2017: 1302-1310.
[15] LUO Z, WANG Z, HUANG Y, et al. Rethinking the Heatmap Regression for Bottom-Up HumanPose Estimation[C]//CVPR. 2021: 13264-13273.
[16] KAMEL A, SHENG B, LI P, et al. Hybrid Refinement-Correction Heatmaps for Human PoseEstimation[J/OL]. IEEE Transactions on Multimedia, 2021, 23: 1330-1342. DOI: 10.1109/TMM.2020.2999181.
[17] WANG J, LONG X, GAO Y, et al. Graph-PCNN: Two Stage Human Pose Estimation withGraph Pose Refinement[C]//Lecture Notes in Computer Science: volume 12356 ECCV (11).Springer, 2020: 492-508.48参考文献
[18] FIERARU M, KHOREVA A, PISHCHULIN L, et al. Learning to Refine Human Pose Estimation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW). 2018: 318-31809.
[19] ZHU J Y, PARK T, ISOLA P, et al. Unpaired Image-To-Image Translation Using CycleConsistent Adversarial Networks[C]//ICCV. 2017: 2242-2251.
[20] SUN H, ZHAO Z, HE Z. Reciprocal Learning Networks for Human Trajectory Prediction[C]//CVPR. 2020: 7414-7423.
[21] XU C, HOWEY J, OHORODNYK P, et al. Segmentation and quantification of infarction without contrast agents via spatiotemporal generative adversarial learning[J]. Medical image analysis, 2020, 59: 101568.
[22] LIU X, ZHANG P, YU C, et al. Watching You: Global-guided Reciprocal Learning for Videobased Person Re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 13334-13343.
[23] ZHANG L, ZHOU S, GUAN J, et al. Accurate Few-Shot Object Detection With Support-QueryMutual Guidance and Hybrid Loss[C]//Proceedings of the IEEE/CVF Conference on ComputerVision and Pattern Recognition. 2021: 14424-14432.
[24] CARREIRA J, AGRAWAL P, FRAGKIADAKI K, et al. Human Pose Estimation with IterativeError Feedback[C]//CVPR. IEEE Computer Society, 2016: 4733-4742.
[25] FIERARU M, KHOREVA A, PISHCHULIN L, et al. Learning to Refine Human Pose Estimation[C]//CVPR Workshops. Computer Vision Foundation / IEEE Computer Society, 2018:205-214.
[26] KAN Z, CHEN S, LI Z, et al. Self-Constrained Inference Optimization on Structural Groupsfor Human Pose Estimation[C/OL]//ECCV: volume 13665. Springer, 2022: 729-745. https://doi.org/10.1007/978-3-031-20065-6_42.
[27] WANG D, SHELHAMER E, LIU S, et al. Tent: Fully Test-Time Adaptation by Entropy Minimization[C]//ICLR. OpenReview.net, 2021.
[28] SUN Y, WANG X, LIU Z, et al. Test-Time Training with Self-Supervision for Generalizationunder Distribution Shifts[C]//Proceedings of Machine Learning Research: volume 119 ICML.PMLR, 2020: 9229-9248.
[29] TUNG H, TUNG H, YUMER E, et al. Self-supervised Learning of Motion Capture[C]//NeurIPS. 2017: 5236-5246.
[30] LI Y, HAO M, DI Z, et al. Test-Time Personalization with a Transformer for Human PoseEstimation[C]//NeurIPS. 2021: 2583-2597.
[31] HAMPALI S, RAD M, OBERWEGER M, et al. Honnotate: A method for 3d annotation ofhand and object poses[C]//CVPR. 2020: 3196-3206.
[32] ZIMMERMANN C, CEYLAN D, YANG J, et al. Freihand: A dataset for markerless captureof hand pose and shape from single rgb images[C]//ICCV. 2019: 813-822.
[33] MUELLER F, BERNARD F, SOTNYCHENKO O, et al. Ganerated hands for real-time 3d handtracking from monocular rgb[C]//CVPR. 2018: 49-59.49参考文献
[34] YANG L, YAO A. Disentangling latent hands for image synthesis and pose estimation[C]//CVPR. 2019: 9877-9886.
[35] PARK J, OH Y, MOON G, et al. Handoccnet: Occlusion-robust 3d hand mesh estimationnetwork[C]//CVPR. 2022: 1496-1505.
[36] YE Y, GUPTA A, TULSIANI S. What’s in your hands? 3D Reconstruction of Generic Objectsin Hands[C]//CVPR. 2022: 3895-3905.
[37] HAMPALI S, SARKAR S D, RAD M, et al. Keypoint transformer: Solving joint identificationin challenging hands and object interactions for accurate 3d pose estimation[C]//CVPR. 2022:11090-11100.
[38] CHRISTEN S, KOCABAS M, AKSAN E, et al. D-grasp: Physically plausible dynamic graspsynthesis for hand-object interactions[C]//CVPR. 2022: 20577-20586.
[39] DIBRA E, WOLF T, OZTIRELI C, et al. How to refine 3D hand pose estimation from unlabelleddepth data?[C]//3DV. IEEE, 2017: 135-144.
[40] GE L, CAI Y, WENG J, et al. Hand pointnet: 3d hand pose estimation using point sets[C]//CVPR. 2018: 8417-8426.
[41] TANG D, YU T H, KIM T K. Real-time articulated hand pose estimation using semi-supervisedtransductive regression forests[C]//ICCV. 2013: 3224-3231.
[42] DENG X, ZUO D, ZHANG Y, et al. Recurrent 3D Hand Pose Estimation Using CascadedPose-guided 3D Alignments[J]. IEEE TPAMI, 2022, 45(1): 932-945.
[43] YANG J, BHALGAT Y, CHANG S, et al. Dynamic iterative refinement for efficient 3d handpose estimation[C]//WACV. 2022: 1869-1879.
[44] CHEN X, WANG G, GUO H, et al. Pose guided structured region ensemble network for cascaded hand pose estimation[J]. Neurocomputing, 2020, 395: 138-149.
[45] LASKARIDIS S, VENIERIS S I, ALMEIDA M, et al. SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud[C/OL]//MobiCom. New York, NY, USA:Association for Computing Machinery, 2020. https://doi.org/10.1145/3372224.3419194.
[46] KUNDU D. Bayesian Inference and Life Testing Plan for the Weibull Distribution in Presenceof Progressive Censoring[J]. Technometrics, 2008, 50(2): 144-154.
[47] ELHAYEK A, DE AGUIAR E, JAIN A, et al. Efficient ConvNet-based marker-less motioncapture in general scenes with a low number of cameras[C]//CVPR. 2015: 3810-3818.
[48] RHODIN H, CONSTANTIN V, KATIRCIOGLU I, et al. Neural Scene Decomposition forMulti-Person Motion Capture[C]//CVPR. 2019: 7703-7713.
[49] BAGAUTDINOV T M, ALAHI A, FLEURET F, et al. Social Scene Understanding: End-toEnd Multi-person Action Localization and Collective Activity Recognition[C]//CVPR. 2017:3425-3434.
[50] WU J, WANG L, WANG L, et al. Learning Actor Relation Graphs for Group Activity Recognition[C]//CVPR. 2019: 9964-9974.
[51] YANG Y, REN Z, LI H, et al. Learning Dynamics via Graph Neural Networks for Human PoseEstimation and Tracking[C]//CVPR. 2021: 8074-8084.50参考文献
[52] WANG M, TIGHE J, MODOLO D. Combining Detection and Tracking for Human Pose Estimation in Videos[C]//CVPR. 2020: 11085-11093.
[53] CHEN Y, WANG Z, PENG Y, et al. Cascaded Pyramid Network for Multi-Person Pose Estimation[C]//CVPR. 2018: 7103-7112.
[54] ZHANG F, ZHU X, DAI H, et al. Distribution-Aware Coordinate Representation for HumanPose Estimation[C]//CVPR. 2020: 7091-7100.
[55] HUANG J, ZHU Z, GUO F, et al. The Devil Is in the Details: Delving Into Unbiased DataProcessing for Human Pose Estimation[C]//CVPR. 2020: 5699-5708.
[56] LIN T, MAIRE M, BELONGIE S J, et al. Microsoft COCO: Common Objects in Context[C]//ECCV. 2014: 740-755.
[57] LI J, WANG C, ZHU H, et al. CrowdPose: Efficient Crowded Scenes Pose Estimation and aNew Benchmark[C]//CVPR. 2019: 10863-10872.
[58] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015:3431-3440.
[59] NEWELL A, HUANG Z, DENG J. Associative Embedding: End-to-End Learning for JointDetection and Grouping[C]//NeurIPS. 2017: 2277-2287.
[60] HUANG S, GONG M, TAO D. A Coarse-Fine Network for Keypoint Localization[C]//ICCV.2017: 3047-3056.
[61] YU D, SU K, GENG X, et al. A Context-and-Spatial Aware Network for Multi-Person PoseEstimation[J]. CoRR, 2019, abs/1905.05355.
[62] LI W, WANG Z, YIN B, et al. Rethinking on Multi-Stage Networks for Human Pose Estimation[J]. CoRR, 2019, abs/1901.00148.
[63] KHIRODKAR R, CHARI V, AGRAWAL A, et al. Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation[C]//Proceedings of the IEEE/CVF International Conference onComputer Vision (ICCV). 2021: 3122-3131.
[64] GOLDA T, KALB T, SCHUMANN A, et al. Human Pose Estimation for Real-World CrowdedScenarios[C]//AVSS. 2019: 1-8.
[65] CHÉRON G, LAPTEV I, SCHMID C. P-CNN: Pose-Based CNN Features for Action Recognition[C]//ICCV. IEEE Computer Society, 2015: 3218-3226.
[66] LI D, CHEN X, ZHANG Z, et al. Pose Guided Deep Model for Pedestrian Attribute Recognitionin Surveillance Scenarios[C]//ICME. IEEE Computer Society, 2018: 1-6.
[67] HE D, XIA Y, QIN T, et al. Dual Learning for Machine Translation[C]//NeurIPS. 2016: 820-828.
[68] XIA Y, QIN T, CHEN W, et al. Dual Supervised Learning[C]//Proceedings of Machine LearningResearch: volume 70 ICML. PMLR, 2017: 3789-3798.
[69] CHENG H, YANG L, LIU Z. Survey on 3D Hand Gesture Recognition[J]. IEEE TCSVT, 2016,26(9): 1659-1673.
[70] ZUFFI S, KANAZAWA A, JACOBS D W, et al. 3D Menagerie: Modeling the 3D Shape andPose of Animals[C]//CVPR. 2017.
[71] SONG X, WANG P, ZHOU D, et al. ApolloCar3D: A Large 3D Car Instance UnderstandingBenchmark for Autonomous Driving[C]//CVPR. 2019.
[72] CHOI H, MOON G, LEE K M. Pose2mesh: Graph convolutional network for 3d human poseand mesh recovery from a 2d human pose[C]//ECCV. Springer, 2020: 769-787.
[73] LIU S, JIANG H, XU J, et al. Semi-supervised 3d hand-object poses estimation with interactionsin time[C]//CVPR. 2021: 14687-14697.
[74] LONG J, SHELHAMER E, DARRELL T. Fully Convolutional Networks for Semantic Segmentation[C]//CVPR. 2015.
[75] LIN K, WANG L, LIU Z. Mesh Graphormer[C]//ICCV. 2021: 12939-12948.
[76] LIN K, WANG L, LIU Z. End-to-end human pose and mesh reconstruction with transformers[C]//CVPR. 2021: 1954-1963.
[77] MOON G, LEE K M. I2l-meshnet: Image-to-lixel prediction network for accurate 3d humanpose and mesh estimation from a single rgb image[C]//ECCV. Springer, 2020: 752-768.
[78] HASSON Y, TEKIN B, BOGO F, et al. Leveraging photometric consistency over time forsparsely supervised hand-object reconstruction[C]//CVPR. 2020: 571-580.
[79] BOUKHAYMA A, BEM R D, TORR P H. 3D Hand Shape and Pose From Images in the Wild[C]//CVPR. 2019.
[80] HASSON Y, VAROL G, TZIONAS D, et al. Learning joint reconstruction of hands and manipulated objects[C]//CVPR. 2019: 11807-11816.

所在学位评定分委会
材料与化工
国内图书分类号
TP391.4
来源库
人工提交
成果类型学位论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/544714
专题工学院_电子与电气工程系
推荐引用方式
GB/T 7714
阚哲涵. 新型自参考泛化模型方法的研究[D]. 深圳. 南方科技大学,2023.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
12132120-阚哲涵-电子与电气工程(13793KB)----限制开放--请求全文
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[阚哲涵]的文章
百度学术
百度学术中相似的文章
[阚哲涵]的文章
必应学术
必应学术中相似的文章
[阚哲涵]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。