南方科技大学知识苑(SUSTech KC): 带噪场景下的深度神经网络压缩算法研究及验证系统设计

题名	带噪场景下的深度神经网络压缩算法研究及验证系统设计
其他题名	NOISE-TOLERANT HARDWARE-AWARE MODEL COMPRESSION FOR DEEP NEURAL NETWORKS AND EVALUATION SYSTEM DESIGN
姓名	陆舜
姓名拼音	LU Shun
学号	12032481
学位类型	硕士
学位专业	0809 电子科学与技术
学科门类/专业学位类别	08 工学
导师	姚新
导师单位	计算机科学与工程系
论文答辩日期	2023-05-13
论文提交日期	2023-06-28
学位授予单位	南方科技大学
学位授予地点	深圳
摘要	近年来，随着在边缘设备上部署深度模型的普及，模型压缩算法的研究越来越受到关注。硬件感知方法通过在边缘设备上评估模型的指标值来辅助模型压缩。已有的研究表明，采用硬件感知的压缩方法可以有效地提高压缩模型在边缘设备上的真实性能。然而，在硬件环境下，评估执行过程中存在一定的不确定性，大多数相关工作没有考虑评估值中的噪声，也没有考虑噪声对算法的影响。此外，相比于传统的压缩算法，基于硬件感知的方法需要在跨设备的边云环境下协同，这意味着频繁的模型导入以及指标数据收集，压缩算法也需要多节点同步部署运行，这大量引入了潜在的人力成本。现有的使用硬件感知方法的工作主要集中在特定的平台或优化目标上，该研究领域还缺乏通用的框架支持，这导致了低端硬件不确定执行环境下压缩算法的研究困难。本文旨在研究不确定执行环境下的深度模型压缩问题。本文将压缩过程作为一个约束优化问题，并通过演化算法对该问题进行求解。而在求解过程中，需要对比不同模型在边缘设备上的真实表现，由于在边缘设备上采集的数据存在噪声，将导致难以选取到实际更优的模型。首先，本文设计了一种能够反映边缘设备中模型噪声的指标，并设计了一种对包含噪声的评价值进行比较的方法，使算法更适合不确定执行环境下的模型压缩，最后通过剪枝的方法实现了抗噪算法。其次，为了解决硬件感知研究领域缺乏通用开发框架的问题，本文设计了一个自动化验证系统，该系统可以有效地支持分布式算法的开发部署和硬件指标的评估。
关键词	深度神经网络模型压缩剪枝硬件感知演化计算
语种	中文
培养类别	独立培养
入学年份	2020
学位授予年份	2023-06
参考文献列表	[1] RUSSELL S, NORVIG P. Artificial Intelligence: A Modern Approach, 4th, Global ed[M]. Prentice Hall, 2022. [2] KORTLI Y, JRIDI M, ALFALOU A, et al. Face Recognition Systems: A Survey[J]. Sensors, 2020, 20(2): 342. [3] NASSIF A B, SHAHIN I, ATTILI I B, et al. Speech Recognition Using Deep Neural Networks: A Systematic Review[J]. IEEE Access, 2019, 7: 19143-19165. [4] SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nat., 2016, 529(7587): 484-489. [5] LU Z, RATHOD V, VOTEL R, et al. RetinaTrack: Online Single Stage Joint Detection and Tracking[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020. Computer Vision Foundation / IEEE, 2020: 14656- 14666. [6] LIU D, KONG H, LUO X, et al. Bringing AI to edge: From deep learning’s perspective[J]. Neurocomputing, 2022, 485: 297-320. [7] YANG H, TATE M. A Descriptive Literature Review and Classification of Cloud Computing Research[J]. Commun. Assoc. Inf. Syst., 2012, 31: 2. [8] SHUKUR H, ZEEBAREE S, ZEBARI R, et al. Cloud computing virtualization of resources allocation for distributed systems[J]. Journal of Applied Science and Technology Trends, 2020, 1(3): 98-105. [9] SHI W, DUSTDAR S. The Promise of Edge Computing[J]. Computer, 2016, 49(5): 78-81. [10] SZE V, CHEN Y, YANG T, et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey[J]. Proc. IEEE, 2017, 105(12): 2295-2329. [11] CHENG Y, WANG D, ZHOU P, et al. A Survey of Model Compression and Acceleration for Deep Neural Networks[J]. CoRR, 2017, abs/1710.09282. [12] GUO Y, YAO A, CHEN Y. Dynamic Network Surgery for Efficient DNNs[C]//Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain. 2016: 1379-1387. [13] LI H, KADAV A, DURDANOVIC I, et al. Pruning Filters for Efficient ConvNets[C]//5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24- 26, 2017, Conference Track Proceedings. OpenReview.net, 2017. [14] HE Y, LIU P, WANG Z, et al. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration[C]//IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 2019: 4340-4349. 60 [15] WANG Q, LI D, HUANG X, et al. Optimizing FFT-Based Convolution on ARMv8 Multi core CPUs[C]//Lecture Notes in Computer Science: volume 12247 Euro-Par 2020: Parallel Processing - 26th International Conference on Parallel and Distributed Computing, Warsaw, Poland, August 24-28, 2020, Proceedings. Springer, 2020: 248-262. [16] LI M, LIU Y, LIU X, et al. The Deep Learning Compiler: A Comprehensive Survey[J]. IEEE Trans. Parallel Distributed Syst., 2021, 32(3): 708-727. [17] CAPRA M, BUSSOLINO B, MARCHISIO A, et al. Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead[J]. IEEE Access, 2020, 8: 225134-225180. [18] SHUANGFENG L. Tensorflow lite: On-device machine learning framework[J]. Journal of Computer Research and Development, 2020, 57(9): 1839. [19] JOUPPI N P, YOUNG C, PATIL N, et al. In-Datacenter Performance Analysis of a Tensor Processing Unit[C]//Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA 2017, Toronto, ON, Canada, June 24-28, 2017. ACM, 2017: 1-12. [20] HAN S, KANG J, MAO H, et al. ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA[C]//Proceedings of the 2017 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA 2017, Monterey, CA, USA, February 22-24, 2017. ACM, 2017: 75-84. [21] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Commun. ACM, 2017, 60(6): 84-90. [22] BENGIO Y, LECUN Y, HINTON G E. Deep learning for AI[J]. Commun. ACM, 2021, 64(7): 58-65. [23] KIM Y, PARK E, YOO S, et al. Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications[C]//4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. 2016. [24] 李皈颖. 深度模型简化：存储压缩和计算加速[D]. 中国科学技术大学, 2018. [25] HE Y, LIU P, WANG Z, et al. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration[C]//IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 2019: 4340-4349. [26] HE Y, KANG G, DONG X, et al. Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks[C]//Proceedings of the Twenty-Seventh International Joint Conference on Ar tificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden. ijcai.org, 2018: 2234- 2240. [27] ZHUANG Z, TAN M, ZHUANG B, et al. Discrimination-aware Channel Pruning for Deep Neural Networks[C]//Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada. 2018: 883-894. [28] ZHAO R, LUK W. Efficient Structured Pruning and Architecture Searching for Group Con volution[C]//2019 IEEE/CVF International Conference on Computer Vision Workshops, ICCV Workshops 2019, Seoul, Korea (South), October 27-28, 2019. IEEE, 2019: 1961-1970. [29] VYSOGORETS A, KEMPE J. Connectivity matters: Neural network pruning through the lens of effective sparsity[J]. Journal of Machine Learning Research, 2023, 24(99): 1-23. [30] LIBERIS E, LANE N D. Differentiable Neural Network Pruning to Enable Smart Applications on Microcontrollers[J]. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2022, 6(4): 171:1-171:19. [31] WANG X, WANG J, TANG X, et al. Filter Pruning via Filters Similarity in Consecutive Layers [J]. CoRR, 2023, abs/2304.13397. [32] HAN S, MAO H, DALLY W J. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding[C]//4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. 2016. [33] HUBARA I, COURBARIAUX M, SOUDRY D, et al. Binarized Neural Networks[C]// Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain. 2016: 4107-4115. [34] ZHU C, HAN S, MAO H, et al. Trained Ternary Quantization[C]//5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017. [35] ZHOU S, NI Z, ZHOU X, et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients[J]. CoRR, 2016, abs/1606.06160. [36] ZHOU A, YAO A, WANG K, et al. Explicit Loss-Error-Aware Quantization for Low-Bit Deep Neural Networks[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. Computer Vision Foundation / IEEE Computer Society, 2018: 9426-9435. [37] RIGAMONTI R, SIRONI A, LEPETIT V, et al. Learning Separable Filters[C]//2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, June 23-28, 2013. IEEE Computer Society, 2013: 2754-2761. [38] DENTON E L, ZAREMBA W, BRUNA J, et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation[C]//Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada. 2014: 1269-1277. [39] JADERBERG M, VEDALDI A, ZISSERMAN A. Speeding up Convolutional Neural Networks with Low Rank Expansions[C]//British Machine Vision Conference, BMVC 2014, Nottingham, UK, September 1-5, 2014. BMVA Press, 2014. [40] TAI C, XIAO T, WANG X, et al. Convolutional neural networks with low-rank regularization [C]//4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. 2016. 62 [41] HINTON G E, VINYALS O, DEAN J. Distilling the Knowledge in a Neural Network[J]. CoRR, 2015, abs/1503.02531. [42] ZHANG Y, XIANG T, HOSPEDALES T M, et al. Deep Mutual Learning[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. Computer Vision Foundation / IEEE Computer Society, 2018: 4320-4328. [43] MIRZADEH S, FARAJTABAR M, LI A, et al. Improved Knowledge Distillation via Teacher Assistant[C]//The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 2020: 5191-5198. [44] FU H, ZHOU S, YANG Q, et al. LRC-BERT: Latent-representation Contrastive Knowledge Distillation for Natural Language Understanding[C]//Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021. AAAI Press, 2021: 12830-12838. [45] WANG K, LIU Z, LIN Y, et al. HAQ: Hardware-Aware Automated Quantization With Mixed Precision[C]//IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 2019: 8612- 8620. [46] HONG W, LI G, LIU S, et al. Multi-objective evolutionary optimization for hardware-aware neural network pruning[J]. Fundamental Research, 2022. [47] YANG S, CHEN W, ZHANG X, et al. AUTO-PRUNE: automated DNN pruning and mapping for ReRAM-based accelerator[C]//ICS ’21: 2021 International Conference on Supercomputing, Virtual Event, USA, June 14-17, 2021. ACM, 2021: 304-315. [48] HE Y, LIN J, LIU Z, et al. AMC: AutoML for Model Compression and Acceleration on Mobile Devices[C]//Lecture Notes in Computer Science: volume 11211 Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part VII. Springer, 2018: 815-832. [49] YANG T, HOWARD A G, CHEN B, et al. NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications[C]//Lecture Notes in Computer Science: volume 11214 Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part X. Springer, 2018: 289-304. [50] YU F, HAN C, WANG P, et al. HFP: Hardware-Aware Filter Pruning for Deep Convolutional Neural Networks Acceleration[C]//25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event / Milan, Italy, January 10-15, 2021. IEEE, 2020: 255-262. [51] SHEN M, YIN H, MOLCHANOV P, et al. HALP: Hardware-Aware Latency Pruning[J]. CoRR, 2021, abs/2110.10811. [52] LI W, WANG R, QIAN D. CompactNet: Platform-Aware Automatic Optimization for Convolutional Neural Networks[C]//PMAM@PPoPP 2021: Proceedings of the Twelfth International Workshop on Programming Models and Applications for Multicores and Manycores, Virtual Event, Republic of Korea, 27 February 2021. ACM, 2021: 11-20. [53] ELSKEN T, METZEN J H, HUTTER F. Neural Architecture Search[M]. Cham: Springer International Publishing, 2019: 63-77. [54] DAI X, ZHANG P, WU B, et al. ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation[C]//IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 2019: 11398-11407. [55] KURTIC E, FRANTAR E, ALISTARH D. ZipLM: Hardware-Aware Structured Pruning of Language Models[J]. CoRR, 2023, abs/2302.04089. [56] YANG H, ZHU Y, LIU J. ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model[C]//IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 2019: 11206-11215. [57] HUANG K, CHEN S, LI B, et al. Acceleration-Aware Fine-Grained Channel Pruning for Deep Neural Networks via Residual Gating[J]. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022, 41(6): 1902-1915. [58] YANG T, CHEN Y, SZE V. Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. IEEE Computer Society, 2017: 6071-6079. [59] XIAO J, ZHANG C, GONG Y, et al. HALOC: Hardware-Aware Automatic Low-Rank Com pression for Compact Neural Networks[J]. CoRR, 2023, abs/2301.09422. [60] ROSENBLATT F. The perceptron, a perceiving and recognizing automaton Project Para[M]. Cornell Aeronautical Laboratory, 1957. [61] AMARI S. Backpropagation and stochastic gradient descent method[J]. Neurocomputing, 1993, 5(3): 185-196. [62] CUTKOSKY A, MEHTA H. Momentum Improves Normalized SGD[C]//Proceedings of Ma chine Learning Research: volume 119 Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event. PMLR, 2020: 2260-2268. [63] DUCHI J C, HAZAN E, SINGER Y. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization[J]. J. Mach. Learn. Res., 2011, 12: 2121-2159. [64] KINGMA D P, BA J. Adam: A Method for Stochastic Optimization[C]//3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. 2015. [65] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J/OL]. Proc. IEEE, 1998, 86(11): 2278-2324. https://doi.org/10.1109/5.726791. [66] DENG L, LI G, HAN S, et al. Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey[J]. Proc. IEEE, 2020, 108(4): 485-532. [67] HAN S, POOL J, TRAN J, et al. Learning both Weights and Connections for Efficient Neural Network[C]//Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada. 2015: 1135-1143. [68] SRIVASTAVA N, HINTON G E, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. J. Mach. Learn. Res., 2014, 15(1): 1929-1958. [69] LECUN Y, DENKER J S, SOLLA S A. Optimal Brain Damage[C]//Advances in Neural Information Processing Systems 2, [NIPS Conference, Denver, Colorado, USA, November 27-30, 1989]. Morgan Kaufmann, 1989: 598-605. [70] HANSON S J, PRATT L Y. Comparing Biases for Minimal Network Construction with Back Propagation[C]//Advances in Neural Information Processing Systems 1, [NIPS Conference, Denver, Colorado, USA, 1988]. Morgan Kaufmann, 1988: 177-185. [71] HASSIBI B, STORK D G. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon[C]//Advances in Neural Information Processing Systems 5, [NIPS Conference, Denver, Colorado, USA, November 30 - December 3, 1992]. Morgan Kaufmann, 1992: 164-171. [72] SRINIVAS S, BABU R V. Data-free Parameter Pruning for Deep Neural Networks[C]// Proceedings of the British Machine Vision Conference 2015, BMVC 2015, Swansea, UK, September 7-10, 2015. BMVA Press, 2015: 31.1-31.12. [73] CHEN W, WILSON J T, TYREE S, et al. Compressing Neural Networks with the Hashing Trick[C]//JMLR Workshop and Conference Proceedings: volume 37 Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015. JMLR.org, 2015: 2285-2294. [74] ULLRICH K, MEEDS E, WELLING M. Soft Weight-Sharing for Neural Network Compression [C]//5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017. [75] LIANG T, GLOSSNER J, WANG L, et al. Pruning and quantization for deep neural network acceleration: A survey[J]. Neurocomputing, 2021, 461: 370-403. [76] FRANKLE J, CARBIN M. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks[C]//7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. [77] HOLLEMANS M. How fast is my model?[EB/OL]. 2018 [2018-07-30]. https://machinethink .net/blog/how-fast-is-my-model/. [78] MARCULESCU D, STAMOULIS D, CAI E. Hardware-aware machine learning: modeling and optimization[C]//Proceedings of the International Conference on Computer-Aided Design, ICCAD 2018, San Diego, CA, USA, November 05-08, 2018. ACM, 2018: 137. [79] NVIDIA. Jetson Xavier NX[EB/OL]. 2022 [2022-08-24]. https://www.nvidia.cn/autonomous -machines/embedded-systems/jetson-xavier-nx/. [80] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet Classification with Deep Convolutional Neural Networks[C]//Conference and Workshop on Neural Information Processing Systems. Lake Tahoe, Nevada, 2012: 1106-1114. [81] LI G, QIAN C, JIANG C, et al. Optimization based Layer-wise Magnitude-based Pruning for DNN Compression[C]//Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden. ijcai.org, 2018: 2383-2389. [82] QIAN C. Distributed Pareto Optimization for Large-Scale Noisy Subset Selection[J]. IEEE Trans. Evol. Comput., 2020, 24(4): 694-707. [83] KRIZHEVSKY A, HINTON G E, et al. Learning multiple layers of features from tiny images [M]. Toronto, ON, Canada, 2009.
所在学位评定分委会	电子科学与技术
国内图书分类号	TP301.6
来源库	人工提交
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/544188
专题	工学院_计算机科学与工程系
推荐引用方式 GB/T 7714	陆舜. 带噪场景下的深度神经网络压缩算法研究及验证系统设计[D]. 深圳. 南方科技大学,2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
12032481-陆舜-计算机科学与工程（2367KB）	--	--	限制开放	--	请求全文