南方科技大学知识苑(SUSTech KC): 面向非完美忆阻阵列的鲁棒神经网络研究

题名	面向非完美忆阻阵列的鲁棒神经网络研究
其他题名	ROBUST NEURAL NETWORK DESIGN FOR MEMRISTIVE CROSSBARS
姓名	肖样
姓名拼音	XIAO Yang
学号	12032499
学位类型	硕士
学位专业	080900
学科门类/专业学位类别	08 工学
导师	袁博
导师单位	计算机科学与工程系
外机构导师单位	南方科技大学
论文答辩日期	2023-05-13
论文提交日期	2023-06-29
学位授予单位	南方科技大学
学位授予地点	深圳
摘要	忆阻阵列得益于其存算一体的计算范式，成为了当下极具前景的神经网络加速器。当使用忆阻阵列来加速神经网络时，神经网络的权重需要被编程为忆阻器的电导值以进行存储和运算。然而，受限于器件本身的物理性质，在电导值的编程过程中会引入各种形式的误差，例如写误差、量化误差、漂移误差、卡死误差等。这些误差会使得实际编程电导值偏离目标编程电导值，从而导致忆阻神经网络性能的严重下降。本文针对如何克服忆阻神经网络中的写误差和量化误差问题进行了较为深入的研究，并提出了相应的误差容忍算法，使得神经网络在利用忆阻阵列进行加速的同时能够维持较高的性能。本文的主要贡献如下：（1）针对写误差提出了一种基于贝叶斯推断的解决方案。在写误差的影响下，实际编程电导值会带有一定的不确定性，因此我们希望所训练出的网络权重也带有不确定性，从而可以去容忍写误差带来的变化，这可以通过在神经网络的离线训练过程中引入贝叶斯推断的方式来实现。（2）针对量化误差提出了一种基于聚类的非均匀量化方案。由于忆阻器的分辨率有限，忆阻的电导值只能够处于若干个离散状态，这意味着近乎连续的权重需要被量化到有限个状态。在传统的均匀量化方案中，这些离散状态是在电导范围内均匀分布的。当分辨率较高时，这种方案尚可保证忆阻神经网络的性能，当分辨率较低时，忆阻神经网络的性能会严重下降。我们将离散状态的选择建模成了一个聚类问题，并使用k均值算法对其进行求解，保障了忆阻神经网络在低分辨率环境下的性能。（3）基于三个公开数据集（MNIST, CIFAR-10, CIFAR-100）在三种经典的网络结构（MLP, AlexNet, ResNet-18）上对所提出算法的性能进行了验证。以 MNIST手写数字识别任务为例，基准算法在无误差环境下基于MLP 的分类准确率可以达到98.36%，而误差环境下平均分类准确率将下降到71.77%。使用本文提出的方案后，误差环境下平均分类准确率仍能维持在94.36% 附近。
关键词	忆阻器写误差量化误差贝叶斯推断量化感知
语种	中文
培养类别	独立培养
入学年份	2020
学位授予年份	2023-06-25
参考文献列表	[1] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444. [2] DENG L, LI J, HUANG J, et al. Recent advances in deep learning for speech research at Microsoft[C]//International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013: 8604-8608. [3] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems. IEEE, 2012: 1106-1114. [4] CHEN C, SEFF A, KORNHAUSER A L, et al. DeepDriving: learning affordance for direct perception in autonomous driving[C]//International Conference on Computer Vision. IEEE, 2015: 2722-2730. [5] ESTEVA A, KUPREL B, NOVOA R A, et al. Dermatologist-level classification of skin cancer with deep neural networks[J]. Nature, 2017, 542(7639): 115-118. [6] SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484-489. [7] CHUA L. Memristor-the missing circuit element[J]. IEEE Transactions on Circuit Theory, 1971, 18(5): 507-519. [8] STRUKOV D B, SNIDER G S, STEWART D R, et al. The missing memristor found[J]. Nature, 2008, 453(7191): 80-83. [9] SHERIDAN P M, CAI F, DU C, et al. Sparse coding with memristor networks[J]. Nature Nanotechnology, 2017, 12(8): 784-789. [10] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [11] YAO P, WU H, GAO B, et al. Fully hardware-implemented memristor convolutional neural network[J]. Nature, 2020, 577(7792): 641-646. [12] PARK S M, HWANG H G, WOO J U, et al. Improvement of conductance modulation linearity in a Cu2+-doped KNbO3 memristor through the increase of the number of oxygen vacancies[J]. ACS Applied Materials & Interfaces, 2019, 12(1): 1069-1077. [13] KIM S, CHOI S, LEE J, et al. Tuning resistive switching characteristics of tantalum oxide memristors through Si doping[J]. ACS Nano, 2014, 8(10): 10262-10269. [14] PAPANDREOU N, POZIDIS H, PANTAZI A, et al. Programming algorithms for multilevel phase-change memory[C]//International Symposium on Circuits and Systems. IEEE, 2011: 329-332. [15] HU M, LI H, CHEN Y, et al. BSB training scheme implementation on memristor-based circuit[C]//Symposium on Computational Intelligence for Security and Defense Applications. IEEE, 2013: 80-87. [16] XIA L, LIU M, NING X, et al. Fault-tolerant training with on-line fault detection for RRAM-based neural computing systems[C]//Proceedings of the 54th Annual Design Automation Conference. ACM, 2017: 33:1-33:6. [17] EDWARDS P J, MURRAY A F. Weight saliency regularization in augmented networks[C]//European Symposium on Artificial Neural Networks. IEEE, 1998: 261-266. [18] BERNIER J L, LOPERA J O, ROJAS I, et al. Obtaining fault tolerant multilayer perceptrons using an explicit regularization[J]. Neural Processing Letters, 2000, 12(2): 107-113. [19] XU Q, WANG J, YUAN B, et al. Reliability-driven memristive crossbar design in neuromorphic computing systems[J]. IEEE Transactions on Automation Science and Engineering, 2023, 20(1): 74-87. [20] LI H, XU Z, TAYLOR G, et al. Visualizing the loss landscape of neural nets[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018: 6391–6401. [21] EDWARDS P J, MURRAY A F. Can deterministic penalty terms model the effects of synaptic weight noise on network fault-tolerance?[J]. International Journal of Neural Systems, 1995, 6(04): 401-416. [22] LIN J, WEN C, HU X, et al. Rescuing RRAM-based computing from static and dynamic faults[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2021, 40(10): 2049-2062. [23] MENG Z, QIAN W, ZHAO Y, et al. Digital offset for RRAM-based neuromorphic computing: A novel solution to conquer cycle-to-cycle variation[C]//Design, Automation & Test in Europe Conference & Exhibition. IEEE, 2021: 1078-1083. [24] HUANG C, XU N, ZENG J, et al. Rescuing ReRAM-based neural computing systems from device variation[J]. ACM Transactions on Design Automation of Electronic Systems, 2022, 28(6): 1-17. [25] LIU B, LI H, CHEN Y, et al. Vortex: variation-aware training for memristor X-bar[C]//Proceedings of the 52nd Annual Design Automation Conference. ACM, 2015: 1-6. [26] KUHN H W. The Hungarian method for the assignment problem[M]//50 Years of Integer Programming 1958-2008 - From the Early Years to the State-of-the-Art. Springer, 2010: 29-47. [27] NAGEL M, FOURNARAKIS M, AMJAD R A, et al. A white paper on neural network quantization[A/OL]. 2021. https://arxiv.org/abs/2106.08295. [28] SONG C, LIU B, WEN W, et al. A quantization-aware regularized learning method in multi-level memristor-based neuromorphic computing system[C]//Non-Volatile Memory Systems and Applications Symposium. IEEE, 2017: 1-6. [29] ZHANG C, ZHOU P. A quantized training framework for robust and accurate ReRAM-based neural network accelerators[C]//Asia and South Pacific Design Automation Conference. ACM, 2021: 43-48. [30] YU S. Resistive random access memory (RRAM)[J]. Synthesis Lectures on Emerging Engineering Technologies, 2016, 2(5): 1-79. [31] XU Q, WANG J, YUAN B, et al. Reliability-driven memristive crossbar design in neuromorphic computing systems[J]. IEEE Transactions on Automation Science and Engineering, 2023, 20(1): 74-87. [32] YAN K, PENG M, YU X, et al. High-performance perovskite memristor based on methylammonium lead halides[J]. Journal of Materials Chemistry C, 2016, 4(7): 1375-1381. [33] LOHN A J, STEVENS J E, MICKEL P R, et al. Optimizing TaOx memristor performance and consistency within the reactive sputtering “forbidden region”[J]. Applied Physics Letters, 2013, 103(6): 063-102. [34] LI Y, ZHONG Y, XU L, et al. Ultrafast synaptic events in a chalcogenide memristor[J]. Scientific Reports, 2013, 3(1): 16-19. [35] SUN W, GAO B, CHI M, et al. Understanding memristive switching via in situ characterization and device modeling[J]. Nature Communications, 2019, 10(1): 1-13. [36] MA W, LU J. An equivalence of fully connected layer and convolutional layer[A/OL]. 2017. http://arxiv.org/abs/1712.01252. [37] AMBROGIO S, NARAYANAN P, TSAI H, et al. Equivalent-accuracy accelerated neural network training using analogue memory[J]. Nature, 2018, 558(7708): 60-67. [38] YU S, GAO B, FANG Z, et al. A neuromorphic visual system using RRAM synaptic devices with Sub-pJ energy and tolerance to variability: Experimental characterization and large-scale modeling[C]//International Electron Devices Meeting. 2012: 10.4.1-10.4.4. [39] PHAM D, LE T M V. Auto-encoding variational bayes for inferring topics and visualization[C]//International Committee on Computational Linguistics, 2020: 5223-5234. [40] BLUM A, HAGHTALAB N, PROCACCIA A D. Variational dropout and the local reparameterization Trick[C]//Advances in Neural Information Processing Systems. 2015: 2575-2583. [41] TOMCZAK J M, WELLING M. VAE with a VampPrior[C]//International Conference on Ar tificial Intelligence and Statistics: volume 84. PMLR, 2018: 1214-1223. [42] BENGIO Y, LE´ONARD N, COURVILLE A. Estimating or propagating gradients through stochastic neurons for conditional computation[A/OL]. 2013. https://arxiv.org/abs/1308.3432. [43] HE Z, LIN J, EWETZ R, et al. Noise injection adaption: end-to-end ReRAM crossbar non-ideal effect adaption for neural network mapping[C]//Proceedings of the 56th Annual Design Automation Conference. 2019: 1-6. [44] KRIZHEVSKY A, HINTON G, et al. Learning multiple layers of features from tiny images[J]. University of Toronto, 2012. [45] ZHOU Y, HU X, WANG L, et al. Bayesian neural network enhancing reliability against conductance drift for memristor neural networks[J]. Science China Information Sciences, 2021, 64(6): 1-12. [46] DALGATY T, CASTELLANI N, TURCK C, et al. In situ learning using intrinsic memristor variability via Markov chain Monte Carlo sampling[J]. Nature Electronics, 2021, 4(2): 151-161. [47] GRØNLUND A, LARSEN K G, MATHIASEN A, et al. Fast exact k-means, k-medians and Bregman divergence clustering in 1D[A/OL]. 2017. https://arxiv.org/pdf/1701.07204.pdf.
所在学位评定分委会	电子科学与技术
国内图书分类号	TP183
来源库	人工提交
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/544528
专题	工学院_计算机科学与工程系
推荐引用方式 GB/T 7714	肖样. 面向非完美忆阻阵列的鲁棒神经网络研究[D]. 深圳. 南方科技大学,2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
12032499-肖样-计算机科学与工程（4354KB）	--	--	限制开放	--	请求全文