南方科技大学知识苑(SUSTech KC): 基于脉冲神经网络的高性能类脑视觉研究

题名	基于脉冲神经网络的高性能类脑视觉研究
其他题名	Spiking Neural Network Based High-Performance Brain-Inspired Computer Vision
姓名	李彦辰
姓名拼音	LI Yanchen
学号	12132341
学位类型	硕士
学位专业	0809 电子科学与技术
学科门类/专业学位类别	08 工学
导师	程然
导师单位	计算机科学与工程系
论文答辩日期	2024-05-12
论文提交日期	2024-06-27
学位授予单位	南方科技大学
学位授予地点	深圳
摘要	随着信息技术的发展，高性能视觉任务的研究成为了十分重要的科技需求。作为第三代神经网络模型，脉冲神经网络（Spiking Neural Network，SNN）以其仿照生物神经系统的信息传递模式被认为具有推动深度学习研究发展的可能，依托其事件驱动性，SNN具备稀疏通信与高性能优化的潜力。在现有研究工作中，已有学者开展了面向类脑视觉任务的SNN模型探索，在类脑视觉信息预处理、脉冲神经元建模、深度学习模型架构设计、SNN训练与系统级别底层架构支持上均有成果，为类脑视觉与SNN模型的高性能研究提供了良好的铺垫。然而，当前的研究仍存在一些局限性。SNN模型在类脑视觉任务上的研究探索相比人工神经网络较少，且更多具备生物神经元特质的模型未被应用到视觉任务的实际场景中，这使得后续的研究缺少足够的范例；另外，SNN在现有的GPU等硬件平台上存在训练与运行效率低的问题，这对SNN模型在深度学习上的进一步发展带来了一定程度上的障碍。本文工作以类脑视觉为背景，针对SNN模型的高性能优化开展了一系列研究。本文首先梳理了类脑视觉任务下两种类型数据集的预处理方法，并以实验形式对比了不同处理方法的效果，在将类脑视觉任务流程统一化的同时验证了大时间步下SNN模型的表现。之后，本文针对自动驾驶事件目标检测与事件图像分类任务设计了一个基于脉冲神经元的特征金字塔网络模型，同时为脉冲神经元增加了自适应脉冲发放阈值的特性，增强了模型的表现。其理论运算功耗说明了该模型在神经形态硬件上的高效性。在此基础上，本文针对脉冲神经元模型提出了时间步融合优化方法，该方法提供了脉冲神经元运算的单GPU加速与多GPU分布式拓展，在系统层级实现了SNN模型训练与推理的加速，进一步的理论与实验分析证明了其在大规模时间步下的优化潜力。
其他摘要	Advances in information technology have emphasized the importance of high-performance computer vision tasks in scientific and technological research. As the third-generation neural network models, Spiking Neural Networks (SNNs), which mimic biological neural systems, are poised to enhance deep learning research due to their event-driven nature, allowing for sparse communication and high-performance optimization. Recent works explore SNNs for brain-inspired computer vision tasks, achieving progress in computer vision information preprocessing, spiking neuron modeling, deep learning architecture design, SNN training, and system-level support, which lay a strong foundation for advanced research in brain-inspired computer vision and SNN modeling. Current limitations include the under-exploration of SNNs compared to artificial neural networks for computer vision tasks and the inefficiency of SNNs on current hardware like GPUs, which impacts their performance and impedes further development. This research focuses on optimizing SNNs for brain-inspired computer vision. Initially, we categorized preprocessing methods for two kinds of datasets, compared their effects, and assessed the SNN model’s performance with extended time steps for brain-inspired tasks. We then developed a feature pyramid network with adaptive firing thresholds for automotive event-based detection and classification, demonstrating high efficiency on neuromorphic hardware. Furthermore, we introduced a temporal fusion optimization method that enhances single and multi-GPU SNN operations, accelerating training and inference at the system level, with both theoretical and experimental analyses confirming its optimization potential for large-scale applications.
关键词	脉冲神经网络高性能计算 GPU加速计算机视觉
其他关键词	Spiking Neural Networks High-Performance Computing GPU Acceleration Computer Vision
语种	中文
培养类别	独立培养
入学年份	2021
学位授予年份	2024-06
参考文献列表	[1] Review of spike-based neuromorphic computing for brain-inspired vision: Biology, algorithms, and hardware; [2] Networks of spiking neurons: The third generation of neural network models; [3] Towards spike-based machine intelligence with neuromorphic computing; [4] Gradient-based learning applied to document recognition; [5] YOLOv3: An incremental improvement; [6] The Pascal visual object classes (VOC) challenge; [7] Microsoft COCO: Common objects in context; [8] The impulses produced by sensory nerve endings: Part 3. Impulses set up by touch and pressure; [9] Conversion of analog to spiking neural networks using sparse temporal coding; [10] Neural coding in spiking neural networks: A comparative study for robust neuromorphic systems; [11] Fast and efficient information transmission with burst spikes in deep spiking neural networks; [12] SPADE-E2VID: Spatially-adaptive denormalization for event-based video reconstruction; [13] Event-based video reconstruction via potential-assisted spiking neural network; [14] Converting static image datasets to spiking neuromorphic datasets using saccades; [15] CIFAR10-DVS: An event-stream dataset for object classification; [16] A low power, fully event-based gesture recognition system; [17] Unsupervised event-based learning of optical flow, depth, and egomotion; [18] Event-based high dynamic range image and very high frame rate video generation using conditional generative adversarial networks; [19] A million spiking-neuron integrated circuit with a scalable communication network and interface; [20] Spiking-YOLO: Spiking neural network for real-time object detection; [21] Spatio-temporal backpropagation for training high-performance spiking neural networks; [22] Direct training for spiking neural networks: Faster, larger, better; [23] Optimal ANN-SNN conversion for fast and accurate inference in deep spiking neural networks; [24] A solution to the learning dilemma for recurrent networks of spiking neurons; [25] Feature pyramid networks for object detection; [26] U-Net: Convolutional networks for biomedical image segmentation; [27] Towards ultra low latency spiking neural networks for vision and sequential tasks using temporal pruning; [28] Efficient training of spiking neural networks with temporally-truncated local backpropagation through time; [29] Razor SNN: Efficient spiking neural network with temporal embeddings; [30] Optimal ANN-SNN conversion for high-accuracy and ultra-lowlatency Spiking Neural Networks; [31] RMP-SNN: Residual membrane potential neuron for enabling deeper high-accuracy and low-latency spiking neural network; [32] DIET-SNN: A low-latency spiking neural network with direct input encoding and leakage and threshold optimization; [33] Parallel spiking neurons with high efficiency and ability to learn long-term dependencies; [34] Towards memory- and time-efficient backpropagation for training spiking neural networks; [35] Online training through time for spiking neural networks; [36] GPipe: Efficient training of giant neural networks using pipeline parallelism; [37] ImageNet classification with deep convolutional neural networks; [38] Megatron-LM: Training multi-billion parameter language models using model parallelism; [39] Learning multiple layers of features from tiny images; [40] ImageNet: A large-scale hierarchical image database; [41] Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing; [42] Conversion of continuous-valued deep networks to efficient event-driven networks for image classification; [43] T2FSNN: Deep spiking neural networks with time-to-first-spike coding; [44] TDSNN: From deep neural networks to deep spike neural networks with temporal-coding; [45] Bursts as a unit of neural information: Selective communication via resonance; [46] Bursts generate a non-reducible spike-pattern code; [47] A system hierarchy for brain-inspired computing; [48] Loihi: A neuromorphic manycore processor with on-chip learning; [49] Neuronal dynamics: From single neurons to networks and models of cognition; [50] Currents carried by sodium and potassium ions through the membrane of the giant axon of Loligo; [51] Simple model of spiking neurons; [52] Deep learning in neural networks: An overview; [53] Gaussian processes in machine learning; [54] Towards understanding the effect of leak in spiking neural networks; [55] An adaptive threshold neuron for recurrent spiking neural networks with nanodevice hardware implementation; [56] Generalized leaky integrate-and-fire models classify multiple neuron types; [57] Systematic generation of biophysically detailed models for diverse cortical neuron types; [58] Long short-term memory and learning-to-learn in networks of spiking neurons; [59] A cerebellum-inspired spiking neural model with adapting rate neurons; [60] An adaptive threshold mechanism for accurate and efficient deep spiking convolutional neural networks; [61] LC-TTFS: Towards lossless network conversion for spiking neural networks with TTFS coding; [62] Improving spiking neural network with frequency adaptation for image classification; [63] Temporal-coded spiking neural networks with dynamic firing threshold: Learning with event-driven backpropagation; [64] A synapse-threshold synergistic learning approach for spiking neural networks; [65] A free lunch from ANN: Towards efficient, accurate spiking neural networks calibration; [66] Differentiable spike: Rethinking gradient-descent for training spiking neural networks; [67] Training delays in spiking neural networks; [68] Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks; [69] CUDA: Scalable parallel programming for high-performance scientific computing; [70] cuDNN: Efficient primitives for deep learning; [71] PyTorch: An imperative style, high-performance deep learning library; [72] Training spiking neural networks using lessons from deep learning; [73] SpikingJelly: An open-source machine learning infrastructure platform for spike-based intelligence; [74] BindsNET: A machine learning-oriented spiking neural networks library in Python; [75] Norse — A deep learning library for spiking neural networks; [76] Unsupervised learning of a hierarchical spiking neural network for optical flow estimation: From events to global motion perception; [77] GeNN: A code generation framework for accelerated brain simulations; [78] Spatial transformer networks; [79] A 240 × 180 130 dB 3 μs latency global shutter spatiotemporal vision sensor; [80] The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM; [81] Image quality assessment: From error visibility to structural similarity; [82] The unreasonable effectiveness of deep features as a perceptual metric; [83] Differentiable hierarchical and surrogate gradient search for spiking neural networks; [84] YOLO9000: Better, faster, stronger; [85] Rich feature hierarchies for accurate object detection and semantic segmentation; [86] Batch normalization: Accelerating deep network training by reducing internal covariate shift; [87] A large scale event-based detection dataset for automotive; [88] HATS: Histograms of averaged time surfaces for robust event-based object classification; [89] Event-based asynchronous sparse convolutional networks; [90] Learning to detect objects with a 1 megapixel event camera; [91] Asynchronous spatio-temporal memory network for continuous eventbased object detection; [92] Object detection with spiking neural networks on automotive event data; [93] Decoupled weight decay regularization; [94] HFirst: A temporal approach to object recognition; [95] HOTS: A hierarchy of event-based timesurfaces for pattern recognition; [96] 1.1 computing's energy problem (and what we can do about it); [97] Beyond classification: Directly training spiking neural networks for semantic segmentation; [98] Deep residual learning for image recognition; [99] Adam: A method for stochastic optimization.
所在学位评定分委会	电子科学与技术
国内图书分类号	TP391.4
来源库	人工提交
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/766239
专题	工学院_计算机科学与工程系
推荐引用方式 GB/T 7714	李彦辰. 基于脉冲神经网络的高性能类脑视觉研究[D]. 深圳. 南方科技大学,2024.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
12132341-李彦辰-计算机科学与工（4709KB）	--	--	限制开放	--	请求全文