Advances in information technology have emphasized the importance of high-performance computer vision tasks in scientific and technological research. As the third-generation neural network models, Spiking Neural Networks (SNNs), which mimic biological neural systems, are poised to enhance deep learning research due to their event-driven nature, allowing for sparse communication and high-performance optimization. Recent works explore SNNs for brain-inspired computer vision tasks, achieving progress in computer vision information preprocessing, spiking neuron modeling, deep learning architecture design, SNN training, and system-level support, which lay a strong foundation for advanced research in brain-inspired computer vision and SNN modeling. Current limitations include the under-exploration of SNNs compared to artificial neural networks for computer vision tasks and the inefficiency of SNNs on current hardware like GPUs, which impacts their performance and impedes further development. This research focuses on optimizing SNNs for brain-inspired computer vision. Initially, we categorized preprocessing methods for two kinds of datasets, compared their effects, and assessed the SNN model’s performance with extended time steps for brain-inspired tasks. We then developed a feature pyramid network with adaptive firing thresholds for automotive event-based detection and classification, demonstrating high efficiency on neuromorphic hardware. Furthermore, we introduced a temporal fusion optimization method that enhances single and multi-GPU SNN operations, accelerating training and inference at the system level, with both theoretical and experimental analyses confirming its optimization potential for large-scale applications.
[1] Review of spike-based neuromorphic computing for brain-inspired vision: Biology, algorithms, and hardware;
[2] Networks of spiking neurons: The third generation of neural network models;
[3] Towards spike-based machine intelligence with neuromorphic computing;
[4] Gradient-based learning applied to document recognition;
[5] YOLOv3: An incremental improvement;
[6] The Pascal visual object classes (VOC) challenge;
[7] Microsoft COCO: Common objects in context;
[8] The impulses produced by sensory nerve endings: Part 3. Impulses set up by touch and pressure;
[9] Conversion of analog to spiking neural networks using sparse temporal coding;
[10] Neural coding in spiking neural networks: A comparative study for robust neuromorphic systems;
[11] Fast and efficient information transmission with burst spikes in deep spiking neural networks;
[12] SPADE-E2VID: Spatially-adaptive denormalization for event-based video reconstruction;
[13] Event-based video reconstruction via potential-assisted spiking neural network;
[14] Converting static image datasets to spiking neuromorphic datasets using saccades;
[15] CIFAR10-DVS: An event-stream dataset for object classification;
[16] A low power, fully event-based gesture recognition system;
[17] Unsupervised event-based learning of optical flow, depth, and egomotion;
[18] Event-based high dynamic range image and very high frame rate video generation using conditional generative adversarial networks;
[19] A million spiking-neuron integrated circuit with a scalable communication network and interface;
[20] Spiking-YOLO: Spiking neural network for real-time object detection;
[21] Spatio-temporal backpropagation for training high-performance spiking neural networks;
[22] Direct training for spiking neural networks: Faster, larger, better;
[23] Optimal ANN-SNN conversion for fast and accurate inference in deep spiking neural networks;
[24] A solution to the learning dilemma for recurrent networks of spiking neurons;
[25] Feature pyramid networks for object detection;
[26] U-Net: Convolutional networks for biomedical image segmentation;
[27] Towards ultra low latency spiking neural networks for vision and sequential tasks using temporal pruning;
[28] Efficient training of spiking neural networks with temporally-truncated local backpropagation through time;
[29] Razor SNN: Efficient spiking neural network with temporal embeddings;
[30] Optimal ANN-SNN conversion for high-accuracy and ultra-lowlatency Spiking Neural Networks;
[31] RMP-SNN: Residual membrane potential neuron for enabling deeper high-accuracy and low-latency spiking neural network;
[32] DIET-SNN: A low-latency spiking neural network with direct input encoding and leakage and threshold optimization;
[33] Parallel spiking neurons with high efficiency and ability to learn long-term dependencies;
[34] Towards memory- and time-efficient backpropagation for training spiking neural networks;
[35] Online training through time for spiking neural networks;
[36] GPipe: Efficient training of giant neural networks using pipeline parallelism;
[37] ImageNet classification with deep convolutional neural networks;
[38] Megatron-LM: Training multi-billion parameter language models using model parallelism;
[39] Learning multiple layers of features from tiny images;
[40] ImageNet: A large-scale hierarchical image database;
[41] Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing;
[42] Conversion of continuous-valued deep networks to efficient event-driven networks for image classification;
[43] T2FSNN: Deep spiking neural networks with time-to-first-spike coding;
[44] TDSNN: From deep neural networks to deep spike neural networks with temporal-coding;
[45] Bursts as a unit of neural information: Selective communication via resonance;
[46] Bursts generate a non-reducible spike-pattern code;
[47] A system hierarchy for brain-inspired computing;
[48] Loihi: A neuromorphic manycore processor with on-chip learning;
[49] Neuronal dynamics: From single neurons to networks and models of cognition;
[50] Currents carried by sodium and potassium ions through the membrane of the giant axon of Loligo;
[51] Simple model of spiking neurons;
[52] Deep learning in neural networks: An overview;
[53] Gaussian processes in machine learning;
[54] Towards understanding the effect of leak in spiking neural networks;
[55] An adaptive threshold neuron for recurrent spiking neural networks with nanodevice hardware implementation;
[56] Generalized leaky integrate-and-fire models classify multiple neuron types;
[57] Systematic generation of biophysically detailed models for diverse cortical neuron types;
[58] Long short-term memory and learning-to-learn in networks of spiking neurons;
[59] A cerebellum-inspired spiking neural model with adapting rate neurons;
[60] An adaptive threshold mechanism for accurate and efficient deep spiking convolutional neural networks;
[61] LC-TTFS: Towards lossless network conversion for spiking neural networks with TTFS coding;
[62] Improving spiking neural network with frequency adaptation for image classification;
[63] Temporal-coded spiking neural networks with dynamic firing threshold: Learning with event-driven backpropagation;
[64] A synapse-threshold synergistic learning approach for spiking neural networks;
[65] A free lunch from ANN: Towards efficient, accurate spiking neural networks calibration;
[66] Differentiable spike: Rethinking gradient-descent for training spiking neural networks;
[67] Training delays in spiking neural networks;
[68] Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks;
[69] CUDA: Scalable parallel programming for high-performance scientific computing;
[70] cuDNN: Efficient primitives for deep learning;
[71] PyTorch: An imperative style, high-performance deep learning library;
[72] Training spiking neural networks using lessons from deep learning;
[73] SpikingJelly: An open-source machine learning infrastructure platform for spike-based intelligence;
[74] BindsNET: A machine learning-oriented spiking neural networks library in Python;
[75] Norse — A deep learning library for spiking neural networks;
[76] Unsupervised learning of a hierarchical spiking neural network for optical flow estimation: From events to global motion perception;
[77] GeNN: A code generation framework for accelerated brain simulations;
[78] Spatial transformer networks;
[79] A 240 × 180 130 dB 3 μs latency global shutter spatiotemporal vision sensor;
[80] The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM;
[81] Image quality assessment: From error visibility to structural similarity;
[82] The unreasonable effectiveness of deep features as a perceptual metric;
[83] Differentiable hierarchical and surrogate gradient search for spiking neural networks;
[84] YOLO9000: Better, faster, stronger;
[85] Rich feature hierarchies for accurate object detection and semantic segmentation;
[86] Batch normalization: Accelerating deep network training by reducing internal covariate shift;
[87] A large scale event-based detection dataset for automotive;
[88] HATS: Histograms of averaged time surfaces for robust event-based object classification;
[89] Event-based asynchronous sparse convolutional networks;
[90] Learning to detect objects with a 1 megapixel event camera;
[91] Asynchronous spatio-temporal memory network for continuous eventbased object detection;
[92] Object detection with spiking neural networks on automotive event data;
[93] Decoupled weight decay regularization;
[94] HFirst: A temporal approach to object recognition;
[95] HOTS: A hierarchy of event-based timesurfaces for pattern recognition;
[96] 1.1 computing's energy problem (and what we can do about it);
[97] Beyond classification: Directly training spiking neural networks for semantic segmentation;
[98] Deep residual learning for image recognition;
[99] Adam: A method for stochastic optimization.
修改评论