南方科技大学知识苑(SUSTech KC): Multi-objective magnitude-based pruning for latency-aware deep neural network compression

题名	Multi-objective magnitude-based pruning for latency-aware deep neural network compression
作者	Hong，Wenjing1,2,3 ; Yang，Peng1 ; Wang，Yiwen 4; Tang，Ke1
通讯作者	Tang，Ke
DOI	10.1007/978-3-030-58112-1_32
发表日期	2020
ISSN	0302-9743
EISSN	1611-3349
会议录名称	LECT NOTES ARTIF INT
卷号	12269 LNCS
页码	470-483
摘要	Layer-wise magnitude-based pruning is a popular method for Deep Neural Network (DNN) compression. It has the potential to reduce the latency for an inference made by a DNN by pruning connects in the network, which prompts the application of DNNs to tasks with real-time operation requirements, such as self-driving vehicles, video detection and tracking. However, previous methods mainly use the compression rate as a proxy for the latency, without explicitly accounting for latency in the training of the compressed network. This paper presents a new layer-wise magnitude-based pruning method, namely Multi-objective Magnitude-based Latency-Aware Pruning (MMLAP). MMLAP captures latency directly and incorporates a novel multi-objective evolutionary algorithm to optimize both accuracy of a DNN and its latency efficiency when designing compressed networks, i.e., when tuning hyper-parameters of LMP. Empirical studies show the competitiveness of MMLAP compared to well-established LMP methods and show the value of multi-objective optimization in yielding Pareto-optimal compressed networks in terms of accuracy and latency.
关键词	Compression Latency-aware Magnitude-based pruning Multi-objective evolutionary algorithm Multi-objective optimization
学校署名	第一 ; 通讯
语种	英语
相关链接	[Scopus记录]
收录类别	EI
EI入藏号	20203909228657
EI主题词	Multilayer neural networks ; Pareto principle ; Deep neural networks ; Evolutionary algorithms
EI分类号	Ergonomics and Human Factors Engineering:461.4 ; Optimization Techniques:921.5
Scopus记录号	2-s2.0-85091285814
来源库	Scopus
引用统计	被引频次[WOS]：6
成果类型	会议论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/188044
专题	工学院_计算机科学与工程系
作者单位	1.Guangdong Provincial Key Laboratory of Brain-Inspired Intelligent Computation,Department of Computer Science and Engineering,Southern University of Science and Technology,Shenzhen,518055,China 2.Department of Management Science,University of Science and Technology of China,Hefei,230027,China 3.Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence,Guangzhou,510515,China 4.Department of Electronic and Computer Engineering,Department of Chemical and Biological Engineering,Hong Kong University of Science and Technology,Hong Kong
第一作者单位	计算机科学与工程系
通讯作者单位	计算机科学与工程系
第一作者的第一单位	计算机科学与工程系
推荐引用方式 GB/T 7714	Hong，Wenjing,Yang，Peng,Wang，Yiwen,et al. Multi-objective magnitude-based pruning for latency-aware deep neural network compression[C],2020:470-483.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
muti objective magni（455KB）	会议论文	--	限制开放	CC BY-NC-SA