题名 | Multi-objective magnitude-based pruning for latency-aware deep neural network compression |
作者 | |
通讯作者 | Tang,Ke |
DOI | |
发表日期 | 2020
|
ISSN | 0302-9743
|
EISSN | 1611-3349
|
会议录名称 | |
卷号 | 12269 LNCS
|
页码 | 470-483
|
摘要 | Layer-wise magnitude-based pruning is a popular method for Deep Neural Network (DNN) compression. It has the potential to reduce the latency for an inference made by a DNN by pruning connects in the network, which prompts the application of DNNs to tasks with real-time operation requirements, such as self-driving vehicles, video detection and tracking. However, previous methods mainly use the compression rate as a proxy for the latency, without explicitly accounting for latency in the training of the compressed network. This paper presents a new layer-wise magnitude-based pruning method, namely Multi-objective Magnitude-based Latency-Aware Pruning (MMLAP). MMLAP captures latency directly and incorporates a novel multi-objective evolutionary algorithm to optimize both accuracy of a DNN and its latency efficiency when designing compressed networks, i.e., when tuning hyper-parameters of LMP. Empirical studies show the competitiveness of MMLAP compared to well-established LMP methods and show the value of multi-objective optimization in yielding Pareto-optimal compressed networks in terms of accuracy and latency. |
关键词 | |
学校署名 | 第一
; 通讯
|
语种 | 英语
|
相关链接 | [Scopus记录] |
收录类别 | |
EI入藏号 | 20203909228657
|
EI主题词 | Multilayer neural networks
; Pareto principle
; Deep neural networks
; Evolutionary algorithms
|
EI分类号 | Ergonomics and Human Factors Engineering:461.4
; Optimization Techniques:921.5
|
Scopus记录号 | 2-s2.0-85091285814
|
来源库 | Scopus
|
引用统计 |
被引频次[WOS]:6
|
成果类型 | 会议论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/188044 |
专题 | 工学院_计算机科学与工程系 |
作者单位 | 1.Guangdong Provincial Key Laboratory of Brain-Inspired Intelligent Computation,Department of Computer Science and Engineering,Southern University of Science and Technology,Shenzhen,518055,China 2.Department of Management Science,University of Science and Technology of China,Hefei,230027,China 3.Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence,Guangzhou,510515,China 4.Department of Electronic and Computer Engineering,Department of Chemical and Biological Engineering,Hong Kong University of Science and Technology,Hong Kong |
第一作者单位 | 计算机科学与工程系 |
通讯作者单位 | 计算机科学与工程系 |
第一作者的第一单位 | 计算机科学与工程系 |
推荐引用方式 GB/T 7714 |
Hong,Wenjing,Yang,Peng,Wang,Yiwen,et al. Multi-objective magnitude-based pruning for latency-aware deep neural network compression[C],2020:470-483.
|
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | 操作 | |
muti objective magni(455KB) | 会议论文 | -- | 限制开放 | CC BY-NC-SA |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论