南方科技大学知识苑(SUSTech KC): EvalDNN: A toolbox for evaluating deep neural network models

题名	EvalDNN: A toolbox for evaluating deep neural network models
作者	Tian，Yongqiang 1; Zeng，Zhihua 2; Wen，Ming 3; Liu，Yepang4 ; Kuo，Tzu Yang 1; Cheung，Shing Chi 1
通讯作者	Zeng，Zhihua
DOI	10.1145/3377812.3382133
发表日期	2020-06-27
ISSN	0270-5257
ISBN	978-1-7281-6528-8
会议录名称	Proceedings - International Conference on Software Engineering
页码	45-48
会议日期	5-11 Oct. 2020
会议地点	Seoul, Korea (South)
摘要	Recent studies have shown that the performance of deep learningmodels should be evaluated using various important metrics suchas robustness and neuron coverage, besides the widely-used prediction accuracy metric. However, major deep learning frameworkscurrently only provide APIs to evaluate a model's accuracy. In order to comprehensively assess a deep learning model, frameworkusers and researchers often need to implement new metrics bythemselves, which is a tedious job. What is worse, due to the largenumber of hyper-parameters and inadequate documentation, evaluation results of some deep learning models are hard to reproduce,especially when the models and metrics are both new.To ease the model evaluation in deep learning systems, we havedeveloped EvalDNN, a user-friendly and extensible toolbox supporting multiple frameworks and metrics with a set of carefullydesigned APIs. Using EvalDNN, evaluation of a pre-trained modelwith respect to different metrics can be done with a few lines ofcode. We have evaluated EvalDNN on 79 models from TensorFlow,Keras, GluonCV, and PyTorch. As a result of our effort made toreproduce the evaluation results of existing work, we release aperformance benchmark of popular models, which can be a useful reference to facilitate future research. The tool and benchmarkare available at https://github.com/yqtianust/EvalDNN and https://yqtianust.github.io/EvalDNN-benchmark/, respectively. A demovideo of EvalDNN is available at: https://youtu.be/v69bNJN2bJc.
关键词	Deep Learning Model Evaluation
学校署名	其他
语种	英语
相关链接	[Scopus记录]
收录类别	EI
EI入藏号	20204409420993
EI主题词	HTTP ; Neural network models ; Deep neural networks
EI分类号	Ergonomics and Human Factors Engineering:461.4 ; Artificial Intelligence:723.4
Scopus记录号	2-s2.0-85094112889
来源库	Scopus
全文链接	https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9270369
引用统计	被引频次[WOS]：11
成果类型	会议论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/209207
专题	南方科技大学工学院_计算机科学与工程系
作者单位	1.Hong Kong University of Science and Technology,Hong Kong,Hong Kong 2.Zhejiang University,Hangzhou,China 3.Huazhong University of Science and Technology,Wuhan,China 4.Southern University of Science and Technology,Shenzhen,China
推荐引用方式 GB/T 7714	Tian，Yongqiang,Zeng，Zhihua,Wen，Ming,et al. EvalDNN: A toolbox for evaluating deep neural network models[C],2020:45-48.