南方科技大学知识苑(SUSTech KC): Towards a Flexible Accuracy-Oriented Deep Learning Module Inference Latency Prediction Framework for Adaptive Optimization Algorithms

题名	Towards a Flexible Accuracy-Oriented Deep Learning Module Inference Latency Prediction Framework for Adaptive Optimization Algorithms
作者	Shen, Jingran1 ; Tziritas, Nikos 2; Theodoropoulos, Georgios3
通讯作者	Theodoropoulos, Georgios
DOI	10.1007/978-3-031-57808-3_3
发表日期	2024
会议名称	13th IFIP TC 12 International Conference on Intelligent Information Processing, IIP 2024
ISSN	1868-4238
EISSN	1868-422X
ISBN	9783031578076
会议录名称	IFIP Advances in Information and Communication Technology
卷号	703 IFIPAICT
页码	34-47
会议日期	May 3, 2024 - May 6, 2024
会议地点	Shenzhen, China
出版者	Springer Science and Business Media Deutschland GmbH
摘要	With the rapid development of Deep Learning, more and more applications on the cloud and edge tend to utilize large DNN (Deep Neural Network) models for improved task execution efficiency as well as decision-making quality. Due to memory constraints, models are commonly optimized using compression, pruning, and partitioning algorithms to become deployable onto resource-constrained devices. As the conditions in the computational platform change dynamically, the deployed optimization algorithms should accordingly adapt their solutions. To perform frequent evaluations of these solutions in a timely fashion, RMs (Regression Models) are commonly trained to predict the relevant solution quality metrics, such as the resulted DNN module inference latency, which is the focus of this paper. Existing prediction frameworks specify different RM training workflows, but none of them allow flexible configurations of the input parameters (e.g., batch size, device utilization rate) and of the selected RMs for different modules. In this paper, a deep learning module inference latency prediction framework is proposed, which i) hosts a set of customizable input parameters to train multiple different RMs per DNN module (e.g., convolutional layer) with self-generated datasets, and ii) automatically selects a set of trained RMs leading to the highest possible overall prediction accuracy, while keeping the prediction time/space consumption as low as possible. Furthermore, a new RM, namely MEDN (Multi-task Encoder-Decoder Network), is proposed as an alternative solution. Comprehensive experiment results show that MEDN is fast and lightweight, and capable of achieving the highest overall prediction accuracy and R-squared value. The Time/Space-efficient Auto-selection algorithm also manages to improve the overall accuracy by 2.5% and R-squared by 0.39%, compared to the MEDN single-selection scheme. © IFIP International Federation for Information Processing 2024.
学校署名	第一 ; 通讯
语种	英语
收录类别	EI
资助项目	This research was supported by: Shenzhen Science and Technology Program, China (No. GJHZ20210705141807022); Guangdong Province Innovative and Entrepreneurial Team Programme, China (No. 2017ZT07X386); SUSTech Research Institute for Trustworthy Autonomous Systems, China. Corresponding author_ Georgios Theodoropoulos.
EI入藏号	20241715951353
EI主题词	Constrained optimization ; Decision making ; Forecasting ; Inference engines ; Learning algorithms ; Learning systems ; Neural network models ; Regression analysis
EI分类号	Ergonomics and Human Factors Engineering:461.4 ; Artificial Intelligence:723.4 ; Expert Systems:723.4.1 ; Machine Learning:723.4.2 ; Management:912.2 ; Mathematical Statistics:922.2 ; Systems Science:961
来源库	EV Compendex
引用统计
成果类型	会议论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/794570
专题	工学院_计算机科学与工程系南方科技大学
作者单位	1.Department of Computer Science and Engineering, Southern University of Science and Technology (SUSTech), Shenzhen, China 2.Department of Informatics and Telecommunications, University of Thessaly, Lamia, Greece 3.Research Institute for Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology (SUSTech), Shenzhen, China
第一作者单位	计算机科学与工程系
通讯作者单位	计算机科学与工程系
第一作者的第一单位	计算机科学与工程系
推荐引用方式 GB/T 7714	Shen, Jingran,Tziritas, Nikos,Theodoropoulos, Georgios. Towards a Flexible Accuracy-Oriented Deep Learning Module Inference Latency Prediction Framework for Adaptive Optimization Algorithms[C]:Springer Science and Business Media Deutschland GmbH,2024:34-47.