题名 | Towards a Flexible Accuracy-Oriented Deep Learning Module Inference Latency Prediction Framework for Adaptive Optimization Algorithms |
作者 | |
通讯作者 | Theodoropoulos, Georgios |
DOI | |
发表日期 | 2024
|
会议名称 | 13th IFIP TC 12 International Conference on Intelligent Information Processing, IIP 2024
|
ISSN | 1868-4238
|
EISSN | 1868-422X
|
ISBN | 9783031578076
|
会议录名称 | |
卷号 | 703 IFIPAICT
|
页码 | 34-47
|
会议日期 | May 3, 2024 - May 6, 2024
|
会议地点 | Shenzhen, China
|
出版者 | |
摘要 | With the rapid development of Deep Learning, more and more applications on the cloud and edge tend to utilize large DNN (Deep Neural Network) models for improved task execution efficiency as well as decision-making quality. Due to memory constraints, models are commonly optimized using compression, pruning, and partitioning algorithms to become deployable onto resource-constrained devices. As the conditions in the computational platform change dynamically, the deployed optimization algorithms should accordingly adapt their solutions. To perform frequent evaluations of these solutions in a timely fashion, RMs (Regression Models) are commonly trained to predict the relevant solution quality metrics, such as the resulted DNN module inference latency, which is the focus of this paper. Existing prediction frameworks specify different RM training workflows, but none of them allow flexible configurations of the input parameters (e.g., batch size, device utilization rate) and of the selected RMs for different modules. In this paper, a deep learning module inference latency prediction framework is proposed, which i) hosts a set of customizable input parameters to train multiple different RMs per DNN module (e.g., convolutional layer) with self-generated datasets, and ii) automatically selects a set of trained RMs leading to the highest possible overall prediction accuracy, while keeping the prediction time/space consumption as low as possible. Furthermore, a new RM, namely MEDN (Multi-task Encoder-Decoder Network), is proposed as an alternative solution. Comprehensive experiment results show that MEDN is fast and lightweight, and capable of achieving the highest overall prediction accuracy and R-squared value. The Time/Space-efficient Auto-selection algorithm also manages to improve the overall accuracy by 2.5% and R-squared by 0.39%, compared to the MEDN single-selection scheme. © IFIP International Federation for Information Processing 2024. |
学校署名 | 第一
; 通讯
|
语种 | 英语
|
收录类别 | |
资助项目 | This research was supported by: Shenzhen Science and Technology Program, China (No. GJHZ20210705141807022); Guangdong Province Innovative and Entrepreneurial Team Programme, China (No. 2017ZT07X386); SUSTech Research Institute for Trustworthy Autonomous Systems, China. Corresponding author_ Georgios Theodoropoulos.
|
EI入藏号 | 20241715951353
|
EI主题词 | Constrained optimization
; Decision making
; Forecasting
; Inference engines
; Learning algorithms
; Learning systems
; Neural network models
; Regression analysis
|
EI分类号 | Ergonomics and Human Factors Engineering:461.4
; Artificial Intelligence:723.4
; Expert Systems:723.4.1
; Machine Learning:723.4.2
; Management:912.2
; Mathematical Statistics:922.2
; Systems Science:961
|
来源库 | EV Compendex
|
引用统计 | |
成果类型 | 会议论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/794570 |
专题 | 工学院_计算机科学与工程系 南方科技大学 |
作者单位 | 1.Department of Computer Science and Engineering, Southern University of Science and Technology (SUSTech), Shenzhen, China 2.Department of Informatics and Telecommunications, University of Thessaly, Lamia, Greece 3.Research Institute for Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology (SUSTech), Shenzhen, China |
第一作者单位 | 计算机科学与工程系 |
通讯作者单位 | 计算机科学与工程系 |
第一作者的第一单位 | 计算机科学与工程系 |
推荐引用方式 GB/T 7714 |
Shen, Jingran,Tziritas, Nikos,Theodoropoulos, Georgios. Towards a Flexible Accuracy-Oriented Deep Learning Module Inference Latency Prediction Framework for Adaptive Optimization Algorithms[C]:Springer Science and Business Media Deutschland GmbH,2024:34-47.
|
条目包含的文件 | 条目无相关文件。 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论