题名 | CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts |
作者 | |
通讯作者 | Li, Yi; Wang, Zhongrui; Shang, Dashan |
发表日期 | 2024-10-01
|
DOI | |
发表期刊 | |
ISSN | 1674-733X
|
EISSN | 1869-1919
|
卷号 | 67期号:10 |
摘要 | Artificial intelligence (AI) has experienced substantial advancements recently, notably with the advent of large-scale language models (LLMs) employing mixture-of-experts (MoE) techniques, exhibiting human-like cognitive skills. As a promising hardware solution for edge MoE implementations, the computing-in-memory (CIM) architecture collocates memory and computing within a single device, significantly reducing the data movement and the associated energy consumption. However, due to diverse edge application scenarios and constraints, determining the optimal network structures for MoE, such as the expert's location, quantity, and dimension on CIM systems remains elusive. To this end, we introduce a software-hardware co-designed neural architecture search (NAS) framework, CIM-based MoE NAS (CMN), focusing on identifying a high-performing MoE structure under specific hardware constraints. The results of the NYUD-v2 dataset segmentation on the RRAM (SRAM) CIM system reveal that CMN can discover optimized MoE configurations under energy, latency, and performance constraints, achieving 29.67x (43.10x) energy savings, 175.44x(109.89x) speedup, and 12.24x smaller model size compared to the baseline MoE-enabled Visual Transformer, respectively. This co-design opens up an avenue toward high-performance MoE deployments in edge CIM systems. |
关键词 | |
相关链接 | [来源记录] |
收录类别 | |
语种 | 英语
|
学校署名 | 通讯
|
资助项目 | National Key R&D Program of China[2020AAA0109005]
; Beijing Natural Science Foundation[Z210006]
; National Naturel Science Foundation of China["62122004","62374181"]
; Strategic Priority Research Program of the Chinese Academy of Sciences[XDB44000000]
; Hong Kong Research Grant Council["27206321","17205922","17212923"]
; Hong Kong Innovation and Technology Fund[ITP/048/22AP]
|
WOS研究方向 | Computer Science
; Engineering
|
WOS类目 | Computer Science, Information Systems
; Engineering, Electrical & Electronic
|
WOS记录号 | WOS:001321914100001
|
出版者 | |
来源库 | Web of Science
|
引用统计 | |
成果类型 | 期刊论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/834296 |
专题 | 工学院_深港微电子学院 |
作者单位 | 1.Univ Hong Kong, Dept Elect & Elect Engn, Hong Kong 999077, Peoples R China 2.ACCESS AI Chip Ctr Emerging Smart Syst, InnoHK Ctr, Hong Kong Sci Pk, Hong Kong 999077, Peoples R China 3.Chinese Acad Sci, Key Lab Fabricat Technol Integrated Circuits, Beijing 100049, Peoples R China 4.Chinese Acad Sci, Lab Microelect Devices & Integrated Technol, Inst Microelect, Beijing 100029, Peoples R China 5.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 6.Southern Univ Sci & Technol, Sch Microelect, Shenzhen 518055, Peoples R China |
通讯作者单位 | 深港微电子学院 |
推荐引用方式 GB/T 7714 |
Han, Shihao,Liu, Sishuo,Du, Shucheng,et al. CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts[J]. SCIENCE CHINA-INFORMATION SCIENCES,2024,67(10).
|
APA |
Han, Shihao.,Liu, Sishuo.,Du, Shucheng.,Li, Mingzi.,Ye, Zijian.,...&Shang, Dashan.(2024).CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts.SCIENCE CHINA-INFORMATION SCIENCES,67(10).
|
MLA |
Han, Shihao,et al."CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts".SCIENCE CHINA-INFORMATION SCIENCES 67.10(2024).
|
条目包含的文件 | 条目无相关文件。 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论