中文版 | English
题名

CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts

作者
通讯作者Li, Yi; Wang, Zhongrui; Shang, Dashan
发表日期
2024-10-01
DOI
发表期刊
ISSN
1674-733X
EISSN
1869-1919
卷号67期号:10
摘要
Artificial intelligence (AI) has experienced substantial advancements recently, notably with the advent of large-scale language models (LLMs) employing mixture-of-experts (MoE) techniques, exhibiting human-like cognitive skills. As a promising hardware solution for edge MoE implementations, the computing-in-memory (CIM) architecture collocates memory and computing within a single device, significantly reducing the data movement and the associated energy consumption. However, due to diverse edge application scenarios and constraints, determining the optimal network structures for MoE, such as the expert's location, quantity, and dimension on CIM systems remains elusive. To this end, we introduce a software-hardware co-designed neural architecture search (NAS) framework, CIM-based MoE NAS (CMN), focusing on identifying a high-performing MoE structure under specific hardware constraints. The results of the NYUD-v2 dataset segmentation on the RRAM (SRAM) CIM system reveal that CMN can discover optimized MoE configurations under energy, latency, and performance constraints, achieving 29.67x (43.10x) energy savings, 175.44x(109.89x) speedup, and 12.24x smaller model size compared to the baseline MoE-enabled Visual Transformer, respectively. This co-design opens up an avenue toward high-performance MoE deployments in edge CIM systems.
关键词
相关链接[来源记录]
收录类别
语种
英语
学校署名
通讯
资助项目
National Key R&D Program of China[2020AAA0109005] ; Beijing Natural Science Foundation[Z210006] ; National Naturel Science Foundation of China["62122004","62374181"] ; Strategic Priority Research Program of the Chinese Academy of Sciences[XDB44000000] ; Hong Kong Research Grant Council["27206321","17205922","17212923"] ; Hong Kong Innovation and Technology Fund[ITP/048/22AP]
WOS研究方向
Computer Science ; Engineering
WOS类目
Computer Science, Information Systems ; Engineering, Electrical & Electronic
WOS记录号
WOS:001321914100001
出版者
来源库
Web of Science
引用统计
成果类型期刊论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/834296
专题工学院_深港微电子学院
作者单位
1.Univ Hong Kong, Dept Elect & Elect Engn, Hong Kong 999077, Peoples R China
2.ACCESS AI Chip Ctr Emerging Smart Syst, InnoHK Ctr, Hong Kong Sci Pk, Hong Kong 999077, Peoples R China
3.Chinese Acad Sci, Key Lab Fabricat Technol Integrated Circuits, Beijing 100049, Peoples R China
4.Chinese Acad Sci, Lab Microelect Devices & Integrated Technol, Inst Microelect, Beijing 100029, Peoples R China
5.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
6.Southern Univ Sci & Technol, Sch Microelect, Shenzhen 518055, Peoples R China
通讯作者单位深港微电子学院
推荐引用方式
GB/T 7714
Han, Shihao,Liu, Sishuo,Du, Shucheng,et al. CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts[J]. SCIENCE CHINA-INFORMATION SCIENCES,2024,67(10).
APA
Han, Shihao.,Liu, Sishuo.,Du, Shucheng.,Li, Mingzi.,Ye, Zijian.,...&Shang, Dashan.(2024).CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts.SCIENCE CHINA-INFORMATION SCIENCES,67(10).
MLA
Han, Shihao,et al."CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts".SCIENCE CHINA-INFORMATION SCIENCES 67.10(2024).
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Han, Shihao]的文章
[Liu, Sishuo]的文章
[Du, Shucheng]的文章
百度学术
百度学术中相似的文章
[Han, Shihao]的文章
[Liu, Sishuo]的文章
[Du, Shucheng]的文章
必应学术
必应学术中相似的文章
[Han, Shihao]的文章
[Liu, Sishuo]的文章
[Du, Shucheng]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。