题名 | When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task Medical Applications |
作者 | |
通讯作者 | Wu, Xian; Zhao, Xiangyu; Tian, Feng |
DOI | |
发表日期 | 2024-07-10
|
会议名称 | 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024
|
ISBN | 9798400704314
|
会议录名称 | |
页码 | 1104-1114
|
会议日期 | July 14, 2024 - July 18, 2024
|
会议地点 | Washington, DC, United states
|
会议录编者/会议主办者 | ACM SIGIR
|
出版地 | 1601 Broadway, 10th Floor, NEW YORK, NY, UNITED STATES
|
出版者 | |
摘要 | The recent surge in Large Language Models (LLMs) has garnered significant attention across numerous fields. Fine-tuning is often required to fit general LLMs for a specific domain, like the web-based healthcare system. However, two problems arise during fine-tuning LLMs for medical applications. One is the task variety problem, which involves distinct tasks in real-world medical scenarios. The variety often leads to sub-optimal fine-tuning for data imbalance and seesaw problems. Besides, the large amount of parameters in LLMs leads to huge time and computation consumption by fine-tuning. To address these two problems, we propose a novel parameter efficient fine-tuning framework for multi-task medical applications, dubbed as MOELoRA. The designed framework aims to absorb both the benefits of mixture-of-expert (MOE) for multi-task learning and low-rank adaptation (LoRA) for parameter efficient fine-tuning. For unifying MOE and LoRA, we devise multiple experts as the trainable parameters, where each expert consists of a pair of low-rank matrices to retain the small size of trainable parameters. Then, a task-motivated gate function for all MOELoRA layers is proposed, which can control the contributions of each expert and produce distinct parameters for various tasks. We conduct experiments on a multi-task medical dataset, indicating MOELoRA outperforms the existing parameter efficient fine-tuning methods. The code is available online. © 2024 ACM. |
关键词 | |
学校署名 | 其他
|
语种 | 英语
|
相关链接 | [来源记录] |
收录类别 | |
资助项目 | This research was supported by Tencent (CCF-Tencent Open Fund), Research Impact Fund (No.R1015-23), APRC - CityU New Research Initiatives (No.9610565, Start-up Grant for New Faculty of CityU), CityU - HKIDS Early Career Research Grant (No.9360163), Hong Kong ITC Innovation and Technology Fund Midstream Research Programme for Universities Project (No.ITS/034/22MS), Hong Kong Environmental and Conservation Fund (No. 88/2022), SIRG - CityU Strategic Interdisciplinary Research Grant (No.7020046, No.7020074).
|
WOS研究方向 | Computer Science
|
WOS类目 | Computer Science, Artificial Intelligence
; Computer Science, Information Systems
; Computer Science, Theory & Methods
|
WOS记录号 | WOS:001273410001018
|
EI入藏号 | 20243216839996
|
EI主题词 | Computational linguistics
; Learning systems
; Medical problems
|
EI分类号 | Computer Theory, Includes Formal Logic, Automata Theory, Switching Theory, Programming Theory:721.1
|
来源库 | EV Compendex
|
引用统计 |
被引频次[WOS]:7
|
成果类型 | 会议论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/807090 |
专题 | 南方科技大学 |
作者单位 | 1.Xi'an Jiaotong University, City University of Hong Kong, Xi'an, China 2.Tencent YouTu Lab, Jarvis Research Center, Shenzhen, China 3.City University of Hong Kong, Hong Kong, Hong Kong 4.Southern University of Science and Technology, City University of Hong Kong, Shenzhen, China 5.University of Science and Technology of China, City University of Hong Kong, Hefei, China 6.Xi'an Jiaotong University, Xi'an, China |
推荐引用方式 GB/T 7714 |
Liu, Qidong,Wu, Xian,Zhao, Xiangyu,et al. When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task Medical Applications[C]//ACM SIGIR. 1601 Broadway, 10th Floor, NEW YORK, NY, UNITED STATES:Association for Computing Machinery, Inc,2024:1104-1114.
|
条目包含的文件 | 条目无相关文件。 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论