南方科技大学知识苑(SUSTech KC): When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task Medical Applications

题名	When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task Medical Applications
作者	Liu, Qidong 1; Wu, Xian 2; Zhao, Xiangyu 3; Zhu, Yuanshao 4 ; Xu, Derong 5; Tian, Feng 6; Zheng, Yefeng 2
通讯作者	Wu, Xian; Zhao, Xiangyu; Tian, Feng
DOI	10.1145/3626772.3657722
发表日期	2024-07-10
会议名称	47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024
ISBN	9798400704314
会议录名称	SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
页码	1104-1114
会议日期	July 14, 2024 - July 18, 2024
会议地点	Washington, DC, United states
会议录编者/会议主办者	ACM SIGIR
出版地	1601 Broadway, 10th Floor, NEW YORK, NY, UNITED STATES
出版者	Association for Computing Machinery, Inc
摘要	The recent surge in Large Language Models (LLMs) has garnered significant attention across numerous fields. Fine-tuning is often required to fit general LLMs for a specific domain, like the web-based healthcare system. However, two problems arise during fine-tuning LLMs for medical applications. One is the task variety problem, which involves distinct tasks in real-world medical scenarios. The variety often leads to sub-optimal fine-tuning for data imbalance and seesaw problems. Besides, the large amount of parameters in LLMs leads to huge time and computation consumption by fine-tuning. To address these two problems, we propose a novel parameter efficient fine-tuning framework for multi-task medical applications, dubbed as MOELoRA. The designed framework aims to absorb both the benefits of mixture-of-expert (MOE) for multi-task learning and low-rank adaptation (LoRA) for parameter efficient fine-tuning. For unifying MOE and LoRA, we devise multiple experts as the trainable parameters, where each expert consists of a pair of low-rank matrices to retain the small size of trainable parameters. Then, a task-motivated gate function for all MOELoRA layers is proposed, which can control the contributions of each expert and produce distinct parameters for various tasks. We conduct experiments on a multi-task medical dataset, indicating MOELoRA outperforms the existing parameter efficient fine-tuning methods. The code is available online. © 2024 ACM.
关键词	Medical Application Large Language Model Multi-task Learning
学校署名	其他
语种	英语
相关链接	[来源记录]
收录类别	EI ; CPCI-S
资助项目	This research was supported by Tencent (CCF-Tencent Open Fund), Research Impact Fund (No.R1015-23), APRC - CityU New Research Initiatives (No.9610565, Start-up Grant for New Faculty of CityU), CityU - HKIDS Early Career Research Grant (No.9360163), Hong Kong ITC Innovation and Technology Fund Midstream Research Programme for Universities Project (No.ITS/034/22MS), Hong Kong Environmental and Conservation Fund (No. 88/2022), SIRG - CityU Strategic Interdisciplinary Research Grant (No.7020046, No.7020074).
WOS研究方向	Computer Science
WOS类目	Computer Science, Artificial Intelligence ; Computer Science, Information Systems ; Computer Science, Theory & Methods
WOS记录号	WOS:001273410001018
EI入藏号	20243216839996
EI主题词	Computational linguistics ; Learning systems ; Medical problems
EI分类号	Computer Theory, Includes Formal Logic, Automata Theory, Switching Theory, Programming Theory:721.1
来源库	EV Compendex
引用统计	被引频次[WOS]：7
成果类型	会议论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/807090
专题	南方科技大学
作者单位	1.Xi'an Jiaotong University, City University of Hong Kong, Xi'an, China 2.Tencent YouTu Lab, Jarvis Research Center, Shenzhen, China 3.City University of Hong Kong, Hong Kong, Hong Kong 4.Southern University of Science and Technology, City University of Hong Kong, Shenzhen, China 5.University of Science and Technology of China, City University of Hong Kong, Hefei, China 6.Xi'an Jiaotong University, Xi'an, China
推荐引用方式 GB/T 7714	Liu, Qidong,Wu, Xian,Zhao, Xiangyu,et al. When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task Medical Applications[C]//ACM SIGIR. 1601 Broadway, 10th Floor, NEW YORK, NY, UNITED STATES:Association for Computing Machinery, Inc,2024:1104-1114.