南方科技大学知识苑(SUSTech KC): 面向非独立同分布数据的个性化联邦学习算法研究

题名	面向非独立同分布数据的个性化联邦学习算法研究
其他题名	PERSONALIZED FEDERATED LEARNING ALGORITHM ON NON-I.I.D. DATA
姓名	杨锐鸿
姓名拼音	YANG Ruihong
学号	11930391
学位类型	硕士
学位专业	080900 电子科学与技术
学科门类/专业学位类别	08 工学
导师	张宇
导师单位	计算机科学与工程系
论文答辩日期	2022-05-08
论文提交日期	2022-06-12
学位授予单位	南方科技大学
学位授予地点	深圳
摘要	联邦学习是一种基于隐私保护的机器学习范式，它允许多个客户端在服务器的协调下联合训练一个全局模型，而不会导致数据泄露。然而，在现实场景中，不同客户端的数据通常不满足广泛应用于机器学习中数据独立同分布的假设，因此传统联邦学习的全局模型的性能可能会变差。为了处理这种情况，我们可以为每个客户端训练不同的模型来捕捉每个客户端的个性化。本文提出了一种崭新的个性化联邦学习框架，称为个性化联邦互学习算法，根据每个客户端数据的非独立同分布特性来训练个性化模型。具体来说，它将互学习的思想集成到每个客户端本地模型的训练过程中，不仅提高了全局模型和个性化模型的性能，同时加快模型的收敛性。此外，个性化联邦互学习算法可以支持客户端模型的异构性，并且保护个性化模型的信息。本文通过在四个数据集上的实验结果清晰地表明所提出的算法与经典的基线相比，性能显著提高。面对联邦学习中非独立同分布的多模态数据，本文提出了一种基于个性化联邦学习的多模态情感分析算法，它假设每个客户端的模型中模态编码层是全局共享的，模型的其它部分可由客户端个性化定义。另外，文章中还提出将多任务学习方法用于提高编码层的表达能力。最后通过实验表明了所提算法的有效性。
其他摘要	Federated Learning (FL) is a privacy-protected machine learning paradigm that allows many clients to jointly train a model under the coordination of a server without the local data leakage. In real-world scenarios, data in different clients usually cannot satisfy the independent and identically distributed (i.i.d.) assumption adopted widely in machine learning. Traditionally training a single global model may cause performance degradation in such a non-i.i.d. case. To handle this case, various models can be trained for each client to capture the personalization of each client. In this paper, we propose a new personalized FL framework called Personalized Federated Mutual Learning (PFML). It uses the non-i.i.d. characteristics to generate specific models for clients. Specifically, the PFML method integrates mutual learning into the training process of the local model in each client. It not only improves the performance of both the global and personalized models but also speeds up the convergence compared with state-of-the-art methods. Moreover, the proposed PFML method can help maintain the heterogeneity of client models and protect the information of personalized models. Experimental results on four datasets clearly demonstrate that the proposed framework achieves significantly better performance compared with state-of-the-art baselines. In the face of extreme non-i.i.d. multimodal data in federated learning, we propose an algorithm called Personalized Federated Multimodal Sentiment Analysis (PFMSA), which assumes that modality encoders of the model in each client model are globally shared, and other parts of the model can be customized by the client. In addition, we also propose to use multi-task learning methods to improve the expressive ability of modality encoders. Finally, experiments on benchmark datasets show the effectiveness of the proposed PFMSA algorithm.
关键词	联邦学习非独立同分布个性化多模态情感分析
其他关键词	Federated Learning Non-i.i.d. Personalization Multimodal Sentiment Analysis
语种	中文
培养类别	独立培养
入学年份	2019
学位授予年份	2022-06
参考文献列表	[1] MCMAHAN B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C]//Artificial Intelligence and Statistics. PMLR, 2017: 1273-1282. [2] 杨强, 刘洋, 程勇, 等. 联邦学习[M]. 北京: 电子工业出版社, 2020. [3] YANG Q, LIU Y, CHEN T, et al. Federated machine learning: Concept and applications[J].ACM Transactions on Intelligent Systems and Technology, 2019, 10(2): 1-19. [4] LIU F, WU X, GE S, et al. Federated learning for vision-and-language grounding problems[C]//AAAI Conference on Artificial Intelligence. 2020: 11572-11579. [5] XIE P, WU B, SUN G. Bayhenn: Combining bayesian deep learning and homomorphic encryption for secure dnn inference[J]. arXiv preprint arXiv:1906.00639, 2019. [6] BONAWITZ K, IVANOV V, KREUTER B, et al. Practical secure aggregation for privacy preserving machine learning[C]//ACM SIGSAC Conference on Computer and CommunicationsSecurity. 2017: 1175-1191. [7] KONEČNỲ J, MCMAHAN H B, RAMAGE D, et al. Federated optimization: Distributed machine learning for on-device intelligence[J]. arXiv preprint arXiv:1610.02527, 2016. [8] KAIROUZ P, MCMAHAN H B, AVENT B, et al. Advances and open problems in federated learning[J]. CoRR, 2019. [9] LI T, SAHU A K, TALWALKAR A, et al. Federated learning: Challenges, methods, and future directions[J]. IEEE Signal Processing Magazine, 2020, 37(3): 50-60. [10] KONEČNỲ J, MCMAHAN H B, YU F X, et al. Federated learning: Strategies for improving communication efficiency[J]. arXiv preprint arXiv:1610.05492, 2016. [11] ROTHCHILD D, PANDA A, ULLAH E, et al. Fetchsgd: Communication-efficient federated learning with sketching[C]//International Conference on Machine Learning. PMLR, 2020:8253-8265. [12] DIAO E, DING J, TAROKH V. Heterofl: Computation and communication efficient federated learning for heterogeneous clients[J]. arXiv preprint arXiv:2010.01264, 2020. [13] LI T, SAHU A K, ZAHEER M, et al. Federated optimization in heterogeneous networks[J].arXiv preprint arXiv:1812.06127, 2018. [14] KARIMIREDDY S P, KALE S, MOHRI M, et al. Scaffold: Stochastic controlled averaging for federated learning[C]//International Conference on Machine Learning. PMLR, 2020: 5132-5143. [15] IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]//International Conference on Machine Learning. PMLR, 2015: 448-456. [16] ZHU L, LIU Z, HAN S. Deep leakage from gradients[J]. Advances in Neural Information Processing Systems, 2019, 32. [17] HITAJ B, ATENIESE G, PEREZ-CRUZ F. Deep models under the gan: Information leakage from collaborative deep learning[C]//ACM SIGSAC Conference on Computer and Communications Security. 2017: 603-618. [18] BAGDASARYAN E, VEIT A, HUA Y, et al. How to backdoor federated learning[C]//International Conference on Artificial Intelligence and Statistics. PMLR, 2020: 2938-2948. [19] SUN Z, KAIROUZ P, SURESH A T, et al. Can you really backdoor federated learning?[J].arXiv preprint arXiv:1911.07963, 2019. [20] TORREY L, SHAVLIK J. Transfer learning[M]//Handbook of research on machine learning applications and trends: Algorithms, methods, and techniques. IGI Global, 2010: 242-264. [21] WANG K, MATHEWS R, KIDDON C, et al. Federated evaluation of on-device personalization[J]. arXiv preprint arXiv:1910.10252, 2019. [22] ARIVAZHAGAN M G, AGGARWAL V, SINGH A K, et al. Federated learning with personalization layers[J]. arXiv preprint arXiv:1912.00818, 2019. [23] COLLINS L, HASSANI H, MOKHTARI A, et al. Exploiting shared representations for personalized federated learning[C]//International Conference on Machine Learning. PMLR, 2021:2089-2099. [24] SHAMSIAN A, NAVON A, FETAYA E, et al. Personalized federated learning using hypernetworks[C]//International Conference on Machine Learning. PMLR, 2021: 9489-9502. [25] VILALTA R, DRISSI Y. A perspective view and survey of meta-learning[J]. Artificial Intelligence Review, 2002, 18(2): 77-95. [26] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//International Conference on Machine Learning. PMLR, 2017: 1126-1135. [27] JIANG Y, KONEČNỲ J, RUSH K, et al. Improving federated learning personalization via model agnostic meta learning[J]. arXiv preprint arXiv:1909.12488, 2019. [28] T DINH C, TRAN N, NGUYEN J. Personalized federated learning with moreau envelopes[J].Advances in Neural Information Processing Systems, 2020, 33: 21394-21405. [29] LI X C, ZHAN D C, SHAO Y, et al. Fedphp: Federated personalization with inherited private models[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2021: 587-602. [30] GUO B, MEI Y, XIAO D, et al. Pfl-moe: Personalized federated learning based on mixture of experts[J]. arXiv preprint arXiv:2012.15589, 2020. [31] DENG Y, KAMANI M M, MAHDAVI M. Adaptive personalized federated learning[J]. arXiv preprint arXiv:2003.13461, 2020. [32] HUANG Y, CHU L, ZHOU Z, et al. Personalized cross-silo federated learning on non-iid data[C]//AAAI Conference on Artificial Intelligence. 2021: 7865-7873. [33] ZHANG M, SAPRA K, FIDLER S, et al. Personalized federated learning with first order model optimization[J]. arXiv preprint arXiv:2012.08565, 2020. [34] HINTON G, VINYALS O, DEAN J, et al. Distilling the knowledge in a neural network[J]. arXiv preprint arXiv:1503.02531, 2015. [35] GOU J, YU B, MAYBANK S J, et al. Knowledge distillation: A survey[J]. International Journal of Computer Vision, 2021, 129(6): 1789-1819. [36] ZADEH A, PU P. Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph[C]//the 56th Annual Meeting of the Association for Computational Linguistics. 2018. [37] ELLIS J G, JOU B, CHANG S F. Why we watch the news: A dataset for exploring sentiment in broadcast video news[C]//International Conference on Multimodal Interaction. 2014: 104-111. [38] PORIA S, HAZARIKA D, MAJUMDER N, et al. Meld: A multimodal multi-party dataset for emotion recognition in conversations[J]. arXiv preprint arXiv:1810.02508, 2018. [39] CAI Y, CAI H, WAN X. Multi-modal sarcasm detection in twitter with hierarchical fusion model[C]//the 57th Annual Meeting of the Association for Computational Linguistics. 2019:2506-2515. [40] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. 2016: 770-778. [41] ZADEH A, CHEN M, PORIA S, et al. Tensor fusion network for multimodal sentiment analysis[J]. arXiv preprint arXiv:1707.07250, 2017. [42] ZADEH A, LIANG P P, PORIA S, et al. Multi-attention recurrent network for human communication comprehension[C]//AAAI Conference on Artificial Intelligence. 2018: 5642-5649. [43] ZADEH A, LIANG P P, MAZUMDER N, et al. Memory fusion network for multi-view sequential learning[C]//AAAI Conference on Artificial Intelligence. 2018: 5634-5641. [44] FALLAH A, MOKHTARI A, OZDAGLAR A. Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach[J]. Advances in Neural Information Processing Systems, 2020, 33: 3557-3568. [45] SMITH V, CHIANG C K, SANJABI M, et al. Federated multi-task learning[J]. Advances in Neural Information Processing Systems, 2017, 30: 4424-4434. [46] ZHANG Y, XIANG T, HOSPEDALES T M, et al. Deep mutual learning[C]//IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4320-4328. [47] REDDI S, CHARLES Z, ZAHEER M, et al. Adaptive federated optimization[J]. arXiv preprint arXiv:2003.00295, 2020. [48] SHAMIR O, SREBRO N, ZHANG T. Communication-efficient distributed optimization using an approximate newton-type method[C]//International Conference on Machine Learning. PMLR, 2014: 1000-1008. [49] YU W, XU H, MENG F, et al. Ch-sims: A chinese multimodal sentiment analysis dataset with fine-grained annotation of modality[C]//the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 3718-3727. [50] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018. [51] DEGOTTEX G, KANE J, DRUGMAN T, et al. Covarep—a collaborative voice analysis repository for speech technologies[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. 2014: 960-964. [52] MCFEE B, RAFFEL C, LIANG D, et al. Librosa: Audio and music signal analysis in python[C]//the 14th Python in Science Conference. 2015: 18-25. [53] ZHANG K, ZHANG Z, LI Z, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters, 2016, 23(10): 1499-1503. [54] BALTRUSAITIS T, ZADEH A, LIM Y C, et al. Openface 2.0: Facial behavior analysis toolkit[C]//IEEE International Conference on Automatic Face & Gesture Recognition. 2018: 59-66. [55] LECUN Y, BOSER B, DENKER J S, et al. Backpropagation applied to handwritten zip code recognition[J]. Neural Computation, 1989, 1(4): 541-551. [56] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation,1997, 9(8): 1735-1780. [57] ZHANG W, LI R, ZENG T, et al. Deep model based transfer and multi-task learning for biological image analysis[J]. IEEE Transactions on Big Data, 2016, 6(2): 322-333. [58] LIU W, MEI T, ZHANG Y, et al. Multi-task deep visual-semantic embedding for video thumbnail selection[C]//IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3707-3715. [59] CARUANA R. Multitask learning[J]. Machine Learning, 1997, 28(1): 41-75. [60] ZHANG Y, YANG Q. A survey on multi-task learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2021.
所在学位评定分委会	计算机科学与工程系
国内图书分类号	TP181
来源库	人工提交
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/335681
专题	工学院_计算机科学与工程系
推荐引用方式 GB/T 7714	杨锐鸿. 面向非独立同分布数据的个性化联邦学习算法研究[D]. 深圳. 南方科技大学,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
11930391-杨锐鸿-计算机科学与工（2033KB）	--	--	限制开放	--	请求全文