南方科技大学知识苑(SUSTech KC): Uni4Eye++: A General Masked Image Modeling Multi-modal Pre-training Framework for Ophthalmic Image Classification and Segmentation

题名	Uni4Eye++: A General Masked Image Modeling Multi-modal Pre-training Framework for Ophthalmic Image Classification and Segmentation
作者	Zhiyuan Cai1,2 ; Li Lin1,2 ; Huaqing He1,2 ; Pujin Cheng1,2 ; Xiaoying Tang1,2
发表日期	2024
DOI	10.1109/TMI.2024.3422102
发表期刊	IEEE Transactions on Medical Imaging 影响因子和分区
ISSN	1558-254X
卷号	PP 期号:99
摘要	A large-scale labeled dataset is a key factor for the success of supervised deep learning in most ophthalmic image analysis scenarios. However, limited annotated data is very common in ophthalmic image analysis, since manual annotation is time-consuming and labor-intensive. Self-supervised learning (SSL) methods bring huge opportunities for better utilizing unlabeled data, as they do not require massive annotations. To utilize as many unlabeled ophthalmic images as possible, it is necessary to break the dimension barrier, simultaneously making use of both 2D and 3D images as well as alleviating the issue of catastrophic forgetting. In this paper, we propose a universal self-supervised Transformer framework named Uni4Eye++ to discover the intrinsic image characteristic and capture domain-specific feature embedding in ophthalmic images. Uni4Eye++ can serve as a global feature extractor, which builds its basis on a Masked Image Modeling task with a Vision Transformer architecture. On the basis of our previous work Uni4Eye, we further employ an image entropy guided masking strategy to reconstruct more-informative patches and a dynamic head generator module to alleviate modality confusion. We evaluate the performance of our pre-trained Uni4Eye++ encoder by fine-tuning it on multiple downstream ophthalmic image classification and segmentation tasks. The superiority of Uni4Eye++ is successfully established through comparisons to other state-of-the-art SSL pre-training methods. Our code is available at Github1.
相关链接	[IEEE记录]
收录类别	EI
学校署名	第一
ESI学科分类	CLINICAL MEDICINE
引用统计
成果类型	期刊论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/783767
专题	工学院_电子与电气工程系南方科技大学
作者单位	1.Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China 2.Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China
第一作者单位	电子与电气工程系; 南方科技大学
第一作者的第一单位	电子与电气工程系
推荐引用方式 GB/T 7714	Zhiyuan Cai,Li Lin,Huaqing He,et al. Uni4Eye++: A General Masked Image Modeling Multi-modal Pre-training Framework for Ophthalmic Image Classification and Segmentation[J]. IEEE Transactions on Medical Imaging,2024,PP(99).
APA	Zhiyuan Cai,Li Lin,Huaqing He,Pujin Cheng,&Xiaoying Tang.(2024).Uni4Eye++: A General Masked Image Modeling Multi-modal Pre-training Framework for Ophthalmic Image Classification and Segmentation.IEEE Transactions on Medical Imaging,PP(99).
MLA	Zhiyuan Cai,et al."Uni4Eye++: A General Masked Image Modeling Multi-modal Pre-training Framework for Ophthalmic Image Classification and Segmentation".IEEE Transactions on Medical Imaging PP.99(2024).