题名 | TransVLAD: Focusing on Locally Aggregated Descriptors for Few-Shot Learning |
作者 | |
通讯作者 | Jianguo Zhang |
共同第一作者 | Haoquan Li; Laoming Zhang |
DOI | |
发表日期 | 2022
|
会议名称 | 17th European Conference on Computer Vision (ECCV)
|
ISSN | 0302-9743
|
EISSN | 1611-3349
|
ISBN | 978-3-031-20043-4
|
会议录名称 | |
卷号 | 13680
|
会议日期 | OCT 23-27, 2022
|
会议地点 | null,Tel Aviv,ISRAEL
|
出版地 | GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND
|
出版者 | |
摘要 | This paper presents a transformer framework for few-shot learning, termed TransVLAD, with one focus showing the power of locally aggregated descriptors for few-shot learning. Our TransVLAD model is simple: a standard transformer encoder following a NeXtVLAD aggregation module to output the locally aggregated descriptors. In contrast to the prevailing use of CNN as part of the feature extractor, we are the first to prove self-supervised learning like masked autoencoders (MAE) can deal with the overfitting of transformers in few-shot image classification. Besides, few-shot learning can benefit from this general-purpose pre-training. Then, we propose two methods to mitigate few-shot biases, supervision bias and simple-characteristic bias. The first method is introducing masking operation into fine-tuning, by which we accelerate fine-tuning (by more than 3x) and improve accuracy. The second one is adapting focal loss into soft focal loss to focus on hard characteristics learning. Our TransVLAD finally tops 10 benchmarks on five popular few-shot datasets by an average of more than 2%. |
关键词 | |
学校署名 | 第一
; 共同第一
; 通讯
|
语种 | 英语
|
相关链接 | [来源记录] |
收录类别 | |
资助项目 | National Key Research and Development Program of China[2021YFF1200800]
; Stable Support Plan Program of Shenzhen Natural Science Fund[20200925154942002]
|
WOS研究方向 | Computer Science
; Imaging Science & Photographic Technology
|
WOS类目 | Computer Science, Artificial Intelligence
; Imaging Science & Photographic Technology
|
WOS记录号 | WOS:000904098900030
|
来源库 | Web of Science
|
引用统计 |
被引频次[WOS]:5
|
成果类型 | 会议论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/412318 |
专题 | 工学院_斯发基斯可信自主研究院 工学院_计算机科学与工程系 理学院_统计与数据科学系 |
作者单位 | 1.Research Institute of Trustworthy Autonomous Systems, Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China 2.Guangdong Provincial Key Laboratory of Brain-inspired Intelligent Computation, Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China 3.Department of Statistics and Data Science, Southern University of Science and Technology, Shenzhen, China 4.Peng Cheng Lab, Shenzhen, China |
第一作者单位 | 斯发基斯可信自主系统研究院; 计算机科学与工程系 |
通讯作者单位 | 斯发基斯可信自主系统研究院; 计算机科学与工程系 |
第一作者的第一单位 | 斯发基斯可信自主系统研究院; 计算机科学与工程系 |
推荐引用方式 GB/T 7714 |
Haoquan Li,Laoming Zhang,Daoan Zhang,et al. TransVLAD: Focusing on Locally Aggregated Descriptors for Few-Shot Learning[C]. GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND:SPRINGER INTERNATIONAL PUBLISHING AG,2022.
|
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | 操作 | |
136800509.pdf(796KB) | -- | -- | 开放获取 | -- | 浏览 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论