题名 | Enabling Deep Residual Networks for Weakly Supervised Object Detection |
作者 | |
通讯作者 | Ji,Rongrong |
DOI | |
发表日期 | 2020
|
ISSN | 0302-9743
|
EISSN | 1611-3349
|
会议录名称 | |
卷号 | 12353 LNCS
|
页码 | 118-136
|
摘要 | Weakly supervised object detection (WSOD) has attracted extensive research attention due to its great flexibility of exploiting large-scale image-level annotation for detector training. Whilst deep residual networks such as ResNet and DenseNet have become the standard backbones for many computer vision tasks, the cutting-edge WSOD methods still rely on plain networks, e.g., VGG, as backbones. It is indeed not trivial to employ deep residual networks for WSOD, which even shows significant deterioration of detection accuracy and non-convergence. In this paper, we discover the intrinsic root with sophisticated analysis and propose a sequence of design principles to take full advantages of deep residual learning for WSOD from the perspectives of adding redundancy, improving robustness and aligning features. First, a redundant adaptation neck is key for effective object instance localization and discriminative feature learning. Second, small-kernel convolutions and MaxPool down-samplings help improve the robustness of information flow, which gives finer object boundaries and make the detector more sensitivity to small objects. Third, dilated convolution is essential to align the proposal features and exploit diverse local information by extracting high-resolution feature maps. Extensive experiments show that the proposed principles enable deep residual networks to establishes new state-of-the-arts on PASCAL VOC and MS COCO. |
学校署名 | 其他
|
语种 | 英语
|
相关链接 | [Scopus记录] |
收录类别 | |
EI入藏号 | 20205009617791
|
EI主题词 | Arts computing
; Object recognition
; Deep learning
; Deterioration
; Computer vision
; Convolution
|
EI分类号 | Ergonomics and Human Factors Engineering:461.4
; Information Theory and Signal Processing:716.1
; Data Processing and Image Processing:723.2
; Computer Applications:723.5
; Vision:741.2
; Materials Science:951
|
Scopus记录号 | 2-s2.0-85097395636
|
来源库 | Scopus
|
引用统计 |
被引频次[WOS]:0
|
成果类型 | 会议论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/209826 |
专题 | 南方科技大学 工学院_计算机科学与工程系 |
作者单位 | 1.Media Analytics and Computing Lab,Department of Artificial Intelligence,School of Informatics,Xiamen University,Xiamen,361005,China 2.Pinterest,San Francisco,United States 3.CSE,Southern University of Science and Technology,Shenzhen,China 4.Tencent Youtu Lab,Tencent Technology (Shanghai) Co.,Ltd.,Shanghai,China |
推荐引用方式 GB/T 7714 |
Shen,Yunhang,Ji,Rongrong,Wang,Yan,et al. Enabling Deep Residual Networks for Weakly Supervised Object Detection[C],2020:118-136.
|
条目包含的文件 | 条目无相关文件。 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论