中文版 | English
题名

TEA: Temporal Excitation and Aggregation for Action Recognition

作者
通讯作者Zhang,Jianguo; Kang,Bin; Wang,Limin
DOI
发表日期
2020
会议名称
IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2020)
ISSN
1063-6919
ISBN
978-1-7281-7169-2
会议录名称
页码
906-915
会议日期
June 16 - 18, 2020
会议地点
USA
摘要

Temporal modeling is key for action recognition in videos. It normally considers both short-range motions and long-range aggregations. In this paper, we propose a Temporal Excitation and Aggregation (TEA) block, including a motion excitation (ME) module and a multiple temporal aggregation (MTA) module, specifically designed to capture both short- and long-range temporal evolution. In particular, for short-range motion modeling, the ME module calculates the feature-level temporal differences from spatiotemporal features. It then utilizes the differences to excite the motion-sensitive channels of the features. The long-range temporal aggregations in previous works are typically achieved by stacking a large number of local temporal convolutions. Each convolution processes a local temporal window at a time. In contrast, the MTA module proposes to deform the local convolution to a group of sub-convolutions, forming a hierarchical residual architecture. Without introducing additional parameters, the features will be processed with a series of sub-convolutions, and each frame could complete multiple temporal aggregations with neighborhoods. The final equivalent receptive field of temporal dimension is accordingly enlarged, which is capable of modeling the long-range temporal relationship over distant frames. The two components of the TEA block are complementary in temporal modeling. Finally, our approach achieves impressive results at low FLOPs on several action recognition benchmarks, such as Kinetics, Something-Something, HMDB51, and UCF101, which confirms its effectiveness and efficiency.

关键词
学校署名
通讯
语种
英语
相关链接[Scopus记录]
收录类别
EI入藏号
20204409421322
EI主题词
Computer vision
EI分类号
Information Theory and Signal Processing:716.1 ; Computer Applications:723.5 ; Vision:741.2
Scopus记录号
2-s2.0-85094096959
来源库
Scopus
全文链接https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9157646
引用统计
被引频次[WOS]:374
成果类型会议论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/209283
专题工学院_计算机科学与工程系
作者单位
1.Platform and Content Group (PCG),Tencent,China
2.State Key Laboratory for Novel Software Technology,Nanjing University,China
3.Department of Computer Science and Engineering,Southern University of Science and Technology,China
通讯作者单位计算机科学与工程系
推荐引用方式
GB/T 7714
Li,Yan,Ji,Bin,Shi,Xintian,et al. TEA: Temporal Excitation and Aggregation for Action Recognition[C],2020:906-915.
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Li,Yan]的文章
[Ji,Bin]的文章
[Shi,Xintian]的文章
百度学术
百度学术中相似的文章
[Li,Yan]的文章
[Ji,Bin]的文章
[Shi,Xintian]的文章
必应学术
必应学术中相似的文章
[Li,Yan]的文章
[Ji,Bin]的文章
[Shi,Xintian]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。