南方科技大学知识苑(SUSTech KC): Memory-attended recurrent network for video captioning

题名	Memory-attended recurrent network for video captioning
作者	Pei, Wenjie 1; Zhang, Jiyuan 1; Wang, Xiangrong2 ; Ke, Lei 1; Shen, Xiaoyong 1; Tai, Yu-Wing 1
通讯作者	Tai, Yu-Wing
DOI	10.1109/CVPR.2019.00854
发表日期	2019
ISSN	1063-6919
ISBN	978-1-7281-3294-5
会议录名称	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
卷号	2019-June
页码	8339-8348
会议日期	15-20 June 2019
会议地点	Long Beach, CA, United states
出版地	345 E 47TH ST, NEW YORK, NY 10017 USA
出版者	IEEE Computer Society
摘要	Typical techniques for video captioning follow the encoder-decoder framework, which can only focus on one source video being processed. A potential disadvantage of such design is that it cannot capture the multiple visual context information of a word appearing in more than one relevant videos in training data. To tackle this limitation, we propose the Memory-Attended Recurrent Network (MARN) for video captioning, in which a memory structure is designed to explore the full-spectrum correspondence between a word and its various similar visual contexts across videos in training data. Thus, our model is able to achieve a more comprehensive understanding for each word and yield higher captioning quality. Furthermore, the built memory structure enables our method to model the compatibility between adjacent words explicitly instead of asking the model to learn implicitly, as most existing models do. Extensive validation on two real-word datasets demonstrates that our MARN consistently outperforms state-of-the-art methods. © 2019 IEEE.
关键词	Vision + Language Deep Learning Video Analytics
学校署名	其他
语种	英语
相关链接	[来源记录]
收录类别	EI ; CPCI-S
WOS研究方向	Computer Science
WOS类目	Computer Science, Artificial Intelligence ; Computer Science, Theory & Methods
WOS记录号	WOS:000542649301097
EI入藏号	20200508114468
EI主题词	Recurrent neural networks
EI分类号	Computer Applications:723.5 ; Vision:741.2
来源库	EV Compendex
全文链接	https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8953335
引用统计	被引频次[WOS]：150
成果类型	会议论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/104891
专题	南方科技大学未来网络研究院
作者单位	1.Tencent, China 2.Southern University of Science and Technology, China
推荐引用方式 GB/T 7714	Pei, Wenjie,Zhang, Jiyuan,Wang, Xiangrong,et al. Memory-attended recurrent network for video captioning[C]. 345 E 47TH ST, NEW YORK, NY 10017 USA:IEEE Computer Society,2019:8339-8348.