南方科技大学知识苑(SUSTech KC): WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations

题名	WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations
作者	Liu，Peidong 1; He，Zibin 1; Yan，Xiyu 1; Jiang，Yong 1,2; Xia，Shu Tao 1,2; Zheng，Feng3 ; Maowei，Hu 1,4
通讯作者	He，Zibin
DOI	10.1145/3474085.3475217
发表日期	2021-10-17
会议录名称	MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
页码	2995-3004
摘要	Compared with tedious per-pixel mask annotating, it is much easier to annotate data by clicks, which costs only several seconds for an image. However, applying clicks to learn video semantic segmentation model has not been explored before. In this work, we propose an effective weakly-supervised video semantic segmentation pipeline with click annotations, called WeClick, for saving laborious annotating effort by segmenting an instance of the semantic class with only a single click. Since detailed semantic information is not captured by clicks, directly training with click labels leads to poor segmentation predictions. To mitigate this problem, we design a novel memory flow knowledge distillation strategy to exploit temporal information (named memory flow) in abundant unlabeled video frames, by distilling the neighboring predictions to the target frame via estimated motion. Moreover, we adopt vanilla knowledge distillation for model compression. In this case, WeClick learns compact video semantic segmentation models with the low-cost click annotations during the training phase yet achieves real-time and accurate models during the inference period. Experimental results on Cityscapes and Camvid show that WeClick outperforms the state-of-the-art methods, increases performance by 10.24% mIoU than baseline, and achieves real-time execution.
关键词	click annotations knowledge distillation video semantic segmentation weakly-supervised learning
学校署名	其他
语种	英语
相关链接	[Scopus记录]
收录类别	EI
EI入藏号	20214711200266
EI主题词	Computer vision ; Distillation ; Semantics
EI分类号	Artificial Intelligence:723.4 ; Computer Applications:723.5 ; Vision:741.2 ; Chemical Operations:802.3
Scopus记录号	2-s2.0-85119357398
来源库	Scopus
引用统计	被引频次[WOS]：4
成果类型	会议论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/256874
专题	工学院_计算机科学与工程系
作者单位	1.Tsinghua Shenzhen International Graduate School,Tsinghua University,China 2.PCL Research Center of Networks and Communications,Peng Cheng Laboratory,China 3.Department of Computer Science and Engineering,Southern University of Science and Technology,China 4.Shenzhen Rejoice Sport Tech. Co.,LTD,China
推荐引用方式 GB/T 7714	Liu，Peidong,He，Zibin,Yan，Xiyu,et al. WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations[C],2021:2995-3004.