题名 | InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring |
作者 | |
通讯作者 | Li,Zhen |
DOI | |
发表日期 | 2021
|
ISSN | 1550-5499
|
ISBN | 978-1-6654-2813-2
|
会议录名称 | |
页码 | 1771-1780
|
会议日期 | 10-17 Oct. 2021
|
会议地点 | Montreal, QC, Canada
|
摘要 | Compared with the visual grounding on 2D images, the natural-language-guided 3D object localization on point clouds is more challenging. In this paper, we propose a new model, named InstanceRefer, to achieve a superior 3D visual grounding through the grounding-by-matching strategy. In practice, our model first predicts the target category from the language descriptions using a simple language classification model. Then, based on the category, our model sifts out a small number of instance candidates (usually less than 20) from the panoptic segmentation on point clouds. Thus, the non-trivial 3D visual grounding task has been effectively re-formulated as a simplified instance-matching problem, considering that instance-level candidates are more rational than the redundant 3D object proposals. Subsequently, for each candidate, we perform the multi-level contextual inference, i.e., referring from instance attribute perception, instance-to-instance relation perception, and instance-to-background global localization perception, respectively. Eventually, the most relevant candidate is selected and localized by ranking confidence scores, which are obtained by the cooperative holistic visual-language feature matching. Experiments confirm that our method outperforms previous state-of-the-arts on ScanRefer online benchmark and Nr3D/Sr3D datasets. |
关键词 | |
学校署名 | 其他
|
语种 | 英语
|
相关链接 | [Scopus记录] |
收录类别 | |
资助项目 | National Key Research and Development Program of China[2018YFB1800800];
|
EI入藏号 | 20221511951300
|
Scopus记录号 | 2-s2.0-85120974687
|
来源库 | Scopus
|
全文链接 | https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9711198 |
引用统计 |
被引频次[WOS]:26
|
成果类型 | 会议论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/329678 |
专题 | 南方科技大学 |
作者单位 | 1.The Chinese University of Hong Kong (Shenzhen),Shenzhen Research Institute of Big Data,Hong Kong 2.CryoEM Center,Southern University of Science and Technology,China |
推荐引用方式 GB/T 7714 |
Yuan,Zhihao,Yan,Xu,Liao,Yinghong,et al. InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring[C],2021:1771-1780.
|
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | 操作 | |
10.1109@ICCV48922.20(7707KB) | -- | -- | 开放获取 | -- | 浏览 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论