南方科技大学知识苑(SUSTech KC): Task-Oriented Grasp Prediction with Visual-Language Inputs

题名	Task-Oriented Grasp Prediction with Visual-Language Inputs
作者	Chao Tang 1 ; Dehao Huang1 ; Lingxiao Meng2 ; Weiyu Liu 3; Hong Zhang1
DOI	10.1109/IROS55552.2023.10342268
发表日期	2023
会议名称	IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
ISSN	2153-0858
ISBN	978-1-6654-9191-4
会议录名称	2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
页码	4881-4888
会议日期	1-5 Oct. 2023
会议地点	Detroit, MI, USA
出版地	345 E 47TH ST, NEW YORK, NY 10017 USA
出版者	IEEE
摘要	To perform household tasks, assistive robots receive commands in the form of user language instructions for tool manipulation. The initial stage involves selecting the intended tool (i.e., object grounding) and grasping it in a task-oriented manner (i.e., task grounding). Nevertheless, prior researches on visual-language grasping (VLG) focus on object grounding, while disregarding the fine-grained impact of tasks on object grasping. Task-incompatible grasping of a tool will inevitably limit the success of subsequent manipulation steps. Motivated by this problem, this paper proposes GraspCLIP, which addresses the challenge of task grounding in addition to object grounding to enable task-oriented grasp prediction with visual-language inputs. Evaluation on a custom dataset demonstrates that GraspCLIP achieves superior performance over established baselines with object grounding only. The effectiveness of the proposed method is further validated on an assistive robotic arm for grasping previously unseen kitchen tools given the task specification. Our presentation video is available at: https://www.youtube.com/watch?v=e1wfYQPeAXU.
关键词	Grounding Grasping Manipulators Assistive robots 6-DOF Task analysis Intelligent robots
学校署名	第一
语种	英语
相关链接	[IEEE记录]
收录类别	EI ; CPCI-S
资助项目	Shenzhen Key Laboratory of Robotics and Computer Vision[ZDSYS20220330160557001]
WOS研究方向	Computer Science ; Robotics
WOS类目	Computer Science, Artificial Intelligence ; Computer Science, Information Systems ; Computer Science, Theory & Methods ; Robotics
WOS记录号	WOS:001133658803106
EI入藏号	20240315412045
EI主题词	Robotics
EI分类号	Computer Programming Languages:723.1.1 ; Robotics:731.5
来源库	IEEE
全文链接	https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10342268
引用统计	被引频次[WOS]：1
成果类型	会议论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/619947
专题	工学院_电子与电气工程系
作者单位	1.Shenzhen Key Laboratory of Robotics and Computer Vision, Southern University of Science and Technology, Shenzhen, China 2.Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China 3.Stanford University, United States
第一作者单位	南方科技大学
第一作者的第一单位	南方科技大学
推荐引用方式 GB/T 7714	Chao Tang,Dehao Huang,Lingxiao Meng,et al. Task-Oriented Grasp Prediction with Visual-Language Inputs[C]. 345 E 47TH ST, NEW YORK, NY 10017 USA:IEEE,2023:4881-4888.