南方科技大学知识苑(SUSTech KC): Self-Attention Networks for Code Search

题名	Self-Attention Networks for Code Search
作者	Fang，Sen 1; Tan，You Shuai 1; Zhang，Tao 1; Liu，Yepang2
发表日期	2021-06-01
DOI	10.1016/j.infsof.2021.106542
发表期刊	Information and Software Technology 影响因子和分区
ISSN	0950-5849
卷号	134
摘要	Context: Developers tend to search and reuse code snippets from a large-scale codebase when they want to implement some functions that exist in the previous projects, which can enhance the efficiency of software development. Objective: As the first deep learning-based code search model, DeepCS outperforms prior models such as Sourcere and CodeHow. However, it utilizes two separate LSTM to represent code snippets and natural language descriptions respectively, which ignores semantic relations between code snippets and their descriptions. Consequently, the performance of DeepCS falls into the bottleneck, and thus our objective is to break this bottleneck. Method: We propose a self-attention joint representation learning model, named SAN-CS (Self-Attention Network for Code Search). Comparing with DeepCS, we directly utilize the self-attention network to construct our code search model. By a weighted average operation, self-attention networks can fully capture the contextual information of code snippets and their descriptions. We first utilize two individual self-attention networks to represent code snippets and their descriptions, respectively, and then we utilize the self-attention network to conduct an extra joint representation network for code snippets and their descriptions, which can build semantic relationships between code snippets and their descriptions. Therefore, SAN-CS can break the performance bottleneck of DeepCS. Results: We evaluate SAN-CS on the dataset shared by Gu et al. and choose two baseline models, DeepCS and CARLCS-CNN. Experimental results demonstrate that SAN-CS achieves significantly better performance than DeepCS and CARLCS-CNN. In addition, SAN-CS has better execution efficiency than DeepCS at the training and testing phase. Conclusion: This paper proposes a code search model, SAN-CS. It utilizes the self-attention network to perform the joint attention representations for code snippets and their descriptions, respectively. Experimental results verify the effectiveness and efficiency of SAN-CS.
关键词	Code Search Joint Embedding Self-attention Mechanism
相关链接	[Scopus记录]
收录类别	SCI ; EI
语种	英语
学校署名	其他
WOS记录号	WOS:000634797600003
EI入藏号	20210709926236
EI主题词	Automata Theory ; Computer Software Reusability ; Deep Learning ; Efficiency ; Learning Systems ; Long Short-term Memory ; Semantics ; Software Design
EI分类号	Information Theory And Signal Processing:716.1 ; Computer Theory, Includes Formal Logic, Automata Theory, Switching Theory, Programming Theory:721.1 ; Computer Software, Data HAndling And Applications:723 ; Production Engineering:913.1
ESI学科分类	COMPUTER SCIENCE
Scopus记录号	2-s2.0-85100726176
来源库	Scopus
引用统计	被引频次[WOS]：34
成果类型	期刊论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/221476
专题	南方科技大学工学院_计算机科学与工程系
作者单位	1.Macau University of Science and Technology,Macau,China 2.Southern University of Science and Technology,Shenzhen,China
推荐引用方式 GB/T 7714	Fang，Sen,Tan，You Shuai,Zhang，Tao,et al. Self-Attention Networks for Code Search[J]. Information and Software Technology,2021,134.
APA	Fang，Sen,Tan，You Shuai,Zhang，Tao,&Liu，Yepang.(2021).Self-Attention Networks for Code Search.Information and Software Technology,134.
MLA	Fang，Sen,et al."Self-Attention Networks for Code Search".Information and Software Technology 134(2021).