题名 | Speeding up Data Manipulation Tasks with Alternative Implementations: An Exploratory Study |
作者 | |
发表日期 | 2021-07-01
|
DOI | |
发表期刊 | |
ISSN | 1049-331X
|
EISSN | 1557-7392
|
卷号 | 30期号:4 |
摘要 | As data volume and complexity grow at an unprecedented rate, the performance of data manipulation programs is becoming a major concern for developers. In this article, we study how alternative API choices could improve data manipulation performance while preserving task-specific input/output equivalence. We propose a lightweight approach that leverages the comparative structures in Q&A sites to extracting alternative implementations. On a large dataset of Stack Overflow posts, our approach extracts 5,080 pairs of alternative implementations that invoke different data manipulation APIs to solve the same tasks, with an accuracy of 86%. Experiments show that for 15% of the extracted pairs, the faster implementation achieved >10x speedup over its slower alternative. We also characterize 68 recurring alternative API pairs from the extraction results to understand the type of APIs that can be used alternatively. To put these findings into practice, we implement a tool, AlterApi7, to automatically optimize real-world data manipulation programs. In the 1,267 optimization attempts on the Kaggle dataset, 76% achieved desirable performance improvements with up to orders-of-magnitude speedup. Finally, we discuss notable challenges of using alternative APIs for optimizing data manipulation programs. We hope that our study offers a new perspective on API recommendation and automatic performance optimization. |
关键词 | |
相关链接 | [Scopus记录] |
收录类别 | |
语种 | 英语
|
学校署名 | 其他
|
WOS记录号 | WOS:000683039100009
|
EI入藏号 | 20213210753826
|
EI主题词 | Computer Software
; Software Engineering
|
EI分类号 | Computer Software, Data HAndling And Applications:723
; Computer Programming:723.1
|
ESI学科分类 | COMPUTER SCIENCE
|
Scopus记录号 | 2-s2.0-85112068274
|
来源库 | Scopus
|
引用统计 |
被引频次[WOS]:1
|
成果类型 | 期刊论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/242729 |
专题 | 工学院_计算机科学与工程系 |
作者单位 | 1.College of Computer Science and Software Engineering,Shenzhen University,Shenzhen,3688 Nanhai Avenue,China 2.Department of Computer Science and Engineering,Southern University of Science and Technology,Shenzhen,1088 Xueyuan Avenue,China 3.Shenzhen University,Shenzhen,3688 Nanhai Avenue,China |
推荐引用方式 GB/T 7714 |
Tao,Yida,Tang,Shan,Liu,Yepang,et al. Speeding up Data Manipulation Tasks with Alternative Implementations: An Exploratory Study[J]. ACM Transactions on Software Engineering,2021,30(4).
|
APA |
Tao,Yida,Tang,Shan,Liu,Yepang,Xu,Zhiwu,&Qin,Shengchao.(2021).Speeding up Data Manipulation Tasks with Alternative Implementations: An Exploratory Study.ACM Transactions on Software Engineering,30(4).
|
MLA |
Tao,Yida,et al."Speeding up Data Manipulation Tasks with Alternative Implementations: An Exploratory Study".ACM Transactions on Software Engineering 30.4(2021).
|
条目包含的文件 | 条目无相关文件。 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论