中文版 | English
题名

Speeding up Data Manipulation Tasks with Alternative Implementations: An Exploratory Study

作者
发表日期
2021-07-01
DOI
发表期刊
ISSN
1049-331X
EISSN
1557-7392
卷号30期号:4
摘要

As data volume and complexity grow at an unprecedented rate, the performance of data manipulation programs is becoming a major concern for developers. In this article, we study how alternative API choices could improve data manipulation performance while preserving task-specific input/output equivalence. We propose a lightweight approach that leverages the comparative structures in Q&A sites to extracting alternative implementations. On a large dataset of Stack Overflow posts, our approach extracts 5,080 pairs of alternative implementations that invoke different data manipulation APIs to solve the same tasks, with an accuracy of 86%. Experiments show that for 15% of the extracted pairs, the faster implementation achieved >10x speedup over its slower alternative. We also characterize 68 recurring alternative API pairs from the extraction results to understand the type of APIs that can be used alternatively. To put these findings into practice, we implement a tool, AlterApi7, to automatically optimize real-world data manipulation programs. In the 1,267 optimization attempts on the Kaggle dataset, 76% achieved desirable performance improvements with up to orders-of-magnitude speedup. Finally, we discuss notable challenges of using alternative APIs for optimizing data manipulation programs. We hope that our study offers a new perspective on API recommendation and automatic performance optimization.

关键词
相关链接[Scopus记录]
收录类别
SCI ; EI
语种
英语
学校署名
其他
WOS记录号
WOS:000683039100009
EI入藏号
20213210753826
EI主题词
Computer Software ; Software Engineering
EI分类号
Computer Software, Data HAndling And Applications:723 ; Computer Programming:723.1
ESI学科分类
COMPUTER SCIENCE
Scopus记录号
2-s2.0-85112068274
来源库
Scopus
引用统计
被引频次[WOS]:1
成果类型期刊论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/242729
专题工学院_计算机科学与工程系
作者单位
1.College of Computer Science and Software Engineering,Shenzhen University,Shenzhen,3688 Nanhai Avenue,China
2.Department of Computer Science and Engineering,Southern University of Science and Technology,Shenzhen,1088 Xueyuan Avenue,China
3.Shenzhen University,Shenzhen,3688 Nanhai Avenue,China
推荐引用方式
GB/T 7714
Tao,Yida,Tang,Shan,Liu,Yepang,et al. Speeding up Data Manipulation Tasks with Alternative Implementations: An Exploratory Study[J]. ACM Transactions on Software Engineering,2021,30(4).
APA
Tao,Yida,Tang,Shan,Liu,Yepang,Xu,Zhiwu,&Qin,Shengchao.(2021).Speeding up Data Manipulation Tasks with Alternative Implementations: An Exploratory Study.ACM Transactions on Software Engineering,30(4).
MLA
Tao,Yida,et al."Speeding up Data Manipulation Tasks with Alternative Implementations: An Exploratory Study".ACM Transactions on Software Engineering 30.4(2021).
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Tao,Yida]的文章
[Tang,Shan]的文章
[Liu,Yepang]的文章
百度学术
百度学术中相似的文章
[Tao,Yida]的文章
[Tang,Shan]的文章
[Liu,Yepang]的文章
必应学术
必应学术中相似的文章
[Tao,Yida]的文章
[Tang,Shan]的文章
[Liu,Yepang]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。