题名 | DHive: Query Execution Performance Analysis via Dataflow in Apache Hive |
作者 | |
通讯作者 | Tang, Bo |
发表日期 | 2023-08-01
|
DOI | |
发表期刊 | |
ISSN | 2150-8097
|
卷号 | 16期号:12页码:3998-4001 |
摘要 | Nowadays, Apache Hive has been widely used for large-scale data analysis applications in many organizations. Various visual analytical tools are developed to help Hive users quickly analyze the query execution process and identify the performance bottleneck of executed queries. However, existing tools mostly focus on showing the time usage of query sub-components (jobs and operators) but fail to provide enough evidence to analyze the root reasons for the slow execution progress. To tackle this problem, we develop a visual analytical system DHive to visualize and analyze the query execution progress via dataflow analysis. DHive shows the dataflow during query execution at multiple levels: query level, job level and task level, which enable users to identify the key jobs/tasks and explain their time usage by linking them to the auxiliary information such as the system configuration and hardware status. We demonstrate the effectiveness of DHive by two cases in a production cluster. DHive is open-source at https://github.com/DBGroupSUSTech/DHive.git. |
相关链接 | [来源记录] |
收录类别 | |
语种 | 英语
|
学校署名 | 第一
; 通讯
|
资助项目 | Shenzhen Fundamental Research Program[20220815112848002]
; Guangdong Provincial Key Laboratory[2020B121201001]
|
WOS研究方向 | Computer Science
|
WOS类目 | Computer Science, Information Systems
; Computer Science, Theory & Methods
|
WOS记录号 | WOS:001067701000066
|
出版者 | |
EI入藏号 | 20234314943565
|
EI主题词 | Data flow analysis
|
EI分类号 | Computer Software, Data Handling and Applications:723
|
来源库 | Web of Science
|
引用统计 | |
成果类型 | 期刊论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/582919 |
专题 | 工学院_计算机科学与工程系 |
作者单位 | 1.Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen, Peoples R China 2.Southern Univ Sci & Technol, Res Inst Trustworthy Autonomous Syst, Shenzhen, Peoples R China |
第一作者单位 | 计算机科学与工程系 |
通讯作者单位 | 计算机科学与工程系 |
第一作者的第一单位 | 计算机科学与工程系 |
推荐引用方式 GB/T 7714 |
Zhang, Chaozu,Shen, Qiaomu,Tang, Bo. DHive: Query Execution Performance Analysis via Dataflow in Apache Hive[J]. PROCEEDINGS OF THE VLDB ENDOWMENT,2023,16(12):3998-4001.
|
APA |
Zhang, Chaozu,Shen, Qiaomu,&Tang, Bo.(2023).DHive: Query Execution Performance Analysis via Dataflow in Apache Hive.PROCEEDINGS OF THE VLDB ENDOWMENT,16(12),3998-4001.
|
MLA |
Zhang, Chaozu,et al."DHive: Query Execution Performance Analysis via Dataflow in Apache Hive".PROCEEDINGS OF THE VLDB ENDOWMENT 16.12(2023):3998-4001.
|
条目包含的文件 | 条目无相关文件。 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论