中文版 | English
题名

面向Spark任务执行过程的多粒度可视分析方法研究

其他题名
RESEARCH ON MULTI-GRAINED VISUAL ANALYSIS METHOD FOR APACHE SPARK APPLICATION EXECUTION PROCESS
姓名
姓名拼音
LI Qian
学号
12032465
学位类型
硕士
学位专业
0809 电子科学与技术
学科门类/专业学位类别
08 工学
导师
唐博
导师单位
计算机科学与工程系
论文答辩日期
2023-05-13
论文提交日期
2023-07-01
学位授予单位
南方科技大学
学位授予地点
深圳
摘要

       随着物联网、云计算、人工智能等新一代信息技术的高速发展,单机在数据存储和数据计算方面的处理能力已无法再满足如此庞大的数据量。Spark依赖大数据集群来加速应用执行,任务被分解给多台计算机同时处理,执行时间被大大缩短。但是受系统复杂性、网络拥塞或数据分布等因素影响,运维人员必须从各种大规模的日志信息中定位性能问题,这一过程往往需要投入大量的时间和精力。可视化技术将各类抽象的日志数据转换成图形,进而帮助分析人员有效地观察应用的执行过程,并结合分析人员的专家知识发现执行异常以及分析异常的潜在因素。然而,Spark执行过程的可视分析仍面临着逻辑操作间依赖复杂、原子任务数量众多以及系统行为难以预测等因素的影响。

       本文基于Spark并行计算过程提取的时序数据,实现了一套针对Spark性能检测分析的可视化系统。系统提供多粒度的可视分析方法和丰富的交互手段,帮助用户渐进式地探索导致异常执行的因素。具体来说,针对逻辑操作间依赖复杂问题,本工作设计了一种优化有向无环图的新布局算法来可视化应用执行过程,以优化视图的空间布局并减轻视图中的交叉和重叠问题,以及提出了一类并行计算框架下通用的评分方法来识别全局的执行异常。针对原子任务数量众多问题,本工作通过散点图映射原子任务的分布以直观地识别图中任务的异常模式,以及开发了一种改进的基于聚类和曲线束捆绑的平行坐标系来分析一组相似原子任务中属性间的关联和识别异常的任务组。针对系统行为难以预测问题,本工作实现了系统状态与计算任务执行的关联可视化。本文通过三个案例来分享本系统识别与资源、负载和数据相关的性能问题的方法,并通过收集来自大数据运维专家的反馈证明本系统的有效性。

关键词
语种
中文
培养类别
独立培养
入学年份
2020
学位授予年份
2023-06
参考文献列表

[1] 中国大数据网. 中国大数据分析行业研究报告[EB/OL]. (2022-04)
[2023-02-20]. https://res.zghy.org.cn/res/file.
[2] 前瞻产业研究院. 2019 年中国大数据行业研究报告[EB/OL]. (2019)
[2023-02-20]. http://pdf.dfcfw.com/pdf/H3_AP201911251371103072_1.pdf.
[3] 混沌工程实验室. 中国混沌工程调查报告(2021 年)[EB/OL]. (2021-11)
[2023-02-20].http://www.caict.ac.cn/kxyj//qwfb/ztbg/202111/P020211115608682270800.pdf.
[4] ZAHARIA M, CHOWDHURY M, FRANKLIN M J, et al. Spark: Cluster computing with working sets.[J]. HotCloud, 2010, 10(10-10): 95.
[5] Tez ui[EB/OL].
[2022-10-25]. https://tez.apache.org/tez-ui.html.
[6] Spark web ui[EB/OL].
[2022-10-25]. https://spark.apache.org/docs/latest/web-ui.html.
[7] Dr.Elephant[EB/OL].
[2022-06-09]. http://legendtkl.com/2017/10/17/dr-elephant-overview/.
[8] LI L, ZHANG X, ZHAO X, et al. Fighting the Fog of War: Automated Incident Detection for Cloud Systems.[C]//USENIX Annual Technical Conference. 2021: 131-146.
[9] XU Y, SUI K, YAO R, et al. Improving service availability of cloud systems by predicting disk error[C]//2018 {USENIX} Annual Technical Conference ({USENIX}{ATC} 18). 2018: 481-494.
[10] LIU X, YIN Z, ZHAO C, et al. PinSQL: Pinpoint Root Cause SQLs to Resolve Performance Issues in Cloud Databases[C]//2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2022: 2549-2561.
[11] 饶翔, 王怀民, 陈振邦, 等. 云计算系统中基于伴随状态追踪的故障检测机制[J]. 计算机学报, 2012, 35(5): 856-870.
[12] 王易东, 刘培顺, 王彬. 基于深度学习的系统日志异常检测研究[J]. 网络与信息安全学报,2019, 5(5): 105-118.
[13] LU J, LIU C, LI L, et al. CrashTuner: detecting crash-recovery bugs in cloud systems via meta info analysis[C]//Proceedings of the 27th ACM Symposium on Operating Systems Principles. 2019: 114-130.
[14] JUNG J, HU H, ARULRAJ J, et al. Apollo: Automatic detection and diagnosis of performance regressions in database systems[J]. Proceedings of the VLDB Endowment, 2019, 13(1): 57-70.
[15] NAGARAJ K, KILLIAN C E, NEVILLE J. Structured comparative analysis of systems logs to diagnose performance problems.[C]//NSDI: number 1. 2012: 353.
[16] GLASBERGEN B, ABEBE M, DAUDJEE K, et al. Sentinel: Universal Analysis and Insight for Data Systems[J]. Proc. VLDB Endow., 2020, 13(11): 2720-2733.
[17] 任明, 宋云奎. 基于深度学习的云计算系统异常检测方法[J]. 计算机技术与发展, 2019, 29(5): 54-57.
[18] MA M, YIN Z, ZHANG S, et al. Diagnosing Root Causes of Intermittent Slow Queries in Large-Scale Cloud Databases[J]. Proc. VLDB Endow., 2020, 13(8): 1176-1189.
[19] YOON D Y, NIU N, MOZAFARI B. DBSherlock: A Performance Diagnostic Tool for Trans actional Databases[C]//ÖZCAN F, KOUTRIKA G, MADDEN S. Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016. ACM, 2016: 1599-1614.
[20] Elastic[EB/OL].
[2022-11-21]. https://www.elastic.co/cn/.
[21] Dynatrace[EB/OL].
[2022-11-21]. https://www.dynatrace.cn/platform/.
[22] 实时计算 Flink 版-作业运维[EB/OL]. (2023-02-28)
[2023-03-02]. https://help.aliyun.com/product/45029.html.
[23] 杨亚洲. 腾讯云 MongoDB 智能诊断及性能优化实践[EB/OL]. (2022-06-20)
[2023-03-02]. https://www.infoq.cn/article/eQmsgfmbA99ToJN9JYNf.
[24] 赵颖, 樊晓平, 周芳芳, 等. 大规模网络安全数据协同可视分析方法研究[J]. 计算机科学与探索, 2014(7): 848-857.
[25] SIMITSIS A, WILKINSON K, BLAIS J, et al. VQA: vertica query analyzer[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. 2014: 701-704.
[26] MORITZ D, HALPERIN D, HOWE B, et al. Perfopticon: Visual query analysis for distributed databases[C]//Computer Graphics Forum: volume 34. Wiley Online Library, 2015: 71-80.
[27] SAKIN S A, BIGELOW A, TOHID R, et al. Traveler: Navigating task parallel traces for performance analysis[J]. IEEE Transactions on Visualization and Computer Graphics, 2022, 29 (1): 788-797.
[28] XIE C, XU W, MUELLER K. A visual analytics framework for the detection of anomalous call stack trees in high performance computing applications[J]. IEEE transactions on visualization and computer graphics, 2018, 25(1): 215-224.
[29] FARIHA A, NATH S, MELIOU A. Causality-guided adaptive interventional debugging[C]// Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2020: 431-446.
[30] Jaeger[EB/OL].
[2022-11-21]. https://www.jaegertracing.io/.
[31] Datadog[EB/OL].
[2022-11-21]. https://www.datadoghq.com/blog/.
[32] CloudWatch[EB/OL].
[2022-11-21]. https://aws.amazon.com/cn/cloudwatch/features/.
[33] Ganglia[EB/OL].
[2022-11-21]. http://ganglia.info/.
[34] Grafana[EB/OL].
[2022-06-09]. https://github.com/grafana/grafana.
[35] Prometheus[EB/OL].
[2022-06-09]. https://prometheus.io/.
[36] LI M, TAN J, WANG Y, et al. Sparkbench: a spark benchmarking suite characterizing large scale in-memory data analytics[J]. Cluster Computing, 2017, 20: 2575-2589.
[37] BORTHAKUR D. The hadoop distributed file system: Architecture and design[J]. Hadoop Project Website, 2007, 11(2007): 21.
[38] THUSOO A, SARMA J S, JAIN N, et al. Hive: a warehousing solution over a map-reduce framework[J]. Proceedings of the VLDB Endowment, 2009, 2(2): 1626-1629.
[39] BEAR C, LAMB A, TRAN N. The vertica database: Sql rdbms for managing big data[C]// Proceedings of the 2012 workshop on Management of big data systems. 2012: 37-38.
[40] TAO J, SHI L, ZHUANG Z, et al. Visual analysis of collective anomalies through high-order correlation graph[C]//2018 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 2018: 150-159.
[41] XU K, GUO S, CAO N, et al. Ecglens: Interactive visual exploration of large scale ecg data for arrhythmia detection[C]//Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 2018: 1-12.
[42] 杨瑞朋. 日志异常检测与诊断关键技术研究[D]. 战略支援部队信息工程大学, 2020.
[43] GUO Y, GUO S, JIN Z, et al. Survey on visual analysis of event sequence data[J]. IEEE Transactions on Visualization and Computer Graphics, 2021, 28(12): 5091-5112.
[44] RIEHMANN P, HANFLER M, FROEHLICH B. Interactive sankey diagrams[C]//IEEE Symposium on Information Visualization, 2005. INFOVIS 2005. IEEE, 2005: 233-240.
[45] GUO S, JIN Z, GOTZ D, et al. Visual progression analysis of event sequence data[J]. IEEE transactions on visualization and computer graphics, 2018, 25(1): 417-426.
[46] JIN Z, CUI S, GUO S, et al. Carepre: An intelligent clinical decision assistance system[J]. ACM Transactions on Computing for Healthcare, 2020, 1(1): 1-20.
[47] WONGSUPHASAWAT K, GOTZ D. Outflow: Visualizing patient flow by symptoms and outcome[C]//IEEE VisWeek Workshop on Visual Analytics in Healthcare, Providence, Rhode Is land, USA. American Medical Informatics Association, 2011: 25-28.
[48] GUO S, XU K, ZHAO R, et al. Eventthread: Visual summarization and stage analysis of event sequence data[J]. IEEE transactions on visualization and computer graphics, 2017, 24(1): 56-65.
[49] WONGSUPHASAWAT K, GUERRA GÓMEZ J A, PLAISANT C, et al. LifeFlow: visualizing an overview of event sequences[C]//Proceedings of the SIGCHI conference on human factors in computing systems. 2011: 1747-1756.
[50] DONG Y, FAUTH A, HUANG M, et al. Pansytree: Merging multiple hierarchies[C]//2020 IEEE Pacific visualization symposium (PacificVis). IEEE, 2020: 131-135.
[51] LAW P M, LIU Z, MALIK S, et al. MAQUI: Interweaving queries and pattern mining for recursive event sequence exploration[J]. IEEE transactions on visualization and computer graphics, 2018, 25(1): 396-406.
[52] DANG T, FORBES A. CactusTree: A tree drawing approach for hierarchical edge bundling [C]//2017 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 2017: 210-214.
[53] DU F, PLAISANT C, SPRING N, et al. EventAction: Visual analytics for temporal event sequence recommendation[C]//2016 IEEE Conference on Visual Analytics Science and Tech nology (VAST). IEEE, 2016: 61-70.
[54] MU X, XU K, CHEN Q, et al. MOOCad: Visual Analysis of Anomalous Learning Activities in Massive Open Online Courses.[C]//EuroVis (Short Papers). 2019: 91-95.
[55] ZHAO J, LIU Z, DONTCHEVA M, et al. Matrixwave: Visual comparison of event sequence data[C]//Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 2015: 259-268.
[56] NGUYEN P H, HENKIN R, CHEN S, et al. Vasabi: Hierarchical user profiles for interactive visual user behaviour analytics[J]. IEEE transactions on visualization and computer graphics, 2019, 26(1): 77-86.
[57] BADAM S K, ZHAO J, SEN S, et al. Timefork: Interactive prediction of time series[C]// Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 2016: 5409-5420.
[58] CAPPERS B C, VAN WIJK J J. Exploring multivariate event sequences using rules, aggregations, and selections[J]. IEEE transactions on visualization and computer graphics, 2017, 24(1): 532-541.
[59] GOTZ D, ZHANG J, WANG W, et al. Visual analysis of high-dimensional event sequence data via dynamic hierarchical aggregation[J]. IEEE transactions on visualization and computer graphics, 2019, 26(1): 440-450.
[60] MALIK S, SHNEIDERMAN B, DU F, et al. High-volume hypothesis testing: Systematic exploration of event sequence comparisons[J]. ACM Transactions on Interactive Intelligent Systems (TiiS), 2016, 6(1): 1-23.
[61] WU J, GUO Z, WANG Z, et al. Visual analytics of multivariate event sequence data in racquet sports[C]//2020 IEEE conference on visual analytics science and technology (VAST). IEEE, 2020: 36-47.
[62] CHEN H, CHEN W, MEI H, et al. Visual abstraction and exploration of multi-class scatterplots [J]. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 1683-1692.
[63] MICALLEF L, PALMAS G, OULASVIRTA A, et al. Towards perceptual optimization of the visual design of scatterplots[J]. IEEE transactions on visualization and computer graphics, 2017, 23(6): 1588-1599.
[64] KESAVAN S P, FUJIWARA T, LI J K, et al. A visual analytics framework for reviewing stream ing performance data[C]//2020 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 2020: 206-215.
[65] PLAISANT C, MILASH B, ROSE A, et al. LifeLines: visualizing personal histories[C]// Proceedings of the SIGCHI conference on Human factors in computing systems. 1996: 221- 227.
[66] JO J, HUH J, PARK J, et al. LiveGantt: Interactively visualizing a large manufacturing schedule [J]. IEEE transactions on visualization and computer graphics, 2014, 20(12): 2329-2338.
[67] SUN Y, ZHANG Y, MOSALLAEI A, et al. Daisen: a framework for visualizing detailed GPU execution[C]//Computer Graphics Forum: volume 40. Wiley Online Library, 2021: 239-250.
[68] KHAN M, KHAN S S. Data and information visualization methods, and interactive mechanisms: A survey[J]. International Journal of Computer Applications, 2011, 34(1): 1-14.
[69] HEINRICH J, WEISKOPF D. State of the Art of Parallel Coordinates.[J]. Eurographics (State of the Art Reports), 2013: 95-116.
[70] LUO Y, WEISKOPF D, ZHANG H, et al. Cluster visualization in parallel coordinates using curve bundles[J]. IEEE Transaction on Visualization and Computer Graphics, 2008, 18.
[71] JOHANSSON J, LJUNG P, JERN M, et al. Revealing structure within clustered parallel coordinates displays[C]//IEEE Symposium on Information Visualization, 2005. INFOVIS 2005. IEEE, 2005: 125-132.
[72] ZHOU H, YUAN X, QU H, et al. Visual clustering in parallel coordinates[C]//Computer graph ics forum: volume 27. Wiley Online Library, 2008: 1047-1054.
[73] PALMAS G, BACHYNSKYI M, OULASVIRTA A, et al. An edge-bundling layout for interactive parallel coordinates[C]//2014 IEEE Pacific visualization symposium. IEEE, 2014: 57-64.
[74] MCDONNELL K T, MUELLER K. Illustrative parallel coordinates[C]//volume 27. Wiley Online Library, 2008: 1031-1038.
[75] SEDLMAIR M, MEYER M, MUNZNER T. Design study methodology: Reflections from the trenches and the stacks[J]. IEEE transactions on visualization and computer graphics, 2012, 18 (12): 2431-2440.
[76] Rug Plot[EB/OL].
[2022-11-21]. https://en.wikipedia.org/wiki/Rug_plot.

所在学位评定分委会
电子科学与技术
国内图书分类号
TP391
来源库
人工提交
成果类型学位论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/544755
专题工学院_计算机科学与工程系
推荐引用方式
GB/T 7714
李倩. 面向Spark任务执行过程的多粒度可视分析方法研究[D]. 深圳. 南方科技大学,2023.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
12032465-李倩-计算机科学与工程(11973KB)----限制开放--请求全文
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[李倩]的文章
百度学术
百度学术中相似的文章
[李倩]的文章
必应学术
必应学术中相似的文章
[李倩]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。