南方科技大学知识苑(SUSTech KC): 基于强化学习的网页界面测试的研究

题名	基于强化学习的网页界面测试的研究
其他题名	WEB GUI TESTING BASED ON REINFORCEMENT LEARNING
姓名	樊宇佳
姓名拼音	FAN Yujia
学号	12132331
学位类型	硕士
学位专业	0809 电子科学与技术
学科门类/专业学位类别	08 工学
导师	刘烨庞
导师单位	计算机科学与工程系
论文答辩日期	2024-05-13
论文提交日期	2024-07-03
学位授予单位	南方科技大学
学位授予地点	深圳
摘要	近年来，基于强化学习(RL)的网页测试技术在学术界备受关注，在工业界也有良好的应用前景。相较于传统的测试方法，基于 RL 的方法通过自主学习和优化策略，在适应快速变化的网页环境方面具有更好的灵活性和适应性。然而，现有的相关技术在使用 RL 算法之外，还采取其他策略来提升测试效果。例如利用确定性有限自动机指导状态恢复、使用输入文本生成器提高输入动作的通过率等。这使得 RL 算法本身对于测试工具性能的提升效果变得难以确定。鉴于此，本文的第一项工作是对相关技术进行综述，总结现有方法的共同点，提出包含四个组件的基于单智能体强化学习的网页测试框架。通过对这些组件进行不同的设计和配置组合，共得到 216 种不同的配置组合，并分别在两个常用的开源网站和一个大型商业网站上对它们的测试性能进行全面评估。实验结果表明网页状态的抽象对测试效果影响最大，并且将元素集合作为页面状态的表征更具普遍优势。此外，其他组件的选择也至关重要，且不同的配置选项在不同类型的网页应用上的性能表现不同，本文详细地分析了这些配置选项的性能和优势。这项工作不仅能为将来的研究提供启发与思路，也能对工业界的实际应用提供指导。在上述基于单智能体强化学习的网页测试算法的对比实验中，论文发现在仅使用单个智能体的场景下，测试的性能存在上限。因此，本文的第二个工作是引入多智能体方法来提高测试效率和覆盖范围。设计基于多智能体强化学习(MARL) 的测试算法的关键在于设计整个系统的信息结构以及智能体之间的通信模式。一般来说，基于信息结构可以分为完全去中心化、带有交互网络的去中心化和中心化三类。论文设计了可以允许智能体异步执行测试任务的多智能体强化学习的网页测试系统，并针对后两种信息结构提出了两种基于Q学习的交流算法。通过在6 个不同类别的真实商业网站上对多智能体系统的测试能力进行评估，发现当系统内配置五个智能体后，探索状态数量平均能提升至 2.95 倍，最大提升可达 11.2 倍，在其余评估指标下，平均能达到 2.24 至 2.57 倍。实验结果表明，MARL 在大规模网站测试任务的效率和性能方面具有巨大潜力。这些研究成果为网页测试领域的进一步发展提供了有益的指导和启示。
其他摘要	In recent years, Web testing techniques based on Reinforcement Learning (RL) have gained considerable academic attention and hold promising industrial applications due to its flexibility and adaptability in dynamic Web environments. However, existing RL-based testing tools incorporate additional modules, such as DFA-guided state recovery or input data generation, making the contribution of RL it- self unclear. Therefore, the first work of this thesis is investigating related work, identify commonalities and propose a single-agent RL-based Web testing framework considering four components. By combining different options for these components, 216 configura- tion combinations are evaluated on two popular open-source websites and one large-scale commercial website. The results indicate that state abstraction has the greatest impact on performance, with “Set of Elements” showing a universal advantage. Additionally, other components are also crucial, and different options demonstrate varying performance across diverse website. This thesis offers a detailed analysis of performance and benefits of these options, providing valuable guidance for relevant researchers. Our first work reveals a fact that there is often an upper bound on the testing perfor- mance of single-agent RL algorithms. Based on this observation, the second work of this thesis explores the use of multi-agent methods for Web testing to enhance its coverage and efficiency. The key to designing a tool based on Multi-Agent RL (MARL) is system’s information structure and agent communication mechanism. Generally, MARL can be classified into three types based on information structure: fully decentralized, decentral- ized with networked agents, and centralized settings. This thesis presents a MARL-based Web testing system that allows agents to perform UI events asynchronously. Specifically, two Q-learning-based communication algorithms are proposed for the latter two types of information structures. By evaluating the performance of our system on six real-world websites, it is found that when system is configured with five agents, the average number of detected states increased to 2.95 times with a maximum improvement of 11.2 times. The results demonstrate the potential of MARL in improving the efficiency and perfor- mance in large-scale Web testing tasks.
关键词	网页界面测试强化学习 Q学习多智能体系统
其他关键词	Web GUI Testing Reinforcement Learning Q-learning Multi-agent System
语种	中文
培养类别	独立培养
入学年份	2021
学位授予年份	2024-06
参考文献列表	[1] CHEON Y, LEAVENS G T. A simple and practical approach to unit testing: The JML and JUnit way[C]//ECOOP 2002—Object-Oriented Programming: 16th European Conference Málaga, Spain, June 10–14, 2002 Proceedings 16. Springer, 2002: 231-255. [2] OKKEN B. Python Testing with pytest[M]. Pragmatic Bookshelf, 2022. [3] MEMON A M, BANERJEE I, NAGARAJAN A. GUI ripping: reverse engineering of graphical user interfaces for testing.[C]//WCRE: volume 3. 2003: 260. [4] KIRINUKI H, TANNO H. Automating end-to-end web testing via manual testing[J]. Journal of Information Processing, 2022, 30: 294-306. [5] CHANG X, LIANG Z, ZHANG Y, et al. A Reinforcement Learning Approach to GeneratingTest Cases for Web Applications[C]//2023 IEEE/ACM International Conference on Automation of Software Test (AST). IEEE, 2023: 13-23. [6] LONG Z, WU G, CHEN X, et al. Webrr: self-replay enhanced robust record/replay for web application testing[C]//Proceedings of the 28th ACM Joint Meeting on European Software En- gineering Conference and Symposium on the Foundations of Software Engineering. 2020: 1498-1508. [7] LEOTTA M, CLERISSI D, RICCA F, et al. Capture-replay vs. programmable web testing: An empirical assessment during test case evolution[C]//2013 20th Working Conference on Reverse Engineering (WCRE). IEEE, 2013: 272-281. [8] ZHENG Y, LIU Y, XIE X, et al. Automatic web testing using curiosity-driven reinforcement learning[C]//2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 2021: 423-435. [9] SHERIN S, MUQEET A, KHAN M U, et al. QExplore: An exploration strategy for dynamic web applications using guided search[J]. Journal of Systems and Software, 2023, 195: 111512. [10] MESBAH A, BOZDAG E, VAN DEURSEN A. Crawling Ajax by inferring user interface state changes[C]//2008 eighth international conference on web engineering. IEEE, 2008: 122-134. [11] MARCHETTO A, TONELLA P, RICCA F. State-based testing of Ajax web applications[C]// 2008 1st international conference on software testing, verification, and validation. IEEE, 2008: 121-130. [12] ATHAIYA S, KOMONDOOR R. Testing and analysis of web applications using page models [C]//Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2017: 181-191. [13] BIAGIOLA M, RICCA F, TONELLA P. Search based path and input data generation for web application testing[C]//Search Based Software Engineering: 9th International Symposium, SS-BSE 2017, Paderborn, Germany, September 9-11, 2017, Proceedings 9. Springer, 2017: 18-32. [14] MARSAGLIA G, ZAMAN A. Monkey tests for random number generators[J]. Computers & mathematics with applications, 1993, 26(9): 1-10. [15] MORALES M. Grokking deep reinforcement learning[M]. Manning Publications, 2020. [16] MARIANI L, PEZZÈ M, RIGANELLI O, et al. Automatic testing of GUI-based applications [J]. Software Testing, Verification and Reliability, 2014, 24(5): 341-366. [17] BAUERSFELD S, VOS T. A reinforcement learning approach to automated gui robustness testing[C]//Fast abstracts of the 4th symposium on search-based software engineering (SSBSE 2012). 2012: 7-12. [18] KOROGLU Y, SEN A, MUSLU O, et al. QBE: QLearning-based exploration of android ap- plications[C]//2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 2018: 105-115. [19] VUONG T A T, TAKADA S. A reinforcement learning based approach to automated testing of android applications[C]//Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation. 2018: 31-37. [20] VUONG T A T, TAKADA S. Semantic Analysis for Deep Q-Network in Android GUI Testing. [C]//SEKE. 2019: 123-170. [21] PAN M, HUANG A, WANG G, et al. Reinforcement learning based curiosity-driven testing of Android applications[C]//Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2020: 153-164. [22] ESKONEN J, KAHLES J, REIJONEN J. Automating GUI testing with image-based deep reinforcement learning[C]//2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS). IEEE, 2020: 160-167. [23] MARIANI L, PEZZÈ M, RIGANELLI O, et al. AutoBlackTest: a tool for automatic black-box testing[C]//Proceedings of the 33rd international conference on software engineering. 2011: 1013-1015. [24] MELO F S, VELOSO M. Learning of coordination: Exploiting sparse interactions in multia- gent systems[C]//Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems-Volume 2. Citeseer, 2009: 773-780. [25] PHAM H X, LA H M, FEIL-SEIFER D, et al. Cooperative and distributed reinforcement learn- ing of drones for field coverage[A]. 2018. [26] SHAMSOSHOARA A, KHALEDI M, AFGHAH F, et al. A solution for dynamic spectrum management in mission-critical UAV networks[C]//2019 16th annual IEEE international con- ference on sensing, communication, and networking (SECON). IEEE, 2019: 1-6. [27] CARINO S, ANDREWS J H. Dynamically testing GUIs using ant colony optimization (T)[C]// 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2015: 138-148. [28] ALSHAHWAN N, HARMAN M. Automated web application testing using search based soft- ware engineering[C]//2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011). IEEE, 2011: 3-12. [29] ESPARCIA-ALCÁZAR A I, ALMENAR F, MARTÍNEZ M, et al. Q-learning strategies for action selection in the TESTAR automated testing tool[J]. 6th International Conferenrence on Metaheuristics and nature inspired computing (META 2016), 2016: 130-137. [30] LAN Y, LU Y, LI Z, et al. Deeply Reinforcing Android GUI Testing with Deep Reinforce- ment Learning[C]//Proceedings of the 46th IEEE/ACM International Conference on Software Engineering. 2024: 1-13. [31] BENGIO Y, COURVILLE A, VINCENT P. Representation learning: A review and new per- spectives[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 35(8): 1798- 1828. [32] JANG E, GU S, POOLE B. Categorical reparametrization with gumble-softmax[C]// International Conference on Learning Representations (ICLR 2017). OpenReview. net, 2017. [33] ZHANG K, YANG Z, BAŞAR T. Multi-agent reinforcement learning: A selective overview of theories and algorithms[J]. Handbook of reinforcement learning and control, 2021: 321-384. [34] SHENG J, WANG L, YANG F, et al. Learning Cooperative Oversubscription for Cloud by Chance-Constrained Multi-Agent Reinforcement Learning[C]//Proceedings of the ACM Web Conference 2023. 2023: 2927-2936. [35] MYERS G J, BADGETT T, THOMAS T M, et al. The art of software testing: volume 2[M]. Wiley Online Library, 2004. [36] HUO Q, ZHU H, GREENWOOD S. A multi-agent software engineering environment for test- ing Web-based applications[C]//Proceedings 27th Annual International Computer Software and Applications Conference. COMPAC 2003. IEEE, 2003: 210-215. [37] BAI X, DAI G, XU D, et al. A multi-agent based framework for collaborative testing on web services[C]//The Fourth IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems, and the Second International Workshop on Collaborative Computing, In- tegration, and Assurance (SEUS-WCCIA’06). IEEE, 2006: 6-pp. [38] ARTZI S, DOLBY J, JENSEN S H, et al. A framework for automated testing of JavaScript web applications[C]//Proceedings of the 33rd International Conference on Software Engineering. 2011: 571-580. [39] MAHAJAN S, LI B, BEHNAMGHADER P, et al. Using visual symptoms for debugging presentation failures in web applications[C]//2016 IEEE International Conference on Software Testing, Verification and Validation (ICST). IEEE, 2016: 191-201. [40] KAELBLING L P, LITTMAN M L, MOORE A W. Reinforcement learning: A survey[J]. Journal of artificial intelligence research, 1996, 4: 237-285. [41] LEOTTA M, STOCCO A, RICCA F, et al. ROBULA+: An algorithm for generating robust XPath locators for web testing[J]. Journal of Software: Evolution and Process, 2016, 28(3): 177-204. [42] RATCLIFF J W, METZENER D E. Pattern-matching-the gestalt approach[J]. Dr Dobbs Jour- nal, 1988, 13(7): 46. [43] PATHAK D, AGRAWAL P, EFROS A A, et al. Curiosity-driven exploration by self-supervised prediction[C]//International conference on machine learning. PMLR, 2017: 2778-2787. [44] SUTTON R S, BARTO A G. Reinforcement learning: An introduction[M]. MIT press, 2018. [45] Monkey[EB/OL]. 2018. https://developer.android.com/. [46] EVEN-DAR E, MANSOUR Y, BARTLETT P. Learning Rates for Q-learning.[J]. Journal of machine learning Research, 2003, 5(1). [47] SELENIUMHQ. selenium: A browser automation framework and ecosystem.[EB/OL]. https: //github.com/SeleniumHQ/selenium/. [48] W3C Working Draft: UI Events.[EB/OL]. https://www.w3.org/TR/uievents/. [49] BIAGIOLA M, STOCCO A, RICCA F, et al. Diversity-based web test generation[C]// Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Con- ference and Symposium on the Foundations of Software Engineering. 2019: 142-153. [50] NACHAR N, et al. The Mann-Whitney U: A test for assessing whether two independent samples come from the same distribution[J]. Tutorials in quantitative Methods for Psychology, 2008, 4 (1): 13-20. [51] SHAPLEY L S. Stochastic games[J]. Proceedings of the national academy of sciences, 1953, 39(10): 1095-1100. [52] HASSELT H. Double Q-learning[J]. Advances in neural information processing systems, 2010, 23. [53] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning [A]. 2013. [54] MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning [C]//International conference on machine learning. PMLR, 2016: 1928-1937.
所在学位评定分委会	电子科学与技术
国内图书分类号	TP311.5
来源库	人工提交
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/778902
专题	工学院_计算机科学与工程系
推荐引用方式 GB/T 7714	樊宇佳. 基于强化学习的网页界面测试的研究[D]. 深圳. 南方科技大学,2024.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
12132331-樊宇佳-计算机科学与工（7558KB）	--	--	限制开放	--	请求全文