南方科技大学知识苑(SUSTech KC): 面向超分场景QoS的虚拟CPU与物理CPU协同调度机制研究

题名	面向超分场景QoS的虚拟CPU与物理CPU协同调度机制研究
其他题名	QOS-AWARE CO-SCHEDULING MANAGEMENT OF VIRTUAL CPU AND PHYSICAL CPU FOR OVERSUBSCRIBED PUBLIC CLOUD
姓名	彭雅娟
姓名拼音	PENG Yajuan
学号	12132557
学位类型	硕士
学位专业	0809 电子科学与技术
学科门类/专业学位类别	08 工学
导师	喻之斌研究员
导师单位	中国科学院深圳先进技术研究院
论文答辩日期	2024-05-08
论文提交日期	2024-07-04
学位授予单位	南方科技大学
学位授予地点	深圳
摘要	随着云计算技术的日益普及和多核虚拟机的流行，如何在有效地调度管理云资源的同时保障用户的服务质量（Qualitative of Service，QoS）成为了学术研究和工业界的关键问题。然而有研究表明，大部分数据中心在以非常低的资源利用率运行。公有云通常采用：1）多租户用以提高服务器利用率；2）虚拟化技术用于不同租户间的隔离；3）资源超额订阅（即超分）以进一步提高资源效率。然而之前的研究工作并没有考虑到将QoS感知的多租户、虚拟化和资源超额订阅结合在一起。当这三者共存时会引起三个挑战：1）由于尾延迟敏感的在线服务应用（Latency-critical Application，LC）是由大量亚毫秒级任务组成，虚拟化的双重调度框架造成LC应用的性能劣化是传统批处理应用的近十倍；2）LC应用对I/O的频繁调用引起虚拟机内部的线程组之间严重资源争抢；3）主机无法获得应用程序级的性能指标来指导实现公有云中的资源管理。为了解决这些挑战，本文提出了一种QoS感知的CPU协同调度机制，面向虚拟化和资源超额订阅的公有云环境中多个LC工作负载的混部协同管理，旨在不违反用户QoS的前提下容纳尽可能多的虚拟机。利用虚拟机-主机协同的思想，它从两方面解决了先前工作未能处理的关键问题。第一个是细粒度的动态CPU资源调度，根据用户应用的性能需求实时调整主机上的CPU资源，包括虚拟机之间和虚拟机内部的资源隔离。第二个是提出了一种黑盒QoS预测机制，通过采集表征尾延迟的指标进行建模，从而指导主机调度。基于以上两点本文设计了UFO，一款实用的QoS感知资源管理器。UFO不需要用户级的尾延迟输入，而是利用虚拟机操作系统的调度频率表征QoS，适用于公有云虚拟化场景。UFO动态调整主机和虚拟机的CPU数量和分配方案，可以有效减少CPU超分场景下的虚拟化开销。为了验证本方法的有效性和扩展性，本研究选取了公有云场景广泛使用的MySQL、Nginx、Memcached应用作为基准测试集，并在Tailbench组件上进行了推广验证。实验分别评估了UFO在连续变化负载和动态变化负载下应用混部场景的表现，并与目前先进的CPU管理器进行对比。实验结果表明，在相同的混部场景下，UFO最多可以节省50%（平均22%）的CPU资源；在相同的CPU数量下，UFO可以达到更高的负载。面对突发负载时，UFO具有更快的响应速度。
关键词	虚拟化 QoS 协同调度 CPU 资源管理云计算
语种	中文
培养类别	独立培养
入学年份	2021
学位授予年份	2024-05
参考文献列表	[1] MELL P, GRANCE T, et al. The NIST definition of cloud computing[J]. Communications of the ACM, 2011, 53(6): 50-50. [2] 中国工业和信息化部. 云计算发展三年行动计划 (2017-2019 年)[EB/OL]. (2017-03-30) [2024-02-15]. http://www.cac.gov.cn/2017-04/11/c_1120785878.htm. [3] MCCARTHY J. REMINISCENCES ON THE HISTORY OF TIME SHARING[EB/OL]. 1983. http://www-formal.stanford.edu/jmc/history/timesharing/timesharing.html. [4] STRACHEY C S. Time sharing in large, fast computers.[C]//IFIP Congress: Vol. 59. 1959:336-341. [5] 何宝宏. 何宝宏: 云计算如何发力?[J]. 中国经贸, 2019(5): 4. [6] GARTNER. Gartner Forecasts Worldwide Public Cloud End-User Spending to Reach Nearly $600 Billion in 2023[EB/OL]. (2023-04). https://www.gartner.com/en/newsroom/press-releases. [7] 中国信息通信研究院. 云计算白皮书（2023 年）[EB/OL]. (2023-07) [2024-02-15]. http://www.caict.ac.cn/kxyj/qwfb/bps/202307/t20230725_458185.htm. [8] 马超. 面向云计算中心的资源优化方法及系统[D]. 西安电子科技大学, 2020. [9] LO D, CHENG L, GOVINDARAJU R, et al. Heracles: Improving Resource Efficiency at Scale[C]//Proceedings of the 42nd Annual International Symposium on Computer Architecture. 2015: 450-462. [10] REISS C, TUMANOV A, GANGER G R, et al. Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis[C]//Proceedings of the third ACM Symposium on Cloud Computing. 2012: 1-13. [11] LIU H. A measurement study of server utilization in public clouds[C]//2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing. IEEE, 2011: 435-442. [12] GUO J, CHANG Z, WANG S, et al. Who limits the resource efficiency of my datacenter: An analysis of alibaba datacenter traces[C]//Proceedings of the International Symposium on Quality of Service. 2019: 1-10. [13] QIN X, MA M, ZHAO Y, et al. How different are the cloud workloads? characterizing largescale private and public cloud workloads[C]//2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 2023: 522-530. [14] INTEL. Cloud Computing Virtualization Building Private IaaS Guide: 1sted[M]. Intel, 2013. [15] LU Z, WU J, BAO J, et al. OCReM: OpenStack-based Cloud Datacentre Resource Monitoring and Management Scheme[J]. International Journal of High Performance Computing and Networking, 2016, 9(1-2): 31-44. [16] 阿里云文档. 阿里云文档-云服务器 ECS：共享型[EB/OL]. (2022-06-21) [2024-02-15]. https://help.aliyun.com/document_detail/108489.htm. [17] 阿里云文档. 阿里云文档-什么是抢占式实例[EB/OL]. (2023-09-26) [2024-02-15]. https://help.aliyun.com/zh/ecs/user-guide/overview-4. [18] LOWE S D. Best Practices for Oversubscription of CPU, Memory and Storage in vSphereVirtual Environments[J]. Technical Whitepaper, Dell, 2013. [19] 倪远. 基于多层嵌套虚拟化的云资源优化方案研究[D]. 西安电子科技大学, 2017. [20] DING X, GIBBONS P B, KOZUCH M A, et al. Gleaner: Mitigating the Blocked-Waiter Wakeup Problem for Virtualized Multicore Applications[C]//2014 USENIX Annual Technical Conference (USENIX ATC 14). 2014: 73-84. [21] OUYANG J, LANGE J R, ZHENG H. Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs[J]. ACM SIGPLAN Notices, 2016, 51(7): 17-23. [22] SHAN J, DING X, GEHANI N. APPLES: Efficiently Handling Spin-lock Synchronization on Virtualized Platforms[J]. IEEE Transactions on Parallel and Distributed Systems, 2017, 28(7):1811-1824. [23] JIA W, SHAN J, LI T O, et al. vSMT-IO: Improving I/O Performance and Efficiency on SMT Processors in Virtualized Clouds[C]//2020 USENIX Annual Technical Conference (USENIX ATC 20). 2020: 449-463. [24] SCHILDERMANS S, SHAN J, AERTS K, et al. Virtualization Overhead of Multithreading in X86 State-of-the-art & Remaining Challenges[J]. IEEE Transactions on Parallel and Distributed Systems, 2021, 32(10): 2557-2570. [25] SONG X, SHI J, CHEN H, et al. Schedule processes, not VCPUs[C]//Proceedings of the 4th Asia-Pacific Workshop on Systems. 2013: 1-7. [26] KIM H, KIM S, JEONG J, et al. Demand-based Coordinated Scheduling for SMP VMs[C]//Proceedings of the eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems. 2013: 369-380. [27] RAO J, ZHOU X. Towards Fair and Efficient SMP Virtual Machine Scheduling[C]//Vol. 49. ACM New York, NY, USA, 2014: 273-286. [28] WU S, CHEN H, DI S, et al. Synchronization-aware Scheduling for Virtual Clusters in Cloud[J]. IEEE Transactions on Parallel and Distributed Systems, 2014, 26(10): 2890-2902. [29] WU S, XIE Z, CHEN H, et al. Dynamic Acceleration of Parallel Applications in Cloud Platforms by Adaptive Time-Slice Control[C]//2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2016: 343-352. [30] AHN J, PARK C H, HEO T, et al. Accelerating Critical OS Services in Virtualized Systems with Flexible Micro-Sliced Cores[C]//Proceedings of the Thirteenth EuroSys Conference. 2018: 1-14. [31] KASHYAP S, MIN C, KIM T. Scaling Guest OS Critical Sections with eCS[C]//2018 USENIXAnnual Technical Conference (USENIX ATC 18). 2018: 159-172. [32] ISHIGURO K, YASUNO N, AUBLIN P L, et al. Mitigating Excessive vCPU Spinning in Magnostic KVM[C]//Proceedings of the 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments. 2021: 139-152. [33] CHEN S, DELIMITROU C, MARTÍNEZ J F. Parties: QoS-aware Resource Partitioning for Multiple Interactive Services[C]//Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. 2019: 107-120. [34] PATEL T, TIWARI D. Clite: Efficient and QoS-aware Co-location of Multiple Latency-critical Jobs for Warehouse Scale Computers[C]//2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2020: 193-206. [35] DU BOIS K, EYERMAN S, EECKHOUT L. Per-thread Cycle Accounting in Multicore Processors[J]. ACM Transactions on Architecture and Code Optimization (TACO), 2013, 9(4):1-22. [36] SUBRAMANIAN L, SESHADRI V, GHOSH A, et al. The Application Slowdown Model: Quantifying and Controlling the Impact of Inter-application Interference at Shared Caches and Main Memory[C]//Proceedings of the 48th International Symposium on Microarchitecture. 2015: 62-75. [37] KANNAN R S, LAURENZANO M, AHN J, et al. Caliper: Interference Estimator for Multitenant Environments Sharing Architectural Resources[J]. ACM Transactions on Architecture and Code Optimization (TACO), 2019, 16(3): 1-25. [38] MASOUROS D, XYDIS S, SOUDRIS D. Rusty: Runtime Interference-aware Predictive Monitoring for Modern Multi-tenant Systems[J]. IEEE Transactions on Parallel and Distributed Systems, 2020, 32(1): 184-198. [39] SHI T, YANG Y, CHENG Y, et al. Alioth: A Machine Learning Based Interference-Aware Performance Monitor for Multi-Tenancy Applications in Public Cloud[C]//2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 2023: 908-917. [40] BIENIA C, KUMAR S, SINGH J P, et al. The PARSEC Benchmark Suite: Characterization and Architectural Implications[C]//Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. 2008: 72-81. [41] DIXIT K M. The SPEC Benchmarks[J]. Parallel computing, 1991, 17(10-11): 1195-1209. [42] ARMBRUST M, XIN R S, LIAN C, et al. Spark SQL: Relational Data Processing in Spark[C]// Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 2015: 1383-1394. [43] MySQL Official Website[EB/OL]. https://www.mysql.com/. [44] Memcached Official Website[EB/OL]. https://memcached.org/. [45] NGINX Official Website[EB/OL]. https://www.nginx.com/. [46] KILIC O, DODDAMANI S, BHAT A, et al. Overcoming Virtualization Overheads for LargevCPU Virtual Machines[C]//2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). IEEE, 2018: 369-380. [47] TANG W, KE Y, FU S, et al. Demeter: QoS-aware CPU Scheduling to Reduce Power Consumption of Multiple Black-box Workloads[C]//Proceedings of the 13th Symposium on Cloud Computing. 2022: 31-46. [48] KASTURE H, SANCHEZ D. Tailbench: A Benchmark Suite and Evaluation Methodology for Latency-critical Applications[C]//2016 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 2016: 1-10. [49] Xapian project[EB/OL]. https://github.com/xapian/xapian. [50] Github Page of Wrk2 Load Generator[EB/OL]. https://github.com/sc2682cornell/wrk2. [51] KOEHN P, HOANG H, BIRCH A, et al. Moses: Open Source Toolkit for Statistical Machine Translation[C]//Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. 2007: 177-180. [52] Github Page of Mutated Load Generator[EB/OL]. https://github.com/scslab/mutated. [53] Github Page of Wrk2 Load Generator[EB/OL]. https://github.com/sc2682cornell/wrk2. [54] Github Page of Sysbench Load Generator[EB/OL]. https://github.com/akopytov/sysbench. [55] 6SENSE. Memcached - Market Share, Competitor Insights in Technology WebMarket Share of Memcached.[EB/OL]. 2024. https://6sense.com/tech/technology-design-and-architecture/memcached-market-share. [56] W3TECHS. Usage statistics and market shares of web servers[EB/OL]. 2024. https://w3techs.com/technologies/overview/web_server/. [57] 6SENSE. MySQL - Market Share, Competitor Insights in Relational Databases[EB/OL]. 2023.https://6sense.com/tech/relational-databases/mysql-market-share. [58] LEVERICH J, KOZYRAKIS C. Reconciling High Server Utilization and Sub-millisecond Quality-of-service[C]//Proceedings of the Ninth European Conference on Computer Systems.2014: 1-14. [59] HAZELWOOD K, BIRD S, BROOKS D, et al. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective[C]//2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2018: 620-629. [60] OUSTERHOUT A, FRIED J, BEHRENS J, et al. Shenango: Achieving high CPU Efficiency for Latency-sensitive Datacenter Workloads[C]//16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). 2019: 361-378. [61] CHENG L, RAO J, LAU F C. vScale: Automatic and Efficient Processor Scaling for SMP Virtual Machines[C]//Proceedings of the Eleventh European Conference on Computer Systems. 2016: 1-14
所在学位评定分委会	电子科学与技术
国内图书分类号	TP39
来源库	人工提交
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/778989
专题	中国科学院深圳理工大学（筹）联合培养
推荐引用方式 GB/T 7714	彭雅娟. 面向超分场景QoS的虚拟CPU与物理CPU协同调度机制研究[D]. 深圳. 南方科技大学,2024.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
12132557-彭雅娟-中国科学院深圳（8190KB）	--	--	限制开放	--	请求全文