[1] MISHRA A, SUN X, JAIN A, et al. The great internet TCP congestion control census[J]. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 2019, 3(3): 1-24.
[2] BLUM N, LACHAPELLE S, ALVERSTRAND H. WebRTC: Real-time communication for the open web platform[J]. Communications of the ACM, 2021, 64(8): 50-54.
[3] LANGLEY A, RIDDOCH A, WILK A, et al. The quic transport protocol: Design and internet-scale deployment[C]//Proceedings of the conference of the ACM special interest group on data communication. 2017: 183-196.
[4] JANSEN B, GOODWIN T, GUPTA V, et al. Performance evaluation of WebRTC-based video conferencing[J]. ACM SIGMETRICS Performance Evaluation Review, 2018, 45(3): 56-68.
[5] YANG F, WU Q, LI Z, et al. BBRv2+: Towards balancing aggressiveness and fairness with delay-based bandwidth probing[J]. Computer Networks, 2022, 206: 108789.
[6] HA S, RHEE I, XU L. CUBIC: a new TCP-friendly high-speed TCP variant[J]. ACM SIGOPS operating systems review, 2008, 42(5): 64-74.
[7] ABBASLOO S, XU Y, CHAO H J. C2TCP: A flexible cellular TCP to meet stringent delay requirements[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(4): 918-932.
[8] ALIZADEH M, GREENBERG A, MALTZ D A, et al. Data center tcp (dctcp)[C]//Proceedings of the ACM SIGCOMM 2010 Conference. 2010: 63-74.
[9] ARUN V, BALAKRISHNAN H. Copa: Practical {Delay-Based} congestion control for the internet[C]//15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). 2018: 329-342.
[10] MITTAL R, LAM V T, DUKKIPATI N, et al. TIMELY: RTT-based congestion control for the datacenter[J]. ACM SIGCOMM Computer Communication Review, 2015, 45(4): 537-550.
[11] BRAGG J, MAUSAM, WELD D S. Sprout: Crowd-powered task design for crowdsourcing[C]//Proceedings of the 31st annual acm symposium on user interface software and technology. 2018: 165-176.
[12] ABBASLOO S, YEN C Y, CHAO H J. Classic meets modern: A pragmatic learning-based congestion control for the internet[C]//Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 2020: 632-647.
[13] JAY N, ROTMAN N, GODFREY B, et al. A deep reinforcement learning perspective on internet congestion control[C]//International Conference on Machine Learning. PMLR, 2019: 3050-3059.
[14] MA Y, TIAN H, LIAO X, et al. Multi-objective congestion control[C]//Proceedings of the Seventeenth European Conference on Computer Systems. 2022: 218-235.
[15] LI X, TANG F, LIU J, et al. {AUTO}: Adaptive Congestion Control Based on {Multi-Objective} Reinforcement Learning for the {Satellite-Ground} Integrated Network[C]//2021 USENIX Annual Technical Conference (USENIX ATC 21). 2021: 611-624.
[16] DONG M, MENG T, ZARCHY D, et al. {PCC} Vivace:{Online-Learning} Congestion Control[C]//15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). 2018: 343-356.
[17] ABBASLOO S, YEN C Y, CHAO H J. Wanna make your TCP scheme great for cellular networks? Let machines do it for you![J]. IEEE Journal on Selected Areas in Communications, 2020, 39(1): 265-279.
[18] EMARA S, LI B, CHEN Y. Eagle: Refining congestion control by learning from the experts[C]//IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 2020: 676-685.
[19] ZHANG H, ZHOU A, HU Y, et al. Loki: improving long tail performance of learning-based real-time video adaptation by fusing rule-based models[C]//Proceedings of the 27th Annual International Conference on Mobile Computing and Networking. 2021: 775-788.
[20] ZHANG H, ZHOU A, LU J, et al. OnRL: improving mobile video telephony via online reinforcement learning[C]//Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. 2020: 1-14.
[21] LI W, GAO S, LI X, et al. Tcp-neuroc: Neural adaptive tcp congestion control with online changepoint detection[J]. IEEE Journal on Selected Areas in Communications, 2021, 39(8): 2461-2475.
[22] WANG B, ZHANG Y, QIAN S, et al. A hybrid receiver-side congestion control scheme for web real-time communication[C]//Proceedings of the 12th ACM Multimedia Systems Conference. 2021: 332-338.
[23] DU Z, ZHENG J, YU H, et al. A unified congestion control framework for diverse application preferences and network conditions[C]//Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies. 2021: 282-296.
[24] BRAKMO L S, O'MALLEY S W, PETERSON L L. TCP Vegas: New techniques for congestion detection and avoidance[C]//Proceedings of the conference on Communications architectures, protocols and applications. 1994: 24-35.
[25] WINSTEIN K, BALAKRISHNAN H. Tcp ex machina: Computer-generated congestion control[J]. ACM SIGCOMM Computer Communication Review, 2013, 43(4): 123-134.
[26] SIVAKUMAR V, DELALLEAU O, ROCKTÄSCHEL T, et al. Mvfst-rl: An asynchronous rl framework for congestion control with delayed actions[J]. ArXiv preprint ArXiv:1910.04054, 2019.
[27] 赖涵光, 李清, 江勇. 基于场景变化的传输控制协议拥塞控制切换方案[J]. 计算机应用, 2022, 42(4): 1225.
[28] ZHENG Y, CHEN H, DUAN Q, et al. Leveraging domain knowledge for robust deep reinforcement learning in networking[C]//IEEE INFOCOM 2021-IEEE Conference on Computer Communications. IEEE, 2021: 1-10.
[29] SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//International conference on machine learning. PMLR, 2014: 387-395.
[30] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. nature, 2015, 518(7540): 529-533.
[31] VAN HASSELT H, GUEZ A, SILVER D. Deep reinforcement learning with double q-learning[C]//Proceedings of the AAAI conference on artificial intelligence. 2016, 30(1).
[32] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[J]. ArXiv preprint ArXiv:1509.02971, 2015.
[33] SCHULMAN J, LEVINE S, ABBEEL P, et al. Trust region policy optimization[C]//International conference on machine learning. PMLR, 2015: 1889-1897.
[34] MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]//International conference on machine learning. PMLR, 2016: 1928-1937.
[35] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[J]. ArXiv preprint ArXiv:1707.06347, 2017.
[36] FUJIMOTO S, HOOF H, MEGER D. Addressing function approximation error in actor-critic methods[C]//International conference on machine learning. PMLR, 2018: 1587-1596.
[37] GARCIA J, FERNÁNDEZ F. A comprehensive survey on safe reinforcement learning[J]. Journal of Machine Learning Research, 2015, 16(1): 1437-1480.
[38] MAO H, SCHWARZKOPF M, HE H, et al. Towards safe online reinforcement learning in computer systems[C]//NeurIPS Machine Learning for Systems Workshop. 2019.
[39] ALSHIEKH M, BLOEM R, EHLERS R, et al. Safe reinforcement learning via shielding[C]//Proceedings of the AAAI conference on artificial intelligence. 2018, 32(1).
[40] THOMAS G, LUO Y, MA T. Safe reinforcement learning by imagining the near future[J]. Advances in Neural Information Processing Systems, 2021, 34: 13859-13869.
[41] FULTON N, PLATZER A. Safe reinforcement learning via formal methods: Toward safe control through proof and learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2018, 32(1).
[42] BECK J, VUORIO R, LIU E Z, et al. A survey of meta-reinforcement learning[J]. ArXiv preprint ArXiv:2301.08028, 2023.
[43] DUAN Y, SCHULMAN J, CHEN X, et al. Rl $^ 2$: Fast reinforcement learning via slow reinforcement learning[J]. ArXiv preprint ArXiv:1611.02779, 2016.
[44] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//International conference on machine learning. PMLR, 2017: 1126-1135.
[45] GUPTA A, MENDONCA R, LIU Y X, et al. Meta-reinforcement learning of structured exploration strategies[J]. Advances in neural information processing systems, 2018, 31.
[46] RAKELLY K, ZHOU A, FINN C, et al. Efficient off-policy meta-reinforcement learning via probabilistic context variables[C]//International conference on machine learning. PMLR, 2019: 5331-5340.
[47] TIAN H, LIAO X, ZENG C, et al. Spine: an efficient DRL-based congestion control with ultra-low overhead[C]//Proceedings of the 18th International Conference on Emerging Networking EXperiments and Technologies. 2022: 261-275.
[48] ZHANG J, ZENG C, ZHANG H, et al. Liteflow: towards high-performance adaptive neural networks for kernel datapath[C]//Proceedings of the ACM SIGCOMM 2022 Conference. 2022: 414-427.
[49] WANG Y, CHEN K, TAN H, et al. Tabi: An Efficient Multi-Level Inference System for Large Language Models[C]//Proceedings of the Eighteenth European Conference on Computer Systems. 2023: 233-248.
[50] AKGUN I U, AYDIN A S, ZADOK E. KMLIB: Towards Machine Learning for Operating Systems[C]//Proceedings of the On-Device Intelligence Workshop, co-located with the MLSys Conference. 2020: 1-6.
[51] BROCKMAN G, CHEUNG V, PETTERSSON L, et al. Openai gym[J]. ArXiv preprint ArXiv:1606.01540, 2016.
[52] NETRAVALI R, SIVARAMAN A, DAS S, et al. Mahimahi: accurate {Record-and-Replay} for {HTTP}[C]//2015 USENIX Annual Technical Conference (USENIX ATC 15). 2015: 417-429.
[53] ABBASLOO S, YEN C Y, CHAO H J. Make tcp great (again?!) in cellular networks: A deep reinforcement learning approach[J]. ArXiv preprint ArXiv:1912.11735, 2019.
[54] FLOYD S. Metrics for the evaluation of congestion control mechanisms[R]. 2008.
[55] GIESSLER A, HAENLE J, KÖNIG A, et al. Free buffer allocation—An investigation by simulation[J]. Computer Networks (1976), 1978, 2(3): 191-208.
修改评论