南方科技大学知识苑(SUSTech KC): 基于GPU的射频电路稳态仿真加速算法

题名	基于GPU的射频电路稳态仿真加速算法
其他题名	GPU-BASED RF CIRCUIT STEADY-STATE SIMULATION ACCELERATION ALGORITHM
姓名	王正卓
姓名拼音	WANG Zhengzhuo
学号	12032653
学位类型	硕士
学位专业	080903 微电子学与固体电子学
学科门类/专业学位类别	08 工学
导师	陈全
导师单位	深港微电子学院
论文答辩日期	2023-05-15
论文提交日期	2023-06-27
学位授予单位	南方科技大学
学位授予地点	深圳
摘要	通信和集成电路行业的繁荣促进了射频电路的发展，也带动了对大规模射频电路仿真的需求，其中的稳态仿真更是其他一切射频电路仿真的基础。然而大部分的开源电路仿真器和国内的一些商用仿真器都不具备完整的射频电路稳态仿真功能，具有此功能的主流商业仿真器也面临在射频电路规模较大时仿真时间较长的问题。最新架构的多核CPU 或GPU 提供了一个理想的并行计算平台，能够加速大规模射频电路中耗时较长的仿真分析。本文首先在开源电路仿真器中实现射频电路的稳态仿真方法，通过对比验证仿真结果的准确性，完善并优化仿真器的稳态仿真功能。接下来，基于其中的一种方法——谐波平衡方法，利用其结构和非线性电路特性，提出使用GPU 的加速方案。最后，将谐波平衡方法部分并行化，在GPU 与CPU 的混合平台上开发射频电路稳态仿真的加速算法，以提高仿真性能，并将之集成到开源电路仿真器中。使用工业实例的数值实验结果表明，使用本文提出的基于GPU 的谐波平衡方法对非线性射频电路进行稳态分析时，在保持相近精度的前提下，速度比原始的谐波平衡方法提高了约3 倍，与另一种稳态仿真方法——打靶方法的原始版本相比，提高了6.5 倍。
其他摘要	The prosperity of communication and integrated circuit industry has not only promoted the development of RF circuit, but also driven the demand for large-scale RF circuit simulation, in which steady-state simulation is the basis of all other RF circuit simulation. However, most open-source circuit simulators and some domestic commercial simulators do not have complete steady-state simulation function for RF circuit.On the other hand, mainstream commercial simulators with this function face the problem of long simulation time when the scale of RF circuit is large. Multi-core CPUs or GPUs with the latest architectures provide an ideal parallel computing platform that can speed up time-consuming simulation analysis for large-scale RF circuit. In this paper, we firstly implement the steady-state simulation method for RF circuit in the open-source circuit simulator, verify the accuracy of the simulation results by comparison, and improve and optimize the steady-state simulation function of the simulator. Then, based on one of the steady-state simulation methods, harmonic balance method, utilizing its structure and nonlinear circuit characteristics, a GPU acceleration scheme is proposed. Finally, the harmonic balance method is partially parallelized, and the acceleration algorithm for steady-state simulation is developed on the mixed platform with GPU and CPU to boost the simulation performance, and the algorithm is integrated into the open-source circuit simulator. The numerical results of industrial cases show that the GPU-based harmonic balance method is 3 times faster than the traditional harmonic balance method and 6.5 times faster than the original version of shooting method, which is another steady-state simulation method, keeping the similar accuracy.
关键词	射频电路仿真谐波平衡方法 GPU 稳态分析
其他关键词	RF Circuit Simulation Harmonic Balance Method GPU Steady-state Analysis
语种	中文
培养类别	独立培养
入学年份	2020
学位授予年份	2023-06
参考文献列表	[1] CHEN Q, SCHOENMAKER W, WENG S H, et al. A fast time-domain EM-TCAD coupled simulation framework via matrix exponential[C]//2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 2012: 422-428. [2] CHEN Q, SCHOENMAKER W. A new tightly-coupled transient electro-thermal simulation method for power electronics[C/OL]//2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 2016: 1-7. DOI: 10.1145/2966986.2966993. [3] 王也, 覃焕耀, 高洪民, 等. 基于ADS 的2.4GHz 射频通信系统设计与仿真分析[J]. 微波学报, 2020, 36: 218-221. [4] LANTSOV V. A New Algorithm for Solving of Harmonic Balance Equations by Using the Model Order Reduction Method[C/OL]//2020 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT). 2020: 295-297. DOI: 10.1109/USBEREIT48449.2020.9117768. [5] LIU X X, YU H, RELLES J, et al. A structured parallel periodic Arnoldi shooting algorithm for RF-PSS analysis based on GPU platforms[C/OL]//16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011). 2011: 13-18. DOI: 10.1109/ASPDAC.2011.5722172. [6] KAPRE N, DEHON A. Parallelizing sparse Matrix Solve for SPICE circuit simulation using FPGAs[C/OL]//2009 International Conference on Field-Programmable Technology. 2009: 190-198. DOI: 10.1109/FPT.2009.5377665. [7] GE X, ZHU H, YANG F, et al. Parallel sparse LU decomposition using FPGA with an efficient cache architecture[C/OL]//2017 IEEE 12th International Conference on ASIC (ASICON). 2017: 259-262. DOI: 10.1109/ASICON.2017.8252462. [8] LEE D, HAGIESCU A, PRITSKER D. Large-Scale and High-Throughput QR Decomposition on an FPGA[C/OL]//2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 2019: 337. DOI: 10.1109/FCCM.2019.00078. [9] KAPRE N, DEHON A. Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors[C/OL]//2009 International Conference on Field Programmable Logic and Applications. 2009: 65-72. DOI: 10.1109/FPL.2009.5272548. [10] REN L, CHEN X, WANG Y, et al. Sparse LU factorization for parallel circuit simulation on GPU[C]//DAC Design Automation Conference 2012. 2012: 1125-1130. [11] VOLKOV V, DEMMEL J. LU, QR and Cholesky Factorizations using Vector Capabilities of GPUs: UCB/EECS-2008-49[R/OL]. EECS Department, University of California, Berkeley, 2008: 15-27. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-49.html. [12] TOMOV S, DONGARRA J J, BABOULIN M. Towards dense linear algebra for hybrid GPU accelerated manycore systems[J]. Parallel Comput., 2009, 36: 232-240. [13] AGULLO E, AUGONNET C, DONGARRA J, et al. LU factorization for accelerator-based systems[C/OL]//2011 9th IEEE/ACS International Conference on Computer Systems and Applications(AICCSA). 2011: 217-224. DOI: 10.1109/AICCSA.2011.6126599. [14] LIU L, YANG G. A Highly Efficient GPU-CPU Hybrid Parallel Implementation of Sparse LU Factorization[J]. Chinese Journal of Electronics, 2012, 21: 7-12. [15] CHEN X, WANG Y, YANG H. NICSLU: An Adaptive Sparse Matrix Solver for Parallel Circuit Simulation[J/OL]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2013, 32(2): 261-274. DOI: 10.1109/TCAD.2012.2217964. [16] NAGEL L, ROHRER R. Computer analysis of nonlinear circuits, excluding radiation (CANCER)[J/OL]. IEEE Journal of Solid-State Circuits, 1971, 6(4): 166-182. DOI: 10.1109/JSSC.1971.1050166. [17] 袁韬, 冯平, 杨静, 等. 节点法与改进节点法的讨论[J]. 电气电子教学学报, 2012, 34: 88-90. [18] SANDBERG M. Convergence of the Forward Euler Method for Nonconvex Differential Inclusions[J/OL]. SIAM Journal on Numerical Analysis, 2009, 47(1): 308-320. DOI: 10.1137/070686093. [19] GUERRA G, SHEN W. Vanishing Viscosity and Backward Euler Approximations for Conservation Laws with Discontinuous Flux[J/OL]. SIAM Journal on Mathematical Analysis, 2019,51(4): 3112-3144. DOI: 10.1137/18M1205662. [20] 殷术亨. 矩阵LU 分解及Cholesky 分解的随机算法研究[D]. 重庆大学, 2020: 1-2. [21] BLECHTA J. Stability of Linear GMRES Convergence with Respect to Compact Perturbations[J/OL]. SIAM Journal on Matrix Analysis and Applications, 2021, 42(1): 436-447. DOI:10.1137/20M1340848. [22] AMDAHL G M. Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities[J/OL]. IEEE Solid-State Circuits Society Newsletter, 2007, 12(3): 19-20. DOI: 10.1109/N-SSC.2007.4785615. [23] GUSTAFSON J L. Reevaluating Amdahl’s Law[J/OL]. Commun. ACM, 1988, 31(5): 532–533. https://doi.org/10.1145/42411.42415. [24] FLYNN M J. Some Computer Organizations and Their Effectiveness[J/OL]. IEEE Transactions on Computers, 1972, C-21(9): 948-960. DOI: 10.1109/TC.1972.5009071. [25] KAPASI U, RIXNER S, DALLY W, et al. Programmable stream processors[J/OL]. Computer, 2003, 36(8): 54-62. DOI: 10.1109/MC.2003.1220582. [26] GADHIKAR L M, RAO Y S. Analysis of Programs for GPGPU Architectures[C/OL]//2018 2nd International Conference on Trends in Electronics and Informatics (ICOEI). 2018: 1-4. DOI: 10.1109/ICOEI.2018.8553918. [27] LINDHOLM E, OBERMAN S. The NVIDIA GeForce 8800 GPU[C/OL]//2007 IEEE Hot Chips 19 Symposium (HCS). 2007: 1-17. DOI: 10.1109/HOTCHIPS.2007.7482490. [28] NVIDIA. CUDA C++ Programming Guide: v12.1[EB/OL]. 2023. https://docs.nvidia.com/cu da/pdf/CUDA_C_Programming_Guide.pdf. [29] CUOMO S, MARCELLINO L, NAVARRA G. A Parallel Implementation of the Hestenes-Jacobi-One-Sides Method Using GPU-CUDA[C/OL]//2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP). 2018: 722-725. DOI:10.1109/PDP2018.2018.00118. [30] IKUNO S, CHEN G, ITOH T, et al. Variable Preconditioned Krylov Subspace Method With Communication Avoiding Technique for Electromagnetic Analysis[J/OL]. IEEE Transactions on Magnetics, 2017, 53(6): 1-4. DOI: 10.1109/TMAG.2017.2655513. [31] 李鹏, 于浩, 王成山, 等. 基于Krylov 子空间的大规模配电网络模型整体化简方法[J]. 电网技术, 2013(8): 2343-2348. [32] LINARO D, GIUDICE D D, BRAMBILLA A, et al. Application of Envelope-Following Techniques to the Shooting Method[J/OL]. IEEE Open Journal of Circuits and Systems, 2020, 1:22-33. DOI: 10.1109/OJCAS.2020.2987973. [33] TELICHEVESKY R, KUNDERT K S, WHITE J K. Efficient Steady-State Analysis Based on Matrix-Free Krylov-Subspace Methods[C/OL]//32nd Design Automation Conference. 1995: 480-484. DOI: 10.1145/217474.217574. [34] TELICHEVESKY R, KUNDERT K, ELFADEL I, et al. Fast simulation algorithms for RF circuits[C/OL]//Proceedings of Custom Integrated Circuits Conference. 1996: 437-444. DOI: 10.1109/CICC.1996.510592. [35] KUNDERT K, WHITE J, SANGIOVANNI-VINCENTELLI A. The Springer International Series in Engineering and Computer Science: Steady-State Methods for Simulating Analog and Microwave Circuits[M/OL]. Springer US, 2013: 81-116. https://books.google.com/books?id=9_XTBwAAQBAJ. [36] HUANG A, GAO X, PAWLOWSKI R, et al. A versatile harmonic balance method in a parallel framework[C/OL]//2018 International Conference on Simulation of Semiconductor Processes and Devices (SISPAD). 2018: 271-275. DOI: 10.1109/SISPAD.2018.8551620. [37] 公忠盛, 徐光宪, 南敬昌, 等. 基于改进混合蜂群算法的非线性电路谐波平衡分析[J]. 计算机应用研究, 2018, 35: 1970-1973+1995. [38] KUNDERT K. Introduction to RF simulation and its application[J/OL]. IEEE Journal of Solid-State Circuits, 1999, 34(9): 1298-1319. DOI: 10.1109/4.782091. [39] 深圳市比昂芯科技有限公司. 一种基于谐波平衡的电路仿真方法、装置及存储介质：CN202210421916.9[P]. 2022-08-19. [40] LAU S L, CHEUNG Y K. Amplitude Incremental Variational Principle for Nonlinear Vibration of Elastic Systems[J/OL]. Journal of Applied Mechanics, 1981, 48(4): 959-964. DOI: 10.1115/1.3157762. [41] 黄建亮, 张兵许, 陈树辉. 优化迭代步长的两种改进增量谐波平衡法[J]. 力学学报, 2022,54(5): 1353-1363. [42] LIU X X, YU H, TAN S X D. A robust periodic arnoldi shooting algorithm for efficient analysis of large-scale RF/MM ICs[C]//Design Automation Conference. 2010: 573-578. [43] LIU X X, YU H, TAN S X D. A GPU-Accelerated Parallel Shooting Algorithm for Analysis of Radio Frequency and Microwave Integrated Circuits[J/OL]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2015, 23(3): 480-492. DOI: 10.1109/TVLSI.2014.2309606. [44] RETZLER A, SWEVERS J, GILLIS J, et al. Shooting methods for identification of nonlinear state-space grey-box models[C/OL]//2022 IEEE 17th International Conference on Advanced Motion Control (AMC). 2022: 207-212. DOI: 10.1109/AMC51637.2022.9729299. [45] 武新宇, 来金梅, 章倩苓, 等. 射频集成电路周期稳态快速模拟算法的研究[J]. 微电子学,2002, 32(3): 161-164. [46] NASTOV O, TELICHEVESKY R, KUNDERT K, et al. Fundamentals of Fast Simulation Algorithms for RF Circuits[J/OL]. Proceedings of the IEEE, 2007, 95(3): 600-621. DOI: 10.1109/JPROC.2006.889366. [47] STOER J, BARTELS R, GAUTSCHI W, et al. Texts in Applied Mathematics: Introduction to Numerical Analysis[M/OL]. Springer US, 2002: 174-182. https://books.google.com/books?id=1oDXWLb9qEkC. [48] VENTURINI G, DANIHER I, ENDOLITH, et al. Ahkab: an open-source SPICE-like interactive circuit simulator[EB/OL]. 2015. DOI: 10.5281/zenodo.19967. [49] 赵斌, 贾智, 王东, 等. 电气化铁路轨道电路钢轨互阻抗计算研究[J]. 铁道学报, 2021, 43(8): 54-61. [50] 马城城, 田泽, 黎小玉, 等. 统一渲染架构GPU 图形处理量化性能模型研究[J]. 电子技术应用, 2019, 45(2): 27-32,36.66
所在学位评定分委会	电子科学与技术
国内图书分类号	TN454
来源库	人工提交
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/544121
专题	南方科技大学-香港科技大学深港微电子学院筹建办公室
推荐引用方式 GB/T 7714	王正卓. 基于GPU的射频电路稳态仿真加速算法[D]. 深圳. 南方科技大学,2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
12032653-王正卓-南方科技大学-（2756KB）	学位论文	--	限制开放	CC BY-NC-SA	请求全文