题名 | 人工智能图像处理的边缘计算硬件优化 |
其他题名 | EDGE COMPUTING ACCELARATION FOR ARTIFICIAL INTELLIGENCE ENHANCED IMAGE PROCESSING
|
姓名 | |
学号 | 11849055
|
学位类型 | 硕士
|
学位专业 | 微电子学与固体电子学
|
导师 | |
论文答辩日期 | 2020-05-27
|
论文提交日期 | 2020-07-17
|
学位授予单位 | 哈尔滨工业大学
|
学位授予地点 | 深圳
|
摘要 | 随着物联网技术的不断发展,人类社会进入万物互联的智能信息化时代,智能机器人、智能门禁以及智能家电等智能设备极大的方便了人们的生活,改变了人们的日常生活方式。其中,图像处理需求是智能设备的主要需求之一,图像处理应用于目标检测、图像分割、医学成像分析等各类领域。图像处理通常需要进行复杂的运算,将人工智能技术应用于图像处理有较好的效果,可以减少运算量,提高图像处理的性能。在云计算模型中,物联网的本地终端设备将采集到的数据经网络传送至云服务器,由云服务器完成数据运算。云计算模型可以很好解决物联网设备性能差的问题,但随着物联网终端设备的不断增加,巨量设备接入网络,云计算模型受网络带宽和性能瓶颈的制约,带来延迟高、实时性差和安全性低等一系列问题。将采集到的数据信息在本地完成运算处理,仅将运算结果传回云服务端的边缘计算模型可以有效解决上述问题。边缘计算模型的实现难点在于边缘设备硬件资源有限以及性能较低,本文针对这一问题进行了研究,研究主要包括以下内容。本文设计了边缘计算系统,研究了人工智能图像处理算法,完成了一个用于人群信息检测的边缘计算系统实例的搭建。本文设计了由树莓派开发板和ARC开发板以及云服务器组成的边缘计算系统,并对卷积神经网络进行了剪枝、参数量化等优化,使其减少计算量与硬件需求、适配边缘计算模型,完成了卷积神经网络的边缘端部署。本文研究了RISC-V架构与RI5CY处理器,设计了用于指令拓展的协处理接口。RISC-V是一套新兴的开源精简指令集架构,RI5CY是基于RISC-V架构的一款低功耗开源处理器,本文基于RI5CY设计了自定义指令来加速卷积神经网络的卷积运算。为了简化后续开发流程、增加协处理器加速模块的可移植性,设计了一套用于指令拓展的协处理器接口。本文基于Winograd算法设计了协处理器加速模块,减少内存访问,加速卷积神经网络的卷积运算过程。本文设计了协处理器加速模块,并对模块进行了测试验证,结果表明,在实现4×4特征图与3×3卷积核的卷积运算中,减少了约77.8%的内存访问次数,实现了性能5倍以上提升,经过内存优化后加速性能提升到了11.7倍,并在特定运算下有更大的性能提升。在更大尺寸的特征图中,协处理器加速模块可以将基于Winograd算法的4×4特征图与3×3卷积核的卷积运算作为一个Winograd加速算子,利用Winograd加速算子对特征图分块进行卷积运算,同样能够获得较大的性能提升。本文通过对基于卷积神经网络算法的图像处理边缘计算进行了加速优化,能够显著提高边缘设备进行图像处理的能力,针对边缘计算模型中边缘端性能不足的问题提出了解决方案,对物联网和边缘计算的发展有一定的推动作用。 |
其他摘要 | With the continuous development of the Internet of Things technology, human society has entered the era of intelligent informationization of the Internet of Everything. Intelligent devices such as intelligent robots, intelligent access control, and intelligent home appliances have greatly facilitated people's lives and changed people's daily life. Among them, image processing needs are one of the main needs of smart devices. Image processing is used in various fields such as target detection, image segmentation, and medical imaging analysis. Image processing usually requires complex calculations. Applying artificial intelligence technology to image processing has a better effect, which can reduce the amount of calculation and improve the performance of image processing. In the cloud computing model, the local terminal equipments of the Internet of Things transmit the collected data to the cloud server through the network, and the cloud server completes the data processing.The cloud computing model can solve the problem of the poor performance of IoT devices, but with the continuous increase of IoT terminal devices and the huge number of devices accessing the network, the cloud computing model is constrained by network bandwidth and performance bottlenecks, bringing a series of problems such as high latency, poor real-time performance and low security. The edge computing model completes the calculation processing of the collected data information locally, and only returns the calculation results to the cloud server, which can effectively solve the above problems. The difficulty of implementing the edge computing model lies in the limited hardware resources of the edge device and the low performance. This thesis has conducted research on this issue, and the research mainly includes the following aspects:This thesis studies artificial intelligence image processing algorithms, designs an edge computing model, and completes an example of an edge computing system for crowd information detection. This thesis designs an edge-computing system consisting of Raspberry Pi development boards, ARC development board, and cloud server. The convolutional neural network is optimized by pruning and parameter quantization, which reduces the amount of calculation and hardware requirements, adapts the edge computing model. And this thesis completes the edge end deployment of the convolutional neural network.This thesis studies the RISC-V architecture and RI5CY processor, and designs a co-processing interface for instruction expansion. RISC-V is an emerging open-source reduced instruction set architecture. RI5CY is a low-power open-source processor based on RISC-V architecture. This thesis designs custom instructions for RI5CY to accelerate the convolution operation of the convolutional neural network. In order to simplify the subsequent development process and increase the portability of the coprocessor acceleration module, a set of coprocessor interfaces for instruction expansion is designed.This thesis designs a coprocessor acceleration module based on the Winograd algorithm to reduce memory access and accelerate the convolution operation process of the convolutional neural network. In this thesis, a coprocessor acceleration module is designed, and the module is tested and verified. The results show that in the convolution operation of 4 × 4 feature map and 3 × 3 convolution kernel, the use of the acceleration module can reduce 77.8 % memory access and improve performance by more than 5 times,and improve performance by 11.7 times through optimized memory. And there is a greater performance improvement under certain computing conditions. In larger feature maps, the coprocessor acceleration module can use the convolution operation of the 4 × 4 feature map and the 3 × 3 convolution kernel as a Winograd acceleration operator. Using the Winograd acceleration operator to convolution the feature map into blocks can also achieve large performance improvement.In this thesis, the edge computing is accelerated and optimized, which can significantly improve the image processing ability of edge devices. It puts forward a solution to the problem of low performance of edge devices. |
关键词 | |
其他关键词 | |
语种 | 中文
|
培养类别 | 联合培养
|
成果类型 | 学位论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/142833 |
专题 | 工学院_电子与电气工程系 |
作者单位 | 南方科技大学 |
推荐引用方式 GB/T 7714 |
张超. 人工智能图像处理的边缘计算硬件优化[D]. 深圳. 哈尔滨工业大学,2020.
|
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | 操作 | |
人工智能图像处理的边缘计算硬件优化.pd(6599KB) | -- | -- | 限制开放 | -- | 请求全文 |
个性服务 |
原文链接 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
导出为Excel格式 |
导出为Csv格式 |
Altmetrics Score |
谷歌学术 |
谷歌学术中相似的文章 |
[张超]的文章 |
百度学术 |
百度学术中相似的文章 |
[张超]的文章 |
必应学术 |
必应学术中相似的文章 |
[张超]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论