南方科技大学知识苑(SUSTech KC): 基于学习网络表征的推荐系统实现及应用

题名	基于学习网络表征的推荐系统实现及应用
其他题名	IMPLEMENTATION AND APPLICATIONOF RECOMMENDER SYSTEM BASED ON LEARNING NETWORK REPRESENTATION
姓名	张大步
学号	11749272
学位类型	硕士
学位专业	计算机技术
导师	史玉回
论文答辩日期	2019-05-30
论文提交日期	2019-06-28
学位授予单位	哈尔滨工业大学
学位授予地点	深圳
摘要	现代社会已经进入了信息过载的时代，电商平台、User Generated Content社区、在线教育等平台每天都会产生海量的内容数据，用户与内容之间的交互也会产生大量的数据。因此如何帮助用户过滤海量信息，帮助用户快速的找到他们最可能感兴趣的内容或商品，是提升用户体验的关键。推荐系统作为一种信息过滤系统，能够结合用户对物品的反馈信息，向用户个性化的推荐他们可能感兴趣的物品。但是随着用户和物品越来越多，数据量越来越大且数据越来越稀疏，推荐系统面临着数据稀疏和海量数据带来的推荐质量降低、推荐的实时性下降等挑战。为提升推荐系统的推荐质量和用户体验，本文主要关注推荐算法和系统架构的设计，用以改善和提升当下推荐系统在数据稀疏和海量数据下的推荐质量和推荐的实时性。首先，本文将介绍推荐系统研究的背景、意义和国内外研究现状。其次，介绍与本文工作相关的理论和系统开发技术。在此基础上，进行基于自然语言处理模型Skip-Gram的推荐算法研究，设计了一种基于Transform-Embedding-Recommender框架的LN-N2V-TW-CF推荐算法，并使用在线教育和电影数据集进行推荐效果的验证和测试。最后，结合大数据相关技术和前后端系统开发技术，设计了一种基于离线模型和在线推荐的实时推荐框架，并实现了基于Apache Spark和Django框架的完整的实时性教育推荐系统应用。本文的主要成果有：（1）在推荐质量方面，在基于Transform-Embedding-Recommender的框架思想下，进行基于自然语言处理模型Skip-Gram的推荐算法研究；设计了一种基于学习网络（Learning Networks）转换、Node2Vec表征和考虑时间权重的基于物品的协同过滤推荐相结合的LN-N2V-TW-CF推荐算法。其中，学习网络的转换方式更好地捕捉了物品之间的拓扑与递进关系；Node2Vec能够更好的表征学习网络中的节点信息和捕捉节点之间的关系；结合引入时间权重的基于物品协同过滤的推荐算法捕捉了时间变化对用户兴趣的影响。相比一些已有的协同过滤推荐算法，本文所提出的算法在教育推荐场景和电影推荐场景下提高了推荐系统的准确率和召回率等指标。（2）在推荐实时性方面，为保证推荐系统的实时性和推荐的时效性，本文设计了一种离线模型和在线推荐相结合的实时推荐系统架构。为了加速网络表征部分的计算，本文将网络表征部分部署于Apache Spark分布式计算平台上，并结合Django框架实现具有Restful风格的推荐系统后端，实现离线模型更新和在线实时推荐相结合的分层实时推荐系统。最后通过前端演示了推荐系统与用户交互的界面，验证了算法的可落地性，最终实现了一个MOOC教育推荐系统应用。
其他摘要	The current society has entered the period of information overload. E-commerce platforms, User Generated Content communities, online education platforms, and other platforms produce massive amounts of content data every day. Interactive behaviors between users and contents also produce a large amount of data. Therefore, how to help users filter a large amount of information and help users quickly find contents or products that users are most likely to be interested in is the key to improving user experience. Recommender system as an information filtering system can combine the user's feedback information on items to personally recommend items that may be of interest to the user. However, with more and more users and items, the amount of data is getting larger and the data is getting sparse, recommender system is currently faced with the challenges of declining recommendation quality and lower recommendation real-time performance caused by sparse data and massive data. In order to improve the recommendation quality of the recommender system and enhance the user experience, this paper focuses on the design of the recommendation algorithm and system architecture to improve the recommendation quality and real-time performance of the current recommender system under data sparse and massive data scenario. First of all, this paper will introduce the background, significance and research status of the recommender system. Secondly, introduce the theory and system development techniques related to the work of this paper. Based on this, we study recommendation algorithm based on Skip-Gram of natural language processing model, design an LN-N2V-TW-CF algorithm based on Transform-Embedding-Recommender workflow, verify and test the recommendation effect on online education and movie dataset. Finally, we combine with big data technology and front-end system development technology, design a real-time recommender framework based on offline model and online recommendation, implement a complete real-time recommender system application based on Apache Spark and Django framework. The main results of this paper are as follows:(1) In terms of recommendation quality, under the framework of Transform-Embedding-Recommender, do research on recommendation algorithm based on natural language processing model Skip-Gram; We design the LN-N2V-TW-CF recommendation algorithm based on learning networks data transformation, Node2Vec embedding and time weight item-based collaborative filtering. Among them, the data transform method of the learning network can better capture the topological relationship between items; Node2Vec can better represent and capture the relationship between nodes in the learning network; The time-weighted and item-based collaborative filtering algorithm introduces the impact of time changes on user interest. Compared with some existing collaborative filtering recommendation algorithms, the proposed algorithm improves the recommendation quality of recommender system in the education recommendation scenario and movie recommendation scenario.(2) In terms of real-time recommendation, in order to ensure the real-time and recommendation timeliness of the recommender system, this paper designs a real-time recommender system architecture by combining offline model and online recommendation. In order to accelerate the calculation of the network representation part, this work deploys the network representation part on the Apache Spark distributed computing platform, and combines the Django framework to implement the Restful style recommender system backend, which realizes the combination of offline model update and online real-time recommendation. Finally, we demonstrate the interaction between the recommender system and the user by the front-end interface, verify the feasibility of the algorithm, and realize a MOOC recommender system application.
关键词	推荐系统协同过滤网络表征学习大数据实时性技术增强学习
其他关键词	recommender system collaborative filtering network representation learning big data real-time performance technology enhanced learning
语种	中文
培养类别	联合培养
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/38909
专题	创新创业学院
作者单位	南方科技大学
推荐引用方式 GB/T 7714	张大步. 基于学习网络表征的推荐系统实现及应用[D]. 深圳. 哈尔滨工业大学,2019.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
基于学习网络表征的推荐系统实现及应用.p（6335KB）	--	--	限制开放	--	请求全文