南方科技大学知识苑(SUSTech KC): 基于MetaGNN的异质信息网络表征学习

题名	基于MetaGNN的异质信息网络表征学习
其他题名	MetaGNN Based Heterogeneous Information Networks Representation Learning
姓名	曲良
学号	11749255
学位类型	硕士
学位专业	计算机科学与技术
导师	史玉回
论文答辩日期	2019-05-30
论文提交日期	2019-05-31
学位授予单位	哈尔滨工业大学
学位授予地点	深圳
摘要	异质信息网络普遍存在于现实生活中的许多情景中，如社交网络、学术引用网络、电影评论网络等。从这些网络中挖掘出有价值的信息是十分重要然而面临许多挑战的，主要的挑战是如何构建合适的信息网络节点表征。传统的信息网络节点表征方法是将网络看成由节点和边组成的图模型，并人为地抽取图中节点的特征作为节点的表征。但是这种方法十分依赖于专家的先验知识以及耗费大量的时间。因此，近年来为了解决以上问题，许多信息网络节点表征学习算法被提出来，旨在自动地学习网络中节点的表征。代表性的方法有基于图神经网络的网络节点表征学习算法。其通过聚合目标节点的邻居节点特征信息来获得目标节点的低维向量表征，但是目前的这类算法存在以下问题：一是其只关注了同质信息网络，即网络只包含了一种类型的节点和边，没有考虑到包含不同类型节点或边的异质信息网络；二是手工设计不同类型邻居节点的聚合策略是十分依赖专家的先验知识且不具有普适性的。为了解决以上问题，本文提出MetaGNN模型，一种基于已有的Deep Q-Network模型的图神经网络模型，该模型包含了两层结构，即Deep Q-network层和图神经网络层。其中Deep Q-Network层根据图神经网络层输入的的目标节点表征信息，自动地学习如何选择不同类型的邻居节点作为输出反馈给图神经网络层；图神经网络层根据Deep Q-Network层输出的邻居节点选择策略对目标节点进行表征学习，并将新的目标节点表征信息以及与下游任务相关的评价指标反馈给Deep Q-network层作为新的输入和奖励。该模型具有以下优势：一是该模型是一种端到端的信息网络节点表征学习模型，可以直接输出与下游任务相关的评价指标；二是该模型在不同领域背景的异质信息网络下具有普适性，不依赖专家的先验知识，可以自动地学习不同类型邻居节点的聚合方式；三是该模型对网络中新加的节点具有很好的泛化能力，可以有效地学习网络中新加节点的表征。为了评价模型的有效性，我们在三个真实世界中的大规模异质信息网络数据集上分别进行了归纳式与直推式节点多分类实验，并测试了模型的超参数敏感性。实验结果表明，我们的模型超过四个经典的信息网络节点表征学习算法。
其他摘要	Heterogeneous Information Networks (HINs) are ubiquitous in many real-world scenarios, such as social networks, academic citation networks, movies rating networks and etc. It is important yet challenging to mine valuable information from these networks, and the main challenge is how to build the proper information networks representation. The classical information networks representation models view information networks as the graph models which consist of nodes and edges, and manually select nodes' features as their representation. However, these models highly rely on experts' prior knowledge and consume much time. To address these problems, many information networks representation learning algorithms are proposed, which aims to automatically learn nodes' representation.Graph neural networks (GNNs) based information networks representation learning algorithms have attracted considerable interests because of their good performance. They assume that adjacent nodes have the similar attributes (e.g. nodes' labels) and obtains target nodes' low-dimension vector representation by aggregating neighbor nodes features. However, most existing GNNs based methods have some problems. First, they only focus on the homogeneous networks with one type of nodes and edges, which cannot generalize to HINs because various types of nodes have different degree influence on target nodes. Second, manually designing nodes aggregating strategies highly rely on experts' knowledge. To address these problems, we propose MetaGNN, a hierarchical networks representation learning algorithm which consists of Deep Q-network (DQN) component and GNNs component. The DQN component automatically learns to aggregate the various types of neighbor nodes of target nodes from GNNs component. GNNs component learns the nodes vector representation according to DQN's aggregating strategies and feedbacks the new target nodes and downstream task related evaluation metric to DQN component as new input and reward respectively. The proposed method has the following three advantages: First, the proposed model is an end-to-end model which can directly output downstream task related evaluation metric. Second, it can automatically learn the aggregating strategies for various types of neighbor nodes without any experts' prior knowledge. Third, it can effectively generalize to new nodes. The model has been validated on three real-world large-scale datasets on both inductive and transductive nodes multi-classification tasks and obtained promising results comparing with four classical information networks representation learning algorithms.
关键词	异质信息网络表征学习图神经网络深度强化学习
其他关键词	heterogeneous information networks representation learning graph neural networks deep reinforcement learning
语种	中文
培养类别	联合培养
成果类型	学位论文
条目标识符	http://sustech.caswiz.com/handle/2SGJ60CL/38813
专题	工学院_计算机科学与工程系
作者单位	南方科技大学
推荐引用方式 GB/T 7714	曲良. 基于MetaGNN的异质信息网络表征学习[D]. 深圳. 哈尔滨工业大学,2019.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可	操作
基于MetaGNN的异质信息网络表征学习（1123KB）	--	--	限制开放	--	请求全文