中文版 | English
题名

Comparison between Calculation Methods for Semantic Text Similarity based on Siamese Networks

作者
DOI
发表日期
2021-07-23
会议名称
2021 4th International Conference on Data Science and Information Technology
会议录名称
页码
389-395
会议日期
July 23 - 25, 2021
会议地点
Shanghai China
摘要

In the era of information explosion, people are eager to obtain contents that meet their own needs and interests from massive amounts of information therefore, how to understand the needs of Internet users correctly and effectively is one of the urgent problems to be solved. In this case, semantic text similarity task is useful in many application scenarios. To measure semantic text similarity based on text matching model, several Siamese networks are constructed in this paper. Specifically, we firstly use the Stsbenchmark dataset, regarding the GloVe, BERT and DistilBERT as initial models, and add deep neural networks to train and fine-tune, fully utilizing the advantages of the existing models. Next, we test several similarity calculation methods to quantify the semantic similarity of sentence pairs. Moreover, the Pearson and Spearman correlation coefficients are used as evaluation indicators to compare the sentence embedding effects of different models. Finally, experiment result shows the Siamese network based on BERT model has the optimal effect among all, with the highest accuracy rate up to 84.5%. While among several similarity calculation methods, the Cosine Similarity usually obtain the best accuracy rate. In the future, this model can be appropriately used in semantic text similarity tasks, through matching texts between users' needs and knowledge base. In this way, we can improve machines' language understanding ability as well as meeting the diverse needs of users.

关键词
学校署名
第一
语种
英语
相关链接[Scopus记录]
收录类别
EI入藏号
20214110999672
EI主题词
Deep Neural Networks ; Embeddings ; Knowledge Based Systems ; Semantic Web ; Semantics
EI分类号
Ergonomics And Human Factors Engineering:461.4 ; Computer Software, Data HAndling And Applications:723 ; Artificial Intelligence:723.4 ; Expert Systems:723.4.1 ; Information Science:903 ; Mathematics:921
Scopus记录号
2-s2.0-85116547931
来源库
Scopus
引用统计
被引频次[WOS]:0
成果类型会议论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/254009
专题南方科技大学
理学院_统计与数据科学系
作者单位
Southern University of Science and Technology,China
第一作者单位南方科技大学
第一作者的第一单位南方科技大学
推荐引用方式
GB/T 7714
Wang,Keyang,Zeng,Yiping,Meng,Fanyu,et al. Comparison between Calculation Methods for Semantic Text Similarity based on Siamese Networks[C],2021:389-395.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可 操作
ComparisonbetweenCal(1161KB)----限制开放--
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Wang,Keyang]的文章
[Zeng,Yiping]的文章
[Meng,Fanyu]的文章
百度学术
百度学术中相似的文章
[Wang,Keyang]的文章
[Zeng,Yiping]的文章
[Meng,Fanyu]的文章
必应学术
必应学术中相似的文章
[Wang,Keyang]的文章
[Zeng,Yiping]的文章
[Meng,Fanyu]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。