中文版 | English
题名

Cost-effective crowdsourced join queries for entity resolution without prior knowledge

作者
通讯作者Yin,Bo
发表日期
2022-02-01
DOI
发表期刊
ISSN
0167-739X
卷号127页码:240-251
摘要
The join query, which finds matching pairs from two object sets, is a fundamental operation in computer systems and helps to solve many real problems, e.g., entity resolution. In this paper, we address the problem of join queries by leveraging crowdsourcing to obtain matching relationships. The goal is to minimize the monetary cost while maintaining high quality of query results. However, existing approaches focused on finding matching pairs from a single object set and assumed the existence of prior knowledge, which is not applicable in real applications. We propose a cost-effective crowdsourced join query framework that minimizes the overall monetary cost by reducing the monetary cost of labeling single pairs and the amount of comparison pairs. Specifically, we first propose a novel two-level confidence-based labeling model that minimizes the cost for labeling a single pair with confidence guarantee. This model crowdsources easy-judging pairs to ordinary workers, and asks for skilled workers who may charge more than ordinary workers to compare only hard-judging pairs. Statistical estimations are used to aggregate crowdsourcing results with 1−α confidence. Then, we propose a transitivity-based query scheme that minimizes the number of comparison pairs on the basis of transitive relations. Guided by the principle of eagerly identifying matching pairs, especially matching pairs from a single set, our scheme carefully designs the processing order of pairs in order to make full use of transitivities to infer new labels. The results of our extensive experiments demonstrate that the proposed framework can save much more monetary cost while assuring the accuracy of results.
关键词
相关链接[Scopus记录]
收录类别
SCI ; EI
语种
英语
学校署名
其他
WOS记录号
WOS:000706478900004
EI入藏号
20213910954793
EI主题词
Cost effectiveness ; Search engines
EI分类号
Computer Software, Data Handling and Applications:723 ; Industrial Economics:911.2
Scopus记录号
2-s2.0-85115749173
来源库
Scopus
引用统计
被引频次[WOS]:1
成果类型期刊论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/253413
专题工学院_计算机科学与工程系
作者单位
1.School of Computer and Communication Engineering,ChangSha University of Science and Technology,Changsha,410114,China
2.Department of Computer Science and Engineering,Southern University of Science and Technology,Shenzhen,518055,China
推荐引用方式
GB/T 7714
Yin,Bo,Zeng,Weilong,Wei,Xuetao. Cost-effective crowdsourced join queries for entity resolution without prior knowledge[J]. Future Generation Computer Systems-The International Journal of eScience,2022,127:240-251.
APA
Yin,Bo,Zeng,Weilong,&Wei,Xuetao.(2022).Cost-effective crowdsourced join queries for entity resolution without prior knowledge.Future Generation Computer Systems-The International Journal of eScience,127,240-251.
MLA
Yin,Bo,et al."Cost-effective crowdsourced join queries for entity resolution without prior knowledge".Future Generation Computer Systems-The International Journal of eScience 127(2022):240-251.
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Yin,Bo]的文章
[Zeng,Weilong]的文章
[Wei,Xuetao]的文章
百度学术
百度学术中相似的文章
[Yin,Bo]的文章
[Zeng,Weilong]的文章
[Wei,Xuetao]的文章
必应学术
必应学术中相似的文章
[Yin,Bo]的文章
[Zeng,Weilong]的文章
[Wei,Xuetao]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。