中文版 | English
题名

On the Use of Neural Text Generation for the Task of Optical Character Recognition

作者
DOI
发表日期
2019-11
会议名称
16th ACS/IEEE International Conference on Computer Systems and Applications AICCSA 2019
ISSN
2161-5322
ISBN
978-1-7281-5053-6
会议录名称
卷号
2019-November
页码
1-8
会议日期
3-7 Nov. 2019
会议地点
迪拜
出版地
345 E 47TH ST, NEW YORK, NY 10017 USA
出版者
摘要
Optical Character Recognition (OCR), is extraction of textual data from scanned text documents to facilitate their indexing, searching, editing and to reduce storage space. Although OCR systems have improved significantly in recent years, they still suffer in situations where the OCR output does not match the text in the original document. Deep learning models have contributed positively to many problems but their full potential to many other problems are yet to be explored. In this paper we propose a post-processing approach based on the application deep learning to improve the accuracy of OCR system (minimizing the error rate). We report on the use of neural network language models to accomplish the task of correcting incorrectly predicted characters/words by OCR systems. We applied our approach to the IAM handwriting database. Our proposed approach delivers significant accuracy improvement of 20.41% in F-score, 10.86% in character level comparison using Levenshtein distance and 20.69% in document level comparison over previously reported context based OCR empirical results of IAM handwriting database.
关键词
学校署名
其他
语种
英语
相关链接[来源记录]
收录类别
WOS研究方向
Computer Science
WOS类目
Computer Science, Information Systems ; Computer Science, Theory & Methods
WOS记录号
WOS:000563487400108
EI入藏号
20201508381020
EI主题词
Digital storage ; Database systems ; Computational linguistics ; Deep learning ; Learning systems
EI分类号
Ergonomics and Human Factors Engineering:461.4 ; Computer Theory, Includes Formal Logic, Automata Theory, Switching Theory, Programming Theory:721.1 ; Data Storage, Equipment and Techniques:722.1 ; Database Systems:723.3 ; Light/Optics:741.1
来源库
Web of Science
全文链接https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9035333
引用统计
被引频次[WOS]:0
成果类型会议论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/124958
专题个人在本单位外知识产出
工学院_计算机科学与工程系
推荐引用方式
GB/T 7714
Georgios Theodoropoulos. On the Use of Neural Text Generation for the Task of Optical Character Recognition[C]. 345 E 47TH ST, NEW YORK, NY 10017 USA:IEEE,2019:1-8.
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Georgios Theodoropoulos]的文章
百度学术
百度学术中相似的文章
[Georgios Theodoropoulos]的文章
必应学术
必应学术中相似的文章
[Georgios Theodoropoulos]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。