题名 | On the Use of Neural Text Generation for the Task of Optical Character Recognition |
作者 | |
DOI | |
发表日期 | 2019-11
|
会议名称 | 16th ACS/IEEE International Conference on Computer Systems and Applications AICCSA 2019
|
ISSN | 2161-5322
|
ISBN | 978-1-7281-5053-6
|
会议录名称 | |
卷号 | 2019-November
|
页码 | 1-8
|
会议日期 | 3-7 Nov. 2019
|
会议地点 | 迪拜
|
出版地 | 345 E 47TH ST, NEW YORK, NY 10017 USA
|
出版者 | |
摘要 | Optical Character Recognition (OCR), is extraction of textual data from scanned text documents to facilitate their indexing, searching, editing and to reduce storage space. Although OCR systems have improved significantly in recent years, they still suffer in situations where the OCR output does not match the text in the original document. Deep learning models have contributed positively to many problems but their full potential to many other problems are yet to be explored. In this paper we propose a post-processing approach based on the application deep learning to improve the accuracy of OCR system (minimizing the error rate). We report on the use of neural network language models to accomplish the task of correcting incorrectly predicted characters/words by OCR systems. We applied our approach to the IAM handwriting database. Our proposed approach delivers significant accuracy improvement of 20.41% in F-score, 10.86% in character level comparison using Levenshtein distance and 20.69% in document level comparison over previously reported context based OCR empirical results of IAM handwriting database. |
关键词 | |
学校署名 | 其他
|
语种 | 英语
|
相关链接 | [来源记录] |
收录类别 | |
WOS研究方向 | Computer Science
|
WOS类目 | Computer Science, Information Systems
; Computer Science, Theory & Methods
|
WOS记录号 | WOS:000563487400108
|
EI入藏号 | 20201508381020
|
EI主题词 | Digital storage
; Database systems
; Computational linguistics
; Deep learning
; Learning systems
|
EI分类号 | Ergonomics and Human Factors Engineering:461.4
; Computer Theory, Includes Formal Logic, Automata Theory, Switching Theory, Programming Theory:721.1
; Data Storage, Equipment and Techniques:722.1
; Database Systems:723.3
; Light/Optics:741.1
|
来源库 | Web of Science
|
全文链接 | https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9035333 |
引用统计 |
被引频次[WOS]:0
|
成果类型 | 会议论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/124958 |
专题 | 个人在本单位外知识产出 工学院_计算机科学与工程系 |
推荐引用方式 GB/T 7714 |
Georgios Theodoropoulos. On the Use of Neural Text Generation for the Task of Optical Character Recognition[C]. 345 E 47TH ST, NEW YORK, NY 10017 USA:IEEE,2019:1-8.
|
条目包含的文件 | 条目无相关文件。 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论