You are here
OPTICAL CHARACTER RECOGNITION: A STATISTICAL MODEL OF MULTI-ENGINE OPTICAL CHARACTER RECOGNITION SYSTEMS
- Date Issued:
- 2004
- Abstract/Description:
- This thesis is a benchmark performed on three commercial Optical Character Recognition (OCR) engines. The purpose of this benchmark is to characterize the performance of the OCR engines with emphasis on the correlation of errors between each engine. The benchmarks are performed for the evaluation of the effect of a multi-OCR system employing a voting scheme to increase overall recognition accuracy. This is desirable since currently OCR systems are still unable to recognize characters with 100% accuracy. The existing error rates of OCR engines pose a major problem for applications where a single error can possibly effect significant outcomes, such as in legal applications. The results obtained from this benchmark are the primary determining factor in the decision of implementing a voting scheme. The experiment performed displayed a very high accuracy rate for each of these commercial OCR engines. The average accuracy rate found for each engine was near 99.5% based on a less than 6,000 word document. While these error rates are very low, the goal is 100% accuracy in legal applications. Based on the work in this thesis, it has been determined that a simple voting scheme will help to improve the accuracy rate.
Title: | OPTICAL CHARACTER RECOGNITION: A STATISTICAL MODEL OF MULTI-ENGINE OPTICAL CHARACTER RECOGNITION SYSTEMS. |
50 views
16 downloads |
---|---|---|
Name(s): |
McDonald, Mercedes Terre, Author M Richie, Samuel, Committee Chair University of Central Florida, Degree Grantor |
|
Type of Resource: | text | |
Date Issued: | 2004 | |
Publisher: | University of Central Florida | |
Language(s): | English | |
Abstract/Description: | This thesis is a benchmark performed on three commercial Optical Character Recognition (OCR) engines. The purpose of this benchmark is to characterize the performance of the OCR engines with emphasis on the correlation of errors between each engine. The benchmarks are performed for the evaluation of the effect of a multi-OCR system employing a voting scheme to increase overall recognition accuracy. This is desirable since currently OCR systems are still unable to recognize characters with 100% accuracy. The existing error rates of OCR engines pose a major problem for applications where a single error can possibly effect significant outcomes, such as in legal applications. The results obtained from this benchmark are the primary determining factor in the decision of implementing a voting scheme. The experiment performed displayed a very high accuracy rate for each of these commercial OCR engines. The average accuracy rate found for each engine was near 99.5% based on a less than 6,000 word document. While these error rates are very low, the goal is 100% accuracy in legal applications. Based on the work in this thesis, it has been determined that a simple voting scheme will help to improve the accuracy rate. | |
Identifier: | CFE0000123 (IID), ucf:46188 (fedora) | |
Note(s): |
2004-08-01 M.S.E.E. College of Engineering and Computer Science, Department of Electrical and Computer Engineering This record was generated from author submitted information. |
|
Subject(s): |
Optical Character Recognition (OCR) Voting Scheme Character Recognition Accuracy Machine Readability |
|
Persistent Link to This Record: | http://purl.flvc.org/ucf/fd/CFE0000123 | |
Restrictions on Access: | campus 2006-01-31 | |
Host Institution: | UCF |