This paper presents a novel 2×1D phase correlation based image registration method for verification of printer
emulator output. The method combines the basic phase correlation technique and a modified 2×1D version of
it to achieve both high speed and high accuracy. The proposed method has been implemented and tested using
images generated by printer emulators. Over 97% of the image pairs were registered correctly, accurately dealing
with diverse images with large translations and image cropping.
KEYWORDS: Point spread functions, Optical character recognition, Detection and tracking algorithms, Data modeling, Printing, Current controlled current source, Scanners, Convolution, Electronic imaging, Visualization
Generally speaking optical character recognition algorithms tend to perform better when presented with homogeneous data. This paper studies a method that is designed to increase the homogeneity of training data, based on an understanding of the types of degradations that occur during the printing and scanning process, and how these degradations affect the homogeneity of the data. While it has been shown that dividing the degradation space by edge spread improves recognition accuracy over dividing the degradation space by threshold or point spread function width alone, the challenge is in deciding how many partitions and at what value of edge spread the divisions should be made. Clustering of different types of character features, fonts, sizes, resolutions and noise levels shows that edge spread is indeed shown to be a strong indicator of the homogeneity of character data clusters.
This paper discusses the implementation of an engine for performing optical character recognition of bi-tonal images using the Gamera framework, an existing open-source framework for building document analysis applications. The OCR engine uses features that are based on the Fourier descriptor to distinguish characters, and is designed to be able to handle character images that contain multiple boundaries. The algorithm works by assigning to each character image a signature that encodes the boundary types that are present in the image as well as the positional relationships that exist between them. Under this approach, only images having the same signature are comparable. Effectively, a meta-classifier is used which first computes the signature of an input image and then dispatches the image to an underlying neural network based classifier which is trained to distinguish between images having that signature. The performance of the OCR engine is evaluated on a set of sample images taken from the newspaper domain, and compares well with other OCR engines. The source code for this engine and all supporting modules is currently available upon request, and will eventually be made available through an open-source project on the sourceforge website.
The National Library of Medicine has developed a system for the automatic extraction of data from scanned journal articles to populate the MEDLINE database. Although the 5-engine OCR system used in this process exhibits good performance overall, it does make errors in character recognition that must be corrected in order for the process to achieve the requisite accuracy. The correction process works by feeding words that have characters with less than 100% confidence (as determined automatically by the OCR engine) to a human operator who then must manually verify the word or correct the error. The majority of these errors are contained in the affiliate information zone where the characters are in italics or small fonts. Therefore only affiliate information data is used in this research. This paper examines the correlation between OCR errors and various character attributes in the MEDLINE database, such as font size, italics, bold, etc. The motivation for this research is that if a correlation between the types of characters and types of errors exists it should be possible to use this information to improve operator productivity by increasing the probability that the correct word option is presented to the human editor. Using a categorizing program and confusion matrices, we have determined that this correlation exists, in particular for the case of characters with diacritics.
Conference Committee Involvement (5)
Document Recognition and Retrieval XVI
21 January 2009 | San Jose, California, United States
Document Recognition and Retrieval XV
30 January 2008 | San Jose, California, United States
Document Recognition and Retrieval XIV
30 January 2007 | San Jose, CA, United States
Document Recognition and Retrieval XIII
18 January 2006 | San Jose, California, United States
Document Recognition and Retrieval XII
19 January 2005 | San Jose, California, United States
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.