With the recent advances in mobile technologies, new capabilities are emerging, such as mobile document image analysis.
However, mobile phones are still less powerful than servers, and they have some resource limitations. One approach to
overcome these limitations is performing resource-intensive processes of the application on remote servers. In mobile
document image analysis, the most resource consuming process is the Optical Character Recognition (OCR) process,
which is used to extract text in mobile phone captured images. In this study, our goal is to compare the in-phone and the
remote server processing approaches for mobile document image analysis in order to explore their trade-offs. For the inphone
approach, all processes required for mobile document image analysis run on the mobile phone. On the other hand,
in the remote-server approach, core OCR process runs on the remote server and other processes run on the mobile phone.
Results of the experiments show that the remote server approach is considerably faster than the in-phone approach in terms
of OCR time, but adds extra delays such as network delay. Since compression and downscaling of images significantly
reduce file sizes and extra delays, the remote server approach overall outperforms the in-phone approach in terms of
selected speed and correct recognition metrics, if the gain in OCR time compensates for the extra delays. According to the
results of the experiments, using the most preferable settings, the remote server approach performs better than the in-phone
approach in terms of speed and acceptable correct recognition metrics.
KEYWORDS: Optical character recognition, Image processing, Cell phones, Mobile devices, Data corrections, Cameras, Data storage, Image segmentation, Detection and tracking algorithms, Scanners
Participatory sensing is an approach which allows mobile devices such as mobile phones to be used for data collection,
analysis and sharing processes by individuals. Data collection is the first and most important part of a participatory
sensing system, but it is time consuming for the participants. In this paper, we discuss automatic data collection
approaches for reducing the time required for collection, and increasing the amount of collected data. In this context, we
explore automated text recognition on images of store receipts which are captured by mobile phone cameras, and the
correction of the recognized text. Accordingly, our first goal is to evaluate the performance of the Optical Character
Recognition (OCR) method with respect to data collection from store receipt images. Images captured by mobile phones
exhibit some typical problems, and common image processing methods cannot handle some of them. Consequently, the
second goal is to address these types of problems through our proposed Knowledge Based Correction (KBC) method
used in support of the OCR, and also to evaluate the KBC method with respect to the improvement on the accurate
recognition rate. Results of the experiments show that the KBC method improves the accurate data recognition rate
noticeably.
KEYWORDS: Optical character recognition, Video, Detection and tracking algorithms, Image processing, Cell phones, Cameras, Light sources and illumination, Video processing, Mobile devices, Image enhancement
In this study, we explore automated text recognition and enhancement using mobile phone captured videos of store receipts. We propose a method which includes Optical Character Resolution (OCR) enhanced by our proposed Row Based Multiple Frame Integration (RB-MFI), and Knowledge Based Correction (KBC) algorithms. In this method, first, the trained OCR engine is used for recognition; then, the RB-MFI algorithm is applied to the output of the OCR. The RB-MFI algorithm determines and combines the most accurate rows of the text outputs extracted by using OCR from multiple frames of the video. After RB-MFI, KBC algorithm is applied to these rows to correct erroneous characters. Results of the experiments show that the proposed video-based approach which includes the RB-MFI and the KBC algorithm increases the word character recognition rate to 95%, and the character recognition rate to 98%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.