Paper
18 January 2010 Touching character segmentation method for Chinese historical documents
Author Affiliations +
Proceedings Volume 7534, Document Recognition and Retrieval XVII; 75340D (2010) https://doi.org/10.1117/12.840251
Event: IS&T/SPIE Electronic Imaging, 2010, San Jose, California, United States
Abstract
The OCR technology for Chinese historical documents is still an open problem. As these documents are hand-written or hand-carved in various styles, overlapped and touching characters bring great difficulty for character segmentation module. This paper presents an over-segmentation-based method to handle the overlapped and touching Chinese characters in historic documents. The whole segmentation process includes two parts: over-segmented and segmenting path optimization. In the former part, touching strokes will be found and segmented by analyzing the geometric information of the white and black connected components. The segmentation cost of the touching strokes is estimated with connected components' shape and location, as well as the touching stroke width. The latter part uses local optimization dynamic programming to find best segmenting path. HMM is used to express the multiple choices of segmenting paths, and Viterbi algorithm is used to search local optimal solution. Experimental results on practical Chinese documents show the proposed method is effective.
© (2010) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xiaolu Sun, Liangrui Peng, and Xiaoqing Ding "Touching character segmentation method for Chinese historical documents", Proc. SPIE 7534, Document Recognition and Retrieval XVII, 75340D (18 January 2010); https://doi.org/10.1117/12.840251
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Computer programming

Optimization (mathematics)

Optical character recognition

Image processing

Image processing algorithms and systems

Intelligence systems

Back to Top