Translator Disclaimer
7 March 1996 Character recognition in a Japanese text recognition system
Author Affiliations +
Proceedings Volume 2660, Document Recognition III; (1996)
Event: Electronic Imaging: Science and Technology, 1996, San Jose, CA, United States
Cherry Blossom is a machine-printed Japanese document recognition system developed at CEDAR in past years. This paper focuses on the character recognition part of the system. for Japanese character classification, two feature sets are used in the system: one is the local stroke direction feature; another is the gradient, structural and concavity feature. Based on each of those features, two different classifiers are designed: one is the so-called minimum error subspace classifier; another is the fast nearest-neighbor (FNN) classifier. Although the original version of the FNN classifier uses Euclidean distance measurement, its new version uses both Euclidean distance and the distance calculation defined in the ME subspace method. This integration improved performance significantly. The number of character classes handled by those classifiers is about 3,300 (including alphanumeric, kana and level-1 Kanji JIS). Classifiers were trained and tested on 200 ppi character images from CEDAR Japanese character image CD-ROM.
© (1996) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Tao Hong, Geetha Srikantan, V. C. Zandy, Chi Fang, and Sargur N. Srihari "Character recognition in a Japanese text recognition system", Proc. SPIE 2660, Document Recognition III, (7 March 1996);

Back to Top