Paper
4 February 2013 Graphic composite segmentation for PDF documents with complex layouts
Canhui Xu, Zhi Tang, Xin Tao, Cao Shi
Author Affiliations +
Proceedings Volume 8658, Document Recognition and Retrieval XX; 86580E (2013) https://doi.org/10.1117/12.2003705
Event: IS&T/SPIE Electronic Imaging, 2013, Burlingame, California, United States
Abstract
Converting the PDF books to re-flowable format has recently attracted various interests in the area of e-book reading. Robust graphic segmentation is highly desired for increasing the practicability of PDF converters. To cope with various layouts, a multi-layer concept is introduced to segment graphic composites including photographic images, drawings with text insets or surrounded with text elements. Both image based analysis and inherent digital born document advantages are exploited in this multi-layer based layout analysis method. By combining low-level page elements clustering applied on PDF documents and connected component analysis on synthetically generated PNG image document, graphic composites can be segmented for PDF documents with complex layouts. The experimental results on graphic composite segmentation of PDF document pages have shown satisfactory performance.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Canhui Xu, Zhi Tang, Xin Tao, and Cao Shi "Graphic composite segmentation for PDF documents with complex layouts", Proc. SPIE 8658, Document Recognition and Retrieval XX, 86580E (4 February 2013); https://doi.org/10.1117/12.2003705
Lens.org Logo
CITATIONS
Cited by 9 scholarly publications and 2 patents.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visualization

Image segmentation

Composites

Image analysis

Analytical research

Image processing

Detection and tracking algorithms

Back to Top