Paper
25 May 2023 Self-compositional data augmentation for scene text detection
Da Zhu, Linfei Wang, Dapeng Tao
Author Affiliations +
Proceedings Volume 12712, International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2023); 1271212 (2023) https://doi.org/10.1117/12.2679227
Event: International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2023), 2023, Huzhou, China
Abstract
Scene text detection aims to detect text regions in the complex background, and deep learning-based methods have been the mainstream. To obtain robust performance, deep learning-based methods are data-hungry, and one efficient technique to obtain sufficient data is data augmentation. However, current data augmentation for scene text detection is 1) changing the whole image, which ignores the instance-level diversity, or 2) generating synthetic data using generative models, which requires extra training data. In this paper, we propose a self-compositional data augmentation (SDA) for scene text detection. Our SDA generates new data by changing the original text regions of one image with four types of variations: translation scaling, rotation, and curving, and putting the changed text regions back into random places of the same image. In specific, our SDA is an instance-level augmentation, which could be combined with image-level augmentation; SDA requires no extra training data, which could be easily adopted in different methods. We conducted extensive experiments with three state-of-the-art scene text detection methods on two public datasets. Using our SDA improves all methods on all datasets, and the improvements demonstrate the effectiveness and generality of our SDA.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Da Zhu, Linfei Wang, and Dapeng Tao "Self-compositional data augmentation for scene text detection", Proc. SPIE 12712, International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2023), 1271212 (25 May 2023); https://doi.org/10.1117/12.2679227
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Computer vision technology

Deep learning

Image processing

Machine learning

Object detection

Back to Top