Self-compositional data augmentation for scene text detection

Da Zhu; Linfei Wang; Dapeng Tao

doi:10.1117/12.2679227

25 May 2023 Self-compositional data augmentation for scene text detection

Da Zhu, Linfei Wang, Dapeng Tao

Proceedings Volume 12712, International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2023); 1271212 (2023) https://doi.org/10.1117/12.2679227
Event: International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2023), 2023, Huzhou, China

Abstract

Scene text detection aims to detect text regions in the complex background, and deep learning-based methods have been the mainstream. To obtain robust performance, deep learning-based methods are data-hungry, and one efficient technique to obtain sufficient data is data augmentation. However, current data augmentation for scene text detection is 1) changing the whole image, which ignores the instance-level diversity, or 2) generating synthetic data using generative models, which requires extra training data. In this paper, we propose a self-compositional data augmentation (SDA) for scene text detection. Our SDA generates new data by changing the original text regions of one image with four types of variations: translation scaling, rotation, and curving, and putting the changed text regions back into random places of the same image. In specific, our SDA is an instance-level augmentation, which could be combined with image-level augmentation; SDA requires no extra training data, which could be easily adopted in different methods. We conducted extensive experiments with three state-of-the-art scene text detection methods on two public datasets. Using our SDA improves all methods on all datasets, and the improvements demonstrate the effectiveness and generality of our SDA.

Citation Download Citation

Da Zhu, Linfei Wang, and Dapeng Tao "Self-compositional data augmentation for scene text detection", Proc. SPIE 12712, International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2023), 1271212 (25 May 2023); https://doi.org/10.1117/12.2679227

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
6 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Computer vision technology

Deep learning

Image processing

Machine learning

Object detection

RELATED CONTENT

Instant level vehicle speed and traffic density estimation using deep...
Proceedings of SPIE (June 15 2023)

An improved class balanced training sample assignment method for object...
Proceedings of SPIE (April 04 2023)

Improving the effect of low resolution face images output in...
Proceedings of SPIE (June 20 2023)

Accurate segmentation of nuclear instances using a double stage neural...
Proceedings of SPIE (April 06 2023)

Performance evaluation of an improved deep CNN based concrete crack...
Proceedings of SPIE (April 18 2023)

Beyond animal detection a benchmark for detecting animal age...
Proceedings of SPIE (October 16 2023)

License plate recognition using machine learning
Proceedings of SPIE (August 21 2023)

Subscribe to Digital Library

Receive Erratum Email Alert

Keywords/Phrases

Search In:

Publication Years