The study of ancient Chinese writing has great cultural and historical value. Text annotation is a time-consuming and laborintensive part of ancient writing research. In this paper, we construct a deep learning model for ancient Chinese text detection which combines Swin Transformer and Mask R-CNN. To solve the overlapping detection problem, we propose Text Non-Maximum Suppression (text-NMS) and text-NMS loss. The former is to weed out redundant subtext bounding boxes in the Non-Maximum Suppression process, and the latter further rectifies the detection failure missed by text-NMS in bounding box regression. Experiments on the Chinese Stone Inscription Dataset show that the proposed algorithm can improve the accuracy of ancient text detection. The text-NMS and text-NMS loss algorithm boost the average precision (AP) of Swin-small and Mask R-CNN from 64.4% to 65.8% with few additional hyper-parameter and computational overhead.
Co-clustering, an extension of one-sided clustering, refers to process of clustering data points and features simultaneously. During text clustering tasks, traditional one-sided clustering algorithms have encountered difficulties dealing with sparse problem. Instead, a co-clustering procedure, where data's common organizing form is a big matrix aggregated by data points, has proved more useful when faced with sparsity. Based on the traditional co-clustering approaches, a new model named SC-DNMF, which takes into account the semantic constraints between words, is proposed in this paper. Experiments on several datasets indicate that our proposal improves the clustering accuracy over traditional co-clustering models.
Moving object detection and background estimation are important steps in numerous computer vision applications. Low rank and sparse representation based methods have attracted wide attention in background modeling field. However, many existing methods ignore the spatio-temporal information of the foreground. In this paper, a new low-rank and sparse representation model for moving object detection is proposed, in which we regard the image sequence as being made up of the sum of a low-rank static background matrix, a sparse foreground matrix and a sparser dynamic background matrix. The 3D total variation regularizer and weighted nonconvex nuclear norm are incorporated to refine our model. Extensive experiments on challenging datasets demonstrate that our method works effectively and outperforms many state-of-the-art approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.