Paper
19 September 2024 SSOCNet: a state space model-based small object counting neural network for object counting with cluttered background
Yu-Tang Wang, Yu-Jen Kao, Yen-Lin Chen, Hsun-Yu Lan, Shyi-Chyi Cheng
Author Affiliations +
Proceedings Volume 13225, Sixth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2024); 132250C (2024) https://doi.org/10.1117/12.3046308
Event: Sixth International Conference on Image, Video Processing and Artificial Intelligence, 2024, Kuala Lumpur, Malaysia
Abstract
The detection and counting of small objects are pivotal in computer vision, particularly within complex, cluttered environments. Traditional methods like Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) show progress yet struggle in such settings. Inspired by the VMamba model, we propose a new State Space Model Based Small Object Counting Neural Network (SSOCNet). This model uses the VMamba architecture as its backbone, known for its global receptive field and computational efficiency, ideal for small object detection and counting. SSOCNet leverages VMamba's strengths and includes a versatile loss function derived from unbalanced optimal transport theory, optimizing performance in diverse settings. Experimental results validate SSOCNet's effectiveness in managing high-density small objects across various datasets and its competitive or superior performance in detection accuracy compared to current state-of-the-art technologies. This study underscores the VMamba architecture's applicability in visual counting tasks and presents a novel approach to counting small objects in complex environments.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Yu-Tang Wang, Yu-Jen Kao, Yen-Lin Chen, Hsun-Yu Lan, and Shyi-Chyi Cheng "SSOCNet: a state space model-based small object counting neural network for object counting with cluttered background", Proc. SPIE 13225, Sixth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2024), 132250C (19 September 2024); https://doi.org/10.1117/12.3046308
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visual process modeling

Performance modeling

RGB color model

Neural networks

Object detection

Data modeling

Systems modeling

Back to Top