Poster + Paper
22 November 2024 CU-DETR: a monocular 3D detection enhanced by local-global feature fusion and position embedding perturbation
Author Affiliations +
Conference Poster
Abstract
The monocular 3D object detection methods based on Transformer have recently progressed significantly. However, most existing methods struggle to effectively handle fine-grained objects and complex scenes, particularly when capturing the features of occluded or small objects. To tackle these issues, we propose a monocular 3D object detector, CU-DETR, based on the MonoDETR framework. CU-DETR introduces the local-global fusion encoder to enhance local feature extraction and fusion and applies an uncertainty perturbation strategy in position encoding to enhance the model’s performance in handling complex scenes. Experimental results on the KITTI public dataset demonstrate that CU-DETR outperforms the MonoDETR.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Xi Li, Kun Ren, Tianyang Zhang, Yongping Du, and Honggui Han "CU-DETR: a monocular 3D detection enhanced by local-global feature fusion and position embedding perturbation", Proc. SPIE 13239, Optoelectronic Imaging and Multimedia Technology XI, 132391L (22 November 2024); https://doi.org/10.1117/12.3036911
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object detection

Convolution

Feature extraction

Feature fusion

Transformers

Visualization

3D acquisition

Back to Top