Paper
19 October 2023 Unveiling the power of unpaired multi-modal data for RGBT tracking
Shen Qing, Wang Yifan, Guo Yu, Mengmeng Yang
Author Affiliations +
Proceedings Volume 12709, Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023); 127092N (2023) https://doi.org/10.1117/12.2685082
Event: Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023), 2023, Nanjing, China
Abstract
RGBT tracking receives increasing interests due to its flexible application in all-day and all-weather environments. However, the training of deep RGBT trackers usually relies on large-scale aligned RGBT pairs, which usually require high human labor and time cost. Considering the strong commonality and specificity of multi-modal data, we propose a novel two-stage learning framework to capture modality-shared and modality-specific features using large-scale unpaired RGBT data, and thus achieve state-of-the-art performance in RGBT tracking. In specific, in the first stage, we aim to learn the modality-shared representations and thus design a generic transformer network which only requires mixed modal data for training. In the second stage, we aim to learn the modality specific representations and achieve adaptive back-propagation using unpaired data. To achieve these goals, we design a modality transformer network, in which two modality encoders are used to capture modality-specific features and a modality-adaptive attention module is designed to enforce the interchange of information between different modalities in a separate-gather way. Since the two training stages do not rely on paired or aligned multi-modal data, the power of unpaired multi-modal data is unveiled in the training of deep RGBT tracker. Extensive experiments on three benchmark datasets demonstrate the effectiveness of our method against state-ofthe-art RGBT trackers.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Shen Qing, Wang Yifan, Guo Yu, and Mengmeng Yang "Unveiling the power of unpaired multi-modal data for RGBT tracking", Proc. SPIE 12709, Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023), 127092N (19 October 2023); https://doi.org/10.1117/12.2685082
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Education and training

Design and modelling

Detection and tracking algorithms

Feature extraction

Matrices

Visualization

RELATED CONTENT


Back to Top