7 March 2019 Two-stream siamese network with contrastive-center losses for RGB-D action recognition
Chunxiao Fan, Zhengyuan Zhai, Yue Ming, Lei Tian
Author Affiliations +
Funded by: Natural Science Foundation of Beijing Municipality
Abstract
Many fusion methods have been developed to improve the performance of action recognition with RGB and depth data, where learning conjoint representation of heterogeneous modalities by a single network has not been paid enough attention. We present an associated representation method for RGB-D action recognition using the siamese network with contrastive-center loss. First, some samples of each class and modality data are selected as the references to construct positive and negative pairs. Each positive pair consists of a training sample and its class reference, whereas the negative pair only involves different classes reference. Then these pairs are inputted to a two-stream siamese network to learn the collaborative representation of RGB and depth data. Two ranking losses, namely intramodal and cross-modal contrastive-center loss, are developed to impose similarity/dissimilarity metric on those pairs. Specifically, the intramodal contrastive-center loss measures the relationship between samples and references from RGB or depth data. The cross-modal contrastive-center loss measures the relationship of visual and depth features in a same low-dimensional space. Finally, the ranking losses and a softmax loss are jointly optimized for action recognition. The proposed method is evaluated on two large action datasets, LAP IsoGD and NTU RGB+D, and a smaller dataset, Sheffield Kinect gesture. The experimental results demonstrate that the proposed method surpasses most of the state-of-the-art methods.
© 2019 SPIE and IS&T 1017-9909/2019/$25.00 © 2019 SPIE and IS&T
Chunxiao Fan, Zhengyuan Zhai, Yue Ming, and Lei Tian "Two-stream siamese network with contrastive-center losses for RGB-D action recognition," Journal of Electronic Imaging 28(2), 023004 (7 March 2019). https://doi.org/10.1117/1.JEI.28.2.023004
Received: 21 June 2018; Accepted: 11 February 2019; Published: 7 March 2019
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
RGB color model

Data modeling

Convolution

Network architectures

Distance measurement

Neural networks

Data fusion

RELATED CONTENT


Back to Top