Paper
14 November 2023 A dynamic gesture recognition method based on R(2+1)D-transformer network
Yupeng Huo, Jie Shen, Xu Chen, Keming Yu
Author Affiliations +
Proceedings Volume 12934, Third International Conference on Computer Graphics, Image, and Virtualization (ICCGIV 2023); 1293417 (2023) https://doi.org/10.1117/12.3008203
Event: 2023 3rd International Conference on Computer Graphics, Image and Virtualization (ICCGIV 2023), 2023, Nanjing, China
Abstract
Efficient spatial-temporal feature extraction from input video streams is crucial for dynamic gesture recognition. In the task of video classification, convolutional neural networks (CNNs) are widely used as feature extractors, while methods based on recurrent neural networks (RNNs) are commonly employed for sequence modeling. However, RNNs lack the ability to model global dependencies and have a limited attention span in the temporal dimension. This becomes a performance bottleneck for dynamic gestures that require sensitivity to temporal correlations. To address this issue, this paper proposes a dynamic gesture recognition model called R(2+1)D-Transformer. It is a Transformer-based approach that focuses on global modeling. Firstly, the R(2+1)D network is employed as a spatial-temporal feature extractor to capture the spatiotemporal information. Then, self-attention-based Transformer is used to map the spatiotemporal feature sequence to the semantic representation of gesture movements, considering both the temporal and spatial context. Finally, the gesture recognition results are obtained through an MLP classification head. Experimental results demonstrate the effectiveness and potential of the proposed R(2+1)D-Transformer model on two publicly available dynamic gesture datasets, IPN-Hand and NvGesture. The promising performance of the proposed approach provides valuable insights and reference for further research and applications in dynamic gesture recognition.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Yupeng Huo, Jie Shen, Xu Chen, and Keming Yu "A dynamic gesture recognition method based on R(2+1)D-transformer network", Proc. SPIE 12934, Third International Conference on Computer Graphics, Image, and Virtualization (ICCGIV 2023), 1293417 (14 November 2023); https://doi.org/10.1117/12.3008203
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Gesture recognition

Feature extraction

Convolution

Modeling

Data modeling

3D modeling

Back to Top