With the rapid progress of human technology level and artificial intelligence field, gesture recognition technology based on deep learning plays an important role in the development of human-computer interaction. Currently, the accuracy of most target detection networks in recognising gesture actions has reached a relatively desirable level, while the training speed and results of many deep learning models are often easily limited by the computational power of these hardware due to the fact that they are used on some low-computing-power platforms, such as low-end CPUs and GPUs; Secondly, the size and complexity of the model will also have a non-negligible impact on the subsequent deployment phase of the application. Aiming at the series of problems raised above, this paper adopts a gesture recognition detection algorithm based on the improved YOLOv5, replacing part of the convolution module with the Ghost module to reduce the parameters, which in turn transforms C3 into C3Ghost, and Conv adopts a parallel structure and transforms it into GhostConv, which further reduces the amount of computation during the training process and accelerates the inference speed of the model, so as to realise that the model improves the training efficiency of the network under the condition of limited hardware arithmetic power. At the same time, in order to solve the problem of accuracy degradation that may be accompanied by the completion of lightweighting, the CBAM attention mechanism is also added to strengthen the network's ability to extract target features to improve detection accuracy; Then the α-IoU loss function is used instead of the CIoU loss function to make the model converge more quickly during training. Ultimately, experimental comparisons show that compared with the original algorithm, the improved YOLOv5s PRO has 46.2% fewer Parameters, 42.6% smaller Model Size, and 48.1% fewer GFLOPs, which can effectively balance the speed and accuracy of training.
|