Capturing the hidden relationships in 2D pose sequences is crucial for accurate 3D human pose estimation(HPE). Recent studies have shown that frequency domain information, independent of spatio-temporal information, has strong capabilities on representing the pose sequences. However, there are few works exploring more appropriate ways to fuse these different kinds of information. In this paper, we propose an alternating cyclic approach for fusing spatio-temporal information and frequency information to achieve accurate 3D human pose estimation. The designed alternating cyclic fusion network allows for a more comprehensive integration of different features, leading to improved accuracy. By leveraging feature splitting and time-frequency convolution, the existing features are processed more appropriately, and achieving model lightweighting. Experimental results demonstrate that our approach achieves comparable accuracy to state-of-the-art methods while significantly outperforming mainstream methods in terms of model lightweighting. In conclusion, the introduction of frequency domain information is of great significance for pose estimation tasks.
|