Falls are a growing social problem and have become a hot topic in healthcare. Thanks to recent advances in deep convolutional neural networks, the accuracy of video-based fall detection has been greatly improved. However, these methods are affected by illumination and complex backgrounds. Video angles and other influencing factors reduce the accuracy and generalization ability of these methods. In this paper, a video-based human fall detection method is proposed. First, a 2D joint point sequence in the video is extracted using a pose estimator, and then a 2D joint point pose sequence is extracted. It is elevated to a 3D joint point pose sequence and then recognized whether it is a fall action by our improved multi-scale unified spatial-temporal graph convolutional network (MS-G3D). The system proves its effectiveness and robustness in the field of action recognition, achieving 99.84% accuracy on the large benchmark action recognition dataset NTU RGB+D, and 95.72% accuracy on the LE2I fall dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.