Pedestrian detection in video surveillance using fully convolutional YOLO neural network

V. V. Molchanov; B. V. Vishnyakov; Y. V. Vizilter; O. V. Vishnyakova; V. A. Knyaz

doi:10.1117/12.2270326

26 June 2017 Pedestrian detection in video surveillance using fully convolutional YOLO neural network

V. V. Molchanov, B. V. Vishnyakov, Y. V. Vizilter, O. V. Vishnyakova, V. A. Knyaz

Proceedings Volume 10334, Automated Visual Inspection and Machine Vision II; 103340Q (2017) https://doi.org/10.1117/12.2270326
Event: SPIE Optical Metrology, 2017, Munich, Germany

Abstract

More than 80% of video surveillance systems are used for monitoring people. Old human detection algorithms, based on background and foreground modelling, could not even deal with a group of people, to say nothing of a crowd. Recent robust and highly effective pedestrian detection algorithms are a new milestone of video surveillance systems. Based on modern approaches in deep learning, these algorithms produce very discriminative features that can be used for getting robust inference in real visual scenes. They deal with such tasks as distinguishing different persons in a group, overcome problem with sufficient enclosures of human bodies by the foreground, detect various poses of people. In our work we use a new approach which enables to combine detection and classification tasks into one challenge using convolution neural networks. As a start point we choose YOLO CNN, whose authors propose a very efficient way of combining mentioned above tasks by learning a single neural network. This approach showed competitive results with state-of-the-art models such as FAST R-CNN, significantly overcoming them in speed, which allows us to apply it in real time video surveillance and other video monitoring systems. Despite all advantages it suffers from some known drawbacks, related to the fully-connected layers that obstruct applying the CNN to images with different resolution. Also it limits the ability to distinguish small close human figures in groups which is crucial for our tasks since we work with rather low quality images which often include dense small groups of people. In this work we gradually change network architecture to overcome mentioned above problems, train it on a complex pedestrian dataset and finally get the CNN detecting small pedestrians in real scenes.

Citation Download Citation

V. V. Molchanov, B. V. Vishnyakov, Y. V. Vizilter, O. V. Vishnyakova, and V. A. Knyaz "Pedestrian detection in video surveillance using fully convolutional YOLO neural network", Proc. SPIE 10334, Automated Visual Inspection and Machine Vision II, 103340Q (26 June 2017); https://doi.org/10.1117/12.2270326

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available