Pedestrian-related accidents are more frequent at night when visible (VI) cameras are inefficient. Compared with VI cameras, thermal cameras work better in this particular environment. Conversely, thermal images have several drawbacks, such as high noise, low-resolution, less detailed information, and susceptibility to ambient temperature. To overcome these shortcomings, an improved algorithm based on you only look once version 3 (YOLOv3) is proposed. First, the number and size of anchors are obtained using k-means++, which makes the shape of the anchors more suitable for detecting the target. Second, the attention module is added to the backbone network, which is helpful with extracting better feature maps from low-quality thermal images. Finally, the improved atrous spatial pyramid pooling module is added to the back of the backbone network to enable the extracted feature maps to contain more multi-scale information and context information. Experiments on the computer vision center-09 dataset show that the average precision is 86.1%, which is 3.5% higher than YOLOv3 and 0.8% higher than YOLOv4. The detection speed reaches 48 FPS. The results show that the improved algorithm has good accuracy and generalization. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
CITATIONS
Cited by 1 scholarly publication.
Detection and tracking algorithms
Thermography
Target detection
Convolution
Infrared detectors
Feature extraction
Cameras