Abstract:In the current study, pedestrian detection accuracy in dense scenes is low. In order to improve the detection accuracy, an improved method based on YOLOv5 network, V-YOLO, is proposed in this paper. The weighted fusion Pyramid Network BiFPN (Bi-directional Feature Pyramid Network) is used to improve the Path Aggregation network PANet (Path Aggregation Network) in the original network to strengthen the multi-scale feature fusion capability. Improve the ability of pedestrian target detection. For retain more feature information and improve the feature extraction capability of the backbone network, a residual structure VBlock is added. SKNet(Select Kernel Networks) were introduced to integrate the feature maps of different receptive fields dynamically to improve the utilization rate of different pedestrian features. In this paper, CrowdHuman data set is used for training and testing. The experimental results show that compared with the original network, the accuracy, recall rate and average accuracy of the proposed algorithm are increased by 1.8%, 2.3% and 2.6%, respectively, which verifies that the proposed algorithm can effectively improve the accuracy of pedestrian target detection in dense scenes.