Sage Journals: Discover world-class research

Abstract

As intelligent transportation systems increasingly rely on vehicle detection algorithms, their actual deployment faces significant challenges. The high computational complexity of deep learning models conflicts with resource-constrained edge devices and their real-time requirements. The detection system must maintain a stable frame rate under low-latency conditions in high-traffic environments to ensure timely traffic monitoring and decision-making, as well as adaptability to complex environments. Poor weather, changes in lighting, and intricate backgrounds can affect detection accuracy, leading to false or missed detections. Consequently, this study proposes a hardware-oriented lightweight vehicle detection framework that optimizes computational efficiency while ensuring detection accuracy. First, MobileNetV3 reconstructs the YOLO backbone feature extraction network to reduce the computational redundancy. Second, conventional convolutions are replaced with depthwise separable convolutions to decouple spatial and channel feature learning. Third, the original C3 module (a cross-stage partial bottleneck structure with three convolutional layers) is reconfigured using redesigned GhostBottleneck blocks (lightweight residual blocks that stack Ghost Modules—where the first expands channels via depthwise separable convolution and the second compresses channels while retaining shortcut connections). In addition, the introduction of a novel dual-stream attention mechanism enhances prediction accuracy while maintaining detection performance. A paired-sample t-test was used to evaluate the effectiveness of the proposed algorithm. The results demonstrate that the P-values for all three key performance metrics —precision, recall, and mAP@0.5 — are significantly lower than 0.05, confirming that MDE-YOLO achieves statistically significant improvements over YOLOv5s in these metrics. Compared with YOLOv5, the proposed method reduces the number of parameters, computational complexity, and model weight by 69.9%, 77.5%, and 67.8%, respectively, while maintaining an average recognition rate reduction of only 0.5%. The slight decrease in accuracy can be considered an acceptable trade-off, providing a methodological framework for overcoming the “accuracy-resource” dilemma in traffic perception systems.

Keywords

YOLOv5 MobilenetV3 GhostNet attention mechanism paired sample t-test

Get full access to this article

View all access options for this article.

References

Bertozzi

Broggi

Fascioli

(2000). Vision-based intelligent vehicles: State of the art and perspectives. Robotics and Autonomous Systems, 32(1), 1–16. https://doi.org/10.1016/S0921-8890(99)00125-6

Cheng

Wang

Bai

(2024). Lightweight air-to-air unmanned aerial vehicle target detection model. Scientific Reports, 14(1), 2609. https://doi.org/10.1038/s41598-024-53181-2 .

Geng

L. L.

Niu

B. N.

(2022). Pruning convolutional neural networks via filter similarity analysis. Machine Learning, 111(9), 3161–3180. https://doi.org/10.1007/s10994-022-06193-w

Girshick

(2015). Fast R-CNN. In IEEE International Conference on Computer Vision (ICCV) (USA, December 2015). IEEE.

Girshick

Donahue

Darrell

Malik

(2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (pp. 580–587). Columbus, OH, USA: IEEE.

Han

Huizi

Dally

W.J.

(2015). Deep compression: Compressing deep neural networks with pruning. Trained Quantization and Huffman Coding Fiber, 56(4), 3–7. https://doi.org/10.48550/arXiv.1510.00149

Han

Wang

Tian

Guo

(2020). Ghostnet: More features from cheap operations. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1577–1586). Seattle, WA, USA: IEEE.

Zhang

Sun

(2017). Channel pruning for accelerating very deep neural networks. In 2017 IEEE International Conference on Computer Vision (ICCV) (pp. 1398–1406). Venice, Italy: IEEE.

Howard

Sandler

Chen

Wang

Chen

L.-C.

Tan

Chu

Vasudevan

Zhu

Pang

Adam

(2019). Searching for MobileNetV3. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 1314–1324).

10.

Howard

A. G.

Zhu

Chen

(2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications.

11.

Shen

Sun

(2018). Squeeze-and-excitation networks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7132–7141). Salt Lake City, UT, USA: IEEE.

12.

Huang

Zhang

(2024). A lightweight vehicle detection method fusing GSConv and coordinate attention mechanism. Sensors, 24(7), 2394. https://doi.org/10.3390/s24082394

13.

Iandola

F. N.

Han

Moskewicz

M. W.

Ashraf

Dally

W.J.

Keutzer

(2016). Squeezenet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. ArXiv, abs/1602.07360. https://doi.org/10.48550/arXiv.1602.07360

14.

Jiang

Ergu

Liu

Cai

(2022). A review of YOLO algorithm developments. Procedia Computer Science, 199, 1066–1073. https://doi.org/10.1016/j.procs.2022.01.135

15.

Lee

Kim

(2024). DCT-ViT: High-frequency pruned vision transformer with discrete cosine transform. IEEE Access, 12, 80386–80396. https://doi.org/10.1109/ACCESS.2024.3410231

16.

Wang

(2023). Improved YOLOv7 for small object detection algorithm based on attention and dynamic convolution. Applied Sciences, 13(16), 9316. https://doi.org/10.3390/app13169316

17.

R. Q.

Zhu

Liu

Y. Y.

(2022). Filter elastic deep neural network channel pruning compression method. Computer Engineering and Applications, 32, 136–141.

18.

Liu

Anguelov

Erhan

Szegedy

Reed

C.-Y.

Berg

A. C.

(2016). SSD: Single Shot MultiBox Detector. In B. Leibe, J. Matas, N. Sebe, & M. Welling (Eds.), Computer Vision – ECCV 2016 (pp. 21–37). Springer International Publishing. https://doi.org/10.1007/978-3-319-46448-0_2

19.

Liu

Zhu

Sang

(2024). LAYN: Lightweight multi-scale attention YOLOv8 network for small object detection. IEEE Access, 12, 29294–29307. https://doi.org/10.1109/ACCESS.2024.3368848

20.

Ren

Girshick

Sun

(2015). Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

21.

Snyder

M. N.

(2019). Advances in Neural Information Processing Systems 32 (NeurIPS Vol. 32, pp. 10242–10253). Curran Associates.

22.

Varghese

Sambath

(2024). YOLOv8: A novel object detection algorithm with enhanced performance and robustness. 2024 Int Conf Adv Data Eng Intell Comput Syst (ADICS) (pp. 1–6), Chennai, India.

23.

Vinoth

Sasikumar

(2024). Lightweight object detection in low light: Pixel-wise depth refinement and TensorRT optimization. Results in Engineering, 23(1), 2590–1230. https://doi.org/10.1016/j.rineng.2024.102510

24.

Wang

Song

Zheng

(2024). Lightweight vehicle detection based on improved YOLOv5s. Sensors, 24(4), 1182. https://doi.org/10.3390/s24041182

25.

Wen

Cai

Lei

Chang

M.-C.

Lim

Yang

M.-H.

Lyu

(2020). UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking. Computer Vision and Image Understanding, 193, 102907. https://doi.org/10.1016/j.cviu.2020.102907

26.

Woo

Park

Lee

J-Y.

Kweon

(2018). CBAM: Convolutional block attention module. arXiv:1807.06521.

27.

X. Y.

Zhou

D. Y.

Shi

(2021). Differential evolution based layer-wise weight pruning for compressing deep neural networks. Sensors, 21, 880. https://doi.org/10.3390/s21030880

28.

P. T.

Cao

Fanhua

Wenyu

(2022). Layer pruning via fusible residual convolutional block for deep neural networks . Journal of Peking University (Natural Science Edition), 58(5), 801–807. https://doi.org/10.48550/arXiv.2011.14356

29.

Zhang

J. Y.

Kou

J. Q.

Liu

N. Z.

(2022). Neural network pruning algorithm based on filter distribution fitting. Computer Technology Development, 32(12), 136–141. https://doi.org/10.3969/j.issn.1673-629X.2022.12.021

30.

Zhang

Wang

(2024). YOLO-Dynamic: A detection algorithm for spaceborne dynamic objects. Sensors, 24(23), 7684. https://doi.org/10.3390/s24237684

Enhancing Real-Time Vehicle Detection with a Lightweight Model

Abstract

Keywords

Get full access to this article

References