Sage Journals: Discover world-class research

Abstract

To address the challenges of feature extraction difficulties and complex background interference in unmanned aerial vehicle (UAV) remote sensing vehicle target detection, YOLOv11 is chosen as the baseline model due to its efficient detection speed, excellent detection accuracy, and the ability to precisely identify objects of small target. Firstly, we introduce the FasterNet neural network, which maintains high detection accuracy while achieving a breakthrough in computational speed and significantly reduces the number of parameters and computational complexity. Secondly, based on the FasterNet model, we propose the Efficient Multi-scale Attention (EMA) mechanism. Through multi-scale feature fusion and efficient computational design, the model enhances its understanding of complex information. Finally, we integrate the DySample and Lightweight Adaptive Extraction (LAE) modules into the neck network. These two modules support dynamic feature alignment and adaptive channel optimization, optimizing the allocation of computational resources. Basing on improving modules, we propose the YOLOv11-FEDL model. Its optimized feature extraction mechanism forms an efficient, flexible, and robust object detection framework. The resulting architecture enhances detection performance while simultaneously lowering computational complexity. In the lightweight experiments conducted on the Cardrone and VisDrone2019 datasets, the mAP@0.5 metric reached 89.7% and 39.1% respectively, the inference speed increases to 301.80 FPS and 334.10 FPS respectively, and the average inference time is reduced to 2.41 ms and 2.43 ms respectively. The improved model maintains comparable detection accuracy while reducing the parameter count to 2.33 M and the computational complexity to 5.3 GFLOPs, and the inference speed is increased by approximately 20%.

Keywords

vehicle object detection lightweight neural network YOLOv11 attention mechanisms

Get full access to this article

View all access options for this article.

References

Ahmad

Ansari

S. U.

Haider

Javed

Rahman

J. U.

Anwar

(2022). Confusion matrix-based modularity induction into pretrained CNN. Multimedia Tools and Applications, 81(16), 23311–23337. https://doi.org/10.1007/s11042-022-12331-2

Al-Jabbar

Al-Mansor

Abdel-Khalek

Alkhalaf

(2023). Ebola optimization with modified DarkNet-53 model for scene classification and security on Internet of Things in smart cities. Alexandria Engineering Journal, 75, 29–40. https://doi.org/10.1016/j.aej.2023.05.049

Almazmomi

N. K.

(2025). MMR-CNN-soft-NMS: An efficient wound segmentation algorithm for diagnosis of peripheral artery disease. Neural Computing and Applications, 37(6), 5223–5234. https://doi.org/10.1007/s00521-024-10931-7

Bakirci

(2025). Internet of Things-enabled unmanned aerial vehicles for real-time traffic mobility analysis in smart cities. Computers and Electrical Engineering, 123(4), 110313. https://doi.org/10.1016/j.compeleceng.2025.110313

Barr

(2025). Coupling the power of YOLOv9 with transformer for small object detection in remote-sensing images. computer Modeling in Engineering & Sciences, 143(4), 593–616. https://doi.org/10.32604/cmes.2025.062264

Bashivan

Bayat

Ibrahim

Dehghani

Ren

(2025). Learning adversarially robust kernel ensembles with kernel average pooling. Expert Systems with Applications, 266, 126017. https://doi.org/10.1016/j.eswa.2024.126017

Cao

Shan

Long

Wang

(2023). Ghostcount: A lightweight convolution network based on high-altitude video for vehicle instantaneous counting in dense traffic scenes. IET Intelligent Transport Systems, 17(5), 943–959. https://doi.org/10.1049/itr2.12318

Chen

Liang

(2025). LMF-YOLO: An improved YOLO algorithm for road object detection in autonomous driving. Signal, Image and Video Processing, 19(9), 1–12. https://doi.org/10.1007/s11760-025-04370-7

Ding

Ruan

Yang

Sun

(2024). LSSMA: Lightweight spectral–spatial neural architecture with multiattention feature extraction for hyperspectral image classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 17, 6394–6413. https://doi.org/10.1109/JSTARS.2024.3371536

10.

Gao

Wang

Yang

Peng

Fang

(2024). Attention mechanism and lightweight network fusion HRNet: A lightweight remote sensing road extraction algorithm integrating attention mechanisms. Journal of Electronic Imaging, 33(6), 063015. https://doi.org/10.1117/1.JEI.33.6.063015

11.

Girshick

Donahue

Darrell

Malik

(2015). Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 142–158. https://doi.org/10.1109/TPAMI.2015.2437384

12.

Dai

Yuan

(2024). YOLOvt: CSPNet-based attention for a lightweight textile defect detection model. Textile Research Journal, 94(9), 1021–1039. https://doi.org/10.1177/00405175231221300

13.

Song

Liu

Xia

Huang

Fan

Lin

Yang

(2025). Mixfuse: An iterative mix-attention transformer for multi-modal image fusion. Expert Systems with Applications, 261, 125427. https://doi.org/10.1016/j.eswa.2024.125427

14.

Qin

Yang

Hong

Dai

Wang

(2024). LVNet: A lightweight volumetric convolutional neural network for real-time and high-performance recognition of 3D objects. Multimedia Tools and Applications, 83(21), 61047–61063. https://doi.org/10.1007/s11042-023-17816-2

15.

Pan

Zhou

Zhu

Wei

Liu

(2024). SOD-YOLO: Small-object-detection algorithm based on improved YOLOv8 for UAV images. Remote Sensing, 16(16), 3057. https://doi.org/10.3390/rs16163057

16.

Liu

Kong

Zhao

Zeng

Tang

Shen

Yang

Zhang

Yuan

Niu

Lin

Wang

(2025). TSLA: A task-specific learning adaptation for semantic segmentation on autonomous vehicles platform. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 44(4), 1406–1419. https://doi.org/10.1109/TCAD.2024.3491015

17.

Liu

Wang

(2023). LAE-Net: A locally-adaptive embedding network for low-light image enhancement. Pattern Recognition, 133, 109039. https://doi.org/10.1016/j.patcog.2022.109039

18.

Mishra

Singh

U. P.

Singh

K. P.

(2023). A lightweight relation network for few-shots classification of hyperspectral images. Neural Computing and Applications, 35(15), 11417–11430. https://doi.org/10.1007/s00521-023-08306-5

19.

Nie

Ding

Wang

Wong

E. K.

(2021). URCA-GAN: UpSample residual channel-wise attention generative adversarial network for image-to-image translation. Neurocomputing, 443, 75–84. https://doi.org/10.1016/j.neucom.2021.02.054

20.

Özcan

Dönmez

(2021). Bacterial disease detection for pepper plant by utilizing deep features acquired from DarkNet-19 CNN model. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi, 12(4), 573–579. https://doi.org/10.24012/dumf.1001901

21.

Qiu

Chen

Cai

Niu

(2024). LD-YOLOv10: A lightweight target detection algorithm for drone scenarios based on YOLOv10. Electronics, 13(16), 3269. https://doi.org/10.3390/electronics13163269

22.

Ren

Girshick

Sun

(2016). Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

23.

Tanriverdi

Alemdar

K. D.

(2025). Comparative analysis of data augmentation strategies based on YOLOv12 and MCDM for sustainable mobility safety: Multi-model ensemble approach. Sustainability, 17(12), 5638. https://doi.org/10.3390/su17125638

24.

Wang

Tan

Wang

Kong

Zhang

Pan

Liu

(2025). SDS-YOLO: An improved vibratory position detection algorithm based on YOLOv11. Measurement, 244, 116518. https://doi.org/10.1016/j.measurement.2024.116518

25.

Wang

Meng

Deng

Wang

(2023). Learning convolutional self-attention module for unmanned aerial vehicle tracking. Signal, Image and Video Processing, 17(5), 2323–2331. https://doi.org/10.1007/s11760-022-02449-z

26.

Yan

Zhang

Yuan

(2022). R-SSD: Refined single shot multibox detector for pedestrian detection. Applied Intelligence, 52(9), 10430–10447. https://doi.org/10.1007/s10489-021-02798-1

27.

Yang

(2025). Learning differentiable categorical regions with Gumbel-Softmax for person re-identification. Neurocomputing, 613, 128723. https://doi.org/10.1016/j.neucom.2024.128723

28.

Liu

Zhao

Liu

(2023). Small object detection algorithm based on improved YOLOv8 for remote sensing. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 17, 1734–1747. https://doi.org/10.1109/JSTARS.2023.3339235

29.

Zhang

Wang

Luo

Xiao

Fang

Xiao

Luo

(2024). STARNet: An efficient spatiotemporal feature sharing reconstructing network for automatic modulation classification. IEEE Transactions on Wireless Communications, 23(10), 13300–13312. https://doi.org/10.1109/TWC.2024.3400754

30.

Zhou

(2022). A lightweight network for crack detection with split exchange convolution and multi-scale features fusion. IEEE Transactions on Intelligent Vehicles, 8(3), 2296–2306. https://doi.org/10.1109/TIV.2022.3210299

An Efficient and Lightweight Feature Enhancement-Based Algorithm YOLOv11-FEDL for Vehicle Target Detection in Unmanned Aerial Vehicle Remote Sensing

Abstract

Keywords

Get full access to this article

References