Abstract
This paper proposes YOLO-Ball, an improved real-time detection model addressing tennis ball detection challenges including small target size, motion blur, and player/racket occlusion. The main contributions include: (1) A multi-branch occlusion-aware attention mechanism for dynamic multi-scale feature fusion, (2) A dual-flow shallow fusion pyramid combining P2 features with bidirectional fusion to enhance small target and blur handling, and (3) A dynamic balance loss integrating Normalized Gaussian Wasserstein Distance (NWD) and Intersection over Union (IoU) with learnable alignment weights. Experiments on the constructed tennis dataset show 82.2% precision and a mean Average Precision at an Intersection over Union threshold of 0.5 (mAP@0.5) of 70.9%, outperforming YOLOv8/v10/v11 by up to 12.5%. YOLO-Ball also demonstrates generalization for volleyball and football detection, thereby providing a robust detection framework for scenarios involving high-speed, small targets.
Keywords
Get full access to this article
View all access options for this article.
