Precision visual positioning of cylindrical lenses in automatic loading manufacturing systems via the E-YOLOX algorithm

Abstract

Cylindrical lenses are extensively utilized in optical systems and laser pumping, where high-quality standards are imperative. This study introduces a visual positioning method for cylindrical lenses utilizing the YOLOX algorithm, aimed at improving loading qualification rates and enhancing production efficiency. Furthermore, an integrated automatic sorting and loading manufacturing system for cylindrical lenses was developed. Initially, the template matching algorithm served as the baseline for visual positioning. A dataset containing 1263 annotated images (covering R10.543, R11, NG, Rect, and Circle categories) was constructed by labeling collected operation images for model training. The original YOLOX model was improved by eliminating the non-maximum suppression step. Specifically, the loss function was optimized through the integration of constraints on the Intersection over Union (IoU), area ratio, and aspect ratio between the predicted and ground-truth bounding boxes. Additionally, a weight factor α was applied to the IoU loss. Further modifications included decoupling the YOLOX head to allow independent training of the prediction branch. An attention mechanism was also introduced to filter out irrelevant information, thereby improving model performance. Comparative performance experiments on the cylindrical lens dataset revealed that the E-YOLOX model achieves a mAP50 of 98.12% and an FPS of 30.45 frame/s, outperforming mainstream object detection algorithms (e.g. YOLOv8, YOLOv11). Practical results indicated that the algorithm exhibited a 9% improvement in stability and achieved an average localization accuracy (RMSE = 2.49 pixels) six times greater than the baseline NCC template matching algorithm (RMSE = 15.12 pixels), thereby satisfying the operational requirements for automated sorting and loading tasks.

Keywords

visual positioning machine learning cylindrical lens machine vision end-to-end YOLOX automatic loading systems

Get full access to this article

View all access options for this article.

References

Huang

. Large astigmatic laser cavity modes and astigmatic compensation. Appl Phys B 2018; 124(5): 72–79.

Yang

Fang

, et al. Integrated design of low wavenumber and high-resolution broad-spectrum Raman spectrometer. Optical Technique 2021; 47(6): 647–653. (Chinese)

Zheng

Cai

, et al. Stimulated Raman scattering in CH4 gas using single cylindrical lens focusing. Opt Commun 2021; 493: 126987–126990.

Chen

, et al. Planar laser-based QEPAS trace gas sensor. Sensors 2016; 16(7): 989–995.

Tao

Zhang

, et al. Automatic metallic surface defect detection and recognition with convolutional neural networks. Appl Sci 2018; 8(9): 1575–1589.

Chen

. Detection of surface defects and dimensions of graphite seal ring based on machine vision. Proc SPIE 2021; 14: 424–429.

Minaeian

Liu

Son

. Vision-based target detection and localization via a team of cooperative UAV and UGVs. IEEE Trans Syst Man Cybern Syst 2016; 46(7): 1005–1016.

Zhou

Tian

. Design of automatic container positioning system based on machine vision. Automation Application 2019; (03) : 85–86. (Chinese) DOI: 10.19769/j.zdhy.2019.03.035.

Sabarudin

NHS

Alias

Radzak

. Detection of traffic light using machine vision for autonomous vehicles application. Int J Eng Adv Technol 2019; 9(2): 1033–1037.

10.

Guo

Liu

Gupta

, et al. Machine vision-based intelligent manufacturing using a novel dual-template matching: a case study for lithium battery positioning. Int J Adv Manuf Technol 2021; 116(7–8): 2531–2551.

11.

Çelik

Küçükmanisa

Sümer

, et al. A real-time defective pixel detection system for LCDs using deep learning based object detectors. J Intell Manuf 2022; 33(4): 985–994.

12.

Girshick

Donahue

Darrell

, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Columbus, OH, USA, 23–28 June 2014, pp.580–587. New York: IEEE, https://openaccess.thecvf.com/content_cvpr_2014/html/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.html (accessed 18 May 2022).

13.

Girshick

. Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, Santiago, Chile, 7–13 December 2015, pp.1440–1448. New York: IEEE.

14.

Ren

Girshick

, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 2017; 39(6): 1137–1149.

15.

Gkioxari

Dollár

, et al. Mask R-CNN. IEEE Trans Pattern Anal Mach Intell 2017: 22: 2961–2959.

16.

Bochkovskiy

Wang

C-Y

Liao

H-YM

. YOLOv4: optimal speed and accuracy of object detection (online), http://arxiv.org/abs/2004.10934 (2020, accessed 16 April 2022).

17.

Khanam

Hussain

. What is YOLOv5: a deep look into the internal features of the popular object detector. arXiv:2407.20892, 2024. https://doi.org/10.48550/arXiv.2407.20892.

18.

Varghese

Sambath

. YOLOv8: a novel object detection algorithm with enhanced performance and robustness. In: 2024 International conference on advances in data engineering and intelligent computing systems (ADICS), Chennai, India, 18–19 April 2024, pp.1–6. New York: IEEE.

19.

Khanam

Hussain

. YOLOv11: an overview of the key architectural enhancements. arXiv:2410.17725, 2024. https://doi.org/10.48550/arXiv.2410.17725.

20.

Zhang

. Improved YOLOX fire scenario detection method. Wirel Commun Mob Comput 2022; 2022: 1–8.

21.

Duan

Zhuang

Zhang

, et al. Vision-Based robotic grasping using Faster R-CNN–GRCNN dual-layer detection mechanism. Proc Inst Mech Eng Part B: J Eng Manuf 2025; 239(6–7): 950–964.

22.

Cai

Zhang

, et al. Study on strip segmentation in complex scenes based on improved U-Net model. Proc Inst Mech Eng Part B: J Eng Manuf 2025; 2025: 09544054251384693.

23.

Tee

Solihin

Chong

, et al. Advancing intelligent logistics: YOLO-based object detection with modified loss functions for X-ray cargo screening. Future Transp 2025; 5(3): 120.

24.

Yang

Solihin

Zhao

, et al. Model compression for real-time object detection using rigorous gradation pruning. iScience 2025; 28(1): 111618.

25.

Zhao

Yang

Cao

, et al. Object detection in smart indoor shopping using an enhanced YOLOv8n algorithm. IET Image Process 2024; 18(14): 4745–4759.

26.

Wang

Liao

, et al. CSPNet: a new backbone that can enhance learning capability of CNN. In: 2020 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020, pp.390–391. New York: IEEE, http://www.researchgate.net/publication/343270535_CSPNet_A_New_Backbone_that_can_Enhance_Learning_Capability_of_CNN (accessed 22 April 2022).

27.

Ramachandran

Zoph

. Searching for activation functions. arXiv:1710.05941, 2017. https://doi.org/10.48550/arXiv.1710.05941.

28.

Lin

T-Y

Dollár

Girshick

, et al. Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017, pp.936–944. New York: IEEE.

29.

Jiang

Wang

, et al. UnitBox: an advanced object detection network. In: Proceedings of the 24th ACM international conference on multimedia, New York, NY, USA, 2016, pp.516–520. https://doi.org/10.1145/2964284.2967274.

30.

Zheng

Wang

Liu

, et al. Distance-IoU loss: faster and better learning for bounding box regression. Proc AAAI Conf Artif Intell 2020; 34: 12993–13000.

31.

Erfani

, et al. Alpha-IoU: a family of power intersection over union losses for bounding box regression. Adv Neural Inf Process Syst 2021; 34: 20230–20242.

32.

Woo

Park

Lee

J-Y

, et al. CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision – ECCV (eds Ferrari

Hebert

Sminchisescu

, et al.), Munich, Germany, 6 October 2018, vol. 11211, pp.3–19. Cham: Springer.

33.

Yang

Peng

. Fast algorithm for image matching based on NCC. Mod Electron Technol 2010; 33(22): 107–109.

34.

Zhang

Ren

, et al. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, Santiago, Chile, 7–13 December 2015, pp.1026–1034. New York: IEEE.

35.

Loshchilov

Hutter

. SGDR: stochastic gradient descent with warm restarts. arXiv: 1608.03983, 2016. https://doi.org/10.48550/arXiv.1608.03983.

36.

Yang

Solihin

Zhao

, et al. A review of intelligent ship marine object detection based on RGB camera. IET Image Process 2024; 18(2): 281–297.

37.

Zhou

Wang

Krähenbühl

. Objects as points. ArXiv190407850 Cs (online), http://arxiv.org/abs/1904.07850 (2019, accessed 16 April 2022).

38.

Liu

Anguelov

Erhan

, et al. SSD: Single Shot Multibox Detector. In: Leibe

Matas

Sebe

, et al. (eds) Computer vision – ECCV 2016. Lecture notes in computer science, 17 September 2016, vol. 9905, pp.21–37. Cham: Springer. https://doi.org/10.1007/978-3-319-46448-0_2.