Sage Journals: Discover world-class research

Abstract

Accurate detection of 3D obstacles is crucial in autonomous vehicles and intelligent traffic systems. The multi-modal fusion of cameras and LiDAR in 3D object detection can fully leverage the advantages of both sensors, to improve the accuracy and robustness of target detection and make it become a core component of the perception system in autonomous vehicles. However, due to the inherent differences in sensor data, the difficulty of data fusion in 3D object detection still faces numerous challenges. To effectively address this issue, a 3D object detection algorithm based on multi-scale feature-weighted point-by-point fusion is proposed. By establishing correspondences between camera images and lidar point clouds on a point-wise basis, we employ the ResNet50 network model to obtain multi-scale semantic features from images. The importance of different channels in image features is reasonably assigned weights, enhancing point features with image semantic information. This approach proves beneficial in tackling the challenge of matching image and point cloud fusion, which is hindered by disparate data structures. It fully leverages the complementary nature of multi-modal information. Experimental results on the KITTI object detection benchmark dataset show that the proposed 3D object detection algorithm achieves an average detection accuracy of 80.95%, a 1.34% improvement compared to previous multi-modal algorithms, demonstrating superior 3D object detection performance.

Keywords

3D object detection point cloud multi-modal fusion

Get full access to this article

View all access options for this article.

References

Agyemang

I. O.

Zhang

Adjei-Mensah

Acheampong

Fiasam

L. D.

Sey

Effah

(2023). Automated vision-based structural health inspection and assessment for post-construction civil infrastructure. Automation in Construction, 156(12), 105–153. https://doi.org/10.1016/j.autcon.2023.105153.

Cai

Fan

Feris

R. S.

Vasconcelos

(2016). A unified multi-scale deep convolutional neural network for fast object detection. In Computer vision–ECCV 2016 (pp. 354–370). Springer.

Calandra

Seyfarth

Peters

Deisenroth

M. P.

(2016). Bayesian Optimization for learning gaits under uncertainty: An experimental comparison on a dynamic bipedal walker. Annals of Mathematics and Artificial Intelligence, 76(1), 5–23. https://doi.org/10.1007/s10472-015-9463-9

Chen

Kundu

Zhu

Fidler

Urtasun

(2017). 3d Object proposals using stereo imagery for accurate object class detection. In IEEE Transactions on pattern analysis and machine intelligence (pp. 1259–1272). IEEE.

Chen

Wan

Xia

(2017). Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1907–1915). IEEE.

Chouhan

S. S.

Singh

U. P.

Jain

(2020). Applications of computer vision in plant pathology: A survey. Archives of Computational Methods in Engineering, 27(2), 611–632. https://doi.org/10.1007/s11831-019-09324-0

Dai

Kong

Guo

(2020). EPNet: Learning to exit with flexible multi-branch network. In Proceedings of the 29th ACM international conference on information & knowledge management (pp. 235–244). ACM.

Geiger

Lenz

Urtasun

(2012). Are we ready for autonomous driving? the KITTI vision benchmark suite. In Conference on Computer Vision and Pattern Recognition (pp. 3354–3361). IEEE.

Huang

K.-C.

T.-H.

H.-T.

Hsu

W. H.

(2022). Monodtr: Monocular 3d object detection with depth-aware transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4012–4021). IEEE.

10.

Mozifian

Lee

Harakeh

Waslander

S. L.

(2018). Joint 3d proposal generation and object detection from view aggregation. In 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1–8). IEEE.

11.

Lang

A. H.

Vora

Caesar

Zhou

Yang

Beijbom

(2019). Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12697–12705). IEEE.

12.

Feng

Cai

Sotelo

M. A.

(2023). Localization for intelligent vehicles in underground car parks based on semantic information. In IEEE Transactions on Intelligent Transportation Systems (pp. 1317–1332). IEEE.

13.

Liang

Yang

Chen

Urtasun

(2019). Multi-task multi-sensor fusion for 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7345–7353). IEEE.

14.

Lin

T.-Y.

Dollár

Girshick

Hariharan

Belongie

(2017). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117–2125). IEEE.

15.

Lin

T.-Y.

Goyal

Girshick

Dollár

(2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980–2988). IEEE.

16.

Mousavian

Anguelov

Flynn

Kosecka

(2017). 3d Bounding box estimation using deep learning and geometry. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7074–7082). IEEE.

17.

Pang

Morris

Radha

(2020). CLOCs: Camera-LiDAR object candidates fusion for 3D object detection. In 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 10386–10393). IEEE.

18.

C. R.

Liu

Guibas

L. J.

(2018). Frustum pointnets for 3d object detection from rgb-d data. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 918–927). IEEE.

19.

C. R.

Guibas

L. J.

(2017). Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 652–660). IEEE.

20.

Saxena

Chouhan

S. S.

Aziz

R. M.

Agarwal

(2024). A comprehensive evaluation of marine predator chaotic algorithm for feature selection of COVID-19. Evolving Systems, 15(4), 1235–1248. https://doi.org/10.1007/s12530-023-09557-2

21.

Shi

Wang

(2019). Pointrcnn: 3d object proposal generation and detection from point cloud. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 770–779). IEEE.

22.

Sindagi

V. A.

Zhou

Tuzel

(2019). MVX-Net: Multimodal VoxelNet for 3d object detection. In 2019 International conference on robotics and automation (pp. 7276–7282). IEEE.

23.

Solanki

Singh

U. P.

Chouhan

S. S.

(2023). Brain tumor classification using ML and DL approaches. In 2023 IEEE 5th international conference on cybernetics, cognition and machine learning applications (pp. 204–208). IEEE.

24.

Solanki

Singh

U. P.

Chouhan

S. S.

Jain

(2023a). A systematic analysis of magnetic resonance images and deep learning methods used for diagnosis of brain tumor. Multimedia Tools and Applications, 83(8), 23929–23966. https://doi.org/10.1007/s11042-023-16430-6

25.

Solanki

Singh

U. P.

Chouhan

S. S.

Jain

(2023b). Brain tumour detection and classification by using deep learning classifier. International Journal of Intelligent Systems and Applications in Engineering, 11(2s), 279–292.

26.

Wang

Chao

W.-L.

Garg

Hariharan

Campbell

Weinberger

K. Q.

(2019). Pseudo-lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8445–8453). IEEE.

27.

Wang

Jia

(2019). Frustum convnet: Sliding frustums to aggregate local point-wise features for amodal 3d object detection. In 2019 IEEE/RSJ international conference on intelligent robots and systems (pp. 1742–1749). IEEE.

28.

Anguelov

Jain

(2018). Pointfusion: Deep sensor fusion for 3d bounding box estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 244–253). IEEE.

29.

Chen

(2018). Multi-level fusion based 3d object detection from monocular images. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2345–2353). IEEE.

30.

Zhou

Fang

Yin

Bin

Zhang

(2021). Fusionpainting: Multimodal fusion with adaptive attention for 3d object detection. In 2021 IEEE international intelligent transportation systems conference (ITSC) (pp. 3047–3054). IEEE.

31.

Yan

Mao

(2018). Second: Sparsely embedded convolutional detection. Sensors, 18(10), 3337. https://doi.org/10.3390/s18103337

32.

Zhao

Liu

Huang

(2019). 3D Object detection using scale invariant and feature reweighting networks. In Proceedings of the AAAI conference on artificial intelligence (pp. 9267–9274). AAAI Press.

33.

Zhou

Tuzel

(2018). Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4490–4499). IEEE.

3D Object Detection Algorithm for Autonomous Driving Based on Multi-Scale Feature Weighted Point-by-Point Fusion

Abstract

Keywords

Get full access to this article

References