Sage Journals: Discover world-class research

Abstract

The efficacy of autonomous driving systems is fundamentally linked to their ability to accurately perceive and interpret their surroundings. The investigation addresses the challenges inherent in the sparse and unordered 3D point cloud data produced by LiDAR sensors by introducing a refined PointPillars network architecture. This innovative framework, which seamlessly integrates Transformer modules and ECA-PP (Efficient Channel Attention PointPillars) modules, substantially elevates the precision and efficiency of 3D object detection tasks. The Transformer module adeptly harnesses self-attention mechanisms to extract and process global contextual information from point cloud data, capturing not only local features but also comprehending the overall structure and layout of the scene. Meanwhile, the ECA-PP module enhances feature discrimination by focusing on channel attention, effectively distinguishing and emphasizing salient features crucial for object detection. Comprehensive experimental evaluations, conducted using the KITTI dataset and nuScenes dataset, have yielded compelling evidence that the proposed algorithm significantly outperforms current technologies in terms of detection accuracy for various categories, including cars, pedestrians, and cyclists.

Keywords

Autonomous driving 3D object detection point cloud PointPillars Transformer

Get full access to this article

View all access options for this article.

References

Montiel-Marín

Llamazares

Antunes

, et al. Point cloud painting for 3D object detection with camera and automotive 3+1D RADAR fusion. Sensors 2024; 24: 20.

Guo

Huang

, et al. Pillar-based multilayer pseudo-image 3D object detection. J Electron Imaging 2024; 33: 14.

Mao

Shi

Wang

, et al. 3D object detection for autonomous driving: a comprehensive survey. Int J Comput Vis 2023; 131: 1909–1963.

Xie

Lin

Zheng

, et al. Dense sequential fusion: point cloud enhancement using foreground mask guidance for multimodal 3-D object detection. IEEE Trans Instrum Meas 2024; 73: 15.

. 3D object detection based on point cloud in automatic driving scene. Multimed Tools Appl 2024; 83: 13029–13044.

Lang

Vora

Caesar

, et al. PointPillars: fast encoders for object detection from point clouds. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019, pp.12689–12697. New York: IEEE.

Geiger

Lenz

Urtasun

. Are we ready for autonomous driving? The KITTI vision benchmark suite. In: IEEE conference on computer vision and pattern recognition (CVPR), Providence, RI, USA, 16–21 June 2012, pp.3354–3361. New York: IEEE.

Caesar

Bankiti

Lang

, et al. nuScenes: a multimodal dataset for autonomous driving. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), Seattle, WA, USA, 13–19 June 2020.

, et al. PointNet: deep learning on point sets for 3D classification and segmentation. In: 30th IEEE/CVF conference on computer vision and pattern recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017, pp.77–85. New York: IEEE.

10.

Shi

Wang

, et al. PointRCNN: 3D object proposal generation and detection from point cloud. In: 32nd IEEE/CVF conference on computer vision and pattern recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019, pp.770–779. New York: IEEE.

11.

Zhou

Tuzel

. VoxelNet: end-to-end learning for point cloud based 3D object detection. In: 31st IEEE/CVF conference on computer vision and pattern recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018, pp.4490–4499. New York: IEEE.

12.

Yan

Mao

. SECOND: sparsely embedded convolutional detection. Sensors 2018; 18: 17.

13.

Guo

Cai

Liu

, et al. PCT: point cloud transformer. Comput Vis Media 2021; 7: 187–199.

14.

Liu

Yang

Tang

, et al. FlatFormer: flattened window attention for efficient point cloud transformer. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023, pp.1200–1211. New York: IEEE.

15.

Wang

Jia

Zhang

, et al. Object detection in 3D point cloud based on ECA mechanism. J Circuits Syst Comput 2023; 32: 17.

16.

Vaswani

Shazeer

Parmar

, et al. Attention is all you need. In: 31st annual conference on neural information processing systems (NIPS), Long Beach, CA, USA, 4–9 December 2017.

17.

Zhao

Jiang

Jia

, et al. Point transformer. In: 18th IEEE/CVF international conference on computer vision (ICCV), 2021, pp.16239–16248. New York: IEEE.

18.

Xia

, et al. SOE-Net: a self-attention and orientation encoding network for point cloud based place recognition. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), Nashville, TN, USA, 20–25 June 2021, pp.11343–11352. New York: IEEE.

19.

, et al. Voxel set transformer: a set-to-set approach to 3D object detection from point clouds. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), New Orleans, LA, USA, 2022, pp.8407–8417. New York: IEEE.

20.

Zhang

Meng

Yan

, et al. Transformer-based global PointPillars 3D object detection method. Electronics 2023; 12: 13.

21.

Chollet

. Xception: deep learning with depthwise separable convolutions. In: 30th IEEE/CVF conference on computer vision and pattern recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017, pp.1800–1807. New York: IEEE.

22.

Wang

Zhu

, et al. ECA-Net: efficient channel attention for deep convolutional neural networks. arXiv preprint arXiv:1910.03151v1, 2020.

23.

Shen

Albanie

, et al. Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 2020; 42: 2011–2023.

24.

Wang

Chen

. A LiDAR multi-object detection algorithm for autonomous driving. Appl Sci 2023; 13: 16.

25.

Lin

Goyal

Girshick

, et al. Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 2020; 42: 318–327.

26.

Chen

Wan

, et al. Multi-view 3D object detection network for autonomous driving. In: 30th IEEE/CVF conference on computer vision and pattern recognition (CVPR). Honolulu, HI, USA, 21–26 July 2017, pp.6526–6534. New York: IEEE.

27.

Everingham

Van Gool

Williams

CKI

, et al. The Pascal Visual Object Classes (VOC) challenge. Int J Comput Vis 2010; 88: 303–338.

28.

Liu

, et al. Frustum PointNets for 3D object detection from RGB-D data. In: 31st IEEE/CVF conference on computer vision and pattern recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018, pp.918–927. New York: IEEE.

29.

Simon

Milz

Amende

, et al. Complex-YOLO: an Euler-region-proposal for real-time 3D object detection on point clouds. In: 15th European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018, pp.197–209. Cham, Switzerland: Springer Cham.

30.

Liu

Tan

, et al. Stereodistill: pick the cream from lidar for distilling stereo-based 3d object detection. Proc AAAI Conf Artif Intell 2023; 37: 1790–1798.

31.

Hayeon

Yang

Huh

. SeSame: simple, easy 3D object detection with point-wise semantics. In: Cho

Laptev

Tran

, et al. (eds) Asian conference on computer vision. Singapore: Springer, 2025, pp.211–227.

32.

Mao

Xue

Niu

, et al. Voxel transformer for 3D object detection. In: 18th IEEE/CVF international conference on computer vision (ICCV), 2021, pp.3144–3153. New York: IEEE.

33.

Chen

, et al. Unifying voxel-based representation with transformer for 3d object detection. Adv Neural Inf Process Syst 2022; 35: 18442–18455.

34.

Jiang

Zhang

Miao

, et al. PolarFormer: multi-camera 3D object detection with polar transformer. Proc AAAI Conf Artif Intell 2023; 37: 1042–1050.

35.

Wang

, et al. BEVFormer: learning Bird’s-Eye-View representation from multi-camera images via spatiotemporal transformers. arXiv preprint arXiv:2203. 17270, 2022.

Optimizing PointPillars for 3D detection in autonomous driving: The transformer and ECA-PP synergy

Abstract

Keywords

Get full access to this article

References