Three-Dimensional Object Detection in Point Clouds with Multi-Stage Proposal Refinement Network

Authors

DOI:

https://doi.org/10.18196/jrc.v6i2.25602

Keywords:

Object Detection, LiDAR 3D Point Clouds, Progressive Refinement, Localization

Abstract

Three-dimensional object detection in point clouds serves a vital role in autonomous driving and robotics. Point Clouds provide a vivid representation of 3D data that enables reliable object detection by acquiring the spatial distribution of points in a scene, facilitating the localization and identification of the objects within three-dimensional space. Precise localization of the objects remains challenging, particularly for moderately visible objects which attributes to inconsistent quality proposals. To tackle this, this paper presents a multi-stage proposal refinement network to generate the qualitative predictions. The research contribution is, first to improve the quality of proposals in partially visible objects, the model is integrated with 3D Resnet backbone through the refinement module at various stages. Second, to improve the quality of predictions, a confidence-weighted box voting mechanism is incorporated ensuring the precise bounding box detections. Experimentation analysis was carried out on the KITTI, NuScenes and the custom LIDAR datasets. Notably, the proposed method achieves an average precision of 82.45% for Car class, 44.94% for Pedestrian class and 66.12% for Cyclist class on the moderate category of KITTI dataset, but in the hard category with high occlusion need to be improved. On Nuscenes dataset, the model achieved mAP of 66.2%. In custom dataset, 2739 training frames, 342 frames for validation, and 343 frames for testing were taken which achieved an average precision of 82.40% for Car, 44.10% for pedestrian and 67.90% for Cyclist. The results indicate that multi-stage refinement network enhances to perform the object detection precisely, which is critical to localize and detect the target in autonomous driving and robotics.

Author Biographies

Jyothsna Datti, VNR Vignana Jyothi Institute of Engineering and Technology

Ms. D. Jyothsna received her B. Tech degree from JNTU Kakinada, Andhra Pradesh, India. She did her
M. Tech degree from Andhra University, Andhra Pradesh, India. She is currently pursuing her full-
time Ph.D. in the department of CSE at Jawaharlal Nehru Technological University Hyderabad
research center VNR Vignana Jyothi Institute of Engineering and Technology. Her research interests
include object detection, 3D point cloud processing and image processing. She worked as Assistant
Professor and has 7 years of teaching experience and 2.5 years of research experience.

Ramesh Chandra Gollapudi, VNR Vignana Jyothi Institute of Engineering and Technology

Dr. G. Ramesh Chandra is presently working as a Professor in the Department of Computer Science
and Engineering and is the Head of the Department for Research and Development Cell, VNR
Vignana Jyothi Institute of Engineering and Technology, Hyderabad, Telangana, India. He received
Ph.D. degree in the area of image processing with thesis titled ‘‘Detection of Superficial and
Volumetric Features in 3-D Digital Images’’ under JNTU-Hyderabad. He has 23 Years of teaching and
research experience. His areas of interest include Image processing, video processing, point cloud
processing and Data Mining. He has completed 6 research projects, 1 consultancy project and 2
ongoing DRDO research projects. He has developed various software products such as Logical Video
Processing System, Logical 3-D Image Processing System, Logical Pattern Generation System and
Logical Image Processing System. He also developed a compiler for ‘‘Xervo
Script Engine’’ for Cyber Motion Technologies Pvt. Ltd. He has published more than 43 research
papers in international journals and conferences. He visited Seoul—South Korea, San Diego—USA
and Penang-Malaysia for presenting a paper in IEEE conferences in October-2012, October-2014 and
November-2017 respectively. He was awarded with ‘‘Best CSE Teacher’’ by ISTE AP&TS Section for
the year 2016. He is also awarded with ‘‘Distinguished Official Award’’ for the extensive contribution
towards conducting ICSCI2011 Conference Successfully by Pentagram Research Centre, Hyderabad.
He has completed 5 research projects and 1 ongoing project sponsored by Armament Research and
Development Establishment (ARDE-DRDO), Drive Lozics Pvt. Ltd., India, Mira Consulting Inc.,
Malaysia, Information Technology Research Academy (ITRA), MeItY, India and TEQIP-III.

References

Y. Guo, H. Wang, Q. Hu, H. Liu, L. Liu, and M. Bennamoun, “Deep Learning for 3D Point Clouds: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 12, pp. 4338–4364, 2021, doi: 10.1109/tpami.2020.3005434.

O. Bouazizi et al., “Road object detection using SSD-MobileNet algorithm: Case study for real-time ADAS applications,” Journal of Robotics and Control (JRC), vol. 5, no. 2, pp. 551–560, 2024, doi: 10.18196/jrc.v5i2.21145.

D. Jyothsna and G. R. Chandra, “Multi-Object Detection in 3D Point Cloud’s Range Image Using Deep-Learning Technique,” 14th International Conference on Cloud Computing, Data Science &Engineering (Confluence), pp. 401–406, 2024, doi: 10.1109/confluence60223.2024.10463402.

P. Chotikunnan, T. Puttasakul, R. Chotikunnan, B. Panomruttanarug, M. Sangworasil, and A. Srisiriwat, “Evaluation of Single and Dual image Object Detection through Image Segmentation Using ResNet18 in Robotic Vision Applications,” Journal of Robotics and Control (JRC), vol. 4, no. 3, pp. 263–277, 2023, doi: 10.18196/jrc.v4i3.17932.

J. Mao, S. Shi, X. Wang, and H. Li, “3D Object Detection for Autonomous Driving: A Comprehensive Survey,” International Journal of Computer Vision, vol. 131, no. 8, pp. 1909–1963, 2023, doi: 10.1007/s11263-023-01790-1.

W. Zimmer et al., “A survey of robust 3D object detection methods in point clouds,” arXiv preprint arXiv:2204.00106, 2022.

Z. Song et al., “VoxelNextFusion: A Simple, Unified, and Effective Voxel Fusion Framework for Multimodal 3-D Object Detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–12, 2023, doi: 10.1109/tgrs.2023.3331893.

L. Wang et al., “Multi-Modal 3D Object Detection in Autonomous Driving: A Survey and Taxonomy,” IEEE Transactions on Intelligent Vehicles, vol. 8, no. 7, pp. 3781–3798, 2023, doi: 10.1109/tiv.2023.3264658.

Z. Song et al., "Robustness-Aware 3D Object Detection in Autonomous Driving: A Review and Outlook," in IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 11, pp. 15407-15436, 2024, doi: 10.1109/TITS.2024.3439557.

S. Xu, F. Li, Z. Song, J. Fang, S. Wang and Z. -X. Yang, "Multi-Sem Fusion: Multimodal Semantic Fusion for 3-D Object Detection," in IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1-14, 2024, doi: 10.1109/TGRS.2024.3387732.

L. Wang et al., "Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving," in IEEE Transactions on Vehicular Technology, vol. 72, no. 5, pp. 5628-5641, 2023, doi: 10.1109/TVT.2022.3230265.

L. Wang et al., "Multi-Modal 3D Object Detection in Autonomous Driving: A Survey and Taxonomy," in IEEE Transactions on Intelligent Vehicles, vol. 8, no. 7, pp. 3781-3798, 2023, doi: 10.1109/TIV.2023.3264658.

H. Xie, W. Zheng, Y. Chen, and H. Shin, “Camera and LiDAR-based point painted voxel region-based convolutional neural network for robust 3D object detection,” Journal of Electronic Imaging, vol. 31, no. 5, 2022, doi: 10.1117/1.jei.31.5.053025.

L. Wang et al., "Fuzzy-NMS: Improving 3D Object Detection With Fuzzy Classification in NMS," in IEEE Transactions on Intelligent Vehicles, 2024, doi: 10.1109/TIV.2024.3409684.

X. Zhang et al., "RI-Fusion: 3D Object Detection Using Enhanced Point Features With Range-Image Fusion for Autonomous Driving," in IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1-13, 2023, doi: 10.1109/TIM.2022.3224525.

X. Zou, H. He, and C. Li, “Three-dimensional point cloud registration algorithm based on the correlation coefficient,” Journal of Electronic Imaging, vol. 32, no. 1, 2023, doi: 10.1117/1.jei.32.1.013010.

L. Fan, F. Wang, N. Wang and Z. Zhang, "FSD V2: Improving Fully Sparse 3D Object Detection with Virtual Voxels," in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, doi: 10.1109/TPAMI.2024.3502456.

L. Wang et al., “SAT-GCN: Self-attention graph convolutional network-based 3D object detection for autonomous driving,” Knowledge-Based Systems, vol. 259, p. 110080, 2023, doi: 10.1016/j.knosys.2022.110080.

Z. Song, H. Wei, L. Bai, L. Yang, and C. Jia, “GraphAlign: Enhancing Accurate Feature Alignment by Graph matching for Multi-Modal 3D Object Detection,” IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3335–3346, 2023, doi: 10.1109/iccv51070.2023.00311.

Z. Song, C. Jia, L. Yang, H. Wei, and L. Liu, “GraphAlign++: An Accurate Feature Alignment by Graph Matching for Multi-Modal 3D Object Detection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 4, pp. 2619–2632, 2024, doi: 10.1109/tcsvt.2023.3306361.

Y. Liu, S. Meng, H. Wang, and J. Liu, “Deep learning based object detection from multi-modal sensors: an overview,” Multimedia Tools and Applications, vol. 83, no. 7, pp. 19841–19870, 2023, doi: 10.1007/s11042-023-16275-z.

Le, Van-Hung. "Visual Slam and Visual Odometry Based on RGB-D Images Using Deep Learning: A Survey," Journal of Robotics and Control (JRC) vol. 5, no. 4, 2024, doi:10.18196/jrc.v5i4.22061.

R. Q. Charles, H. Su, M. Kaichun, and L. J. Guibas, “PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 77–85, 2017, doi: 10.1109/cvpr.2017.16.

C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” Advances in neural information processing systems, vol. 30, 2017.

C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, “Frustum PointNets for 3D Object Detection from RGB-D Data,” IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 918–927, 2018, doi: 10.1109/cvpr.2018.00102.

W. Liu et al., “SSD: Single Shot MultiBox Detector,” Computer Vision – ECCV, pp. 21–37, 2016, doi: 10.1007/978-3-319-46448-0_2.

W. Zheng, W. Tang, L. Jiang, and C.-W. Fu, “SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud,” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14489–14498, 2021, doi: 10.1109/cvpr46437.2021.01426.

C. He, H. Zeng, J. Huang, X.-S. Hua, and L. Zhang, “Structure Aware Single-Stage 3D Object Detection From Point Cloud,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11870–11879, 2020, doi: 10.1109/cvpr42600.2020.01189.

Z. Yang, Y. Sun, S. Liu, and J. Jia, “3DSSD: Point-Based 3D Single Stage Object Detector,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11040-11048, 2020, doi: 10.1109/cvpr42600.2020.01105.

H. Yang et al., “PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13476–13487, 2023, doi: 10.1109/cvpr52729.2023.01295.

Y. Zhou and O. Tuzel, “VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4490–4499, 2018, doi: 10.1109/cvpr.2018.00472.

A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “PointPillars: Fast Encoders for Object Detection From Point Clouds,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12697-12705, 2019, doi: 10.1109/cvpr.2019.01298.

W. Shi and R. Rajkumar, “Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1708–1716, 2020, doi: 10.1109/cvpr42600.2020.00178.

S. Shi, X. Wang, and H. Li, “Pointrcnn: 3D Object Proposal Generation and Detection From Point Cloud,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–779, 2019, doi: 10.1109/cvpr.2019.00086.

K. Fukitani et al., “3D object detection using improved PointRCNN,” Cognitive Robotics, vol. 2, pp. 242–254, 2022, doi: 10.1016/j.cogr.2022.12.001.

S. Shi, Z. Wang, J. Shi, X. Wang and H. Li, "From Points to Parts: 3D Object Detection From Point Cloud With Part-Aware and Part-Aggregation Network," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 8, pp. 2647-2664, 1 Aug. 2021, doi: 10.1109/TPAMI.2020.2977026.

J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, “Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 2, pp. 1201–1209, 2021, doi: 10.1609/aaai.v35i2.16207.

J. Mao, M. Niu, H. Bai, X. Liang, H. Xu, and C. Xu, “Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection,” IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2703–2712, 2021, doi: 10.1109/iccv48922.2021.00272.

Y. Yan, Y. Mao, and B. Li, “SECOND: Sparsely Embedded Convolutional Detection,” Sensors, vol. 18, no. 10, p. 3337, 2018, doi: 10.3390/s18103337.

W. Ma, J. Chen, Q. Du, and W. Jia, “PointDrop: Improving Object Detection from Sparse Point Clouds via Adversarial Data Augmentation,” 25th International Conference on Pattern Recognition (ICPR), pp. 10004–10009, 2021, doi: 10.1109/icpr48806.2021.9412691.

S. Shi et al., “PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10526–10535, 2020, doi: 10.1109/cvpr42600.2020.01054.

S. Shi et al., “PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection,” International Journal of Computer Vision, vol. 131, no. 2, pp. 531–551, 2022, doi: 10.1007/s11263-022-01710-9.

H. Kuang, B. Wang, J. An, M. Zhang, and Z. Zhang, “Voxel-FPN: Multi-Scale Voxel Feature Aggregation for 3D Object Detection from LIDAR Point Clouds,” Sensors, vol. 20, no. 3, p. 704, 2020, doi: 10.3390/s20030704.

Y. Chen, J. Liu, X. Zhang, X. Qi, and J. Jia, “VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 21674–21683, 2023, doi: 10.1109/cvpr52729.2023.02076.

T. Chen and C. Han, “Three-dimensional object detection with spatial-semantic features of point clouds,” Journal of Electronic Imaging, vol. 32, no. 5, 2023, doi: 10.1117/1.jei.32.5.053039.

J. Han, Z. Wan, Z. Liu, J. Feng, and B. Zhou, “SparseDet: Towards End-to-End 3D Object Detection,” Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 781–792, 2022, doi: 10.5220/0010918000003124.

D. Zhang, Z. Zheng, H. Niu, X. Wang, and X. Liu, “Fully Sparse Transformer 3-D Detector for LiDAR Point Cloud,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–12, 2023, doi: 10.1109/tgrs.2023.3328929.

L. Liu et al., “SparseDet: A Simple and Effective Framework for Fully Sparse LiDAR-Based 3-D Object Detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–14, 2024, doi: 10.1109/tgrs.2024.3468394.

T. Yin, X. Zhou, and P. Krahenbuhl, “Center-based 3D Object Detection and Tracking,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11784-11793, 2021, doi: 10.1109/cvpr46437.2021.01161.

Z. Song, H. Wei, C. Jia, Y. Xia, X. Li, and C. Zhang, “VP-Net: Voxels as Points for 3-D Object Detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–12, 2023, doi: 10.1109/tgrs.2023.3271020.

L. Fan, F. Wang, N. Wang, and Z.-X. Zhang, “Fully sparse 3d object detection,” Advances in Neural Information Processing Systems, vol. 35, pp. 351–363, 2022.

Y. Li, Y. Chen, X. Qi, Z. Li, J. Sun, and J. Jia, “Unifying Voxel based Representation with Transformer for 3D Object Detection,” arXiv preprint arXiv:2206.00630, 2022.

L. Guo, K. Lu, L. Huang, Y. Zhao, and Z. Liu, “Pillar-based multilayer pseudo-image 3D object detection,” Journal of Electronic Imaging, vol. 33, no. 1, 2024, doi: 10.1117/1.jei.33.1.013024.

J. Li, C. Luo, and X. Yang, “PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 17567–17576, 2023, doi: 10.1109/cvpr52729.2023.01685.

Li, Xusheng et al., "PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features." arXiv preprint arXiv:2405.09828, 2024.

G. Shi, R. Li, and C. Ma, “PillarNet: Real-Time and High-Performance Pillar-Based 3D Object Detection,” Computer Vision – ECCV, pp. 35–52, 2022, doi: 10.1007/978-3-031-20080-9_3.

Y. Yang, J. Wang, X. Guo, X. Yang, and W. Qin, "Methods for Improving Point Cloud Authenticity in LiDAR Simulation for Autonomous Driving: A Review," in IEEE Access, vol. 13, pp. 4562-4580, 2025, doi: 10.1109/ACCESS.2025.3525805

R. Ramana, V. Vasudevan, and B. S. Murugan, “Spectral Pyramid Pooling and Fused Keypoint Generation in ResNet-50 for Robust 3D Object Detection,” IETE Journal of Research, pp. 1–13, 2025, doi: 10.1080/03772063.2025.2453897.

F. Liu, Z. Lu, and X. Lin, “Vision-based environmental perception for autonomous driving,” Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering, vol. 239, no. 1, pp. 39-69, 2025, doi: 10.1177/09544070231203059.

R. Qiao, H. Ji, Z. Zhu, and W. Zhang, "Local-to-Global Semantic Learning for Multi-View 3D Object Detection From Point Cloud," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 10, pp. 9371-9385, Oct. 2024, doi: 10.1109/TCSVT.2024.3396870.

L. Du et al., "AGO-Net: Association-Guided 3D Point Cloud Object Detection Network," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 11, pp. 8097-8109, 1 Nov. 2022, doi: 10.1109/TPAMI.2021.3104172.

H. Liu, T. Su, and J. Guo, "Autonomous driving enhanced: a fusion framework integrating LiDAR point clouds with monovision depth-aware transformers for robust object detection," Engineering Research Express, p. 015414, 20 Jan. 2025.

N. Carion et al., "End-to-end object detection with transformers," European conference on computer vision, pp. 213-229, 2020.

A. Dosovitskiy, "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929, 2020.

H. Zhao et al., "Point transformer." Proceedings of the IEEE/CVF international conference on computer vision. 2021.

M.-H. Guo, J.-X. Cai, Z.-N. Liu, T.-J. Mu, R. R. Martin, and S.-M. Hu, “PCT: Point cloud transformer,” Computational Visual Media, vol. 7, no. 2, pp. 187–199, 2021, doi: 10.1007/s41095-021-0229-5.

X. Zhu et al., "Deformable detr: Deformable transformers for end-to-end object detection," arXiv preprint arXiv:2010.04159, 2020.

Y. Wang et al., "Detr3d: 3d object detection from multi-view images via 3d-to-2d queries." Conference on Robot Learning, PMLR, pp. 180–191, 2022.

Z. Cai and N. Vasconcelos, “Cascade R-CNN: Delving Into High Quality Object Detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6154–6162, 2018, doi: 10.1109/cvpr.2018.00644.

Z. Cai and N. Vasconcelos, “Cascade R-CNN: High Quality Object Detection and Instance Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 5, pp. 1483–1498, 2021, doi: 10.1109/tpami.2019.2956516.

A. Liu, L. Yuan, and J. Chen, “CSA-RCNN: Cascaded Self-Attention Networks for High-Quality 3-D Object Detection From LiDAR Point Clouds,” IEEE Transactions on Instrumentation and Measurement, vol. 73, pp. 1–13, 2024, doi: 10.1109/tim.2024.3476690.

Q. Cai et al., "3d cascade rcnn: High quality object detection in point clouds," arXiv preprint arXiv:2211.08248, 2022.

H. Shenga et al., “Improving 3D Object Detection with Channel-wise Transformer,” IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2723–2732, 2021, doi: 10.1109/iccv48922.2021.00274.

Q. Meng, W. Wang, T. Zhou, J. Shen, Y. Jia, and L. VanGool, “Towards a weakly supervised framework for 3D point cloud object detection and annotation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 8, pp.1–16, 2021.

B. Graham and L. van der Maaten, “Submanifold Sparse Convolutional Networks,” arXiv preprint arXiv:1706.01307, 2017.

S. Qiu, Y. Wu, S. Anwar, and C. Li, “Investigating Attention Mechanism in 3D Point Cloud Object Detection,” International Conference on 3D Vision (3DV), pp. 403–412, 2021, doi: 10.1109/3dv53792.2021.00050.

J. Zhang, J. Wang, D. Xu, and Y. Li, “HcNet: a point cloud object detection network based on height and channel attention,” Remote Sensing, vol. 13, no. 24, p. 5071, 2021.

C. He, R. Li, S. Li, and L. Zhang, “Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8407–8417, 2022, doi: 10.1109/cvpr52688.2022.00823.

A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? The KITTI vision benchmark suite,” 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361, 2012, doi: 10.1109/cvpr.2012.6248074.

M. Liu, J. Ma, Q. Zheng, Y. Liu, and G. Shi, “3D object detection based on attention and multi-scale feature fusion,” Sensors, vol. 22, no. 10, p. 3935, May 2022, doi: 10.3390/s22103935.

H. Caesar et al., “nuScenes: a multimodal dataset for autonomous driving,” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11621-11631, Jun. 2020, doi: 10.1109/cvpr42600.2020.01164.

B. Zhu et al., “Class-balanced grouping and sampling for point cloud 3d object detection,” arXiv preprint arXiv:1908.09492, 2019.

Y. Chen, Y. Li, X. Zhang, J. Sun, and J. Jia, “Focal sparse convolutional networks for 3D object detection,” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5418–5427, Jun. 2022, doi: 10.1109/cvpr52688.2022.00535.

P. Hu, J. Ziglar, D. Held, and D. Ramanan, “What You See is What You Get: Exploiting Visibility for 3D Object Detection,” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10998–11006, Jun. 2020, doi: 10.1109/cvpr42600.2020.01101.

X. Bai et al., “TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers,” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1080–1089, Jun. 2022, doi: 10.1109/cvpr52688.2022.00116.

O. D. Team. Openpcdet: An open-source toolbox for 3d object detection from point clouds. https://github.com/open-mmlab/OpenPCDet, 2020.

Downloads

Published

2025-03-28

Issue

Section

Articles