Skip to content

[WACV 2025 Oral] Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding

License

Notifications You must be signed in to change notification settings

ldkong1205/Calib3D

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English | 简体中文

Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding

Lingdong Kong1,2,*    Xiang Xu3,*    Jun Cen4    Wenwei Zhang1
Liang Pan1    Kai Chen1    Ziwei Liu5
1Shanghai AI Laboratory    2National University of Singapore    3Nanjing University of Aeronautics and Astronautics    4The Hong Kong University of Science and Technology    5S-Lab, Nanyang Technological University

About

Calib3D is a comprehensive benchmark targeted at probing the uncertainties of 3D scene understanding models in real-world conditions. It encompasses a systematic study of state-of-the-art models across diverse 3D datasets, laying a solid foundation for the future development of reliable 3D scene understanding systems.

  • 🚧 Aleatoric Uncertainty in 3D: We examine how intrinsic factors, such as sensor measurement noises and point cloud density variations, contribute to data uncertainty in 3D scene understanding. Such uncertainty cannot be reduced even with more data or improved models, necessitating efforts that effectively interpret and quantify this inherent variability.
  • 🚌 Epistemic Uncertainty in 3D: Different from the rather unified network structures in 2D, 3D scene understanding models shed a wider array of structures due to the complex nature of 3D data processing. Our investigation extends to the model uncertainty associated with the diverse 3D architectures, highlighting the importance of addressing knowledge gaps in model training and data representation.

Motivation

Well-calibrated 3D scene understanding models are anticipated to deliver low uncertainties when predictions are accurate and high uncertainties when predictions are inaccurate. Existing 3D models (UnCal) struggled to provide proper uncertainty estimates. The plots shown are the point-wise expected calibration error (ECE) rates. The colormap goes from dark to light denoting low and high error rates, respectively.

Visit our project page to explore more examples. 🚙

Updates

  • [2024.03] - Our paper is available on arXiv. The code has been made publicly accessible. 🚀

Outline

⚙️ Installation

For details related to installation and environment setups, kindly refer to INSTALL.md.

♨️ Data Preparation

nuScenes SemanticKITTI Waymo Open SemanticSTF SemanticPOSS
ScribbleKITTI Synth4D S3DIS nuScenes-C SemanticKITTI-C

Kindly refer to DATA_PREPARE.md for the details to prepare these datasets.

🚀 Getting Started

To learn more usage about this codebase, kindly refer to GET_STARTED.md.

🐉 Model Zoo

 Range View
 Bird's Eye View
 Sparse Voxel
 Multi-View Fusion
 Raw Point
 3D Data Augmentation
 SparseConv Backend

📐 Calib3D Benchmark

Reliability Diagram

The reliability diagrams of visualized calibration gaps on the val set of SemanticKITTI. UnCal, TempS, MetaC, and DeptS denote the uncalibrated, temperature, meta, and our depth-aware scaling calibrations, respectively.

In-Domain 3D Uncertainty

 nuScenes & SemanticKITTI
Method Modal nuScenes SemanticKITTI
UnCal TempS LogiS DiriS MetaC DeptS UnCal TempS LogiS DiriS MetaC DeptS
RangeNet++ Range 4.57% 2.74% 2.79% 2.73% 2.78% 2.61% 4.01% 3.12% 3.16% 3.59% 2.38% 2.33%
SalsaNext Range 3.27% 2.59% 2.58% 2.57% 2.52% 2.42% 5.37% 4.29% 4.31% 4.11% 3.35% 3.19%
FIDNet Range 4.89% 3.35% 2.89% 2.61% 4.55% 4.33% 5.89% 4.04% 4.15% 3.82% 3.25% 3.14%
CENet Range 4.44% 2.47% 2.53% 2.58% 2.70% 2.44% 5.95% 3.93% 3.79% 4.28% 3.31% 3.09%
RangeViT Range 2.52% 2.50% 2.57% 2.56% 2.46% 2.38% 5.47% 3.16% 4.84% 8.80% 3.14% 3.07%
RangeFormer Range 2.44% 2.40% 2.41% 2.44% 2.27% 2.15% 3.99% 3.67% 3.70% 3.69% 3.55% 3.30%
FRNet Range 2.27% 2.24% 2.22% 2.28% 2.22% 2.17% 3.46% 3.53% 3.54% 3.49% 2.83% 2.75%
PolarNet BEV 4.21% 2.47% 2.54% 2.59% 2.56% 2.45% 2.78% 3.54% 3.71% 3.70% 2.67% 2.59%
MinkUNet18 Voxel 2.45% 2.34% 2.34% 2.42% 2.29% 2.23% 3.04% 3.01% 3.08% 3.30% 2.69% 2.63%
MinkUNet34 Voxel 2.50% 2.38% 2.38% 2.53% 2.32% 2.24% 4.11% 3.59% 3.62% 3.63% 2.81% 2.73%
Cylinder3D Voxel 3.19% 2.58% 2.62% 2.58% 2.39% 2.29% 5.49% 4.36% 4.48% 4.42% 3.40% 3.09%
SpUNet18 Voxel 2.58% 2.41% 2.46% 2.59% 2.36% 2.25% 3.77% 3.47% 3.44% 3.61% 3.37% 3.21%
SpUNet34 Voxel 2.60% 2.52% 2.47% 2.66% 2.41% 2.29% 4.41% 4.33% 4.34% 4.39% 4.20% 4.11%
RPVNet Fusion 2.81% 2.70% 2.73% 2.79% 2.68% 2.60% 4.67% 4.12% 4.23% 4.26% 4.02% 3.75%
2DPASS Fusion 2.74% 2.53% 2.51% 2.51% 2.62% 2.46% 2.32% 2.35% 2.45% 2.30% 2.73% 2.27%
SPVCNN18 Fusion 2.57% 2.44% 2.49% 2.54% 2.40% 2.31% 3.46% 2.90% 3.07% 3.41% 2.36% 2.32%
SPVCNN34 Fusion 2.61% 2.49% 2.54% 2.61% 2.37% 2.28% 3.61% 3.03% 3.07% 3.10% 2.99% 2.86%
CPGNet Fusion 3.33% 3.11% 3.17% 3.15% 3.07% 2.98% 3.93% 3.81% 3.83% 3.78% 3.70% 3.59%
GFNet Fusion 2.88% 2.71% 2.70% 2.73% 2.55% 2.41% 3.07% 3.01% 2.99% 3.05% 2.88% 2.73%
UniSeg Fusion 2.76% 2.61% 2.63% 2.65% 2.45% 2.37% 3.93% 3.73% 3.78% 3.67% 3.51% 3.43%
KPConv Point 3.37% 3.27% 3.34% 3.32% 3.28% 3.20% 4.97% 4.88% 4.90% 4.91% 4.78% 4.68%
PIDS1.25x Point 3.46% 3.40% 3.43% 3.41% 3.37% 3.28% 4.77% 4.65% 4.66% 4.64% 4.57% 4.49%
PIDS2.0x Point 3.53% 3.47% 3.49% 3.51% 3.34% 3.27% 4.91% 4.83% 4.72% 4.89% 4.66% 4.47%
PTv2 Point 2.42% 2.34% 2.46% 2.55% 2.48% 2.19% 4.95% 4.78% 4.71% 4.94% 4.69% 4.62%
WaffleIron Point 4.01% 2.65% 3.06% 2.59% 2.54% 2.46% 3.91% 2.57% 2.86% 2.67% 2.58% 2.51%
 Other Datasets
Dataset Type Method Modal UnCal TempS LogiS DiriS MetaC DeptS mIoU
Waymo Open High-Res PolarNet BEV 3.92% 1.93% 1.90% 1.91% 2.39% 1.84% 58.33%
MinkUNet Voxel 1.70% 1.70% 1.74% 1.76% 1.69% 1.59% 68.67%
SPVCNN Fusion 1.81% 1.79% 1.80% 1.88% 1.74% 1.69% 68.86%
SemanticPOSS Dynamic PolarNet BEV 4.24% 8.09% 7.81% 8.30% 5.35% 4.11% 52.11%
MinkUNet Voxel 7.22% 7.44% 7.36% 7.62% 5.66% 5.48% 56.32%
SPVCNN Fusion 8.80% 6.53% 6.91% 7.41% 4.61% 3.98% 53.51%
SemanticSTF Weather PolarNet BEV 5.76% 4.94% 4.49% 4.53% 4.17% 4.12% 51.26%
MinkUNet Voxel 5.29% 5.21% 4.96% 5.10% 4.78% 4.72% 50.22%
SPVCNN Fusion 5.85% 5.53% 5.16% 5.05% 5.12% 4.97% 51.73%
ScribbleKITTI Scribble PolarNet BEV 4.65% 4.59% 4.56% 4.55% 3.25% 3.09% 55.22%
MinkUNet Voxel 7.97% 7.13% 7.29% 7.21% 5.93% 5.74% 59.87%
SPVCNN Fusion 7.04% 6.63% 6.93% 6.66% 5.34% 5.13% 60.22%
Synth4D Synthetic PolarNet BEV 1.68% 0.93% 0.75% 0.72% 1.54% 0.69% 85.63%
MinkUNet Voxel 2.43% 2.72% 2.43% 2.05% 4.01% 2.39% 69.11%
SPVCNN Fusion 2.21% 2.35% 1.86% 1.70% 3.44% 1.67% 69.68%
S3DIS Indoor PointNet++ Point 9.13% 8.36% 7.83% 8.20% 6.93% 6.79% 56.96%
DGCNN Point 6.00% 6.23% 6.35% 7.12% 5.47% 5.39% 54.50%
PAConv Point 8.38% 5.87% 6.03% 5.98% 4.67% 4.57% 66.60%

Domain-Shift 3D Uncertainty

 nuScenes-C & SemanticKITTI-C
Type nuScenes-C SemanticKITTI-C
UnCal TempS LogiS DiriS MetaC DeptS UnCal TempS LogiS DiriS MetaC DeptS
Clean 2.45% 2.34% 2.34% 2.42% 2.29% 2.23% 3.04% 3.01% 3.08% 3.30% 2.69% 2.63%
Fog 5.52% 5.42% 5.49% 5.43% 4.77% 4.72% 12.66% 12.55% 12.67% 12.48% 11.08% 10.94%
Wet Ground 2.63% 2.54% 2.54% 2.64% 2.55% 2.52% 3.55% 3.46% 3.54% 3.72% 3.33% 3.28%
Snow 13.79% 13.32% 13.53% 13.59% 11.37% 11.31% 7.10% 6.96% 6.95% 7.26% 5.99% 5.63%
Motion Blur 9.54% 9.29% 9.37% 9.01% 8.32% 8.29% 11.31% 11.16% 11.24% 12.13% 9.00% 8.97%
Beam Missing 2.58% 2.48% 2.49% 2.57% 2.53% 2.47% 2.87% 2.83% 2.84% 2.98% 2.83% 2.79%
Crosstalk 13.64% 13.00% 12.97% 13.44% 9.98% 9.73% 4.93% 4.83% 4.86% 4.81% 3.54% 3.48%
Incomplete Echo 2.44% 2.33% 2.33% 2.42% 2.32% 2.21% 3.21% 3.19% 3.25% 3.48% 2.84% 2.19%
Cross Sensor 4.25% 4.15% 4.20% 4.28% 4.06% 3.20% 3.15% 3.13% 3.18% 3.43% 3.17% 2.96%
Average 6.78% 6.57% 6.62% 6.67% 5.74% 5.56% 6.10% 6.01% 6.07% 6.29% 5.22% 5.03%

Model Configurations

ECE vs. mIoU SparseConv Backend LiDAR Modality

📝 TODO List

  • Initial release. 🚀
  • Add 3D calibration benchmarks.
  • Add 3D calibration algorithms.
  • Add acknowledgments.
  • Add citations.
  • Add more 3D scene understanding models.

Citation

If you find this work helpful for your research, please kindly consider citing our papers:

@article{kong2024calib3d,
    author = {Lingdong Kong and Xiang Xu and Jun Cen and Wenwei Zhang and Liang Pan and Kai Chen and Ziwei Liu},
    title = {Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding},
    journal = {arXiv preprint arXiv:2403.17010},
    year = {2024},
}
@misc{mmdet3d,
    title = {MMDetection3D: OpenMMLab Next-Generation Platform for General 3D Object Detection},
    author = {MMDetection3D Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmdetection3d}},
    year = {2020}
}

License

This work is under the Apache License Version 2.0, while some specific implementations in this codebase might be with other licenses. Kindly refer to LICENSE.md for a more careful check, if you are using our code for commercial matters.

Acknowledgements

This work is developed based on the MMDetection3D codebase.


MMDetection3D is an open-source toolbox based on PyTorch, towards the next-generation platform for general 3D perception. It is a part of the OpenMMLab project developed by MMLab.

Part of the benchmarked models are from the OpenPCSeg and Pointcept codebases.

We acknowledge the use of the following public resources, during the course of this work: 1nuScenes, 2SemanticKITTI, 3Waymo Open, 4SemanticPOSS, 5Synth4D, 6SemanticSTF, 7ScribbleKITTI, 8S3DIS, 9Robo3D, 10lidar-bonnetal, 11MinkowskiEngine, 12SPConv, 13TorchSparse, 14WaffleIron, 15PolarMix, 16LaserMix, 17FRNet, and 18Open3D-ML.

We thank the exceptional contributions from the above open-source repositories! ❤️

Releases

No releases published

Packages

No packages published

Languages