PCM-NeRF: Probabilistic Camera Modeling for Neural Radiance Fields under Pose Uncertainty

Abstract

Neural surface reconstruction methods typically treat camera poses as fixed values, assuming perfect accuracy from Structure-from-Motion (SfM) systems. This assumption breaks down with imperfect pose estimates, leading to distorted or incomplete reconstructions. We present PCM-NeRF, a probabilistic framework that augments neural surface reconstruction with per-camera learnable uncertainty, built directly on SG-NeRF. Rather than treating all cameras equally throughout optimization, we represent each pose as a distribution with a learnable mean and variance, initialized from SfM correspondence quality. An uncertainty regularization loss couples the learned variance to view confidence — derived from both geometric correspondences and rendering quality — and the resulting uncertainty directly modulates the effective pose learning rate: uncertain cameras receive damped gradient updates, preventing poorly initialized views from corrupting the reconstruction. Experiments on challenging scenes with severe pose outliers demonstrate that PCM-NeRF consistently outperforms state-of-the-art methods in both Chamfer Distance and F-Score, particularly for geometrically complex structures, without requiring foreground masks.

Contributions

We introduce a probabilistic per-camera pose representation for NSR in which each pose is modeled as a distribution in SE(3) with a learnable mean and per-axis variance, initialized from SfM correspondence quality and evolved jointly with the scene.
We derive an uncertainty-modulated pose learning rate that inversely scales each camera's effective gradient step by its learned uncertainty, automatically preventing outlier views from destabilizing reconstruction without manual filtering or threshold selection.
We propose a cross-modal consistency loss that aligns learned variance parameters with view confidence derived from both geometric correspondences and rendering quality, anchoring uncertainty to observable evidence rather than rendering gradients alone.
We demonstrate state-of-the-art surface reconstruction on challenging outlier-pose scenes, achieving the lowest Chamfer Distance on 6 of 8 scenes and the highest F-Score on 6 of 8 scenes.

PCM-NeRF: Method

PCM-NeRF builds directly on SG-NeRF, extending gradient-based pose refinement with a lightweight probabilistic pose representation. Each camera pose P_i is modeled as a distribution in SE(3) with learnable mean μ_i = (r_i, t_i) and diagonal covariance Σ_i = diag(σ²_r,i, σ²_t,i), parameterized in log-space for numerical stability. All rendering uses the mean pose directly — adding no inference-time overhead — while the variance parameters are optimized through the uncertainty regularization loss.

View reliability is estimated from SfM correspondence density and a rolling buffer of per-camera PSNR scores, blended as γ_i^(t) = (1−α)γ_i⁽⁰⁾ + α·γ̂_i^(t) with α = 0.7. The uncertainty regularization loss encourages σ̄_i ≈ (1−γ_i), and the learned uncertainty modulates the effective pose learning rate: η_i^pose = η₀ / (1 + σ̄_i · κ), where κ = 5.0. Geometric consistency is enforced via the volumetric IoU loss over Mixture-of-Gaussian density distributions along matched rays, inherited from SG-NeRF.

Qualitative Results

Qualitative comparison across Clock, Deaf, and Farmer scenes. PCM-NeRF recovers finer geometric detail compared to all baselines, particularly in regions where outlier pose initialization corrupts reconstruction.

Hyperparameter Sensitivity

Effect of IoU weight (λ_IoU) and uncertainty weight (λ_unc) on Chamfer Distance and F-Score. Best results are achieved at λ_IoU = 0.2 and λ_unc = 0.05.

Ablation Study: Component Contributions

Results averaged across all 8 scenes. Both the probabilistic uncertainty module and the feature correspondence structure contribute positively; the best performance is achieved only when both are active together.

Prob. Uncertainty	Feature Corr.	CD ↓	F-Score ↑
✗	✗	1.18	0.72
✓	✗	1.33	0.68
✗	✓	0.46	0.85
✓	✓	0.32	0.93

Quantitative Comparison

Chamfer Distance (↓) across all 8 scenes. PCM-NeRF achieves the best result on 6 of 8 scenes, reducing mean CD by 28.8% relative to SG-NeRF and improving mean F-Score from 0.85 to 0.93.

Method	Baby	Bear	Bell	Clock	Deaf	Farmer	Pavilion	Sculpture	Mean CD ↓	Mean F ↑
NeuS	0.69	0.31	3.33	1.16	0.55	2.49	0.29	0.66	1.19	0.73
Neuralangelo	0.70	0.65	—	0.38	0.59	4.89	1.95	0.31	—	—
BARF	1.08	0.28	3.31	0.19	0.46	2.13	0.38	0.57	1.05	0.73
SCNeRF	1.19	0.27	3.74	1.33	0.46	1.45	0.23	0.81	1.19	0.66
GARF	2.04	2.25	3.09	0.50	0.59	1.58	0.96	0.57	1.45	0.70
L2G-NeRF	1.15	0.29	1.26	0.24	0.40	2.18	0.46	0.37	0.79	0.76
Joint-TensoRF	3.11	1.22	2.49	0.36	0.36	2.51	1.35	0.70	1.51	0.59
SG-NeRF	0.56	0.25	0.98	0.15	0.45	0.87	0.20	0.22	0.46	0.85
Ours (PCM-NeRF)	0.25	0.24	0.51	0.16	0.23	0.79	0.24	0.20	0.33	0.93

Chamfer Distance ↓ per scene. Bold = best per column. PCM-NeRF achieves the lowest CD on 6 of 8 scenes.

BibTeX

@inproceedings{venkatraman2026pcm,
  title={PCM-NeRF: Probabilistic Camera Modeling for Neural Radiance Fields under Pose Uncertainty},
  author={Venkatraman, Shravan and Madavan, Rakesh Raj and Venkatesh, Pavan Kumar Sathya},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4719--4728},
  year={2026}
}