UGPL: Uncertainty-Guided Progressive Learning for Evidence-Based Classification in Computed Tomography

Shravan Venkatraman^1*, Pavan Kumar S^1*, Rakesh Raj Madavan^2*, Chandrakala S²

¹Vellore Institute of Technology, Chennai, India
²Shiv Nadar University, Chennai, India
* Equal contribution

ICCV 2025 Workshops (CVAMD)

UGPL guides CT image classification by leveraging uncertainty estimates to focus analysis on ambiguous regions through progressive, multi-scale refinement.

Abstract

Accurate classification of computed tomography (CT) images is essential for diagnosis and treatment planning, but existing methods often struggle with the subtle and spatially diverse nature of pathological features. Current approaches typically process images uniformly, limiting their ability to detect localized abnormalities that require focused analysis. We introduce UGPL, an uncertainty-guided progressive learning framework that performs a global-to-local analysis by first identifying regions of diagnostic ambiguity and then conducting detailed examination of these critical areas. Our approach employs evidential deep learning to quantify predictive uncertainty, guiding the extraction of informative patches through a non-maximum suppression mechanism that maintains spatial diversity. This progressive refinement strategy, combined with an adaptive fusion mechanism, enables UGPL to integrate both contextual information and fine-grained details. Experiments across three CT datasets demonstrate that UGPL consistently outperforms state-of-the-art methods, achieving improvements of 3.29%, 2.46%, and 8.08% in accuracy for kidney abnormality, lung cancer, and COVID-19 detection, respectively. Our analysis shows that the uncertainty-guided component provides substantial benefits, with performance dramatically increasing when the full progressive learning pipeline is implemented.

Method

We introduce Uncertainty-Guided Progressive Learning (UGPL), a novel framework that mimics diagnostic behavior by performing global analysis followed by focused examination of uncertain regions. UGPL addresses limitations of uniform processing by dynamically allocating computational resources where needed. Our framework first employs a global uncertainty estimator to perform initial classification and generate pixel-wise uncertainty maps, then selects high-uncertainty regions for detailed analysis through a local refinement network. These multi-resolution analyses are combined via an adaptive fusion module that weights predictions based on confidence. Unlike existing methods that treat uncertainty merely as an output signal, UGPL explicitly uses it to guide computational focus, maintaining efficiency while improving performance on diagnostically challenging regions. UGPL processes the input CT image to produce both classification probabilities and an uncertainty map that guides the extraction of high-uncertainty patches using non-maximum suppression. Each patch undergoes high-resolution analysis through a local refinement network, producing patch-specific classification scores and confidence estimates. The adaptive fusion module then integrates global and local predictions using learned weights based on their estimated reliability. Multiple specialized loss functions are jointly optimized, guiding components to work in tandem, adapt according to diagnostic difficulty, and improve performance over uniform processing.

Results

The LM performs poorly across all tasks, particularly for kidney abnormalities (40.57%) and lung cancer classification (51.22%), as local patches alone lack sufficient context and focus on irrelevant regions without global guidance. The following figures shows performance trends across tasks, with COVID-19 detection showing the most significant gains from LM to FM.

The following table and ROC curves show that for COVID-19, GM and FM achieve similar AUC scores (0.901 vs. 0.900). For lung cancer, FM achieves slight improvements across classes, especially for benign cases (0.991 vs. 0.992). For kidney cases, FM improves performance for most classes, including kidney stones (0.984 vs. 0.986).

Models	Kidney Abnormalities		Lung Cancer Type		COVID Presence
Models	Accuracy	F1	Accuracy	F1	Accuracy	F1
ShuffleNetV2	0.96 ± 0.0085	0.95 ± 0.0092	0.94 ± 0.0127	0.91 ± 0.0143	0.69 ± 0.0234	0.67 ± 0.0251
VGG16	0.89 ± 0.0156	0.88 ± 0.0173	0.95 ± 0.0098	0.91 ± 0.0165	0.48 ± 0.0287	0.47 ± 0.0306
ConvNeXt	0.81 ± 0.0189	0.80 ± 0.0195	0.95 ± 0.0076	0.95 ± 0.0084	0.61 ± 0.0267	0.59 ± 0.0278
DenseNet121	0.94 ± 0.0102	0.93 ± 0.0118	0.90 ± 0.0171	0.89 ± 0.0176	0.78 ± 0.0198	0.76 ± 0.0213
DenseNet201	0.95 ± 0.0093	0.94 ± 0.0106	0.84 ± 0.0203	0.83 ± 0.0218	0.76 ± 0.0206	0.74 ± 0.0229
EfficientNetB0	0.95 ± 0.0078	0.94 ± 0.0089	0.95 ± 0.0081	0.95 ± 0.0073	0.73 ± 0.0221	0.71 ± 0.0238
MobileNetV2	0.87 ± 0.0179	0.85 ± 0.0195	0.70 ± 0.0267	0.69 ± 0.0283	0.70 ± 0.0241	0.68 ± 0.0256
ViT	0.94 ± 0.0154	0.92 ± 0.0167	0.51 ± 0.0389	0.22 ± 0.0456	0.56 ± 0.0312	0.55 ± 0.0318
Swin	0.68 ± 0.0298	0.40 ± 0.0421	0.60 ± 0.0334	0.41 ± 0.0398	0.53 ± 0.0331	0.53 ± 0.0329
DeiT	0.92 ± 0.0162	0.90 ± 0.0178	0.66 ± 0.0312	0.46 ± 0.0387	0.44 ± 0.0356	0.35 ± 0.0412
CoaT	0.98 ± 0.0067	0.98 ± 0.0072	0.95 ± 0.0089	0.93 ± 0.0112	0.68 ± 0.0254	0.66 ± 0.0267
CrossViT	0.97 ± 0.0087	0.97 ± 0.0094	0.58 ± 0.0356	0.39 ± 0.0423	0.62 ± 0.0289	0.48 ± 0.0378
CRNet	-	-	-	-	0.73 ± 0.0218	0.76 ± 0.0203
UGPL (Ours)	0.99 ± 0.0023	0.99 ± 0.0031	0.98 ± 0.0047	0.97 ± 0.0052	0.81 ± 0.0134	0.79 ± 0.0147

To analyze the contribution of each component in our progressive learning framework, we compare four configurations: (1) a global-only setup that uses the global uncertainty estimator without local refinement; (2) a no uncertainty guidance (No UG) variant, where patches are selected randomly instead of using uncertainty maps; (3) a fixed patches configuration that uses predefined patch locations rather than adaptive selection; and (4) the full model, which includes all components of the UGPL framework. The following table shows our full model consistently outperforming all reduced variants by substantial F1 margins. On the COVID dataset, all ablations cause dramatic performance drops, with the global-only variant achieving only 14.95% F1. For lung cancer detection, the full model obtains 97.64% F1, while the global-only setup drops to 34.19%. The kidney dataset shows smaller yet significant gaps, with the full model reaching 99.6% F1 versus 58.7% for the best ablated configuration (fixed patches). Interestingly, No UG and fixed patches sometimes perform worse than the global-only model, showing that naively adding local components without proper guidance can be detrimental and highlighting the importance of uncertainty-guided patch selection.

Configuration	Loss Weights							COVID Presence		Lung Cancer Type		Kidney Abnormalities
Configuration	λ_f	λ_g	λ_l	λ_u	λ_c	λ_co	λ_d	Accuracy	F1	Accuracy	F1	Accuracy	F1
C1: Baseline	1.0	0.5	0.5	0.3	0.2	0.1	0.1	0.8108	0.7903	0.9817	0.9764	0.9971	0.9945
C2: Local Emphasis	1.0	0.3	0.7	0.3	0.2	0.1	0.1	0.7946	0.7758	0.9695	0.9641	0.9928	0.9903
C3: Global-Centric	1.0	0.7	0.3	0.3	0.2	0.1	0.1	0.7568	0.7402	0.9634	0.9576	0.9876	0.9832
C4: Uncertainty Focus	1.0	0.5	0.5	0.6	0.2	0.1	0.1	0.8243	0.8057	0.9756	0.9687	0.9953	0.9931
C5: Consistency-Driven	1.0	0.5	0.5	0.3	0.5	0.1	0.1	0.7892	0.7689	0.9786	0.9723	0.9913	0.9889
C6: Balanced High	1.0	0.5	0.5	0.4	0.4	0.2	0.2	0.8051	0.7836	0.9801	0.9739	0.9942	0.9918
C7: Diversity-Enhanced	1.0	0.5	0.5	0.3	0.2	0.1	0.4	0.7784	0.7569	0.9667	0.9602	0.9895	0.9856
C8: Confidence-Calibrated	1.0	0.5	0.5	0.3	0.2	0.4	0.1	0.7973	0.7798	0.9753	0.9695	0.9923	0.9891
C9: Conservative	0.5	0.25	0.25	0.15	0.1	0.05	0.05	0.7486	0.7312	0.9581	0.9524	0.9837	0.9803
C10: Aggressive	2.0	1.0	1.0	0.6	0.4	0.2	0.2	0.8023	0.7827	0.9728	0.9674	0.9932	0.9907

The following table compares ten loss weight configurations across datasets. The baseline configuration (C1) with balanced weights performs best overall (fused: 1.0, global/local: 0.5 each, uncertainty: 0.3, consistency: 0.2, confidence/diversity: 0.1 each). Configurations emphasizing either global or local branches underperform, confirming the necessity of combining global context with local detail. Increased uncertainty weighting (C4) improves COVID detection (82.43% accuracy, 80.57% F1) but slightly reduces performance on Lung and Kidney datasets where target features are more prominent. C5 (Consistency-Driven) excels on the Lung dataset (97.86% accuracy) where structural patterns are clearer, while uniform scaling of all components (C9 & C10) shows no improvement, indicating that relative balance matters more than absolute weight values.

F1 Scores across patch sizes and number of extracted patches. **Bolded values** indicate results from C1 configuration.
Patch Size	Patches	Kidney	Lung	COVID
32	2	0.9586	0.8869	0.7161
	3	0.9673	0.9195	0.7368
	4	0.9541	0.8756	0.7454
64	2	0.9824	0.9764	0.7521
	3	0.9945	0.8671	0.7368
	4	0.9765	0.9343	0.7903
96	2	0.9622	0.8712	0.7372
	3	0.9701	0.9099	0.7262
	4	0.9418	0.8717	0.6505

BibTeX

@InProceedings{UGPL2025,
  author    = {Venkatraman, Shravan and Kumar S, Pavan and Raj, Rakesh and S, Chandrakala},
  title     = {UGPL: Uncertainty-Guided Progressive Learning for Evidence-Based Classification in Computed Tomography},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
  month     = {October},
  year      = {2025}
}