Histopathology image segmentation is essential for delineating tissue structures in skin cancer diagnostics, but modeling spatial context and inter-tissue relationships remains a challenge, especially in regions with overlapping or morphologically similar tissues. Current convolutional neural network (CNN)-based approaches operate primarily on visual texture, often treating tissues as independent regions and failing to encode biological context. To this end, we introduce Neural Tissue Relation Modeling (NTRM), a novel segmentation framework that augments CNNs with a tissue-level graph neural network to model spatial and functional relationships across tissue types. NTRM constructs a graph over predicted regions, propagates contextual information via message passing, and refines segmentation through spatial projection. Unlike prior methods, NTRM explicitly encodes inter-tissue dependencies, enabling structurally coherent predictions in boundary-dense zones. On the benchmark Histopathology Non-Melanoma Skin Cancer Segmentation Dataset, NTRM outperforms state-of-the-art methods, achieving a robust Dice similarity coefficient that is 4.9\% to 31.25\% higher than the best-performing models among the evaluated approaches. Our experiments indicate that relational modeling offers a principled path toward more context-aware and interpretable histological segmentation, compared to local receptive-field architectures that lack tissue-level structural awareness. Our code will be released upon acceptance.
Neural Tissue Relational Modeling (NTRM) explicitly models the biological relationships between tissue types through a GNN integrated with traditional CNN feature extraction. Our approach constructs a tissue-level graph where nodes represent different tissue types and edges encode their spatial and functional relationships, learning tissue-specific embeddings that capture both visual characteristics and biological context. We do this by combining an initial draft segmentation with a tissue relation module (TRM) that refines predictions by incorporating learned tissue dependencies. The TRM constructs a tissue-level graph $\mathcal{G} = (\mathcal{V}, \mathcal{E})$ from the coarse segmentation map, where each node represents a predicted tissue class and edges indicate contextual or spatial proximity. Softmax-normalized class probabilities are thresholded to generate binary masks for each tissue, which define the spatial extent of each node. Intermediate CNN features $\mathcal{D}_2$ are then masked and globally pooled to produce class-specific node embeddings. Edges are constructed by examining spatial adjacency between tissue masks, allowing the graph to capture biologically relevant neighborhood relationships. This explicit graph representation enables TRM to reason over tissue co-occurrence and context — modeling structured interactions that convolutional layers alone cannot express. A graph neural network propagates messages over $\mathcal{G}$, refining node embeddings before projecting them back into the spatial domain.
Our method demonstrates superior boundary adherence and suppression of false positives, particularly for background (BKG) and keratin (KER) regions. In BCC, baseline and SOTA methods frequently misclassify basal compartments as SCC (green) or BKG (black), whereas our prediction more precisely preserves the epithelial-basal interface with reduced false activations. For SCC and IEC, our model shows significantly better class differentiation between adjacent structures like INF, RET, and FOL, with visibly cleaner delineations.
We show the operational impact of the TRM module on segmentation refinement in the above figure. The initial predictions, produced by the CNN decoder in isolation, show failure modes near complex boundaries - particularly at BCC-reticular interfaces and epithelial structures adjacent to keratin deposits. These misclassifications arise due to insufficient contextual reasoning across disjoint but functionally correlated tissue types. As shown in the pipeline. spatially contiguous regions are treated as graph nodes and connected via context-aware edges, allowing the network to explicitly reason over inter-tissue dependencies. The refined segmentation output captures granular class boundaries and suppresses spurious activations, as visually evident in the improvement overlay (right).
@misc{venkatraman2025visualfeaturesneuraltissue,
title={Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin Histology},
author={Shravan Venkatraman and Muthu Subash Kavitha and Joe Dhanith P R and V Manikandarajan and Jia Wu},
year={2025},
eprint={2512.06949},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.06949},
}