Histopathology Image Normalization via Latent Manifold Compaction

2026-02-27 • Machine Learning

Machine LearningComputer Vision and Pattern Recognition

AI summaryⓘ

The authors address the problem that differences in how pathology images are prepared and scanned make it hard for computer models to work well across different hospitals. They created a new method called Latent Manifold Compaction (LMC) that helps models focus on common features in images while ignoring technical differences, using only data from one site. Their method improves the ability of models to work on new, unseen data from other places and performs better than existing techniques on tests involving image classification and detection tasks. This can help make computational pathology tools more reliable across different settings.

Batch effectsHistopathologyImage harmonizationRepresentation learningLatent manifoldDomain generalizationNormalization methodsCross-batch classificationDetection tasks

Authors

Xiaolong Zhang, Jianwei Zhang, Selim Sevim, Emek Demir, Ece Eksi, Xubo Song

Abstract

Batch effects arising from technical variations in histopathology staining protocols, scanners, and acquisition pipelines pose a persistent challenge for computational pathology, hindering cross-batch generalization and limiting reliable deployment of models across clinical sites. In this work, we introduce Latent Manifold Compaction (LMC), an unsupervised representation learning framework that performs image harmonization by learning batch-invariant embeddings from a single source dataset through explicit compaction of stain-induced latent manifolds. This allows LMC to generalize to target domain data unseen during training. Evaluated on three challenging public and in-house benchmarks, LMC substantially reduces batch-induced separations across multiple datasets and consistently outperforms state-of-the-art normalization methods in downstream cross-batch classification and detection tasks, enabling superior generalization.

View PDFOpen arXiv