RANGER: Sparsely-Gated Mixture-of-Experts with Adaptive Retrieval Re-ranking for Pathology Report Generation

2026-03-04Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial Intelligence
AI summary

The authors address challenges in generating pathology reports from huge, detailed whole slide images by creating a system called RANGER. Their method uses a mixture of expert components that specialize in different diagnostic patterns, allowing better report generation. They also improve how the system retrieves and refines relevant medical knowledge to reduce noise and better match the images. Tested on a breast cancer dataset, their approach outperformed existing methods in generating accurate and meaningful reports.

Pathology report generationWhole Slide Images (WSIs)Transformer architectureMixture-of-Experts (MoE)Sparse gatingAdaptive retrievalKnowledge baseNatural language generation metricsBLEU scoreDiagnostic patterns
Authors
Yixin Chen, Ziyu Su, Hikmat Khan, Muhammad Khalid Khan Niazi
Abstract
Pathology report generation remains a relatively under-explored downstream task, primarily due to the gigapixel scale and complex morphological heterogeneity of Whole Slide Images (WSIs). Existing pathology report generation frameworks typically employ transformer architectures, relying on a homogeneous decoder architecture and static knowledge retrieval integration. Such architectures limit generative specialization and may introduce noisy external guidance during the report generation process. To address these limitations, we propose RANGER, a sparsely-gated Mixture-of-Experts (MoE) framework with adaptive retrieval re-ranking for pathology report generation. Specifically, we integrate a sparsely gated MoE into the decoder, along with noisy top-$k$ routing and load-balancing regularization, to enable dynamic expert specialization across various diagnostic patterns. Additionally, we introduce an adaptive retrieval re-ranking module that selectively refines retrieved memory from a knowledge base before integration, reducing noise and improving semantic alignment based on visual feature representations. We perform extensive experiments on the PathText-BRCA dataset and demonstrate consistent improvements over existing approaches across standard natural language generation metrics. Our full RANGER model achieves optimal performance on PathText dataset, reaching BLEU-1 to BLEU-4 scores of 0.4598, 0.3044, 0.2036, and 0.1435, respectively, with METEOR of 0.1883, and ROUGE-L of 0.3038, validating the effectiveness of dynamic expert routing and adaptive knowledge refinement for semantically grounded pathology report generation.