A Proper Scoring Rule for Virtual Staining
2026-02-26 • Machine Learning
Machine Learning
AI summaryⓘ
The authors discuss virtual staining models that predict biological features from cell images, which is helpful for analyzing many samples quickly. They point out that current methods only check overall accuracy instead of how confident the models are about each cell prediction. To fix this, they introduce a new way to measure the information gained from each cell prediction, called information gain (IG). They tested their approach on different types of models and found that IG can identify differences in performance that other measures miss.
virtual staininghigh-throughput screeningposterior distributioninformation gaindiffusion modelsGANsscoring rulesbiological imagingmodel evaluation
Authors
Samuel Tonks, Steve Hood, Ryan Musso, Ceridwen Hopely, Steve Titus, Minh Doan, Iain Styles, Alexander Krull
Abstract
Generative virtual staining (VS) models for high-throughput screening (HTS) can provide an estimated posterior distribution of possible biological feature values for each input and cell. However, when evaluating a VS model, the true posterior is unavailable. Existing evaluation protocols only check the accuracy of the marginal distribution over the dataset rather than the predicted posteriors. We introduce information gain (IG) as a cell-wise evaluation framework that enables direct assessment of predicted posteriors. IG is a strictly proper scoring rule and comes with a sound theoretical motivation allowing for interpretability, and for comparing results across models and features. We evaluate diffusion- and GAN-based models on an extensive HTS dataset using IG and other metrics and show that IG can reveal substantial performance differences other metrics cannot.