New Benchmarking Shows Limited Generalization Power of TCR Antigenic Epitope Prediction Models

2026-06-03Machine Learning

Machine Learning
AI summary

The authors explain that predicting which antigens a T cell receptor (TCR) will recognize using computers could really help in understanding T cells and improve immune system treatments. However, current prediction models are not accurate enough for wide use. They identify that one big problem is the lack of well-defined test datasets to fairly check how good these models are. The authors present two new types of datasets that can properly evaluate model accuracy and help create better prediction tools in the future.

T cell receptorantigen specificitycomputational predictionbenchmark datasetsmodel evaluationimmune engineeringsensitivityspecificityT cell biology
Authors
Yiming Liao, Yiheng Li, Ning Jiang, Bo Li, Keke Chen
Abstract
Accurate computational prediction of T cell receptor (TCR) antigen specificity would transform the study of T cell biology and enable scalable immune engineering, yet existing models lack sufficient sensitivity and specificity for broad applications. A major limitation is the absence of rigorously defined, unseen benchmark datasets that allow unbiased evaluation of model performance and generalizability. Here, we describe two complementary classes of datasets that meet this criterion and argue that they provide both a robust framework for model assessment and a foundation for next-generation TCR-antigen prediction algorithm development.