HonestAffinity: Leak-Aware Evaluation of Protein and Pocket Priors for Binding Affinity Prediction
2026-06-02 • Computational Engineering, Finance, and Science
Computational Engineering, Finance, and Science
AI summaryⓘ
The authors developed HonestAffinity, a new method to predict how well proteins and molecules bind, using only sequence information. They tested different versions of their model to see how two features—a protein embedding called ESM-2 and a pocket marker—affect performance under strict conditions that avoid data leakage. Their results show that including these features helps on some common data splits but can hurt on stricter, no-leak data splits. They suggest reporting results with both types of splits and tailored model versions depending on the use case. The goal is to better understand when and why certain model features help or hurt predictions.
protein-ligand bindingdeep learningESM-2 embeddingspocket markerdata leakagePDBbind datasetsequence-based predictionPearson correlationconvolutional networkTransformer model
Authors
Junhao Wei, Baili Lu, Zhenhong Peng, Wanyan Li, Zhirong Huang, Yanxiao Li, Yifu Zhao, Dexing Yao, Haochen Li, Xudong Ye, Sio-Kei Im, Yapeng Wang, Xu Yang
Abstract
Sequence-based deep learning offers a scalable alternative to structure-based scoring for protein-ligand binding affinity prediction. However, progress is hard to interpret when architectural priors are evaluated on canonical PDBbind-style splits that leak similarity classes across folds. We present HonestAffinity, a compact 1D-input predictor to isolate two priors under a leak-aware protocol: frozen ESM-2 (650M) protein embeddings and a learned binary pocket-position marker. We evaluate a multi-scale convolutional/Transformer template in three variants: HonestAffinity-Pocket, HonestAffinity-NoPocket, and HonestAffinity-Pocket-NoESM. All three train on 11,513 LP-PDBBind complexes in ~3 GPU-hours. We benchmark against five baselines on the LP-PDBBind 3-tier no-leak hold-out, CASF-2016, and a CASF-2016 non-train subset. Our central finding is a split-conditioned reversal rather than a uniformly best prior: HonestAffinity-Pocket achieves the best mean Pearson R on validation and CASF-2016 splits, whereas HonestAffinity-Pocket-NoESM achieves the best mean Pearson R on every strict LP no-leak tier (test_cl1-cl3). Both the pocket marker and ESM-2 input improve performance on familiar splits but reduce Pearson R on strict no-leak tiers. We argue models should report paired canonical and leak-proof ablations, and that deployment-regime-matched variants better describe these reversals than a single default. Code and scripts are linked in the footnote; checkpoints will be released upon acceptance.