Explainable AI for Jet Tagging: A Comparative Study of GNNExplainer, GNNShap, and GradCAM for Jet Tagging in the Lund Jet Plane
2026-04-28 • Machine Learning
Machine Learning
AI summaryⓘ
The authors study how certain neural networks used to identify particle jets at the Large Hadron Collider make their decisions. They use different explanation methods adapted to a specialized graph representation (the Lund plane) where each part corresponds to a meaningful physics event. By comparing these explanations to known physics and traditional observables, they show that the models capture real physical features in different energy regimes. They also provide tools for others to reproduce and extend these explainability analyses.
Graph Neural NetworksParticleNetTransformer NetworksJet TaggingLund PlaneParton SplittingShapley ValuesN-subjettinessEnergy Correlation FunctionsExplainability
Authors
Pahal D. Patel, Sanmay Ganguly
Abstract
Graph neural networks such as ParticleNet and transformer based networks on point clouds such as ParticleTransformer achieve state-of-the-art performance on jet tagging benchmarks at the Large Hadron Collider, yet the physical reasoning behind their predictions remains opaque. We present different methods, i.e. perturbation-based (GNNExplainer), Shapley-value-based (GNNShap), and gradient-based (GRADCam); adapted to operate on LundNet's Lund-plane graph representation. Leveraging the fact that each node in the Lund plane corresponds to a physically meaningful parton splitting, we construct Monte Carlo truth explanation masks and introduce a physics-informed evaluation framework that goes beyond standard fidelity metrics. We perform the analysis in three transverse-momentum bins ($\mathrm{p_T} \in [500,700]$, $[800,1000]$, and the inclusive region $[500,1000]$ GeV), revealing how explanation quality and focus shift between non-perturbative and perturbative regimes. We further quantify the correlation between explainer-assigned node importance and classical jet substructure observables -- $N$-subjettiness ratios $τ_{21}$ and $τ_{32}$ and the energy correlation functions -- establishing the degree to which the model has learned known QCD features. We find that overall the weight assigned by explainability methods has a correlation with analytic observables, with expected shift across different phase space regimes, indicating that a trained neural network indeed learns some aspects of jet-substructure moments. Our open-source implementation enables reproducible explainability studies for graph-based jet taggers.