Radiomic Feature Selection Using Gradient Loss of Deep Neural Network for Lung Cancer Stage Detection

2026-06-03Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionMachine Learning
AI summary

The authors developed a new method called Gradient-Loss Recursive Feature Elimination (GL-RFE) to pick the most important details from lung cancer CT scans for better stage detection. They used a deep neural network to measure how each feature affects the prediction and removed the less useful ones step-by-step. This approach improved the classification of early versus advanced lung cancer with around 90% accuracy. Their method also helps reduce confusing overlaps between features and could be useful for other medical data with many variables but few samples.

radiomicsfeature selectiondeep neural networkCT scanslung cancer staginggradient sensitivity analysisrecursive feature eliminationclassificationhigh-dimensional dataPyRadiomics
Authors
Hina Shakir, Mohammad Mohatram, Javeed Hussain, Syed Rizwan Ali, Muhammad Irfan Memon
Abstract
Radiomics enables extraction of quantitative imaging biomarkers from medical images and has become an important tool for computer-aided cancer diagnosis. However, radiomics datasets are typically high-dimensional with limited samples, making feature selection a critical step for building reliable predictive models. This study proposes a Gradient-Loss Recursive Feature Elimination (GL-RFE) framework that integrates gradient sensitivity analysis from a deep neural network to identify the most influential radiomic features for lung cancer stage detection. A total of 106 radiomic features were extracted from chest Computed Tomography (CT) scans using the PyRadiomics extension of the 3D Slicer platform. The proposed method evaluates feature importance by computing gradients of the network loss with respect to input features and recursively eliminates features with minimal contribution. The resulting top-15 radiomic features are used to train a deep neural network classifier for distinguishing early-stage and advanced-stage lung cancer. The proposed framework achieves strong classification performance, with accuracy of 90.22%, precision of 90.10%, recall of 90.24%, and F1-score of 90.16% on the test dataset. Visualization analyses, including correlation heat maps and distribution plots, further confirm reduced feature redundancy and improved class separability. Compared to conventional feature selection techniques, GL-RFE effectively captures nonlinear feature interactions and enhances model generalization. The presented protocol provides a reproducible and interpretable methodology for radiomics-based cancer stage detection and is particularly suitable for high-dimensional, small-sample biomedical datasets, with potential applications in other domains such as genomics and multimodal clinical analysis.