G-Loss: Graph-Guided Fine-Tuning of Language Models
2026-04-28 • Computation and Language
Computation and LanguageArtificial IntelligenceMachine Learning
AI summaryⓘ
The authors propose a new way to train language models called G-Loss, which looks at the bigger picture by connecting similar documents in a graph. Unlike traditional methods that only focus on small groups, G-Loss uses these connections to help the model understand the meaning of documents better. They tested this idea on several text classification tasks and found that it helps the model learn faster and make more accurate predictions. Overall, the authors show that considering global relationships between documents improves language model training.
BERTloss functioncross-entropylabel propagationembedding manifoldgraph-based learningsemantic structuredocument similarityfine-tuningtext classification
Authors
Sharma Aditya, Agarwal Vinti, Kumar Rajesh
Abstract
Traditional loss functions, including cross-entropy, contrastive, triplet, and su pervised contrastive losses, used for fine-tuning pre-trained language models such as BERT, operate only within local neighborhoods and fail to account for the global semantic structure. We present G-Loss, a graph-guided loss function that incorporates semi-supervised label propagation to use structural relationships within the embedding manifold. G-Loss builds a document-similarity graph that captures global semantic relationships, thereby guiding the model to learn more discriminative and robust embeddings. We evaluate G-Loss on five benchmark datasets covering key downstream classification tasks: MR (sentiment analysis), R8 and R52 (topic categorization), Ohsumed (medical document classification), and 20NG (news categorization). In the majority of experimental setups, G-Loss converges faster and produces semantically coherent embedding spaces, resulting in higher classification accuracy than models fine-tuned with traditional loss functions.