Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures

2026-03-18 • Computation and Language

Computation and Language

AI summaryⓘ

The authors studied how large language models translate gender, which can be tricky because some languages show gender explicitly while others don't. They introduced a new way to measure a model's default gender guesses, called "Prior Bias." They found that newer decoder-only models aren't always better at handling gender than older encoder-decoder models. However, fine-tuning these models helps them pay more attention to context and reduces the tendency to default to masculine translations.

Large Language ModelsMachine TranslationGender BiasPrior BiasDecoder-only modelsEncoder-decoder modelsInstruction TuningContextual AwarenessBias Evaluation

Authors

Chiara Manna, Hosein Mohebbi, Afra Alishahi, Frédéric Blain, Eva Vanmassenhove

Abstract

While Large Language Models achieve state-of-the-art results across a wide range of NLP tasks, they remain prone to systematic biases. Among these, gender bias is particularly salient in MT, due to systematic differences across languages in whether and how gender is marked. As a result, translation often requires disambiguating implicit source signals into explicit gender-marked forms. In this context, standard benchmarks may capture broad disparities but fail to reflect the full complexity of gender bias in modern MT. In this paper, we extend recent frameworks on bias evaluation by: (i) introducing a novel measure coined "Prior Bias", capturing a model's default gender assumptions, and (ii) applying the framework to decoder-only MT models. Our results show that, despite their scale and state-of-the-art status, decoder-only models do not generally outperform encoder-decoder architectures on gender-specific metrics; however, post-training (e.g., instruction tuning) not only improves contextual awareness but also reduces the masculine Prior Bias.

View PDFOpen arXiv