Learning Who Disagrees: Demographic Importance Weighting for Modeling Annotator Distributions with DiADEM

2026-04-09Artificial Intelligence

Artificial IntelligenceComputation and Language
AI summary

The authors explain that when people label things that involve opinions, they often disagree because of their different backgrounds and experiences. Instead of ignoring these differences by choosing just one main label, the authors created a model called DiADEM that learns which aspects of a person's identity, like race or age, are important to predicting disagreements. Their model looks at both the person and the item being labeled to better understand why disagreements happen. Tests show DiADEM is better than existing large language models and other methods at capturing these differing viewpoints. This work suggests that knowing who the annotators are is key to respecting diverse human opinions in natural language processing.

subjective labelingannotator disagreementdemographic factorsDiADEMdisagreement predictionneural architecturechain-of-thought reasoningperspectivist metricsNLP annotationrepresentation learning
Authors
Samay U. Shetty, Tharindu Cyril Weerasooriya, Deepak Pandita, Christopher M. Homan
Abstract
When humans label subjective content, they disagree, and that disagreement is not noise. It reflects genuine differences in perspective shaped by annotators' social identities and lived experiences. Yet standard practice still flattens these judgments into a single majority label, and recent LLM-based approaches fare no better: we show that prompted large language models, even with chain-of-thought reasoning, fail to recover the structure of human disagreement. We introduce DiADEM, a neural architecture that learns "how much each demographic axis matters" for predicting who will disagree and on what. DiADEM encodes annotators through per-demographic projections governed by a learned importance vector $\boldsymbolα$, fuses annotator and item representations via complementary concatenation and Hadamard interactions, and is trained with a novel item-level disagreement loss that directly penalizes mispredicted annotation variance. On the DICES conversational-safety and VOICED political-offense benchmarks, DiADEM substantially outperforms both the LLM-as-a-judge and neural model baselines across standard and perspectivist metrics, achieving strong disagreement tracking ($r{=}0.75$ on DICES). The learned $\boldsymbolα$ weights reveal that race and age consistently emerge as the most influential demographic factors driving annotator disagreement across both datasets. Our results demonstrate that explicitly modeling who annotators are not just what they label is essential for NLP systems that aim to faithfully represent human interpretive diversity.