When Does Demographic Information Help? Data and Modeling Regimes for Perspective-Aware Hate Speech Detection
2026-05-26 • Computation and Language
Computation and Language
AI summaryⓘ
The authors studied when using demographic info helps in tasks where people label things differently, like spotting hate speech. They found that demographics help most when the training data has low disagreement, the test data has high disagreement, and there is enough data with overlapping demographic groups. Based on this, they made a model that only uses demographics as a small tweak to predictions, which worked well especially when labels are uncertain. Overall, the authors say that demographics don't always help and depend on the data and model used.
demographic informationannotator disagreementhate speech detectiondata splitmodeling frameworktraining datatest dataresidual modellabel ambiguityselective adjustment
Authors
Weibin Cai, Reza Zafarani
Abstract
Demographic information is often used to model annotator perspectives in subjective tasks such as hate speech detection, but its benefit is inconsistent: it improves performance in some settings and behaves as noise in others. This paper asks when demographic features help. We analyze demographic gain as a function of both data split properties and modeling frameworks. For data splits, we measure annotator disagreement, namely how often annotators assign different labels to the same example, along with training size and train-test demographic coverage. We find that demographic gains concentrate in regimes with low training disagreement, high test disagreement, fine-grained ambiguity measurement, sufficient training data, and greater demographic overlap. Motivated by these regimes, we introduce a gated demographic residual model that treats demographics as a selective adjustment to text-only predictions. Experiments on MHS and POPQUORN show that this design is effective, especially on high disagreement or low confidence examples. Overall, our results suggest that demographics should not be assumed useful by default; their value depends jointly on the data regime and the modeling framework.