Modelling Opinion Dynamics at Scale with Deep MARL

2026-06-05 • Multiagent Systems

Multiagent SystemsComputer Science and Game TheorySocial and Information Networks

AI summaryⓘ

The authors used a computer simulation where many virtual agents learn how to share opinions and reach agreement by getting rewards, instead of using fixed rules. They tested their model with data from a real social network and found that groups who try hard to fit in match human behavior best. However, in large online networks, this desire to conform can make people less accurate and encourage dishonesty. In smaller, more natural social groups, conformity might help people agree more. The authors suggest that human tendencies for conformity might not work well in today’s online social media, which could lead to spreading false information.

opinion dynamicsmulti-agent reinforcement learningconsensusconformitysocial networksattention layergraph topologycollective accuracydishonestymisinformation

Authors

Lukas Seier, Brandon Kaplowitz, Sebastian Towers, Richard Bailey, Jakob Foerster

Abstract

Modelling opinion dynamics typically relies on hand-crafted local interaction rules to study emergent macroscopic phenomena such as consensus and polarisation. In contrast, multi-agent reinforcement learning (MARL) enables agents to learn such behaviours directly by optimising simple rewards. To explore the potential of MARL for opinion dynamics, we introduce a GPU-accelerated consensus and truth-finding game that scales to populations of up to 1000 agents, comparable to many real-world social sub-networks. To prevent unrealistic conventions, we extend other-play to general-sum social interactions. We next validate our model on a subset of the Bluesky network by recovering agent importance structures from graph topology alone via a learned attention layer, finding that highly conforming populations most closely match human data. In large social media networks such high levels of conformity significantly reduce collective accuracy and promote dishonest agents that lie to fit in. By contrast, small, dynamic hunter-gatherer networks are less affected; here, conformity can even improve collective agreement. This suggests a mismatch between evolved human conformity heuristics and modern social media environments as a potential contributor to misinformation.

View PDFOpen arXiv