Learning Coordinated Preference for Multi-Objective Multi-Agent Reinforcement Learning

2026-06-12 • Multiagent Systems

Multiagent SystemsArtificial Intelligence

AI summaryⓘ

The authors study how groups of agents can work together when they each have different goals that might clash. They introduce a new method called PCMA that helps agents coordinate their preferences in a way that balances these different goals better. They also give a theory explaining why having diverse preferences among agents can actually improve the team's overall success. Tests in various simulated environments and a real traffic control example showed that PCMA helps teams perform better and manage trade-offs more smoothly.

multi-agent reinforcement learningmulti-objective optimizationteam decision makingpolicy optimizationpreference coordinationcooperative agentstrade-off managementgame theorytraffic control simulation

Authors

Pengxin Wang, Lihao Guo, Yi Xie, Bo Liu, Siyang Cao, Jingdi Chen

Abstract

Cooperative multi-objective multi-agent reinforcement learning (MOMARL) models team decision making under multiple, potentially conflicting objectives. In this setting, conflicts arise not only across objectives but also across agents with different observations, roles, and contributions. We propose Preference Coordinated Multi-agent Policy Optimization (PCMA), which learns coordinated agent-specific preferences to enable complementary trade-offs among agents. Theoretically, we formulate cooperative MOMARL as a team-optimal game and show that, under suitable conditions, preference diversity can induce team improvement through a first-order improvement decomposition. Experiments on multiple cooperative MOMA environments and a practical traffic-control scenario show that PCMA improves both performance and trade-off coordination.

View PDFOpen arXiv