Multi-agent reinforcement learning

Two rival teams of agents face off in a MARL experiment

Multi-agent reinforcement learning (MARL) is a sub-field of reinforcement learning. It focuses on studying the behavior of multiple learning agents that coexist in a shared environment.^[1] Each agent is motivated by its own rewards, and does actions to advance its own interests; in some environments these interests are opposed to the interests of other agents, resulting in complex group dynamics.

Multi-agent reinforcement learning is closely related to game theory and especially repeated games, as well as multi-agent systems. Its study combines the pursuit of finding ideal algorithms that maximize rewards with a more sociological set of concepts. While research in single-agent reinforcement learning is concerned with finding the algorithm that gets the biggest number of points for one agent, research in multi-agent reinforcement learning evaluates and quantifies social metrics, such as cooperation,^[2] reciprocity,^[3] equity,^[4] social influence,^[5] language^[6] and discrimination.^[7]

^ Stefano V. Albrecht, Filippos Christianos, Lukas Schäfer. Multi-Agent Reinforcement Learning: Foundations and Modern Approaches. MIT Press, 2024. https://www.marl-book.com/
^ Lowe, Ryan; Wu, Yi (2020). "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments". arXiv:1706.02275v4 [cs.LG].
^ Baker, Bowen (2020). "Emergent Reciprocity and Team Formation from Randomized Uncertain Social Preferences". NeurIPS 2020 proceedings. arXiv:2011.05373.
^ Hughes, Edward; Leibo, Joel Z.; et al. (2018). "Inequity aversion improves cooperation in intertemporal social dilemmas". NeurIPS 2018 proceedings. arXiv:1803.08884.
^ Jaques, Natasha; Lazaridou, Angeliki; Hughes, Edward; et al. (2019). "Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning". Proceedings of the 35th International Conference on Machine Learning. arXiv:1810.08647.
^ Lazaridou, Angeliki (2017). "Multi-Agent Cooperation and The Emergence of (Natural) Language". ICLR 2017. arXiv:1612.07182.
^ Duéñez-Guzmán, Edgar; et al. (2021). "Statistical discrimination in learning agents". arXiv:2110.11404v1 [cs.LG].

[1] Stefano V. Albrecht, Filippos Christianos, Lukas Schäfer. Multi-Agent Reinforcement Learning: Foundations and Modern Approaches. MIT Press, 2024. https://www.marl-book.com/

[2] Lowe, Ryan; Wu, Yi (2020). "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments". arXiv:1706.02275v4 [cs.LG].

[3] Baker, Bowen (2020). "Emergent Reciprocity and Team Formation from Randomized Uncertain Social Preferences". NeurIPS 2020 proceedings. arXiv:2011.05373.

[Hughes_2018_inequity-4] Hughes, Edward; Leibo, Joel Z.; et al. (2018). "Inequity aversion improves cooperation in intertemporal social dilemmas". NeurIPS 2018 proceedings. arXiv:1803.08884.

[5] Jaques, Natasha; Lazaridou, Angeliki; Hughes, Edward; et al. (2019). "Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning". Proceedings of the 35th International Conference on Machine Learning. arXiv:1810.08647.

[6] Lazaridou, Angeliki (2017). "Multi-Agent Cooperation and The Emergence of (Natural) Language". ICLR 2017. arXiv:1612.07182.

[7] Duéñez-Guzmán, Edgar; et al. (2021). "Statistical discrimination in learning agents". arXiv:2110.11404v1 [cs.LG].

[1]

[2]

[3]

[4]

[5]

[6]

[7]