Click here to flash read.
Solving the problem of cooperation is of fundamental importance to the
creation and maintenance of functional societies, with examples of cooperative
dilemmas ranging from navigating busy road junctions to negotiating carbon
reduction treaties. As the use of AI becomes more pervasive throughout society,
the need for socially intelligent agents that are able to navigate these
complex cooperative dilemmas is becoming increasingly evident. In the natural
world, direct punishment is an ubiquitous social mechanism that has been shown
to benefit the emergence of cooperation within populations. However no prior
work has investigated its impact on the development of cooperation within
populations of artificial learning agents experiencing social dilemmas.
Additionally, within natural populations the use of any form of punishment is
strongly coupled with the related social mechanisms of partner selection and
reputation. However, no previous work has considered the impact of combining
multiple social mechanisms on the emergence of cooperation in multi-agent
systems. Therefore, in this paper we present a comprehensive analysis of the
behaviours and learning dynamics associated with direct punishment in
multi-agent reinforcement learning systems and how it compares to third-party
punishment, when both are combined with the related social mechanisms of
partner selection and reputation. We provide an extensive and systematic
evaluation of the impact of these key mechanisms on the dynamics of the
strategies learned by agents. Finally, we discuss the implications of the use
of these mechanisms on the design of cooperative AI systems.