Abstract
A novel model for asymmetric multiagent reinforcement learning is introduced in this paper. The model addresses the problem where the information states of the agents involved in the learning task are not equal; some agents (leaders) have information how their opponents (followers) will select their actions and based on this information leaders encourage followers to select actions that lead to improved payoffs for the leaders. This kind of configuration arises e.g. in semi-centralized multiagent systems with an external global utility associated to the system. We present a brief literature survey of multiagent reinforcement learning based on Markov games and then propose an asymmetric learning model that utilizes the theory of Markov games. Additionally, we construct a practical learning method based on the proposed learning model and study its convergence properties. Finally, we test our model with a simple example problem and a larger two-layer pricing application.
Keywords
Get full access to this article
View all access options for this article.
