Abstract
In multi-agent systems, the presence of learning agents can cause the environment to be non-Markovian from an agent's perspective thus violating the property that traditional single-agent learning methods rely upon. This paper formalizes some known intuition about concurrently learning agents by providing formal conditions that make the environment non-Markovian from an independent (non-communicative) learner's perspective. New concepts are introduced like the divergent learning paths and the observability of the effects of others' actions. To illustrate the formal concepts, a case study is also presented. These findings are significant because they both help to understand failures and successes of existing learning algorithms as well as being suggestive for future work.
Get full access to this article
View all access options for this article.
