Abstract
This paper investigates the performance of synthetic agents in playing and learning scenarios in a turn-based zero-sum game and highlights the ability of opponent-based learning models to demonstrate competitive playing performances in social environments. Synthetic agents are generated based on a variety of combinations of some key parameters, such as exploitation-vs-exploration trade-off, learning back-up and discount rates, and speed of learning, and interact over a very large number of games on a grid infrastructure; experimental data is then analysed to generate clusters of agents that demonstrate interesting associations between eventual performance ranking and learning parameters’ set-up. The evolution of these clusters indicates that agents with a predisposition to knowledge exploration and slower learning tend to perform better than exploiters, which tend to prefer fast learning. Observing these clusters vis-à-vis the playing behaviours of the agents makes it also possible to investigate how to select opponents best from a group; initial results suggest that good progress and stable evolution arise when an agent faces opponents of increasing capacity, and that an agent with a good learning mechanism set-up progresses better when it faces less favourably set-up agents.
Get full access to this article
View all access options for this article.
