Sage Journals: Discover world-class research

Abstract

This paper investigates the performance of synthetic agents in playing and learning scenarios in a turn-based zero-sum game and highlights the ability of opponent-based learning models to demonstrate competitive playing performances in social environments. Synthetic agents are generated based on a variety of combinations of some key parameters, such as exploitation-vs-exploration trade-off, learning back-up and discount rates, and speed of learning, and interact over a very large number of games on a grid infrastructure; experimental data is then analysed to generate clusters of agents that demonstrate interesting associations between eventual performance ranking and learning parameters’ set-up. The evolution of these clusters indicates that agents with a predisposition to knowledge exploration and slower learning tend to perform better than exploiters, which tend to prefer fast learning. Observing these clusters vis-à-vis the playing behaviours of the agents makes it also possible to investigate how to select opponents best from a group; initial results suggest that good progress and stable evolution arise when an agent faces opponents of increasing capacity, and that an agent with a good learning mechanism set-up progresses better when it faces less favourably set-up agents.

Keywords

Synthetic playing behaviour strategy board game opponent-based reinforcement learning

Get full access to this article

View all access options for this article.

References

Al-Khateeb

Kendall

(2011). Introducing a round robin tournament into evolutionary in-dividual and social learning Checkers. Developments in E-systems Engineering. Dubai, United Arab Emirates.

Angluin

(1978). On the complexity of minimum inference of regular sets. Information and Control, 39, 337–350.

Billings

Papp

Schaeffer

Szafron

(1998). Opponent modeling in Poker. Proceedings of 15th National Conference of the American Association on Artificial Intelligence (pp. 493–498). Madison, WI: AAAI Press.

Bošanskýa

Lisýa

Lanctot

Čermáka

Winands

M. H.

(2016). Algorithms for computing strategies in two-player simultaneous move games. Artificial Intelligence, 1–40.

Busoniu

Babuska

De Schutter

(2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 156–172.

Caballero

Botia

Gomez-Skarmeta

(2011). Using cognitive agents in social simulations. Engineering Applications of Artificial Intelligence, 24, 1098–1109. DOI: http://dx.doi.org/10.1016/j.engappai.2011.06.006.

Carmel

Markovitch

(1996). Opponent modeling in multi-agent systems. Adaption and Learning in Multi-Agent Systems. Berlin, Germany: Springer.

Condon

(1992). The complexity of stochastic games. Information and Computation, 96, 203–224.

Crandal

J. W.

Goodrich

M. A.

(2005). Learning to compete, compromise, and cooperate in repeated general-sum games. In 22nd International Conference on Machine Learning (pp. 161–168), Boon, Germany.

10.

Crandall

J. W.

(2014). Towards minimizing disappointment in repeated games. Journal of Artificial Intelligence Research, 49, 111–142.

11.

Crandall

J. W.

Goodrich

M. A.

(2011). Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning. Machine Learning, 82(3), 281–314.

12.

Edelkamp

Kissmann

(2008). Symbolic classication of general two-player games. Technical report, Technische Universitat Dortmund.

13.

Elidrisi

Johnson

Gini

Crandall

(2014). Fast adaptive learning in repeated stochastic games. In Proceedings of the 2014 International Conference on Autonomous Agents and Multi-agent Systems (pp. 1141–1148), Paris, France.

14.

Elkind

Chalkiadakis

Jennings

N. R.

(2008). Coalition structures in weighted voting games. In Proceedings of the 18th European Conference on Artificial Intelligence (pp. 393–397), Patras, Greece.

15.

Elliott

Brzezinski

(1998). Autonomous agents as synthetic characters. AI Magazine, 19(2), 13–30.

16.

Elo

A. E.

(1978). The rating of Chess players, past and present. New York: Arco Publishing.

17.

Ferber

(1999). Multi-Agent Systems: An Introduction to Distributed Artificial Intelligence. Boston: MA: Addison-Wesley.

18.

Fogel

D. B.

(2002). Blondie24 playing at the edge of AI. New York: Academic Press.

19.

Ganzfried

Sandholm

(2011). Game theory-based opponent modeling in large imperfect-information games. In International Conference on Autonomous Agents and Multiagent Systems (pp. 533–5400).

20.

Gilbert

Troitzsch

K. G.

(2005). Simulation for the social scientist. 2nd ed.Milton Keynes, UK: Open University Press.

21.

Glickman

Albyn

J. C.

(1999). Rating the chess rating system. Chance, 12(2), 21–28.

22.

Hernandez-Leal

Munoz de Cote

Sucar

E. L.

(2014a). A framework for learning and planning against switching strategies in repeated games. Connection Science, 26(2), 103–122.

23.

Hernandez-Leal

Munoz de Cote

Sucar

L. E.

(2014b). Using a priori information for fast learning against non-stationary opponents. In Advances in Artificial Intelligence IBERAMIA 2014 (pp. 536–547), Santiago de Chile.

24.

Jan ‘t Hoen

Tuyls

Panait

Luke

La Poutré

(2006). An overview of cooperative and competitive multiagent learning. In Learning and adaption in multi-agent systems (pp. 1–46). Lecture Notes in Computer Science. Berlin, Germany: Springer.

25.

Kaelbling

L. P.

Littman

M. L.

Moore

A. W.

(1996). Reinforcement learning a survey. Journal of Artificial Intelligence Research, 4, 237–285.

26.

Kalles

Kanellopoulos

(2008). A minimax tutor for learning to play a board game. In Workshop on Artificial Intelligence in Games, a workshop of the 18th European Conference on Artificial Intelligence, Patras, Greece. Available at: http://abotea.rsise.anu.edu.au/data/W9.pdf

27.

Kalles

Kanellopoulos

(2001). Verifying game design and playing strategies using reinforcement learning. In: Proceedings of ACM Symposium on Applied Computing, special track on Artificial Intelligence and Computation Logic, Las Vegas.

28.

Kiourt

Kalles

(2016a). A platform for large-scale game-playing multi-agent systems on a high performance computing infrastructure. Multiagent and Grid Systems, 12(1), 35–54.

29.

Kiourt

Kalles

(2013). Building a social multi-agent system simulation management toolbox. In: 6th Balkan Conference in Informatics (pp. 66–70), Thessaloniki, Greece.

30.

Kiourt

Kalles

(2016b). Learning in multi agent social environments with opponent models. Multi-Agent Systems and Agreement Technologies, 137–144. Lecture Notes in Computer ScienceVol. 9571. Berlin, Germany: Springer.

31.

Kiourt

Pavlidis

Kalles

(2016). ReSkill: Relative skill-level calculation system. In 9th Hellenic Conference on Artificial Intelligence (SETN2016), Thessaloniki, Greece.

32.

Kiourt

Kalles

(2016c). Using opponent models to train inexperienced synthetic agents in social environments. In: IEEE Conference on Computational Intelligence and Games, Santorini, Greece.

33.

Kiourt

Kalles

Pavlidis

(2016). Human rating methods on multi-agent systems. Multi-Agent Systems and Agreement Technologies, 129–136. Lecture Notes in Computer ScienceVol. 9571. Berlin, Germany: Springer.

34.

Kiourt

Pavlidis

Kalles

(2016). ReSkill: Relative skill-level calculation system. In 9th Hellenic Conference on Artificial Intelligence (SETN2016), Thessaloniki, Greece.

35.

Leban

Zupan

Vidmar

Bratko

(2006). VizRank: Data visualization guided by machine learning. Data Mining and Knowledge Discovery, 13(2), 119–136.

36.

Littman

M. L.

(1994). Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of 11th International Conference on Machine Learning (pp. 157–163), San Francisco, CA.

37.

Lockett

A. J.

Miikkulainen

(2008). Evolving opponent models for Texas Hold ‘Em. In IEEE Conference on Computational Intelligence in Games, Perth, Australia.

38.

Lopes

Melo

F. S.

Kenward

Santos-Victor

(2009). A computational model of social-learning mechanisms. Adaptive Behavior, 17(6), 467–483.

39.

March

J. G.

(1991). Exploration and exploitation in organizational learning. Organization Science, 2(1), 71–87.

40.

Marivate

V. N.

(2009). Investigation into the effect of social learning in reinforcement learning board game playing agents. Master’s thesis, University of the Witwatersrand, South Africa.

41.

Marivate

V. N.

Marwala

(2008). Social learning methods in board game agents. In: IEEE Symposium Computational Intelligence and Games (pp. 323–328), Perth, Australia.

42.

Marom

Maistros

Hayes

(2001). Experiments with a social learning model. Adaptive Behavior, 9(3–4), 209–240.

43.

Moore

E. F.

(1956). Gedanken-experiments on sequential machines. Automata Studies, Annals of Mathematical Studies, 34, 129–153.

44.

Namee

M. B.

Cunningham

(2003). Creating socially interactive non player characters. The μ-SIC system. International Journal of Intelligent Games and Simulation, 2(1), 28–35.

45.

Smith

S. F.

(2008). A few good agents: Multi-agent social learning. In: Parkes

Padgham

Müller Parsons (Eds.), Proceedings of the 7th International Joint Conference on Autonomous Agents and Multi-agent Systems(AAMAS 2008), Estoril, Portugal, 12–16 May 2008 (pp. 339–346).

46.

Poole

D. L.

Mackworth

A. K.

(2010). Artificial intelligence: Foundations of computational agents. New York: Cambridge University Press.

47.

Prada

Paiva

(2009). Teaming up humans with autonomous synthetic characters. Artificial Intelligence, 173(1), 80–103.

48.

Qin

Khawar

Wan

(2016, 27February). Collective game behavior learning with probabilistic graphical models. Neurocomputing, 194(C), 74–86.

49.

Sandholm

Crites

H. R.

(1995). On multiagent Q-learning in a semi-competitive domain. In: Gerhard Wei and Sandip Sen (Eds.), Proceedings of the Workshop on Adaption and Learning in Multi-Agent Systems (IJCAI ’95) (pp. 191–205). London, UK: Springer-Verlag.

50.

Shoham

Leyton

B. K.

(2009). Multiagent systems: Algorithmic, game-theoretic, and logical foundations. Cambridge, MA: Cambridge University Press.

51.

Sutton

(1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 9–44.

52.

Sutton

R. S.

(1990). Integrated architectures for learning, planning and reacting based on approximating dynamic programming. In Morgan

M. B.

(Ed.), Proceedings of the Seventh International Conference on Machine Learning (pp. 216–224). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

53.

Sutton

Barto

(1998). Reinforcement learning – An introduction. Cambridge, MA: MIT Press.

54.

Tesauro

(1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38(3), 56–68.

55.

Tuyls

Hoen

Vanschoenwinkel

(2006). An evolutionary dynamical analysis of multi-agent learning in iterated games. Autonomous Agents and Multi-Agent Systems, 12, 115–153.

56.

Verbeeck

Nowé

Parent

Tuyls

(2006). Exploring selfish reinforcement learning in repeated games with stochastic rewards. Journal of Autonomous Agents and Multi-Agent Systems, 14(3), 239–269.

Synthetic learning agents in game-playing social environments

Abstract

Keywords

Get full access to this article

References