Sage Journals: Discover world-class research

Abstract

Planning under uncertainty faces a scalability problem when considering multi-robot teams, as the information space scales exponentially with the number of robots. To address this issue, this paper proposes to decentralize multi-robot partially observable Markov decision processes (POMDPs) while maintaining cooperation between robots by using POMDP policy auctions. Auctions provide a flexible way of coordinating individual policies modeled by POMDPs and have low communication requirements. In addition, communication models in the multi-agent POMDP literature severely mismatch with real inter-robot communication. We address this issue by exploiting a decentralized data fusion method in order to efficiently maintain a joint belief state among the robots. The paper presents two different applications: environmental monitoring with unmanned aerial vehicles (UAVs); and cooperative tracking, in which several robots have to jointly track a moving target of interest. The first one is used as a proof of concept and illustrates the proposed ideas through different simulations. The second one adds real multi-robot experiments, showcasing the flexibility and robust coordination that our techniques can provide.

Keywords

multi-robot cooperation planning under uncertainty decentralized data fusion

Get full access to this article

View all access options for this article.

References

Bernstein

Givan

Immerman

Zilberstein

(2002) The complexity of decentralized control of Markov decision processes. (d) Mathematics of Operations Research 27: 819–840.

Bourgault

Durrant-Whyte

(2004) Communication in general decentralized filters and the coordinated search strategy. (d) In: Proceedings of the 7th International Conference on Information Fusion.

Boutilier

(1996) Planning, learning and coordination in multiagent decision processes. (d) In: Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge, pp. 195–210.

Burdakov

Doherty

Holmberg

Kvarnstrom

Olson

(2010) Relay positioning for unmanned aerial vehicle surveillance. (d) The International Journal of Robotics Research 29(8): 1069–1087.

Burkard

(2002) Selected topics on assignment problems. (d) Discrete Applied Mathematics 123(1–3): 257–302.

Capitan

Merino

Caballero

Ollero

(2011) Decentralized delayed-state information filter (DDSIF): A new approach for cooperative decentralized tracking. (d) Robotics and Autonomous Systems 59: 376–388.

Choi

Brunet

How

(2009) Consensus-based decentralised auctions for robust task allocation. (d) IEEE Transactions on Robotics 25: 912–926.

Cole

Sukkarieh

Goktogan

(2006) System development and demonstration of a UAV control architecture for information gathering missions. (d) Journal of Field Robotics 23(6–7): 417–440.

de Hoog

Cameron

Visser

(2009) Role-based autonomous multi-robot exploration. (d) In: Proceedings of COMPUTATIONWORLD’09, pp. 482–487.

10.

Doshi

Roy

(2008) The permutable POMDP: fast solutions to POMDPs for preference elicitation. (d) In: Proceedings of AAMAS, volume 1, pp. 493–500.

11.

Foka

Trahanias

(2007) Real-time hierarchical POMDPs for autonomous robot navigation. (d) Robotics and Autonomous Systems 55(7): 561–571.

12.

Gerkey

Matarić

(2004) A formal analysis and taxonomy of task allocation in multi-robot systems. (d) The International Journal of Robotics Research 23(9): 939–954.

13.

Gmytrasiewicz

Doshi

(2005) A framework for sequential planning in multi-agent settings. (d) Journal of Artificial Intelligence Research 24: 49–79.

14.

Bachrach

Roy

(2010) Efficient planning under uncertainty for a target-tracking micro-aerial vehicle. (d) In: Proceedings of ICRA, pp. 1–8.

15.

Brunskill

Roy

(2011) Efficient planning under uncertainty with macro-actions. (d) Journal of Artificial Intelligence Research 40: 523–570.

16.

Hoey

Little

(2007) Value-directed human behavior analysis from video using partially observable Markov decision processes. (d) IEEE Transactions on Pattern Analysis and Machine Intelligence 29(7): 1–15.

17.

Hsiao

Kaelbling

Lozano-Perez

(2007) Grasping POMDPs. (d) Proceedings of ICRA, pp. 4685–4692.

18.

Hsieh

Cowley

Keller

. (2007) Adaptive teams of autonomous aerial and ground robots for situational awareness. (d) Journal of Field Robotics 24: 991–1014.

19.

Kaelbling

Littman

Cassandra

(1998) Planning and acting in partially observable stochastic domains. (d) Artificial Intelligence 101: 99–134.

20.

Kok

Spaan

MTJ

Vlassis

(2005) Non-communicative multi-robot coordination in dynamic environments. (d) Robotics and Autonomous Systems 50(2–3): 99–114.

21.

Kurniawati

Hsu

Lee

(2011) Motion planning under uncertainty for robotic tasks with long time horizons. (d) The International Journal of Robotics Research 30(3): 308–323.

22.

Kurniawati

Hsu

Lee

(2008) SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. (d) In: Proceedings of the Robotics: Science and Systems Conference. Zurich, Switzerland.

23.

Leonard

Paley

Davis

Fratantoni

Lekien

Zhang

(2010) Coordinated control of an underwater glider fleet in an adaptive ocean sampling field experiment in Monterey Bay. (d) Journal of Field Robotics 27: 718–740.

24.

Liu

Shell

(2011) Multi-level partitioning and distribution of the assignment problem for large-scale multi-robot task allocation. (d) In: Proceedings of RSS.

25.

Matignon

Jeanpierre

Mouaddib

(2012) Coordinated multi-robot exploration under communication constraints using decentralized Markov decision processes. (d) In: Proceedings of AAAI, pp. 2017–2023.

26.

Maza

Caballero

Capitan

de Dios

Ollero

(2011) A distributed architecture for a robotic platform with aerial sensor transportation and self-deployment capabilities. (d) Journal of Field Robotics 28(3): 303–328.

27.

Merino

Caballero

de Dios

Ferruz

Ollero

(2006) A cooperative perception system for multiple UAVs: application to automatic detection of forest fires. (d) Journal of Field Robotics 23: 165–184.

28.

Merino

Gilbert

Capitan

Bowden

Illingworth

Ollero

(2012) Data fusion in ubiquitous networked robot systems for urban services. (d) Annals of Telecommunications 67(7–8): 355–375.

29.

Mosteo

Montano

(2007) Comparative experiments on optimization criteria and algorithms for auction based multi-robot task allocation. (d) In: Proceedings ICRA, pp. 3345–3350.

30.

Nair

Tambe

Marsella

(2003) Role allocation and reallocation in multiagent teams: towards a practical analysis. (d) In: Proceedings AAMAS, pp. 552–559.

31.

Nair

Tambe

Roth

Yokoo

(2004) Communication for improving policy computation in distributed POMDPs. (d) In: Proceedings AAMAS, volume 3, pp. 1098–1105.

32.

Nair

Varakantham

Tambe

Yokoo

(2005) Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs. (d) In: Proceedings AAAI, pp. 133–139.

33.

Nettleton

Durrant-Whyte

Sukkarieh

(2003) A robust architecture for decentralised data fusion. (d) In: Proceedings of ICRA.

34.

Oliehoek

Spaan

MTJ

(2012) Tree-based pruning for multiagent POMDPs with delayed communication. (d) In: Proceedings AAAI, pp. 1415–1421.

35.

Ong

Png

Hsu

Lee

(2009) POMDPs for robotic tasks with mixed observability. (d) In: Proceedings of RSS.

36.

Papadimitriou

Tsitsiklis

(1987) The complexity of Markov decision processes. (d) Mathematics of Operations Research 12(3): 441–450.

37.

Pineau

Gordon

Thrun

(2006) Anytime point-based approximations for large POMDPs. (d) Journal of Artificial Intelligence Research 27: 335–380.

38.

Poupart

(2005) Exploiting Structure to Efficiently Solve Large Scale Partially Observable Markov Decision Processes. (d) Ph.D. thesis, University of Toronto.

39.

Pynadath

Tambe

(2002) The communicative multiagent team decision problem: analyzing teamwork theories and models. (d) Journal of Artificial Intelligence Research 16: 389–423.

40.

Rabinovich

Goldman

Rosenschein

(2003) The complexity of multiagent systems: the price of silence. (d) In: Proceedings of AAMAS, pp. 1102–1103.

41.

Roth

Simmons

Veloso

(2005) Decentralized communication strategies for coordinated multi-agent policies. (d) In: Schultz

Parker

Schneider

(eds.), Multi-Robot Systems: From Swarms to Intelligent Automata, volume IV. Dordrecht: Kluwer Academic Publishers, pp. 93–105.

42.

Roy

Gordon

Thrun

(2005) Finding approximate POMDP solutions through belief compression. (d) Journal of Artificial Intelligence Research 23: 1–40.

43.

Seuken

Zilberstein

(2008) Formal models and algorithms for decentralized decision making under uncertainty. (d) Autonomous Agents and Multi-Agent Systems 17(2): 190–250.

44.

Shah

Conrad

Williams

(2009) Fast distributed multi-agent plan execution with dynamic task assignment and scheduling. (d) In: Proceedings ICAPS, pp. 289–296.

45.

Simmons

Koenig

(1995) Probabilistic robot navigation in partially observable environments. (d) In: Proceedings International Joint Conference on Artificial Intelligence, volume 2, pp. 1080–1087.

46.

Sleight

Durfee

(2012) A decision-theoretic characterization of organizational influences. (d) In: Proceedings of AAMAS, volume 1, pp. 323–330.

47.

Sondik

(1971) The Optimal Control of Partially Observable Markov Processes. (d) Ph.D. thesis, Stanford University.

48.

Spaan

Gonçalves

Sequeira

(2010a) Multirobot coordination by auctioning POMDPs. (d) In: Proceedings of ICRA, pp. 1446–1451.

49.

Spaan

MTJ

Oliehoek

Vlassis

(2008) Multiagent planning under uncertainty with stochastic communication delays. (d) In: Proceedings of ICAPS, pp. 338–345.

50.

Spaan

MTJ

Veiga

Lima

(2010b) Active cooperative perception in network robot systems using POMDPs. (d) In: Proceedings of International Conference on Intelligent Robots and Systems, pp. 4800–4805.

51.

Spaan

MTJ

Vlassis

(2004) A point-based POMDP algorithm for robot planning. (d) In: Proceedings of the IEEE International Conference on Robotics and Automation, New Orleans, Louisiana, pp. 2399–2404.

52.

Stroupe

Balch

(2005) Value-based action selection for observation with robot teams using probabilistic techniques. (d) Robotics and Autonomous Systems 50: 85–97.

53.

Theocharous

Kaelbling

(2003) Approximate planning in POMDPs with macro-actions. (d) In: Advances in Neural Information Processing Systems.

54.

Thrun

Burgard

Fox

(2005) Probabilistic Robotics. (d) Cambrideg, MA: MIT Press.

55.

van der Sluis

(ed.) (2011) Future Generation – Smartgrid Research in the Netherlands. (d) TU Delft Library.

56.

Viguria

Maza

Ollero

(2008) S+T: an algorithm for distributed multirobot task allocation based on services for improving robot cooperation. (d) In: Proceedings of ICRA, pp. 3163–3168.

57.

Wong

Bourgault

Furukawa

(2005) Multi-vehicle Bayesian search for multiple lost targets. (d) In: Proceedings of ICRA, pp. 3169–3174.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

Decentralized multi-robot cooperation with auctioned POMDPs

Abstract

Keywords

Get full access to this article

References

Supplementary Material