Abstract
This paper presents a negotiations-based approach for simultaneous task subdivision and assignment in heterogeneous multi-robot systems. We first propose an abstraction of the concept of a task that allows for the generalizing of a variety of different problems. Based on such an abstraction, we have developed a negotiation protocol based on Rubinstein's alternate offers protocol. This is extended to the multi-dimensional space and employs a heuristic search step for evaluating and generating offers. Furthermore, the issue of how to extend a bilateral negotiations protocol to more than two parties is taken into consideration. The protocol was first tested in numerical simulations with different scenarios and then applied to three real-world missions.
1. Introduction
Multi-robot 1 systems (MRSs) represent a very active field of research. A variety of techniques have been proposed in order to approach the problem of coordination in different kinds of applications [19, 33]. Cooperation applications can be roughly divided into two classes: tight cooperation requires continuous coordination between the robots such as, for instance, in box pushing and formation keeping. Loose cooperation requires coordination at the beginning of the mission for planning the division of labour and at given points in time when re-planning may be needed. Exploration, mapping and surveillance are typical applications. Behaviour-based [18, 32], schemas [1] and virtual vector fields [14, 29, 20] are examples of techniques suitable for the first class of coordination problems, while market-based [9, 8], auction [11] and bargaining [13] techniques are commonly used for the second class of problems. Multi-robot coordination techniques, like virtual potential fields, are also used in sensor networks applications (see, e.g., [16]) for coordinating communication tasks or energy management for teams of mobile sensors.
Here, we focus on loose cooperation for teams of mobile robots. In this class of problems, a given task has to be partitioned into sub-tasks, and the sub-tasks have to be assigned to individual team-members for execution. Most of the coordination techniques assume that the task subdivision step is done at a high level, either on a central station or by a team leader, and as such focus on the sub-task allocation problem.
This approach, although applied with success in many applications, has two principal drawbacks: first, it is centralized (task partitioning is done in one place), and second the partitioning algorithm is usually considered “outside” the coordination protocol. Often, the details of how the original task is partitioned are not given at all.
As pointed out in [34], “the main drawback of this approach is that task decomposition is performed without knowledge of the eventual task allocation; therefore, the cost of the final plan cannot be fully considered. Since there is no backtracking, costly mistakes in the central decompositions cannot be rectified.”
One of the most popular protocols for task assignment is the contract net [26], and many algorithms [12, 27, 5, 9, 11, 24, 15] have been based on this. In all these approaches, each robot has predefined cost and revenue functions that are used to compute the expected gains and losses for performing tasks. However, the robots' preferences and limitations are considered only in the assignment stage, when the robots decide whether to accept (or opt for) a task or not.
Such features do not suit our need for a fully distributed approach that should consider the robots' capabilities early on at the task partitioning stage. A task partitioning algorithm which does not take these into account may produce solutions that are not feasible, leading to incomplete task execution since no agent will bid for certain task(s).
The negotiation protocol we propose performs a simultaneous task subdivision and allocation in a distributed way, taking into account the robots' preferences at all times.
Negotiations have been widely studied in the context of socio-economic studies [7] using, among others, game theory [17], and have been applied, e.g., to electronic commerce using agents [28]. The main problem with game theory approaches is that their theoretical results often refer to simplified models that are not immediately applicable to complex applications. The protocol we propose is based of Rubinstein's alternate-offers protocol [22]. Since such a protocol is based on a uni-dimensional good, we first had to develop an extension to the multi-dimensional space. In this context, a search mechanism for the best (counter)-offer had to be devised for the protocol to be applied. Furthermore, Rubinstein's protocol has been developed for bilateral negotiations. The issue of how to extend it to more than two parties will be discussed below.
It must be pointed out that in all market-based approaches, the task partitioning and assignment is performed taking into account the status and knowledge of the world available at the time of the assignment. An
2. Tasks
In order to design a negotiation algorithm that is general enough to work with different kind of tasks (cf. Fig. 1), an abstract task concept should be defined. A negotiation algorithm based on such an abstraction allows different applications with minimal changes.

Examples of the instantiation of tasks: areas, a set of target points, a communication chain and a frontier. All the tasks can be defined by an array of real numbers: (a) coordinates of the vertices of the polygons (
In the context of loose cooperation, we focus on the object to be divided and not on the activity to be performed on such an object. For example, if the task is surveying a given area, we are mainly interested in partitioning the area and assigning sub-areas to the agents. The activity the agents will have to perform and their preferences (for instance, with regard to their capabilities) will of course play an important role in the negotiation. This role is encapsulated in the cost/reward the agents associate with the task, as explained later on.
We define a task
ℙ
We consider
The problem is to divide a task
In general, a good subdivision is such that there is minimum overlap between sub-tasks (ideally null), and such that the union of the sub-tasks cover the original task. That is:
where the operators ⊓, ⊔ : 𝕋 × 𝕋 → 𝕋 (
Note that there can be exceptions, depending upon the application. For example, in a communication relay application, the overlapping between the communication ranges of two robots must not be null (See Fig. 1(c)).
Let
associates a reward with a set of parameters describing a task. We associate with a subdivision
As such, the problem of task subdivision can be formulated in the following way: given a task
Since each agent looks for its own reward, there is no guarantee that index G corresponds to a solution that is optimal. However, as we will see in the next sections, good sub-optimal values are usually achieved.
2.1 Evaluation of a task
During the negotiation, each robot has to evaluate the cost and reward of a given task. To this aim, it takes into account its internal parameters to evaluate the cost of executing the task, the start-up cost (e.g., to reach the execution site), specific constraints (e.g., forbidden zones, turn angles, battery level) and the general reward associated with the task, expressed as the function
Thus, given the complete task
Example of factors taken into account when evaluating a (sub-)task
2.2 Example
To illustrate the concept of task evaluation, consider the case where a ground vehicle (Robot 1) and an air vehicle (Robot 2) have to explore a given area

Negotiation with a forbidden area (blue area). The green agent (Robot 2) is not allowed to fly over the buildings, and will refuse all proposals, including the no-fly zone. In this example,
where
The constants α = 1, β = 1, γ = 1 and φ = 1000 have been set in such a way that it is equally penalized going outside of the target area and overlapping, and Robot 2 strongly penalizes having to explore region
3. Task negotiations
Given a team of
3.1 Alternate offers
In the alternate-offers protocol proposed by Rubinstein [22], each part of the bilateral negotiation, in turn, proposes the share
The update rule of a target share
Let the discount factors be δ1 for the initiator and δ2 for the responder. Rubinstein's theory guarantees that one perfect equilibrium point exists such that the share that
If the discount factors were known to both parties, each could know
This protocol is not immediately applicable to the multidimensional case. In the uni-dimensional case, when an agent makes a proposal
3.2 Searching for offers
We divide the negotiation into two levels: the protocol level and the proposals' evaluation and generation level. The protocol level is governed by such parameters as impatience to reach an agreement (time pressure as a discount factor δ) and the desired target reward. This level basically implements Rubinstein's alternating offers protocol, also managing the initiation and termination of the negotiation.
The proposal generation level searches the space for a good share given a proposal from the other party, taking into account its own characteristics. This level is also responsible for updating counteroffers.
Both the proposal evaluation and update search steps can be performed by a variety of algorithms. We have experimented with evolutionary algorithms in those cases where a complex space has to be searched (e.g., in the case of areas) and with complete discrete search algorithms (A* [23]) for the simplest cases, such as the negotiation of target points (see the next sections).
Figure 3 shows the communications protocol between two agents. The

Negotiation loop. The loop can be interrupted at any moment due to an abort message of one part or a time-out when waiting for an offer (not shown in the sequence diagram for clarity). The negotiation loop can also be interrupted at any moment if one of the parties decides to unilaterally abort the negotiation, in which case a time-out occurs while waiting for a message, or if the mission is interrupted by a command and control station.
3.2.1 Termination
As mentioned earlier, if the discount factors were known, each agent could compute in advance the maximum share it would receive. Such factors are private information of the robots, but they can be estimated by comparison of the offers received. With these, it can estimate the maximum possible share it can expect, according to Rubinstein's theory.
Supposing that Agent 1 is the first to make a proposal, then it can estimate δ2 and forecast that its share will be:
where
The values
In practice, if the evaluation of the share provides a value sufficiently close to
When an agreement has been reached, the result is a subdivision of the original task and - at the same time - an assignment of the sub-tasks. Note that in this way one agent does not need to possess information about the private characteristics of its team-mates. The only information it needs concerns their offers, in the form of an array of parameters
3.2.2 Learning
In our negotiation strategy, two learning processes take place.
Additionally, each time a new offer is received, the search in the parameters' space for the generation of the counteroffer is not started from scratch but rather takes as its starting point the previous offer and the response received to this. Therefore, the search is directed towards promising regions of the parameters' search space (and away from bad ones). In other words, the search algorithm
The NFZ application described earlier is an example. One agent is constrained to avoid certain parameters' combinations that define a forbidden area. Although the other agent is not aware of such constraints, as they are private information, all offers that include the forbidden configurations will be rejected and so the other agent will be forced to generate offers that avoid them.
3.2.3 Analysis
To illustrate the behaviour of a negotiation, Figure 4 shows an example of the shares of area partitioning between two agents. The left plot shows how the shares the agents obtain is actually close to values predicted by Rubinstein's theory. In the numerical simulation, the negotiation ends when one agent estimates that the share it is receiving is greater than or equal to its prediction, which is made by estimating the opponent's discount factor (Eq. 3).

Negotiation on how to partition an area. Best share and Rubinstein predictions (left). Total area T and global coverage G (right). δ1 = 0.95, δ2 = 0.955 : both agents start claiming the whole area of size 0.74.
The plot to the right shows how the overlapping task is reduced during the negotiation, and how the global coverage G is very close to the optimum value (in this case, the total area T). Moreover, note how good global coverage G is maintained throughout the negotiation. In other words, the whole area is covered by the two robots at all times.
3.3 Extending to more than two negotiators
It is important to notice that, as pointed out in [6], “the difference between a two-party and a three-party game reflects a basic qualitative difference between the types of processes that take place within the negotiation. The involvement of any more than three parties can be seen as an extension of a three-party process.” In other words, there is a shift in complexity when passing from two to three negotiators, but passing from three to any number greater than three does not imply any difference in the negotiation process. Hence, in the following analysis we will focus the discussion on the case where
When there are more than two agents, the proposed negotiation protocol can be extended in several ways.
Some of the heuristic algorithms that we have considered are:
The last two strategies imply that each agent resolves the complete problem. They are difficult to implement as they would take excessive computational power, since the complete state of the system must be taken into account. Moreover, the private information of the other team-mates would be difficult integrate. The coalitions strategy has the drawback of a mechanism for forming the two sub-teams by electing a representative for negotiating. This is far from trivial (see, e.g., [31, 25]). The number of bilateral negotiations would be of the order of
Thus, the rounds strategy seems to be the best, both for simplicity of implementation and in terms of computational requirements. It is fully distributed and decentralized, it does not need complete system status information or teammates' private information, and the number of negotiations to be carried out is of the order of
3.3.1 Estimating the shares when R>2
In the “rounds” negotiation strategy, agent
Even if, from the theoretical point of view, we are violating the requirement of constant discount factors, on the practical side the estimated values of
3.4 Communication and Computational Costs
A final remark must be made regarding the communications topology implied by the algorithm. In this work, we have assumed that all-to-all communications are available in order for the negotiation to take place. All-to-all communications are not strictly needed, since in the negotiation rounds it is sufficient to establish a communication ring. Moreover, communications will be available only during the negotiation itself, and are not required to be maintained continuously during the execution of the tasks that the robots have agreed on.
We also considered no bandwidth restrictions, as the amount of information exchanged is not a critical issue for our protocol. As an example, a polygon of 10 points is described by 20 floating point numbers. Assuming four bytes per number, a total of 80 bytes (plus headers) are sent with each offer. Taking into account the number of agents and the typical number of negotiation rounds needed, a total amount of the order of kilobytes for the whole process can be calculated.
Computational costs are centred on the proposal evaluation and counter-proposal update steps. Such costs depend on the application at hand and on the degree of precision needed. For instance, in an area survey application, in some cases the computation of a route may be required that carefully takes into account the robots' and the environment's characteristics, while in other cases an approximation that simply takes into account the dimension of the areas and the reasons why the polygons' intersections and unions can be employed, simplifying the evaluation. In any case, such costs are independent of the negotiation protocol used, since an evaluation step is always necessary to assess the goodness of a given task partition and assignment. Moreover, we would underline that, in a distributed approach such as the one that we propose, each agent only performs such computations for its sub-task, while in approaches where agents need to compute a complete solution, this cost must be multiplied by the total number of sub-tasks of the mission.
4. Numerical simulations
In this section, we present the simulations we have performed on two examples of tasks: areas and target points. In all the tests, the total size of the task is normalized to one and the discount factors refer to the percentage of the task the robots will reduce at each round. For example, a discount factor 6=0.99 means that for the next round the task claimed will be 1% smaller. Clearly, when the dimension of the task is a discrete quantity, as in the case of the target points, the agents will reduce the claimed share by one quantum every few rounds, when the claimed share reaches the next integer value.
4.1 Areas
A variable number of agents had to negotiate how to partition a given region described by a polygon. The number of points defining the polygon was set to six and included in a square of a normalized size of 1 × 1. As such,
where
Summary of the area coverage experiments,
As expected, more agents need more rounds to find agreement. As far as the quality of the solution is concerned, it must be pointed out that the area was not 100% covered, but the overlap between the sub-tasks is extremely low. Likewise, the part of the sub-task exceeding the target polygon
4.2 Target locations
In this case, the agents had to negotiate how to partition a given set of target locations. The task is then the set of coordinates of the points. The task evaluation function used is:
where
Tables 3 and 4 report the results of the simulations. Each entry in the tables represents the percentage of the distance travelled by the robots in excess with regard to the optimum distance (average over 10 runs), calculated with a complete enumerative
Comparison of the total distance travelled with regard to the optimum solution (percentage of additional distance). Each entry is the average over 10 runs.
Comparison of the maximum individual distances travelled with regard to the optimum solution (percentage of additional distance). Each entry is the average over 10 runs.
The scenario is composed by
As for the
It is interesting to observe that increasing the dimensionality of the task (the number of target points) does not worsen the results, while increasing the number of agents does. This result is common to all the experiments performed.
5. Concrete applications
In this section, we present three concrete instantiations of our task partitioning and assignment.
The first one focuses on exploration, with an application to search and rescue (SAR) 7 [21]. The second one is an area partitioning application for aerial mapping in precision agriculture [4]. Finally, we present an example of how the negotiation strategy can be extended to tight cooperation applications - in this case box bushing - by periodic renegotiations [3].
5.1 Exploration
In this application, the mission is to explore and map a disaster scene in semi-autonomous mode. An operator at the command and control centre receives information concerning the robots' positions and status, partial maps and other sensory information, and has to analyse such information in order to guide the search process. One of his/her duties is to decide as to to which locations of the scenario to send or drive the robots for further exploration, whether it be mapping an unexplored part or revising an interesting feature (e.g., a possible victim).
In this system, the human-robot interface consists of a base station that allows the operator to monitor the progress of the mission, to know the status of the robots, give directives to them and even tele-operate them [30]. In semi-autonomous operation mode, a human operator specifies a number of interesting locations that the robots will visit and the robots autonomously navigate to them. In this mode, the operator has to specify which robots goes to which points. In the case where the number of robots comprising the team grows, it becomes increasingly difficult for the operator to supervise them.
In order to aid the operator involved in such a process and reduce his/her workload, it is useful to consider the team as a whole and assign the set of objective points to the team. Once the team has received the task, it can
The robot coordination mechanism proposed in this work provides the operator with such a view of the team as a whole: the operator decides upon the locations to be visited (the

Allocation of target points in robot search and rescue: initial (left) and final (right) configuration. The operator indicates the locations to be visited by the team (crosses of the left image) while the robots autonomously decide a subdivision. The image on the right shows the robots' final positions and the path travelled. Note how the map is updated as a consequence of the exploration. Note also the curved paths due to the initial robots' orientations.
Thus, in this case the agents had to negotiate how to partition a given set of target locations. The task is then the set of coordinates of the points. The task evaluation function used is:
where
All the other settings were the same as those used for the numerical simulation (Sec. 4.2), with the exception that the offer update removes from the set the one that reduces by the most the total path the robot has to travel, optimizing term
The same experiment has been reproduced deploying the negotiator agents on a fleet of mobile robots (WiFiBots 4G, equipped with an electronic compass and GPS for self-localisation) running suitable navigation software.
The test scenario was composed of three robots and five target points, located in an area of 15 × 15 metres. Figure 6 shows two different allocations, depending on the different parameters' settings (the discount factor of the robots). In this experiment, the optimum solution considering the total length of the travelled path is of 15.58 metres, while the total travelled distance of the negotiated allocation was 17.87 (left) and 18.01 metres (right), approximately 15% longer than the optimum.

Snapshots of the tests with real robots. Different solutions are agreed according to the impatience of the negotiators. In this example, δ1 = 0.887, δ2 = 0.877 and δ3 = 0.867 (left) and δ1 =0.887, δ2 = 0.867 and δ3 = 0.877 (right). The green cones represent target locations and the superimposed lines represent the final allocation.
This experiment also shows how the sub-task allocation depends on the discount factors of the robots. In fact, different discount factors lead to different expected values, as described earlier. Agents accept solutions that are worst for them if they are more impatient. In the example shown in Figure 6, the two rightmost robots (robot
5.2 Aerial survey
Here, we applied our negotiation system to one application of robotics-based precision agriculture that requires the detection and localization of weed (undesired wild plants) in crop fields to be properly treated (e.g., by the localized diffusion of chemical agents) by ground robots.
The mission described here is to take a set of geo-referenced aerial images of a given area. These images will be analysed with image-processing tools in order to detect and localize weeds. From the task subdivision point of view, this application is an area partitioning instance like those described in Sec. 4.1.
In our experiments, a team of three unmanned aerial vehicles (UAVs) is used for this purpose, two Hummingbird quad-rotor platforms 8 and one AR100 platform 9 . Due to processing capabilities restrictions, the negotiating agents and the path planning algorithm run on a base station. Figure 7 (left) shows the architecture of the system and snapshots of the UAVs during tests at the CAR-UPM premises. Figure 7 (centre) illustrates a vineyard area nearby Madrid, Spain, that has been a theatre of the field tests, and a solution of the area partitioning. The individual flight plans of the three agents is shown on the right.

Left: architecture of the aerial robots' team and snapshots of the UAVs in action. Centre: vineyard field objective of the mission. The field has an irregular shape of approximately 330 × 200 metres. Area partition is done in the continuous space. Right: flight routes in the discrete space (red lines). For the purpose of path planning, the area is discretized in cells (10 × 10 cells of size 32.7 × 20 metres), where the centres of the cells are waypoints for the UAVs and the cells correspond to one picture of the global mosaic.
The experiments performed have shown the usefulness of the autonomous negotiation process: the operator simply specifies the area to be surveyed, with a set of geo-referenced points, and the number (and kinds) of vehicles available. The negotiator agents will produce a candidate area partitioning and the flight plans are calculated. The operator can review and validate the solution and, upon validation, flight plans are sent to the UAVs for the mission to start. For more details of the process and experimental results, we refer the reader to [4].
5.3 Box pushing
In order to explore the potential and limitations of the proposed approach, we present a final instantiation of our method. Box-pushing is a classical test bed application for high cooperation strategies, where agents have to coordinate their actions in a continuous manner. Even if the negotiation protocol presented here targets
Figure 8 (left) illustrates how the problem can be expressed in our task/sub-task formalism. Task are vectors representing the direction of the movement of the box if they are executed. Tasks are described by two parameters

Encoding of the box-push task in our framework. Left: the parameters describing the task are push force and push direction. Right: if
The combined execution of the sub-tasks produces a displacement and a rotation of the box towards its target location. Thus, the robots have to negotiate how each of them will push the box to obtain the desired motion
The task evaluation function used in this case is simply:
i.e., how much “useful” motion is obtained by the combination of the two combined pushing actions.
Figure 9 shows a snapshot of the experiments performed, using two modular mobile robots (see [2] for more details on the robots). The robots communicate with a central control station via Bluetooth wireless connections. A fixed camera located on top of the scenario is used to detect the robots' position and orientation, as well as the position of the box, providing all the necessary information for the negotiation to take place. The negotiator agents run in the central control station. We refer the reader to [3] for a detailed description of the experimental results.

A snapshot of one box-pushing application. The green and yellow circles represent the agreed pushing points and the black arrow represents the desired motion vector. This experiment has been carried out in collaboration with Mr. J. Baca and Prof. M. Ferre, Group of Robots and Intelligent Machines, CAR UPM-CSIC.
In this example, we can see how dynamic allocation using renegotiation is achieved. After the agreed actuation of the robots, the target task
Fig. 10 illustrates an example run. On the left, a graphical representation of the bar position over time is reported, while on the right the evolution of the bar's orientation compared to the desired orientation is reported.

Example run. Left: the red dotted line represents the bar and the big circles represent the positions of the robots. Right: box direction with regard to the desired direction.
6. Conclusions
The main contributions of this paper are twofold. First, we propose a formal definition of the concept of a task, general enough that allows for the expression of different problems. A task partitioning algorithm implemented using such a formulation can then be applied to a wide variety of multi-robot tasks. Second, a negotiation protocol that takes advantage of theoretical results coming from game theory which provide some important properties, such as termination criteria and the prediction of the outcome, that can be used to drive the negotiation. Moreover, the algorithm is able to be extended to incorporate mission-specific features such as exit options and the renegotiation of a solution to adapt to a task change.
Experiments conducted both in computer simulations and real applications demonstrate the effectiveness of the proposed approach, which allows for the quick instantiation of the general framework to the particular cases, i.e., an easy step from theory to practice. Here, we have presented three examples with different kinds of missions and robots, with the purpose of illustrating the versatility of our approach. Indeed, our aim was to provide an abstraction of the concepts of task, reward and negotiation, general enough to embrace a large set of possible multi-robot missions, and at the same time aimed at being actually implemented and deployed in real MRSs.
This is also possible because vehicle-specific features are hidden in the task evaluation function of the robots, and thus there is no need for the system or its team-mates to be aware of them. This makes the system more general and decouples the robots' particular characteristics from the negotiation protocol, allowing for the use of teams of heterogeneous robots.
In conclusion, the framework we have presented provides a fully distributed, simultaneous task subdivision and allocation system for teams of heterogeneous robots that can be applied to a wide variety of different robot cooperation problems.
Footnotes
7. Acknowledgements
This work was partially funded by the projects ROTOS (Fleet of autonomous aerial and ground robots - DPI2006-03444) of the Ministry of Education and Science of Spain, RHEA (NMP-CP-IP 245986-2) of the European Commission, and ROBOCITY2030 of the Community of Madrid, Spain.
1
In this paper, we will use the terms ‘robot’, ‘vehicle’, ‘negotiator’, ‘agent’, ‘party’ and ‘team-member’ synonymously, as this makes little difference for the scope of the discussion. To be precise, a vehicle is a mobile robot that runs a negotiator agent, besides other control software, that is a party in a negotiation process within the team of robots of which it is a member.
3
In most practical cases, x will be an array of real numbers: x∈ℝk.
4
Higher orders which overlap (i.e., the overlapping of three or more tasks) are not considered since their size decreases very quickly during the negotiation process, and thus their contribution is not significant.
5
In this example and in the following ones, a linear combination of factors is considered. However, any kind of function can be designed, e.g., taking into account non-linearities of the robot's performance or of the reward associated with a task (e.g., exponential penalties).
6
As suggested by an anonymous reviewer, an optimal ordering may optimize the number of rounds needed and/or the rewards. Modifications of the protocol and of the information exchanged will be needed for this purpose.
