Dynamic Task Allocation in Cooperative Robot Teams

Abstract

In this paper a dynamic task allocation and controller design methodology for cooperative robot teams is presented. Fuzzy logic based utility functions are derived to quantify each robot's ability to perform a task. These utility functions are used to allocate tasks in real-time through a limited lookahead control methodology partially based on the basic principles of discrete event supervisory control theory. The proposed controller design methodology accommodates flexibility in task assignments, robot coordination, and tolerance to robot failures and repairs. Implementation details of the proposed methodology are demonstrated through a warehouse patrolling case study.

Keywords

Supervisory control cooperative robot teams task allocation limited lookahead policy

1. Introduction

Many applications in industrial, civilian and military fields benefit from mobile robot utilization. Application domains vary from warehouse patrolling to service robotics and to space exploration. Mobile robots have been reported to explore, map or inspect friendly or hostile territories (Kumar & Sahin, 2003), (Zhang et al., 2001), (Jennings et al., 1997), or dispense medications in medical facilities (Krishnamurthy & Evans, 1992). Specifically in industrial applications, such as manufacturing, underground mining, toxic waste cleanup and material storage/handling, where many processes take place in hazardous environments harmful to human health, the choice of robotics-based solutions is justifiable. Furthermore, as the complexity and requirements of an application increases, significant advantages may be drawn from the use of multi robot systems.

A major challenge when working with multi robot systems is that of task allocation and coordination. The overall mission is decomposed into multiple tasks to which one or more robots are assigned. The task allocation problem is further complicated considering the dynamic characteristics of the robot team such as robot failures and repairs that may lead to incomplete tasks. The robot team should be able to complete the mission even if some team members are no longer operational and/or available.

This paper describes a general dynamic task allocation and controller design methodology for cooperative robot teams. The robot team is modeled as a Discrete Event System (DES). Each robot is modularly represented by a finite state automaton model. The mission requirements model is synthesized from individual finite state automata representing task completion requirements.

The proposed control methodology is partially based on the Ramadge & Wonham (RW) supervisory control theory (Ramadge & Wonham, 1987). However, instead of synthesizing a complete supervisor, as the traditional RW theory suggests, a limited lookahead policy is adopted that enables/disables events in the system in real-time based on the evaluation of a utility function and robot availability. The utility function uses fuzzy logic to quantify the ability of a robot to perform a task. The robot modules appear or disappear overtime depending on robot failures and repairs. When a failure event occurs, the control methodology re-allocates tasks to the operational robots of the team to ensure mission completion.

In cooperative robot teams, several characteristics of the team members, such as endurance, reliability, efficiency etc, must be considered in task allocation decisions. We describe these characteristics as fuzzy variables and develop a fuzzy controller to determine the utility function value for each task allocation event. These values are then used to determine a preferred task allocation in real time as a part of the proposed controller design methodology. Our work is motivated first by the fact that in traditional supervisory control theory, the acceptable sequences of event execution determined apriori are computationally intractable for realistic size problems. Furthermore, in applications where there is a significant degree of uncertainty associated with resource reliability and the environment, these sequences may not be executable. Instead, we propose a control approach based on a limited lookahead control policy for task allocation in real time.

Secondly, in supervisory control theory the criteria used in restricting the state-to-state transition of the system are the marked states, which denote the acceptable states, and the legal language, which denotes acceptable system behavior. These criteria fail to describe a preferred behavior within the acceptable behaviors.

This work combines the modeling strengths of DES and supervisory control theory to model comprehensive inter task dependences and robot interactions applicable to most task allocation problems. Specifically, this works demonstrates flexibility in task assignments, task sequencing, robot cooperation and coordination, and tolerance in robot failures and repairs. In addition, given that there is diversity between the robots' capabilities (heterogeneous robot team), the proposed methodology can be applied to missions where each task presents distinct requirements.

The rest of this paper is organized as follows: Section 2 discusses the related literature. Section 3 presents the DES models of the robot team and mission requirements models for the task allocation problem. Section 4 describes the utility function concept, the fuzzy controller used to determine the utility function values for task allocation events and the proposed limited lookahead policy. Section 5 introduces failures and repairs in the proposed model. Section 6 discusses scalability issues for the proposed methodology. Section 7 presents simulation results using for a case study and finally, Section 8 concludes this paper.

2. Related literature

The task allocation problem has been addressed in literature by utility based approaches and auction based approaches for both cooperative robot teams and robot swarms. Utility based approaches have been used for task allocation in many control architectures as in (Parker, 1998), (Zlot et al., 2002) and (Timofeev et al., 1999). Each task is assigned to a robot based on various utility estimates: In (Parker, 1998) each robot is assigned a task based on utility estimates of acquiescence and impatience. In (Zlot et al., 2002) utilities are computed as a function of relevant sensors; the robot having the most relevant sensors for a task is assigned the particular task. Utility has also been used in robot team cooperation to estimate the cost of executing an action (Botelho & Alami, 1999) and for sensor-based metrics (Gerkey & Mataric, 2002). Auction based approaches as in (Lagoudakis et al., 2004), (Botelho & Alami, 1999) and (Gerkey & Mataric, 2002) achieve task allocation based on the Artificial Intelligence concept of Contract Net Protocol (Davis & Smith, 1983). Each robot bids for an available task and the robot with the higher bid is assigned to that task. In the proposed control methodology, the dynamic task allocation problem is addressed using utility and fuzzy logic. Utility function values are computed based on the ability of each robot to perform a task considering several factors.

Limited lookahead policies for supervisory control have been first studied in (Chung et al., 1992) where a limited lookahead window is used to control the online behavior of the uncontrolled system model. The notion of pending traces is introduced to describe the legality of a trace in the lookahead window based on a conservative or an optimistic attitude. The notion of pending traces was later raised in (Kumar et al., 1998) by extending the uncontrolled system model behavior by arbitrary traces beyond the limited lookahead window. In (Chung et al., 1993), the authors present a methodology that recursively computes the future control actions based on previously computed control actions. Later, in (Chung et al., 1994) and in (Hadj-Alouane et al., 1994) the authors present an extension to the lookahead policies to cope with the computational complexity problem by making a control decision without exploring the whole lookahead window. Further enhancements in limited lookahead policies for supervisory control have been proposed. In (Heymann & Lin, 1994) a lookahead policy is presented for systems with partial observability. Also, in (Kumar & Garg, 2001) system's uncertainty is considered by assigning probabilities to event occurrences and in (Lin, 1993) by modeling all possible variations of the system. To our knowledge there are no limited lookahead policies in the literature designed to control cooperative robot teams.

As noted in (Grigorov & Rudie, 2006), only few approaches, as described in (Yi-Liang et al., 1997) and (Gordon & Kiriakidis, 2000), concentrate in time varying systems where system modules appear or disappear in time. In these approaches resource modules disappear only after the completion of assigned tasks. In this work, we relax this assumption by considering failures during task execution. In coordinated robot teams the concept of robot failures and repairs is important since a robot failure while executing a task will lead to an incomplete mission unless the control model reassigns the task elsewhere. The lookahead policy presented in this paper considers robot failures and repairs to ensure mission completion.

Supervisory control based approaches on discrete event system have been used by a number of researches to control mobile robot teams. However, although a limited amount of work considers robot failures, not much effort is found in the area of control decisions concerning robot rejoining the robot team after repairs. Specifically, the automata based approaches presented in (Gordon-Spears & Kiriakidis, 2004), (Kiriakidis & Gordon, 2001) and (Xi et al., 2003) consider situations where some robots go offline but do not take into account situations where robots come back online. Similarly, the Petri Net controller in (Jamie et al., 2003) disregards robot repairs. Finally, the control architecture presented in (Kimura et al., 1998) handles only robot failures.

The contributions of this work can be summarized as follows:

Modular design that allows for consideration of robot failures and repairs in the system's behavior. It should be noted that most work in the literature of task allocation addresses only cases of robot failures.

Modeling of comprehensive inter task dependencies and robot interactions.

Development of a fuzzy logic based utility function to quantify the ability of a robot to perform a set of tasks. The fuzzy logic controller takes into account several practical aspects of robot operation to suggest which robot is appropriate for which task.

Task allocation that relies not only on the concept of acceptable behavior rather the concept of preferred system behavior.

The weakness of this work is the increasing state size due to finite automata representation. The proposed methodology is applicable to medium size robot teams (2–7 robots) in realistic size missions (2–9 tasks).

3. Robot team models

In RW supervisory control theory, the uncontrolled system's model (UCSM) and the mission requirements are separately modeled using finite automata (FA) models. The FA model G_j = (∑_j,Q_j,q_j0,δ_j,Q_jm) represents the uncontrollable behavior of Robot j. ∑_j is the set of events, Q_j is the set of states and δ_j : ∑_j × Q_j → Q_j is the transition function. q_j0 and Q_jm are the initial and final states respectively. The set of events ∑_j consists of the controllable and the uncontrollable events ∑_j = ∑_jc∪∑_ju and ∑_jc∪∑_ju = Ø. Fig. 1 depicts the transition graph for the automaton describing Robot j. An arrow marks the initial state and the marked state is shown as a dark circle. Regarding a Task k, the states and the events of Robot j are defined as shown in Table 1.

Fig. 1.

Transition graph for Robot j.

Table 1.

List of States, Events and Their Description

State/Event	Type of Event	Description
I_j	−	Robot j in idle state
B_jk	−	Robot j in busy state with Task k
F_jk	−	Robot j in failed state while executing Task k
start_jk	controllable	initiation of Task k by Robot j
complete_jk	uncontrollable	completion of Task k by Robot j
failure_jk	uncontrollable	failure of Robot j while executing Task k
repair_jk	uncontrollable	repair of Robot j
drop_jk	uncontrollable	repair of Robot j and re-initialization of Task k

The FA model design incorporates two different cases of robot failures and repairs. A robot failure may be considered as (i) temporary failure or (ii) failure with task re-initialization. In the first case, the failed robot will continue executing its task as soon as is repaired while in the second case the task of the failed robot needs to be reinitialized.

The UCSM that represents the uncontrolled behavior of the robot team is composed of the synchronous product (Cassandras & Lafortune, 1999) of the individual robot modules as follows:

G = (Σ, Q, q_{0}, δ, Q_{m}) = G_{1} | | G_{2} | | G_{3}

(1)

The specification's model is the finite automaton that models the mission requirements which is synthesized using individual task completion requirement models. Four alternative task completion requirements are modeled:

Alternative 1: Task k can be performed only by Robot j

Alternative 2: Flexibility in task assignments: Task k can be performed by Robot j or by Robot i, i ≠ j

Alternative 3: Task sequencing and robot coordination: Task k must be performed first by Robot j and subsequently by Robot i, i ≠ j

Alternative 4: Robot cooperation: Task k must be performed by Robots i and j simultaneously.

Through the task completion requirements it is possible to model inter task dependences and robot interactions. Fig. 2 depicts the transition graphs for these four alternative task completion requirements. Fig. 2a describes the requirement where Task k can be performed only by Robot j. If Robot j fails during the task, the task may be re-initialized as shown by the drop_jk event. Fig. 2b describes flexibility in task assignment where Task k can be performed by Robot j or Robot i. Fig. 2c describes task sequencing and robot coordination where Task j must be completed first by Robot j and then by Robot i. Consider the scenario where two Robots, e.g. Robot 1 and 2, must perform one Task, e.g Task 1. To demonstrate, assume that Task 1 is to inspect an area for fire and chemical spills and that Robot 1 carries a fire detection sensor and Robot 2 a chemical sensor. To avoid damaging any of the robots, it would be logical to first inspect for fire using Robot 1 and then for chemical spills with Robot 2. Finally, Fig. 2d describes a robot cooperation procedure where two robots have to perform a task simultaneously.

Fig. 2.

Transition graphs for the task completion requirements

Formally, task completion requirements are modeled as finite automata of the form:

R_{n} = (Σ_{n}, X_{n}, ξ_{n}, x_{n 0}, X_{n m})

(2)

where n is the number of the individual task completion requirements.

The specification's model, S, is a finite automaton synthesized by the synchronous product of all mission requirements. Formally, the specifications model is synthesized as follows:

S = (Σ, X, ξ, x_{0}, X_{m}) = R_{1} ∥ \dots ∥ R_{n}

(3)

The supervisory control model for the team of robots consists of the coupled system model S/G and the control pattern Ψ. The coupled model is defined as the product of the UCSM and the specifications model, which includes all the events that are allowed by both models:

S / G = (Σ_{S} \cap Σ_{G}, X \times Q, γ = ξ \times δ, (q_{0 S}, q_{0 G}), X_{m} \times Q_{m})

(4)

The control pattern Ψ is a function Ψ: ∑ × X → (0, 1) based on the supremal-controllable language of the coupled model that enables (1) or disables (0) the controllable events in the UCSM so that desirable system behavior is guaranteed. The synthesis of the control pattern and consequently the solution to the supervisory control problem is a computationally prohibitive procedure for larger systems. Furthermore, while a task allocation and desired control pattern determined a priori may be executable with reliable resources in controlled environments, such a sequence of events is very unlikely to be executed to completion in applications associated with cooperative robot teams due to unreliable resources and uncontrollable, and possibly hostile, environments which robot teams typically operate in.

In this paper, instead of following the traditional supervisory control approach and synthesizing the complete supervisor for the system, a limited lookahead control policy is adopted. Limited lookahead control approaches are suitable for highly dynamic systems since only a portion of the system corresponding to the system's behavior in the near future is considered for evaluation. A limited lookahead window of finite depth is used to direct the behavior of the system. Every time an event is executed in the UCSM, the lookahead window is reconstructed and all possible sequences of events in the lookahead window are evaluated. The event leading to the highest evaluated string is enabled while the rest of the controllable events are disabled. The evaluation criteria based on the utility concept are described in the next section.

4. Utility function definition

In heterogeneous cooperative robot teams, each robot possesses unique characteristics including but not limited to sensory capabilities, cost, efficiency and endurance. Each robot presents a different level of ability to perform a certain task. A function can capture the robot's ability to perform a task in terms of utility. In addition the same utility function may be used to capture qualitative concepts such as the system designer's choice/knowledge to assign certain tasks to specific robots. For example, consider the case where the system's designer knows that between two robots that can perform the same task, one of the robots will perform better than the other. Obviously, the designer will wish to assign the particular task to the robot that performs better. These aspects complicate the task allocation problem.

An evaluation method that maximizes the overall performance of the robot team is required. In supervisory control theory, traditional system evaluation criteria are the marked states, which denote the acceptable states, and the legal language, which denotes acceptable system behavior. However, these criteria fail to describe undesirable yet acceptable behavior.

The proposed control methodology employs a utility function to evaluate strings in the lookahead window. We define a utility function u : ∑ → [0,1], which associates an event σ ∈ ∑ with a utility value between 0 and 1 and we define the utility of a string s as:

U (s) = \sum_{σ \in s} u (σ) .

(5)

The attributes mentioned that could be used to compute the robot's ability to perform a task, such as endurance, efficiency and designer's choice, represent vague concepts hard to describe mathematically. However, these concepts can be described in terms of fuzzy logic as fuzzy variables with linguistic membership functions. A Mamdani type fuzzy logic controller (Mamdani, 1976) is proposed that receives as inputs the membership of each fuzzy variable and computes the ability of a robot to perform a task.

Depending on the mission's functional requirements and the operational characteristics of each robot in the team, the design of the fuzzy controller can include multiple attributes. Such attributes are defined by the system's designer and are limited only by the level of specificity the designer wishes to apply in the task allocation. Examples of these attributes are the robot's endurance and efficiency, the designer's preference in task allocation, the cost of the robot, the number and the type of sensors a robot carries, the distance a robot has to travel to perform a task, the cost of assigning a robot to a task.

To demonstrate the operation of the fuzzy controller, three fuzzy variables are considered in this study: (i) robot's endurance, (ii) designer's choice and (iii) robot's efficiency. The first fuzzy variable has three membership functions {short, fair, long} and denotes how long a robot can remain functional. The second fuzzy variable with three membership functions {low, medium, high} denotes the system designer's choice to assign certain tasks to specific robots. Finally, the third fuzzy variable with three membership functions {low, medium, high} denotes the robot's efficiency level. The output of the fuzzy logic controller is also a fuzzy variable with membership functions {low, medium, high} denoting a robots ability to perform a task.

Considering a Task k and a Robot j, the ability of Robot j to perform Task k, denoted by ability_jk, is computed based on a set of rules such as:

If the robot's endurance is long , the designer's choice is high and the robot's efficiency is high , then the ability _jk of Robot j to perform Task k is high

If the robot's endurance is fair , the designer's choice is medium and the robot's efficiency is medium , then the ability_jk of Robot j to perform Task k is medium

If the robot's endurance is short , the designer's choice is low and the robot's efficiency is low, then the ability_jk of Robot j to perform the task k is low

The ability of a robot to perform a task is closely related to task allocation and consequently to task initiation events {start_jk}. Thus, the utility function value of the events {start_jk} is equal to the ability of Robot j to perform Task k. In other words,

u (σ) = a b i l i t y_{j k} where σ \in {s t a r t_{_} j k} .

(6)

The events {drop_jk} corresponding to task re-initialization due to robot failure represent cancelation of task assignments. The utility function values of the {drop_jk} events are

u (σ) = - a b i l i t y_{j k} where σ \in {d r o p_{_} j k} .

(7)

The higher the utility function value for an event, the more desired this event is. Since uncontrollable events {complete_jk, failure_jk, repair_jk} cannot be enabled/ disabled, their utility function values are:

\begin{aligned} u (σ) = 0, \\ σ \in {c o m p l e t e_{_} j k, f a i l u r e_{_} j k, r e p a i r_{_} j k} . \end{aligned}

(8)

Even though the uncontrollable events failure_jk and repair_jk have utility function values equal to zero, occurrences of such events influence indirectly the string utility. The impacts of such events to the controller design are discussed in the next section.

Each time an event σ occurs in the UCSM and the limited lookahead window is reconstructed, the utility function values for all the strings s ∈L(S/G) (the language of the coupled model) in the lookahead window are computed. The string with the highest utility function value corresponds to the most desirable system behavior. The maximization procedure is implemented as a dynamic programming problem (Bellman, 1952) with a forward sweep and a backtracking pass.

A string that includes many task initiation events is evaluated higher than a string with fewer task initiation events. However, the highest evaluated string may not always correspond to the most desirable task allocation. Consider the case where 3 robots (Robot 1, Robot 2 and Robot 3) are assigned to perform 3 tasks (Task1, Task 2 and Task 3). If u(start_11) = u(start_22) = u(start_34) = 1 and u(start_31) = 2 indicating that assigning Task 1 to Robot 3 is the desired action since u(start_31) > u(start_11). Suppose that the event start_11 (initiation of Task 1 from Robot 1) is a part of the string start_11start_22start_34 and the event start_31 (initiation of Task 1 from Robot 3) is a part of the string start_31complete_31complete_jk in a limited lookahead window with depth 3 where only 3 future events are considered. The utility function value for the string start_11start_22start_34 is higher than the utility of the string start_31complete_31complete_jk (3 and 2 respectively). Thus, the control methodology will disable the event start_31, which is a more desirable task allocation. To eliminate this bias arising from the limited depth of the lookahead window, a normalized utility function

U_{N} (s) = \frac{1}{N} \sum_{σ \in s} u (σ)

(9)

where N is the number of task initiation events in the string s is used. The normalized utility values of the two strings would be 1 and 2 respectively leading to the preferred choice of task allocation. It should be noted that the normalization of the utility function is also a design consideration and is customizable depending on the characteristics of the mission.

5. Robot failures and limited lookahead control policy

Frequently, during a mission, a robot may go offline due to a sensor failure or communication loss resulting in an incomplete task. The control methodology should be able to compensate for robot failures by reallocating tasks to the operational robots of the team. Two kinds of failures are considered in this work: temporary failures and failures with task re-initialization.

Temporary failures, such as communication loss are failures that can be repaired in a short period of time and the robot may continue executing its task after the repair. Fig. 3, demonstrates a temporary failure in a mission with 2 robots (Robot 1, Robot 2) and 2 tasks (Task 1 and Task 2), where each robot can perform both tasks. Robot 1 is failed in State 2 and repaired in State 4 to continue executing its task assignment.

Fig. 3.

Events executed in the UCSM after a temporary failure

Failures with task re-initialization are failures that require re-initialization of the task assigned to the failed robot. For example, consider a sensor failure that it is not immediately recognized. However, during task execution the sensor failure is realized and the task needs to be reinitialized. The task, assigned to the failed robot, is re-evaluated for allocation. Fig. 4 demonstrates a temporary failure with task re-initialization. Robot 1 has failed in State 2 while executing Task 2 and the failure is considered as failure with task re-initialization. In State 4, Task 2 is dropped and assigned to Robot 2.

Fig. 4.

Events executed in the UCSM after a failure with task re-initialization

As mentioned in the previous section, the uncontrollable events failure_jk and repair_jk have utility function values equal to zero. However occurrences of these events influence the utilities of the controllable events allowing for an intelligent task allocation design. Consider the case where two robot candidates are to perform a task. Based on the evaluation methodology proposed in this work, the robot with the highest utility will be assigned to the task. If the robot fails while executing the task two things may happen: (1) task re-initialization and (2) the robot will be repaired and complete the task. Based on the utility assignment method discussed so far, the task will be re-assigned to the same robot after its repair, since this is the robot with the highest utility. In a real world application, such decision will not be ideal since the same failure might be repeated. For example, even though a robot can perform the specific task better than any other robot of the team, assume that its communications capabilities are not as strong as the capabilities of other robots. If the robot fails while performing a task due to a communication error, then repeating the task will potentially result to the same failure. In addition, even in the case of temporary failure without task re-initialization, it might be desired to re-assign the task to another robot. Consider that a mission is decomposed into multiple tasks and that Task k is the only task left to complete the mission. If Robot j fails while performing Task k, it might be preferable to drop the task and assign it to a different robot instead of waiting for Robot j to be repaired. Such observations lead to an intelligent controller design for task allocation.

To accommodate these observations in the control design, uncontrollable events influence the utility values of the controllable events related to the failed robots as follows: If a failure occurs in the UCSM, the utility of the failed robot for the specific task is reduced to the minimum of the utilities other robots in the team have for the same task. The evaluation process is repeated. This will allow the controller design to decide in re-allocating the task to a different robot.

The block diagram of the control methodology algorithm consists of 5 main modules as shown in Fig. 5:

Fig. 5.

Block diagram of the control algorithm

System Initialization: The UCSM and the specifications are generated as described in Section 3. Based on the fuzzy logic controller, each event is assigned a utility function value using Equations (7), (8) and (9).

Failure Detection: When a robot failure is detected, the information to be used in the calculation of the new limited lookahead window is forwarded to the next module. This information includes the set of events to be masked, the type of failure, and the expected time to repair.

Lookahead window formation: Using as root state the initial state of the UCSM and the specifications model, a lookahead window is formed that includes all the transitions starting from the initial state up to a certain predefined depth in the coupled model. The transitions in the lookahead window form a tree of strings that can be executed in the UCSM.

String Evaluation: Each string in the lookahead window is evaluated using the normalized utility function shown in Equation (9).

Control Decision: The event exiting the root state leading the string with the highest utility function value is enabled. All the other controllable events exiting the root state are disabled. In the case that two or more strings present the same utility function value all events leading these strings are enabled. At the next state, the new system state becomes the root state for the lookahead window and the procedure is repeated until the system reaches a marked state.

6. Scalability

The computational complexity of the traditional supervisory control problem is PSPACE hard. However, the proposed methodology does not require generation of a complete supervisor. The evaluation process is based on the events appear in the system's coupled model. The generation of the coupled model has linear computational complexity.

In addition, even though the evaluation problem is a general search problem with an NP-hard computational complexity, using the lookahead control policy presented in this paper, not all the states of the system need to be evaluated.

The complexity of the fuzzy logic design does not depend on the number of robots in the team or the number of the tasks. It depends only on the number of membership functions and the number of fuzzy variables the designer wishes to allow. Thus, the fuzzy logic design is scalable to any number of robots or tasks.

The bottleneck of the proposed methodology is the generation of the coupled model. Even though the computational complexity is linear, the state space increases exponentially. For this reason, the proposed methodology is applicable to medium size robot teams. In our experiments we were able to formulate problems with up to 7 robots and 9 tasks. The number of robots and tasks are comparable and in most cases greater than similar work in task allocation for robot teams reported in literature. An alternative controller design, based on control pattern generation on the fly, as in (Yalcin, 2005) would disregard the coupled model generation and allow the proposed methodology to be applicable to bigger size robot teams. In future research such methodologies will be considered.

7. Case study

This section discusses a case study based on computer simulation results. The controller design methodology for dynamic task allocation is applied to a warehouse patrolling application scenario. As depicted in Fig. 6, a team of mobile robots is assigned to patrol a warehouse containing hazardous and security sensitive materials. Three robots with different sensory capabilities are considered. Table 2 summarizes the sensor suite for each robot.

Fig. 6.

Warehouse partition for the patrolling scenario

Table 2.

Robot sensors

Robot	1	2	3
Sensors	Fire detector	Chemical	Fire detector
	Chemical	detector	Vision system
	detector	Geiger
	Geiger	Counter
	Counter	Vision System

Based on the partition based patrolling strategy described in (Chevaleyre, 2004) the warehouse is partitioned into five regions as follows: Region 1 contains flammable materials, Region 2 chemical materials, Region 3 radioactive materials, Region 4 security sensitive materials and Region 5 security sensitive and flammable materials. This configuration allows us to demonstrate features such as flexibility in task assignment (Region 1 can be assigned to Robot 1 or Robot 3), and cooperation between the robots (Region 5 must be assigned either to Robot 3 or first to Robot 1 and subsequently to Robot 2). The team's mission is to inspect all five warehouse regions.

The overall mission is divided into 5 tasks where task k = {1,2,3,4,5} corresponds to the patrolling/inspection of the warehouse Region k.

Based on the robot sensory capabilities described in Tables 2 and 3, possible robot-task allocations for Task k and Robot j are defined as:

k = {\begin{cases} {1, 2, 3, 5}, if j = 1 \\ {2, 3, 4, 5}, if j = 2 \\ {1, 4, 5}, if j = 3 \end{cases}

(10)

Table 3.

Region/Robot allocation

Region 1	Region 2	Region 3	Region 4	Region 5
Flammable material	Chemical material	Radioactive material	Security Sensitive material	Flammable & Sec. Sensitive material

Robot 1 / 3	Robot 1 / 2	Robot 1 / 3	Robot 2 / 3	Robot 3 / 1 & 2

Each robot is modeled as a finite automaton as described in Fig. 1. The UCSM is described by the Equation 1. Task completion requirements are also modeled as finite automata, Fig. 2, and the overall mission requirements are described by Equation 3. Tasks 1–4 are modeled based on Alternative 2 and Task 5 is modeled based on Alternative 3.

The ability of each robot to perform a task is computed by the fuzzy logic controller described in Section 4. The membership functions for the fuzzy variables endurance, efficiency and designer's choice, are shown in Table 4.

Table 4.

Fuzzy logic membership functions for the patrolling scenario

Rob.	Fuzzy variable	Task 1	Task 2	Task 3	Task 4	Task 5
1	Endurance	Long	Fair	Long	–	Long
	Efficiency	Low	Med.	Med.		Med.
	Des. Choice	Low	High	Low		Low

2	Endurance	–	Fair	Fair	Fair	Short
	Efficiency		High	Med.	Med.	Med.
	Des. Choice		Low	High	High	Low

3	Endurance	Short	–	–	Fair	Fair
	Efficiency	Med.			Med.	Med.
	Des. Choice	High			Low	High

Fig. 7 depicts the fuzzy variables and their membership functions we have adopted for the patrolling application.

Fig. 7.

Membership functions of the fuzzy variables: robot's endurance, and designer's choice, robot's efficiency and robot's ability

There are 27 such rules in the fuzzy controller, which cover all combinations among the membership functions of the input variables. The corresponding event utility function values based on Equation (7) are:

\begin{aligned} u (s t a r t_{_} 11) = 0.16, u (s t a r t_{_} 12) = 0.81, u (s t a r t_{_} 13) = 0.33, \\ u (s t a r t_{_} 15) = 0.16, u (s t a r t_{_} 22) = 0.33, u (s t a r t_{_} 23) = 0.83, \\ u (s t a r t_{_} 24) = 0.81, u (s t a r t_{_} 25) = 0.18, u (s t a r t_{_} 31) = 0.81, \\ u (s t a r t_{_} 34) = 0.17, u (s t a r t_{_} 35) = 0.83 . \end{aligned}

(11)

The maximum profit due to control actions is equal to 4.09 and is achieved when the following task assignments are made in any order:

{s t a r t_{_} 31, s t a r t_{_} 12, s t a r t_{_} 23, s t a r t_{_} 24, s t a r t_{_} 35} .

(12)

Robot failures are generated using a uniformly distributed random variable p_j ∈ [0,1] denoting the probability of Robot j to fail.

Three performance measures of interest are reported. Total utility refers to the sum of the utility of the events executed in the experimental run. This is an indicator of the quality of task allocation decisions where the higher values indicate better task allocation decisions. Average state space refers to the average number of states explored each time a lookahead window is created during an experimental run. This metric refers to the computational requirement for real time decision making. Finally, total state space metric refers to the total number of new states explored during the entire experimental run.

Table 5.

Summary of simulation results

Depth		Average	% compared with maximum lookahead depth
2	Total Utility	3.38
	Avg. State Space	3.13	1%
	Total state space	39.00	1%

3	Total Utility	4.03
	Avg. State Space	13.20	4%
	Total state space	139.85	5%

6	Total Utility	3.93
	Avg. State Space	132.37	37%
	Total state space	1168.45	43%

14	Total Utility	3.87
	Avg. State Space	356.87	100%
	Total state space	2723.00	100%

Table 6 summarizes the results of the 20 experimental runs. In this scenario, the uncertain environment arising from robot failures produced a more mixed set of results. In all cases including the maximum lookahead depth of 14 the average total utility was less than the maximum total utility. Note that the percentages of computational requirements for the LLD of 2, 3, and 6 were less since the maximum lookahead depth in this scenario was higher.

Table 6.

Summary of simulation results, experiment 2

Depth		Average	% compared with maximum lookahead depth
3	Total Utility	3.96
	Avg. State Space	28.29	Less than 1%
	Total state space	339.00	2%

9	Total Utility	4.07
	Avg. State Space	1462.00	40%
	Total state space	11667.00	59%

	Total Utility	4.10
14	Avg. State Space	3299.00	92%
	Total state space	19382.00	99%
	Total Utility	4.12
18	Avg. State Space	3586.00	100%
18	Total state space	19488.00	100%

In order to demonstrate the impact of the LLD on task allocation decisions based on the total utility of the executed events, we compared the null hypothesis

H_{0} : μ_{2} = μ_{3} = μ_{6} = μ_{14}

(13)

against the alternative hypothesis that the means are different using a single-factor analysis of variance (ANOVA) with four levels of LLD and 20 repetitions. The results indicate that there is a significant difference between the means and LLD is a factor that impacts the total utility level. Subsequently we conducted similar experiment but this time using 3 levels of LLD 3, 6 and 14. The results of this analysis show no significant difference between the means.

In summary, these preliminary experiments indicate that a limited lookahead control policy provides a computational efficient approach to the problem of task allocation. However, the depth of the limited lookahead window must be carefully chosen based on the characteristics of the mission as well as the desired level of optimality as the quality of the task allocations is dependent on this parameter. Furthermore, these preliminary results indicate that LLD of less than ¼ of the maximum lookahead depth provides compounded reductions in the computational requirements where on average less than 10% of the computational requirements are sufficient to make control decisions. However, it must be pointed out that this statement must be further verified with larger experimental designs.

Additional simulations with 5 robots and 6 tasks have been performed. Table 6 summarizes the results. The maximum utility in this case is 5.09.

8. Conclusions

In this paper we describe a novel control methodology for task allocation in cooperative robot teams. Finite automata formalism is used to model the robot team and the mission requirements as discrete event systems. In developing the system model, we considered flexibility in task assignment, robot coordination for task completion and robot failures and repairs. These characteristics are commonly encountered in mission planning and execution of cooperative robot teams. We also describe a utility function for task allocation that uses fuzzy logic to describe various robot capabilities which are difficult to quantify. Subsequently, a limited lookahead control policy coupled with a fuzzy controller is developed for task allocation in real time.

The use of limited lookahead policies presents significant advantages in terms of computational complexity. The computational complexity of the traditional supervisory control problem is polynomial (Ramadge & Wonham, 1987) if the UCSM and the specification's model describe perfectly the behavior of the robot team and the mission requirements. However, if there is imperfect information about the system then the complexity becomes PSPACE hard (Blondel & Tsitsiklis, 2000). The limited lookahead policy proposed in this paper is based on the coupled model of the system where the computational complexity of the coupled model generation is linear. The computational results show that only a fraction of the coupled model needs to be explored in the limited lookahead window to make task allocation decisions. The preliminary results show that task allocation decisions based on the limited lookahead window control policy produces comparable results when compared with exploring the complete coupled model. Further completely randomized experimentation is required to generalize these findings. The limited lookahead depth is a critical parameter affecting the quality of task allocation as well as computational complexity. The results in Section 7 indicate that larger limited lookahead depths lead to higher number of states visited by the control algorithm. This is an expected result, however, the associated increase in the utility of task allocation is not as clear cut. In this work, as in most of the referenced literature, the depth of the lookahead window is arbitrarily chosen. In (Chung et al., 1992), the depth window is computed based on the number of uncontrollable events in the system. A future research direction involves determining the characteristic associated with cooperative robot teams and their missions, which may be used to develop a methodology to calculate a dynamic limited lookahead depth in real time. Such an approach will result in a controller that is adaptable to the changing needs of missions and cooperative robot teams.

References

Bellman

, Year, On the Theory of Dynamic Programming. Proceedings of the National Academy of Sciences

Blondel

V.D.

and Tsitsiklis

J.N.

, 2000, A survey of computational complexity results in systems and control. Automatica, 36: 1249–1274.

Botelho

S.C.

and Alami

, Year, M+: a scheme for multi-robot cooperation through negotiated task allocation and achievement. Proceedings of IEEE International Conference on Robotics and Automation., 1234–1239 vol.1232.

Cassandras

C.G.

and Lafortune

, 1999, Introduction to Discrete Event Systems. Kluwer Academic Publishers, Norwell, Massachesetts, USA

Chevaleyre

, Year, Theoretical analysis of the multi-agent patrolling problem. IEEE/WIC/ACM International Conference on Intelligent Agent Technology. IAT, 302–308.

Chung

S.-L.

Lafortune

and Lin

, 1993, Recursive computation of limited lookahead supervisory controls for discrete event systems. Discrete Event Dynamic Systems, 3: 71–100.

Chung

S.-L.

Lafortune

and Lin

, 1994, Supervisory control using variable lookahead policies. Discrete Event Dynamic Systems, 4: 237–268.

Chung

S.L.

Lafortune

and Lin

, 1992, Limited lookahead policies in supervisory control of discrete event systems. IEEE Transactions on Automatic Control, 37: 1921–1935.

Davis

and Smith

R.G.

, 1983, Negotiation as a metaphor for distributed problem solving. Artificial Intelligence, 20: 63–109.

10.

Gerkey

B.P.

and Mataric

M.J.

, 2002, Sold!: auction methods for multirobot coordination. IEEE Transactions on Robotics and Automation, 18: 758–768.

11.

Gordon

and Kiriakidis

, Year, Adaptive supervisory control of interconnected discrete event systemsYear, Adaptive supervisory control of interconnected discrete event systems. Proceedings of the 2000 IEEE International Conference on Control Applications., 935–940.

12.

Gordon-Spears

and Kiriakidis

, 2004, Reconfigurable robot teams: modeling and supervisory control. IEEE Transactions on Control Systems Technology, 12: 763–769.

13.

Grigorov

and Rudie

, 2006, Near-Optimal Online Control of Dynamic Discrete-Event Systems. Discrete Event Dynamic Systems, 16: 419–449.

14.

Hadj-Alouane

N.B.

Lafortune

and Feng

, 1994, Variable lookahead supervisory control with state information. IEEE Transactions on Automatic Control, 39: 2398–2410.

15.

Heymann

and Lin

, 1994, On-line control of partially observed discrete event systems. Discrete Event Dynamic Systems, 4: 221–236.

16.

Jamie

Pretty

R.K.

and Gosine

R.G.

, 2003, Coordinated execution of tasks in a multiagent environment. IEEE Transactions on Systems, Man and Cybernetics, Part A, 33: 615–619.

17.

Jennings

J.S.

Whelan

and Evans

W.F.

, Year, Cooperative search and rescue with a team of mobile robots. Proceedings of 8th International Conference on Advanced Robotics, ICAR '97., 193–200.

18.

Kimura

Takahashi

Okuyama

Tsuchiya

S.A.T.S.

and Suzuki

Y.A.S.Y.

, 1998, A fault-tolerant control algorithm having a decentralized autonomous architecture for space hyper-redundant manipulators. IEEE Transactions on Systems, Man and Cybernetics, Part A, 28: 521–527.

19.

Kiriakidis

and Gordon

, Year, Supervision of multiple-robot systems. Proceedings of the 2001 American Control Conference., 2117–2120 vol.2113.

20.

Krishnamurthy

and Evans

, Year, HelpMate: A robotic courier for hospital use. IEEE International Conference on Systems, Man and Cybernetics, 1630–1634 vol.1632.

21.

Kumar

Cheung

H.M.

and Marcus

S.I.

, 1998, Extension based Limited Lookahead Supervision of Discrete Event Systems. Automatica, 34: 1327–1344.

22.

Kumar

and Garg

V.K.

, 2001, Control of stochastic discrete event systems modeled by probabilistic languages. IEEE Transactions on Automatic Control, 46: 593–606.

23.

Kumar

and Sahin

, Year, Cognitive maps in swarm robots for the mine detection application. IEEE International Conference on Systems, Man and Cybernetics., 3364–3369 vol.3364.

24.

Lagoudakis

M.G.

Berhault

Koenig

Keskinocak

P.A.K.P.

and Kleywegt

A.J.A.K.A.J.

, Year, Simple auctions with performance guarantees for multi-robot task allocation. Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. (IROS 2004). 698–705 vol.691.

25.

Lin

, 1993, Robust and adaptive supervisory control of discrete event systems. IEEE Transactions on Automatic Control, 38: 1848–1852.

26.

Mamdani

E.H.

, 1976, Advances in the linguistic synthesis of fuzzy controllers. International Journal of Man-Machine Studies, 8: 245–254.

27.

Parker

L.E.

, 1998, ALLIANCE: an architecture for fault tolerant multirobot cooperation. Robotics and Automation, IEEE Transactions on, 14: 220–240.

28.

Ramadge

P.J.

and Wonham

W.M.

, 1987, Supervisory control of a class of discrete event processes. SIAM Journal on Control and Optimization, 18: 452–462.

29.

Timofeev

A.V.

Kolushev

F.A.

and Bogdanov

A.A.

, Year, Hybrid algorithms of multi-agent control of mobile robots. International Joint Conference on Neural Networks. IJCNN, 4115–4118 vol.4116.

30.

Lee

Ray

and Phopa

S.A.P.S.

, Year, A behavior-based collaborative multi-agent system. IEEE International Conference on Systems, Man and Cybernetics, 4242–4248 vol.4245.

31.

Yalcin

Khemuka

Deshpande

, 2005, Modelling inter-task dependencies and control of workflow managements systems based on supervisory control theory. International Journal of Production Research, 43: 4359–4379.

32.

Yi-Liang

Laortune

and Feng

, Year, How to reuse supervisors when discrete event system models evolve. Proceedings of the 36th IEEE Conference on Decision and Control., 2964–2969 vol.2963.

33.

Zhang

Schervish

Acar

E.U.

and Choset

H.A.C.H.

, Year, Probabilistic methods for robotic landmine search. Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems., 1525–1532 vol.1523.

34.

Zlot

Stentz

Dias

M.B.

and Thayer

S.A.T.S.

, Year, Multi-robot exploration controlled by a market economy. Proceedings of IEEE ICRA., 3016–3023.