Sage Journals: Discover world-class research

Abstract

In order to prove classical planning instances unsolvable, state-of-the-art planners resort to a state space search. However, we show here that an incomplete, yet computationally efficient criterion is sometimes sufficient to immediately identify as unsolvable a wide range of planning instances. Based on linear and integer programming, we show in this paper how it can be leveraged, should it fail at first. This criterion is the keystone of various techniques we propose to rewrite and enhance the STRIPS model, so as to gather new information about it. If the newly found bits of information is not sufficient to identify the instance as unsolvable, they still constitute human-readable bits of information that can provide additional insight on the planning instance.

Keywords

classical planning linear programming unsatisfiability

1. Introduction

Current classical planners go through a search phase, with the aim of finding a sequence of actions that satisfies a given goal. They often start with the assumption that such a plan exists, and for the past few decades, significant work has been done on designing more and more efficient techniques to find solution plans. However, various reasons may lead to an instance not admitting any solution. Search-based planners will then explore the state space in its entirety, potentially cutting branches of the search tree, until they realize no plan can be found. The detection of states that cannot lead to any solution is often a byproduct of the heuristics used during search: an infinite heuristic value for an admissible heuristic is a synonym of a dead-end state.

This is why in recent years, there has been a renewed interest in detecting unsolvable planning instances, as illustrated by the 2016 Unsolvability International Planning Competition (Unsolvability IPC). Various techniques have been developed in the last couple of decades, such as dead-end formulas (Cserna et al., 2018) and traps (Lipovetzky et al., 2016; Steinmetz et al., 2022). However, all of these methods are based on the exploration of the state space.

In this article, we propose to leverage a linear programming (LP)- and integer programming (IP)-based criterion to iteratively refine a planning model, to show its unsolvability. The criterion we use is fast to compute and allows us to quickly recognize a wide range of unsolvable planning instances. However, it is not complete, in the sense that it may not recognize some unsolvable instances as such. Nevertheless, we show how to use it to iteratively refine the planning model, and keep gathering additional information about the instance with the aim that our procedure can detect that it is unsolvable.

Most of our techniques to gather information are based on a simple schema: after testing the solvability of planning instances $Π^{'}$ that is derived from the initial planning problem $Π$ given as input, we deduce additional information about the problem $Π$ if $Π^{'}$ is unsolvable. For instance, if the instance $Π^{'}$ , which is $Π$ where operator $a$ was removed, is proven to be unsolvable, then it means that $a$ appears in all solution plans of $Π$ . In the case where one can efficiently detect some unsolvable planning instances (but not all), then lots of such derived instances $Π^{'}$ can be tested successfully. As the criterion we use is incomplete but fast, even though it often fails to detect unsolvable instances, it still manages to help gather new information, as lots of tests can be made in a reasonable time. As more and more information is known about the planning instance, the mathematical program on which the criterion is based can also be enriched with the new knowledge, so that it can detect additional unsolvable instances.

More generally, being able to detect planning instances that have no solution can have various applications in itself. For instance, consider the case where an instance models the attacks that a malicious user may perform on a system, with the goal of accessing restricted data. Finding that no sequence of actions may achieve this shows that the system is secure.

The paper is organized as follows. In Section 2, we introduce our formalism and notations for classical planning. In Section 3, we present the mathematical-programming-based criterion we use throughout this paper. In Section 4, we show how to design tests to gather new information about a planning model. In Section 5, we report our experimental trials on standard sets of benchmarks. Section 6 reviews related work, while Section 7 is devoted to a discussion and perspectives on our findings.

2. Background

2.1. STRIPS Planning Instance

A STRIPS planning instance is a tuple $Π = ⟨ F, I, O, G ⟩$ such that $F$ is a set of propositional variables called fluents, and $I$ is a set of fluents of $F$ , called the initial state. $G$ is a set of literals of $F$ , such that no literal appears at the same time as its negation, and is called the goal. We will denote $G^{+}$ the set of positive literals of $G$ , and $G^{-}$ the set of fluents that appear negated in $G$ . Finally, $O$ is a set of operators: operators $a \in O$ are of the form $a = ⟨ pre (a), eff (a) ⟩$ . $pre (a) \subseteq F$ is the precondition of $a$ , which is a set of fluents. $eff (a)$ is the effect of $a$ , and is a set of literals of $F$ . We will denote ${eff}^{+} (a) = {f \in F ∣ f \in eff (a)}$ the set of positive effects of $a$ , and ${eff}^{-} (a) = {f \in F ∣ \neg f \in eff (a)}$ its negative effects.

Note that we define a version of STRIPS with negative goals. The original STRIPS formulation only specified positive goals, and is not any less expressive: any instance with negative goals can be translated, in polynomial time, into an equivalent instance without negative goals. We nonetheless allow negative goals in our formulation of STRIPS, since later in this paper we investigate the possibility of adding negative goals in order to strengthen a STRIPS instance. But one should keep in mind that most planning instances (and in particular, the ones used in our set of benchmarks) come with positive goals only: this is why we assume $G^{-}$ is empty, except in the section where we consider adding negative goals.

Something similar could be said for the preconditions of operators: standard STRIPS only defines positive preconditions, since any STRIPS instance with negative preconditions can be translated into an equivalent instance without negative preconditions in linear time (Geffner & Bonet, 2013). Some versions of STRIPS allow negative preconditions; however, in this case, disallowing negative preconditions makes the formulation of some of our criteria simpler, hence our choice to use this version of STRIPS.

Without loss of generality, we assume that for all operators $a$ , ${eff}^{+} (a) \cap {eff}^{-} (a) = \emptyset$ . In addition, we will also suppose that ${eff}^{+} (a) \cap pre (a) = \emptyset$ , otherwise the redundant fluents from the effects can be removed. Any planning instance that does not satisfy these criteria can be transformed, in polynomial time, into an equivalent instance that complies with them.

2.2. States and Plans

A state $s$ is an assignment of truth values to all fluents in $F$ . For notational convenience, we associate $s$ with the set of fluents of $F$ which are true in $s$ . An operator $a$ can be applied to states of $Π$ that verify its preconditions. More formally, for any state $s$ , if $pre (a) \subseteq s$ then we define the result of the application of $a$ to $s$ as $s [a] = (s ∖ {eff}^{-} (a)) \cup {eff}^{+} (a)$ .

Given an instance $Π = ⟨ F, I, O, G ⟩$ , a plan is a sequence of operators $π = a_{1}, \dots, a_{k}$ from $O$ such that there exists a sequence of states $s_{0}, \dots, s_{k}$ , such that, for all $i \in 1, \dots, k$ , the operator $a_{i}$ is applicable in $s_{i - 1}$ , and that $s_{i} = s_{i - 1} [a_{i}]$ . A plan is a solution plan if we have, in addition, $s_{0} = I$ , $G^{+} \subseteq s_{k}$ and $G^{-} \cap s_{k} = \emptyset$ . We say that a fluent $f$ is established (respectively, deleted) by some occurrence of an operator $a \in O$ in $π$ if $f$ is false (respectively, true) in some state $s_{i}$ , but true (respectively, false) after the application of $a$ , in state $s_{i + 1} = s_{i} [a]$ . In the rest of this paper, we will refer to solution plans as simply plans.

3. Detecting Unsolvable Instances by LP

This section introduces two equivalent criteria that we use, and extend, to detect a planning instance’s unsolvability. These criteria are incomplete, in the sense that they cannot detect all unsolvable planning instances by themselves. However, they require very limited computational resources, and are fast to run, as they are based on LP or IP. We will show later how to leverage those properties in order to make the most of these criteria when they are not able to detect an instance’s unsolvability by themselves.

3.1. Potential-Based Argument

The first LP formulation that we worked with is based on the following argument. Suppose that we have a numerical function $Φ : F \to R^{+}$ , that associates a potential to each fluent. We can then naturally define the potential of a state $s \subseteq 2^{F}$ as $Φ (s) = \sum_{f \in s} Φ (f)$ . If one can prove that all goal states have a higher potential than the initial state, but the application of any operator $a$ to any state $s$ leads to a state $s^{'}$ of lesser (or equal) potential, then the planning instance has no solution plan.

Such a function $Φ$ can be found thanks to the following observation. In any plan, the potential of a state $s^{'}$ solely depends on the previous state $s$ , and on the operator $a$ that was applied such that $s [a] = s^{'}$ . In this case, we will say that $a$ induced an increase in potential of $Δ Φ_{a} (s) = Φ (s^{'}) - Φ (s)$ . One can remark that there exists an upper bound for $Δ Φ_{a} (s)$ , which does not depend on $s$ but only on $a$ . Indeed, in the limit case, all fluents $f \in {eff}^{+} (a)$ are effectively established by $a$ , but no fluent $f^{'} \in {eff}^{-} (a)$ is destroyed, except when $f^{'} \in {eff}^{-} (a) \cap pre (a)$ . Recall that we assume, without loss of generality, that ${eff}^{+} (a) \cap pre (a) = \emptyset$ .

More formally, let us consider two sets of operators, with regard to some fluent $f$ : on the one hand, the operators that will surely delete $f$ when applied, that we denote ${S D}_{f}$ ; on the other hand, the operators that could possibly add $f$ when applied, denoted ${P A}_{f}$ . The latter are operators that may establish $f$ in the resulting state $s^{'}$ depending on whether $f$ is false in the previous state $s$ or not. More formally, the sets are defined as follows:

$∙$
${S D}_{f} = {a ∣ f \in {eff}^{-} (a) \cap pre (a)}$ .
$∙$
${P A}_{f} = {a ∣ f \in {eff}^{+} (a)}$ .
This leads to the following inequality, which models the limit case previously presented. This effectively gives us an upper bound on the change of potential induced by $a$ from any state $s$ , which we denote $Δ Φ_{a} (s)$ . Observe that the right-hand side is independent of $s$ .
$Δ Φ_{a} (s) \leq \sum_{f \in {eff}^{+} (a)} Φ (f) - \sum_{f \in {eff}^{-} (a) \cap pre (a)} Φ (f) .$
Now suppose that, for all operators $a$ , the right-hand side of the previous inequality is negative. It means that applying any operator makes the potential of the state decrease. As a consequence, states that have a higher potential than the initial state cannot be reached. Note that, as the potential of a state is only determined by the potential of the fluents that are true in this state, and all potentials are positive, $Φ (G^{+})$ is a lower bound for the potential of any goal state. Thus, if we also have that $Φ (G^{+}) > Φ (I)$ , then the planning instance has no solution.

The only remaining issue is to check whether such a potential function $Φ$ exists. As $Φ$ is only determined by its values on the various fluents, this can be done with the following set of equations, with the set of variables $V = {x_{f} ∣ f \in F}$ . Intuitively, $x_{f}$ corresponds to the potential $Φ (f)$ of $f$ .
Linear Program 1
Let $Π = ⟨ F, I, O, G ⟩$ be a planning instance. Define $L_{Π}^{p o t}$ to have the following variables and constraints:

Variables: ${x_{f} ∣ f \in F}$ .

Constraints:
$\begin{aligned} \sum_{f \in G^{+}} x_{f} - \sum_{f \in I} x_{f} & > 0, \end{aligned}$
(1)

$\begin{aligned} \sum_{f \in {e f f}^{+} (a)} x_{f} - \sum_{f \in {e f f}^{-} (a) \cap p r e (a)} x_{f} & \leq 0 (a \in O) \end{aligned}$
(2)

$\begin{aligned} x_{f} & \geq 0 (f \in F) . \end{aligned}$
(3)

Note that the set of constraints above is not a linear program per se, since it contains the strict inequality (1). However, it can be easily reduced to an actual linear program (for instance by replacing $> 0$ by $\geq 1$ in (1)), and we present it as such for the sake of simplicity.

The following proposition follows from the discussion further above.
Proposition 1
Let $Π$ be a STRIPS instance. Suppose that there exists a solution for the Linear Program 1. Then $Π$ has no solution.
Proof.
Suppose that the hypothesis is true, and let $x_{f}$ ( $f \in F$ ) be a solution to Linear Program 1. As outlined in the discussion above, for each state $s$ , we associate the potential:
$Φ (s) = \sum_{f \in s} x_{f} .$
A solution plan must be valid for any initial state satisfying the initial condition and, in particular, for the initial state $s_{0} = I$ in which all fluents not in $I$ are false. Equation (1) tells us that $Φ (s_{0}) < Φ (s_{k})$ at (any) goal state $s_{k}$ . If an operator $a \in O$ transforms state $s_{i - 1}$ into state $s_{i}$ , then the maximum of $Δ Φ_{a} (s_{i - 1}) = Φ (s_{i}) - Φ (s_{i - 1})$ , that is the net gain in the score, is attained when each positive effect $f \in {e f f}^{+} (a)$ is established (i.e., $f$ was not already true in $s_{i - 1}$ ) and only those negative effects $f \in {e f f}^{-} (a) \cap p r e (a)$ actually occur (i.e., negative effects $f \in {e f f}^{-} (a) ∖ p r e (a)$ were already false in $s_{i - 1}$ ). Equations (2) and (3) tell us that $Φ (s_{i - 1}) \geq Φ (s_{i})$ for any consecutive states in a solution plan $s_{0}, \dots, s_{k}$ . Clearly a goal state $s_{k}$ can never be reached since potentials of all states reachable from $s_{0}$ remain less than or equal to $Φ (s_{0})$ .

Note that the converse of Proposition 1 is not true: not all unsolvable planning instances are detected by the criterion we propose.

Our definition of potential is reminiscent of potential heuristics, as introduced by Pommerening et al. (2015): facts are assigned numerical values, also called potentials, which uniquely and compactly define the potentials of all states of the planning instance. In their paper, the authors do so in order to synthesize a heuristic $h^{pot}$ , which is a linear combination of these potentials. They show that the consistency, the goal awareness, and thus the admissibility of the heuristic can all be expressed in terms of linear constraints, and potential heuristics are thus easy to synthesize through a linear program.

The main differences with our work lie in two points. First, potential heuristics are defined for planning tasks in Finite Domain Representation (FDR), where variables describing states are not binary, but can have domains of arbitrary size. They are then, in theory, slightly more general than our formulation. Second, while potentials in Pommerening et al. (2015) range over $R$ , we only allow positive potentials for fluents. These two points, combined, allow us to have a simple formulation for our linear program, without requiring the operators to be in transition normal form (TNF). TNF requires every fluent found in the effects of an action $a$ to also appear in its precondition. In FDR, TNF can be achieved through a transformation of any noncompliant action into a set of equivalent actions. When the planning instance cannot be transformed into an equivalent instance in TNF of reasonable size, the synthesis of potential heuristics, as well as their use, becomes slightly more tedious. In our work, we propose a criterion based on potentials that does not require a planning instance to be in TNF, and we rely on STRIPS and the positivity of the fluents’ potentials for this.
3.2. Dual Linear Program

The linear program presented in the previous section is hard to interpret, as the concept of potential we introduced has no reality outside of the criterion. However, we show in this section how to transform it into another program that can equivalently allow us to detect some unsolvable instances, but whose result is easier to interpret.

To this effect, we resort to Farkas’s lemma. Farkas’s lemma is related to the well-known fact that in LP, the primal problem is feasible iff the dual problem is feasible. One version of this lemma states that exactly one of the following sets of equations has a solution: either (1) $Ay \geq d$ where $y \geq 0$ , or (2) $A^{t} x \leq 0$ and $d^{t} x > 0$ where $x \geq 0$ , where $A$ is a matrix and $x, y$ , and $d$ vectors of the appropriate sizes. Let us consider the set of equations previously mentioned. Applying Farkas’s lemma, it has a solution iff the following system has no solution:

Linear Program 2 ( $L_{Π}^{op}$ )

Let $Π = ⟨ F, I, O, G ⟩$ be a planning instance. We define $L_{Π}^{op}$ as follows:

Variables: $V =$ ${y_{a} ∣ a \in O} .$

Constraints $C$ :

\begin{aligned} \sum_{a \in {P A}_{f}} y_{a} - \sum_{a \in {S D}_{f}} y_{a} & \geq δ_{f}^{-} (f \in F), \end{aligned}

(4)

\begin{aligned} y_{a} & \geq 0 (a \in O), \end{aligned}

(5)

where $δ_{f}^{-} = 1_{G^{+}} (f) - 1_{I} (f)$ ( $1_{S} (x)$ being the indicator function of set $S$ : $1_{S} (x) = 1$ if $x \in S$ , and 0 otherwise). In this context, the variable $y_{a}$ corresponds to the number of times operator $a$ is executed in some sequence of actions. Note that $y_{a}$ is positive, but not necessarily integral: this allows us to obtain a polynomial time relaxation of the STRIPS instance. Inequality (4) states that the number of possible establishments of $f$ minus the number of sure destructions of $f$ must be greater than or equal to $δ_{f}^{-}$ . For instance, any fluent that appears positively in the goal but not in the initial state must be established as least once more than it is surely deleted in any plan. This dual version of our original linear program provides an alternative insight into the meaning of Proposition 1.

Lemma 1

Let $Π = ⟨ F, I, O, G ⟩$ be a planning instance, $L_{Π}^{op}$ as defined in Linear Program 2, and $π$ a solution plan for $Π$ . Let us define $c_{π} : O \to N$ to be the number of occurrences of operators of $O$ in $π$ . Then the assignment where, for all $a \in O$ , $y_{a} = c_{π} (a)$ , is a solution for $L_{Π}^{op}$ .

Proof.

Let $y_{a}$ ( $a \in O$ ) be as defined above. We will show that this is a solution for $L_{Π}^{op}$ . For each fluent $f$ , let us denote $e_{f}$ the number of times a fluent is established during the execution of $π$ , and $d_{f}$ the number of times it is destroyed. Recall that a fluent $f$ is established (respectively, deleted) by some occurrence of an operator $a \in O$ in $π$ if $f$ is false (respectively, true) before the application of the operator, but true (respectively, false) after. As $π$ is a solution plan, we have that:

1_{G^{+}} (f) - 1_{I} (f) \leq e_{f} - d_{f},

which can be shown by case disjunction on whether

f

is in

I

G^{+}

, both or neither. We denote the inequalities above in a more concise way:

δ_{f}^{-} \leq e_{f} - d_{f} .

In addition, in the extreme case,

f

is established in

π

at most as many times as there are occurrences of operators

a

with

f \in {eff}^{+} (a)

. Hence,

e_{f} \leq \sum_{a \in {P A}_{f}} y_{a} .

Similarly, the only operators

a \in O

whose applications are guaranteed to destroy

f

are such that

f \in {pre}^{+} (a) \cap {eff}^{-} (a)

. Thus,

d_{f} \geq \sum_{a \in {S D}_{f}} y_{a} .

By combining both inequalities above, we have

\begin{aligned} δ_{f}^{-} & \leq e_{f} - d_{f} \\ \leq \sum_{a \in {P A}_{f}} y_{a} - \sum_{a \in {S D}_{f}} y_{a}, \end{aligned}

(6)

which means that

y_{a}

(

a \in O

) satisfies the constraints of the form of inequality (4) of

L_{Π}^{op}

. As a consequence, as each

y_{a}

is also positive, this is a solution to

L_{Π}^{op}

The contrapositive of Lemma 1 is an alternative proof that, if $L_{Π}^{op}$ has no solution, then neither has $Π$ . But it allows us to show more than that, as we have the following corollaries, that we use later on:

Corollary 1

If $L_{Π}^{op}$ has no integral solution, then the associated planning instance $Π$ has no solution.

Proof.

The proof is immediate, as each operator appears an integral number of times in any solution plan $π$ .

Linear Program 2 is, in fact, an LP formulation of the state equation heuristic (Bonet, 2013), as previously shown in Pommerening et al. (2014). Its efficiency for detecting unsolvable planning instances has been shown before, as it is part of the Aidos planner, which won the Unsat IPC in 2016 (Seipp et al., 2016). The planner uses the LP formulation of the operator-counting heuristic to detect dead ends during search, working on a FDR of the instance. As a consequence, Aidos can detect every instance on which our criterion succeeds: when Linear Program 2 detects some instance $Π$ as unsolvable, Aidos detects it too when exploring the initial state, since it is detected as a dead-end. When Linear Program 2 fails to detect an unsolvable instance, Aidos carries on with a search phase, aided by other components too, and has another chance at finding its unsolvability. In this case, we, however, do not resort to search, but show how to rewrite the model directly, and how to adapt the linear program accordingly.

Even though we introduced $L_{Π}^{op}$ as a linear program, we showed with Lemma 1 that one can also see it as an integer program. Solving an integer program is notoriously harder and slower than solving a linear program. As the integral solutions of the set of equations form a subset of its set of rational solutions, testing the solvability of the program over integral solutions is more likely to prove that the associated planning instance has no solution. Note that Farkas’s lemma does not apply in the integral case, hence the need for Lemma 1.

Since Linear Program 2 is the dual of Linear Program 1, and both programs can thus be used equivalently, we will only use Linear Program 2 in the rest of this paper. This is motivated by the fact that its formulation is more intuitive, since counting the occurrences of each operator in some plan is more tangible than associating an abstract potential to fluents. This allows us to reinvest Linear Program 2 in a variety of ways our initial formulation could not be reused, as will be shown later.

In the next section, we show that, in the case where the criterion introduced here fails, it can still be leveraged to gather additional information about $Π$ .

4. Enhancing the Planning Problem

This section is dedicated to extending and adding information to the initial planning instance, mainly with the goal of proving it unsolvable. Through various methods, we either add or remove elements from the input model $Π$ , or add information about $Π$ that is not directly encodable into the model, but that can nevertheless still be included in the linear program or to make deductions. In order to do so, we will resort to two kinds of methods. In the first one, we build variations of $Π$ so that, if one of these variations is deemed unsolvable through the previous linear program, then some additional information about $Π$ can be deduced. In the second method, we do not consider per se a variation $Π^{'}$ of $Π$ , but we directly modify the linear program $L_{Π}^{op}$ associated with $Π$ , so that if it is unsolvable, we can deduce new specific information about $Π$ .

In the following, we call operation any such method. In the specific case where the operation answers a boolean question (e.g., Is an action removable?), we call it a test.

Note that ultimately, our goal is to detect unsolvable instances as such. As the set of solution plans is empty, its elements and itself satisfy various otherwise uncommon properties. For instance, any operator can be removed without altering the set of solutions. This is why we search for properties on elements of $Π$ that are unlikely to appear in solvable instances, but that are reasonable in our setting.

In the rest of this section, we illustrate the previous general principles through various operations, that allow us to find new information about the planning instance given as input. As our goal is to detect unsolvable instances, in the following, we assume that the criterion could not detect, at first, that the instance is unsolvable and that we have to gather additional information in order to do so. This section concludes with a very simple example showing that, were the criterion to fail at first, it can still be used to prove an instance unsolvable.

4.1. Operator Counts and Landmarks

4.1.1. Landmark Detection

An operator $a \in O$ is a landmark for $Π$ if $a$ occurs at least once in every solution plan. We maintain throughout our procedure a set $L \subseteq O$ of landmarks. With regard to our framework, we can test if an operator is a landmark by removing it from the model and testing if the instance is deemed unsolvable. More formally,

Lemma 2
Let $Π = ⟨ F, I, O, G ⟩$ and $a \in O$ . If $Π_{∣ a} = ⟨ F, I, O ∖ {a}, G ⟩$ is unsolvable, then $a$ is a landmark.

This allows us to define the landmark detection test below, where $Π_{∣ a}$ is as defined in Lemma 2.

4.1.2. Operator Count

One can generalize the notion of landmark, by counting the least number of times an operator appears in any solution plan. This is why we define the function $n^{-} : O \to N$ , such that $n^{-} (a)$ is the least number of occurrences of action $a$ in any plan. Likewise, we define $n^{+} (a)$ as the maximum number of times $a$ appears in any plan. With this notation, $a \in O$ is a landmark iff $n^{-} (a) \geq 1$ .

Reasoning on the number of occurrences of some operator $a \in O$ can be done through Linear Program 2. Indeed, as the variables correspond to the number of occurrences of each operator in some sequence of actions, one only has to find lower and upper bounds for each variable $y_{a}$ in a solution of LP 2. This is why one can compute approximate values for $n^{+} (a)$ and $n^{-} (a)$ through an integral and optimization variation of LP 2, that we present below:

Integer Program 1 ( $L_{Π}^{opt} (a)$ )

Let $Π = ⟨ F, I, O, G ⟩$ be a planning instance, with $O = {a_{1}, \dots, a_{m}}$ , and $L_{Π}^{op} (V, C)$ the associated Linear Program 2. For $a \in O$ , define $L_{Π}^{opt} (a)$ with the following variables and constraints:

Variables: ${y_{a} ∣ a \in O}$ .

Constraints: Same as $L_{Π}^{op}$ except that the $y_{a}$ are integral.

Objective function $g : N^{m} ⟶ N$ :

g : y_{a_{1}}, \dots, y_{a_{m}} ⟼ y_{a} .

Lemma 3
Let $Π$ be a planning instance, $a \in O$ an operator, and consider the integer program $L_{Π}^{opt} (a)$ with objective function $g$ . Then minimizing (respectively, maximizing) $g$ yields a lower (respectively, an upper) bound on the value of $n^{-} (a)$ (respectively, $n^{+} (a)$ ).
Proof.
The proof is a consequence of Lemma 1. We consider only the case where $g$ is minimized, as the proof for the other case is mostly identical. We denote $n_{L}^{-}$ the value obtained by minimizing $g$ in $L_{Π}^{opt} (a)$ , where $a \in O$ is fixed. Suppose for a contradiction that $n^{-} (a) < n_{L}^{-}$ . Then there exists a plan $π_{a}$ where $a$ occurs exactly $n^{-} (a)$ times, by definition. By Lemma 1, there exists a solution $Y_{π_{a}}$ for $L_{Π}^{op}$ where $Y_{π_{a}} (a) = n^{-} (a) < n_{L}^{-}$ , which contradicts the optimality of $n_{L}^{-}$ . Consequently, we have $n_{L}^{-} \leq n^{-} (a)$ .

This gives the following two tests.

In the rest of this paper, we will often use the notation $OpCount (a)$ to refer to the successive application of ${OpCount}^{-} (a)$ and ${OpCount}^{+} (a)$ .
4.1.3. Using Operator Counts

We use the above tests to maintain estimates of the values of $n^{+} (a)$ and $n^{-} (a)$ . Once a nontrivial value for some $n^{+} (a)$ or some $n^{-} (a)$ has been found (i.e., a finite or nonzero value, respectively), one can reintroduce it into the linear program in the form of additional constraints. These constraints can be introduced in either $L_{Π}^{op}$ or $L_{Π}^{opt}$ , as both programs use the same sets of variables and constraints. As the variables of the linear programs correspond to the number of occurrences of operators in some plan, adding these constraints is straightforward for every $a \in O$ :

\begin{aligned} y_{a} & \leq n^{+} (a), \\ y_{a} & \geq n^{-} (a) . \end{aligned}

4.2. Operator Mutexes

Operator mutexes are unordered pairs of operators ${a_{1}, a_{2}}$ that cannot both appear in the same solution plan for $Π$ . Such operators can still be part of some solution plan, on their own. As an illustration, suppose that $a_{1}$ and $a_{2}$ both include fluent $f$ in their respective positive preconditions and in their delete effects. If $f$ does not belong to the add effects of any operator of $Π$ , then $a_{1}$ and $a_{2}$ are operators mutexes, as they are in competition for the nonrenewable resource $f$ . We maintain, throughout the execution of our procedure, a set $M_{O}$ of operator mutexes. One may think that operator mutexes do not occur often in planning models, as it is a property that concerns the set of solution plans as a whole. However, our aim is to detect unsolvable planning instances, which have by definition no solution plan: all pairs of operators are thus operator mutexes.

4.2.1. Finding Operator Mutexes Through LP

In order to check if two operators $a_{1}, a_{2} \in O$ are operator mutexes, it suffices to build the linear program $L_{Π}^{opm} (a_{1}, a_{2})$ and check if it is feasible or not. In essence, $L_{Π}^{opm} (a_{1}, a_{2})$ consists of the same set of constraints as $L_{Π}^{op}$ , except that it has the following additional constraints:

\begin{aligned} y_{a_{1}} & \geq 1, \end{aligned}

(7)

\begin{aligned} y_{a_{2}} & \geq 1. \end{aligned}

(8)

L_{Π}^{opm} (a_{1}, a_{2})

does not admit a solution, then it means that both operators cannot appear simultaneously in a plan, and thus are operator mutexes. Note that adding only one of the two inequalities above could also make the linear system unsolvable. In that case, this would mean that either

a_{1}

a_{2}

does not appear in any solution plan. This is something that we test before resorting to operator mutex tests. More generally, we check if some operators can be removed from the planning model. We present various techniques for doing so in the next section. Below, we introduce the notation for the test corresponding to the linear program we presented above:

4.2.2. Using Operator Mutexes

Including the information we have about operator mutexes in the linear program is straightforward. Knowing that ${a_{1}, a_{2}}$ is an operator mutex translates into barring certain solutions of $L_{Π}^{op}$ from appearing. Concretely, we wish to ensure that, when $y_{a_{1}} \geq 1$ , then $y_{a_{2}} = 0$ , and conversely.

In order to achieve so, we introduce binary variables of the form $u_{a} \in {0, 1}$ , for each operator $a \in O$ . We wish to enforce that $u_{a} = 1$ whenever $y_{a} \geq 1$ . This can be done through the following set of constraints, which also bars operator mutexes from appearing.

\begin{aligned} y_{a} - n^{+} (a) \cdot u_{a} & \leq 0 (a \in O), \end{aligned}

(9)

\begin{aligned} u_{a_{1}} + u_{a_{2}} & \leq 1 ({a_{1}, a_{2}} \in M_{O}) . \end{aligned}

(10)

4.3. Detection of Removable Actions

This section is concerned with finding operators $a \in O$ that never appear in any solution plan. Even though some such operators can be detected statically by the parser that we use, some others require additional computation. We present various techniques that allow us to detect if an operator can be immediately removed from the planning instance, without altering its set of solutions.

4.3.1. Through a Modification of the Linear Program

We start by extending $L_{Π}^{op}$ into $L_{Π}^{ro} (a)$ through the addition of the constraint $y_{a} \geq 1$ . If $L_{Π}^{ro} (a)$ has no solution, then $Π$ has no solution where $a$ occurs at least once and $a$ can thus be removed from the model.

We do not elaborate on this argument further, as it is a special case of the technique seen in Section 4.1. Indeed, it is equivalent to show that $n^{+} (a) = 0$ , as it ensures that $a$ does not occur in any solution plan. However, this argument allows us to find removable operators that are not detected by a test proposed later in this subsection.

4.3.2. Unreachable Preconditions

A simple way to prove that some operator $a$ will never be part of any plan is to prove that no reachable state satisfies its precondition. This can be done by testing that the planning instance $Π_{a}^{pre} = ⟨ F, I, O, pre (a) ⟩$ is unsolvable.

Removing some operators relaxes the linear program $L_{Π}^{op}$ , by the deletion of some of the associated variables and constraints. As a consequence, it can help prove some instances unsolvable. We introduce below the notation for the associated test:

4.3.3. Dead-End Operators

As it is possible to test whether or not there exists a reachable state where $a$ can be applied, it is natural to ask the opposite: does $a$ always lead to a dead-end, where no goal state can be reached?

This paragraph is dedicated to finding such operators, called dead-end operators. In order to do so, we need to restrict ourselves to the few fluents that appear in all states resulting from the application of $a$ , that is to say, the fluents that are true after $a$ is applied either because of the effects of $a$ or by inertia. Indeed, these fluents are the only ones for which we have enough information about their truth value to reason about. Let $F_{a} = pre (a) \cup {eff}^{+} (a) \cup {eff}^{-} (a)$ . For any set $S$ of literals of $F$ , and $E \subseteq F$ , we note $S_{∣ E}$ the projection of $S$ over the fluents $E$ . Likewise, we denote $a_{∣ E} = ⟨ pre (a)_{∣ E}, eff (a)_{∣ E} ⟩$ the projection of operator $a$ over $E$ . For any $O^{'} \subseteq O$ , we also note $O_{∣ E}^{'} = {a_{∣ E} ∣ a \in O^{'}}$ . This leads us to the following lemma:

Lemma 4
Let $Π = ⟨ F, I, O, G ⟩$ be a planning instance and $Π_{a}^{post} = ⟨ F_{a}, I_{a}^{post}, O_{∣ F_{a}}, G_{∣ F_{a}} ⟩$ , where $I_{a}^{post} = (pre (a) ∖ {eff}^{-} (a)) \cup {eff}^{+} (a)$ . If $Π_{a}^{post}$ is unsolvable, then $a$ is a dead-end operator in $Π$ .
Proof.
We prove the contrapositive: suppose that $a$ is not a dead-end operator in $Π$ , and let us show that $Π_{a}^{post}$ is solvable. Since $a$ is not a dead-end operator in $Π$ , there exists a solution plan for $Π$ where $a$ occurs at least once. Let $π = a_{1} \dots a_{k}$ be such a plan, and let $i \in {1, \dots, k}$ be the greatest index such that $a_{i} = a$ .

Then we show in what follows that $π^{post} = a_{i + 1 ∣ F_{a}} \dots a_{k ∣ F_{a}}$ is a solution plan for $Π_{a}^{post}$ , where $π^{post}$ is the empty plan when $i = k$ .

Let $s_{0} s_{1} \dots s_{k}$ be the set of states associated with $π$ , and let us show by recurrence that the sequence of states associated with the plan $π^{post}$ in $Π_{a}^{post}$ is $s_{i ∣ F_{a}} \dots s_{k ∣ F_{a}}$ . First, we prove that $I_{a}^{post} = s_{i} \cap F_{a}$ . As $s_{i}$ results from the application of $a_{i} = a$ , we have that $I_{a}^{post} \subseteq s_{i}$ , and thus $I_{a}^{post} \subseteq s_{i} \cap F_{a}$ . Conversely, let us show that $s_{i} \cap F_{a} \subseteq I_{a}^{post}$ . Suppose that $f \in s_{i} \cap F_{a}$ , and let us proceed by case disjunction on $f \in F_{a} = pre (a) \cup {eff}^{+} (a) \cup {eff}^{-} (a)$ . If $f \in {eff}^{+} (a)$ , then $f \in I_{a}^{post}$ . Necessarily, we have that $f \notin {eff}^{-} (a)$ , because otherwise $f \notin s_{i}$ as $s_{i}$ results from the application of $a$ . Then, we otherwise have $f \in pre (a) ∖ {eff}^{-} (a)$ , and thus, we have $f \in I_{a}^{post}$ . Hence, $I_{a}^{post} = s_{i} \cap F_{a}$ .

We have shown that the initial state of the sequence of states associated with $π^{post}$ is $s_{i ∣ F_{a}}$ . To show the property for the remaining states, it suffices to check that, for any two states $s, s^{'}$ of $Π$ , and for any operator $a^{'} \in O$ , if we have $s^{'} = s [a^{'}]$ , then $s_{∣ F_{a}}^{'} = s_{∣ F_{a}} [a_{∣ F_{a}}^{'}]$ . To see this, observe that $s_{∣ F_{a}}^{'}$ can be obtained in two different ways: 1) by applying $a$ to $s$ , and then projecting the resulting state $s^{'}$ over $F_{a}$ , or 2) by projecting $a$ and $s$ over $F_{a}$ , resulting in $a_{∣ F_{a}}$ and $s_{∣ F_{a}}$ , respectively, and then applying $a_{∣ F_{a}}$ to $s_{∣ F_{a}}$ .

Formally, this can be shown by successively applying the definition of the projection of $a^{'}$ over $F_{a}$ , denoted $a_{∣ F_{a}}^{'}$ , and the definition of $s_{∣ F_{a}}$ . We end up with the following:
$\begin{aligned} s_{∣ F_{a}} [a_{∣ F_{a}}^{'}] & = (s_{∣ F_{a}} ∖ {eff}^{-} (a_{∣ F_{a}}^{'})) \cup {eff}^{+} (a_{∣ F_{a}}^{'}) \\ = (s_{∣ F_{a}} ∖ ({eff}^{-} (a) \cap F_{a})) \cup ({eff}^{+} (a) \cap F_{a}) \\ = ((s \cap F_{a}) ∖ ({eff}^{-} (a) \cap F_{a})) \cup ({eff}^{+} (a) \cap F_{a}) \\ = ((s ∖ {eff}^{-} (a)) \cup {eff}^{+} (a)) \cap F_{a} \\ = s_{∣ F_{a}}^{'} . \end{aligned}$
As a consequence, we immediately have that $s_{k ∣ F_{a}} \subseteq G_{∣ F_{a}}$ , and $π^{post}$ is a solution plan for $Π_{a}^{post}$ , which is what we sought to prove.

Experimental trials showed that no operator could be proved to be a dead-end operator. This is why, in Section 5, no results are reported about the above test. However, we still included this test to show that some very small problems derived from the input instance can be of interest.
4.4. Extended Goals

In this section, we propose various methods to find more precise goal states. More specifically, we try to add new literals to the goal, be they positive or negative. Suppose, for instance, that some fluent $f \in G^{+}$ can only be true if some other fluent $f^{'}$ is true. Then one can immediately add $f^{'}$ to $G^{+}$ . These more precise goals make the program richer and hence more likely to detect unsolvable instances. $G^{-}$ can also be extended in a similar way.

This can be done in our framework through the following simple observation. Let $f \in F$ , and $Π_{+ f}^{G} = ⟨ F, I, O, G \cup {f} ⟩$ . If $Π_{+ f}^{G}$ is unsolvable, then $f$ can be added to the negative goals of $Π$ . Indeed, no goal state $s_{G}$ such that $s_{G} ⊨ f$ is reachable: necessarily, in any goal state $s_{G}$ , we have $s_{G} ⊨ \neg f$ . Conversely, let $Π_{- f}^{G} = ⟨ F, I, O, G \cup {\neg f} ⟩$ . If $Π_{- f}^{G}$ is unsolvable, then $f$ can be safely added to the goals of $Π$ without changing the set of solutions.

There is, of course, a symmetrically equivalent test FPosGoal that we could have defined. However, in order to detect positive goals, we would need to test $Π_{- f}^{G}$ , which has negative goals. These negative goals are not used in the formulation of Linear Program 2: as a consequence, our criterion would be powerless at detecting that $Π_{- f}^{G}$ is unsolvable.

4.5. Fluent Mutexes and Unreachable Fluents

A fluent mutex is a set of fluents $M \subseteq F$ for which $s ⊭ M$ holds in all states $s$ reachable from the initial state $I$ . Some tests presented previously can be seen as testing whether some subset $M \subseteq F$ is a fluent mutex. Let us consider for instance the $PreImp$ test presented in Section 4.3: for some operator $a \in O$ , checking that $Π_{a}^{pre} = ⟨ F, I, O, pre (a) ⟩$ is unsolvable (and thus that operator $a$ can be removed from the instance) is equivalent to checking that $pre (a)$ is a mutex. However, our criterion allows us to check if any set of fluents $F^{'} \subseteq F$ is a mutex, by testing the unsolvability of $Π_{F^{'}}^{mut} = ⟨ F, I, O, F^{'} ⟩$ .

The criterion does not detect all fluent mutexes, and each candidate set of fluents has to be tested individually. Thus, since there exists an exponential number of candidates to test, it is not possible to detect in reasonable time the ones that are within the reach of our method. Finding which sets are interesting to test is a problem in itself; even more so since one has to know how to make use of the newly found information that some $M \subseteq F$ is a mutex.

In the general case, we could not find a way to reinvest into the linear program the knowledge that a set of fluents is a mutex. Indeed, Linear Program 2 reasons over the number of times operators (have to) occur in a plan. As a consequence, we do not have any obvious way to reason about properties concerning states, which is precisely what fluent mutexes are. For that reason, we do not include in our routine computation of mutexes through our linear program, even though we can detect a range of fluent mutexes.

However, some fluents are always false, in the sense that no plan will ever establish them. We call the fluents unreachable fluents, and they can be detected with the same argument as above:

Even though these fluents appear very rarely, as will be shown in the experimental trials, it remains linear to test for all fluents whether they are unreachable or not: thus, the computational burden is significantly lower than for other fluent “mutexes.” When an unreachable fluent is detected, one can project the whole instance on fluents $F ∖ {f}$ , and remove the operators that have $f$ in their positive preconditions. Note, however, that any such operator $a$ would also be detected by test PreImp( $a$ ), which is more likely to succeed.

4.6. Example Instance Solved Through Model Refinement

In the rest of this section, we show that, when an instance is unsolvable but could not be detected as such by our criterion, then modifying the model using the above operations can still be enough to show the instance unsolvable.

Consider the instance $Π = ⟨ F, I, O, G ⟩$ , with:

$F = {p, q, r}$ ,

$I = {p}$ ,

$G = {p, q}$ ,

$O = {a_{1}, a_{2}, a_{3}}$ such that

–
$a_{1} = ⟨ {p}, {q, \neg p} ⟩$ ,
–
$a_{2} = ⟨ {q}, {p, \neg q} ⟩$ ,
–
$a_{3} = ⟨ {r}, {p, q} ⟩$ .

$Π$ is clearly unsolvable, since $p$ and $q$ cannot be both true after the application of $a_{1}$ or $a_{2}$ , and $a_{3}$ can never be applied since its precondition cannot be established and is not true in $I$ .

The associated linear program $L_{Π}^{op}$ is the following, where $y_{i}$ is the variable associated with $a_{i}$ for any $i \in {1, 2, 3}$ , and the constraints of the form $y_{i} \geq 0$ are not explicitly written for the sake of concision:
$\begin{aligned} - y_{1} + y_{2} + y_{3} & \geq 0, \end{aligned}$
(11)

$\begin{aligned} y_{1} - y_{2} + y_{3} & \geq 1, \end{aligned}$
(12)

$\begin{aligned} 0 & \geq 0, \end{aligned}$
(13)
where equations (11), (12), and (13), respectively, correspond to fluents $p$ , $q$ , and $r$ .

The above linear program is solvable (consider for instance the solution where $y_{1} = y_{2} = 0$ , and $y_{3} = 1$ ), and as a consequence, our criterion fails to detect the instance as unsolvable.

However, let us apply the test $PreImp (a_{3})$ . This amounts to testing the unsolvability of $Π_{a_{3}}^{pre} = ⟨ F, I, O, {r} ⟩$ , and the associated linear program is almost the same as the above, except that the right-hand sides of inequalities are adapted to the new goal:
$\begin{aligned} - y_{1} + y_{2} + y_{3} & \geq 0, \\ y_{1} - y_{2} + y_{3} & \geq 0, \\ 0 & \geq 1. \end{aligned}$
Since the last equation is an immediate contradiction, $Π_{a_{3}}^{pre}$ is unsolvable, and the test $PreImp (a_{3})$ succeeds. As a consequence, $a_{3}$ can be removed from $Π$ , and we note $Π^{'}$ this new but equivalent instance. The linear program $L_{Π^{'}}^{op}$ associated with it is the following:
$\begin{aligned} - y_{1} + y_{2} & \geq 0, \\ y_{1} - y_{2} & \geq 1, \\ 0 & \geq 0, \end{aligned}$
which does not admit any solution (the summation of the two upper equations leads to the contradiction $0 \geq 1$ ). This shows in turn that $Π^{'}$ is unsolvable, and thus that the original instance $Π$ does not admit any solution plan either.

Note that the above sequence of operations could have been done equivalently with Linear Program 1 instead.
5. Experimental Evaluation

Our implementation was done in Python 3.10, basing ourselves on the Fast Downward parser (Helmert, 2006). We did not need the entire translator component, but only the functions concerned with the conversion of the instance into STRIPS. For linear programs, we resorted to the GLOP solver (Perron & Furnon, 2019), while integer programs were solved with Gurobi (Gurobi Optimization, LLC, 2023). We also used Google ORTools (Perron & Furnon, 2019) to interface between our program and the solvers. We ran our experiments on a machine running Rocky Linux 8.5, powered by an Intel Xeon E5-2667 v3 processor, with a 30-minute cutoff and using at most 16 GB of memory per instance. Our code is available online at https://github.com/arnaudlequen/MPRefinement.

In addition to the evaluation of the linear program, we also implemented a procedure based on the observations of Section 4. The main loop of this procedure consists of executing sequentially a predetermined list of operations and tests, until the instance is detected as unsolvable or the list is depleted. We elaborate further on this in Section 5.2.

We wished to evaluate our program on two different aspects: first, its ability to detect unsolvable instances, and second, its ability to find additional information when it could not conclude.

Our set of benchmarks consists of the unsolvable instances from the unplannability track of the International Planning Competition 2016 (Unsat IPC), for which we report our results on unsolvable instances. The Unsat IPC also includes solvable instances, which we tested our program on, as a sanity check, with success.

5.1. LP-Based Criteria

In this section, we show that our LP-based criterion suffices to detect a wide range of unsolvable planning instances. Our results are reported in detail in Table 1.

Table 1.
Summary of the Results Returned by the LP-Based Criterion, run on the Unsat Planning Competition Benchmark Set.

Set Unsat Total

bag-transport 19 29

bottleneck 25 25

cave-diving 1 25

chessboard-pebbling 23 23

over-tpp 2 30

pegsol-row5 14 15

Tetris 20 20

Remaining 0 180

Total 104 347

Set	Unsat	Total
bag-transport	19	29
bottleneck	25	25
cave-diving	1	25
chessboard-pebbling	23	23
over-tpp	2	30
pegsol-row5	14	15
Tetris	20	20
Remaining	0	180
Total	104	347

Note. Each line corresponds to a domain, which is a set of instances modeling similar problems. The first column reports instances on which our criterion succeeds, while the second column reports the total number of instances in the benchmark set. Domains for which no instance could be solved are summed up in the last line labeled Remaining.

In essence, about 30% of all instances of the Unsat IPC are almost immediately found to be unsolvable by the sole use of the criterion. These results, however, vary greatly from one domain to the other, in a very dichotomous fashion: either the domain is (almost) entirely solved through the criterion, or few to no instances are deemed unsolvable. In the case of domain bag-transport, which seems to be in between, all instances the criterion has been tried on are actually found to be unsolvable: however, as the last 10 instances are too big to be parsed, we could not run the test on them. We can also note that both LP- and IP-based criteria yield the same results and that solving the IP-based program did not allow us to detect more unsolvable instances than through the LP formulation.

Both programs are, however, very lightweight: for most domains, building and solving the program required less than a few seconds. In addition, for the vast majority of instances, the criteria required little more than a few tenths of a second to complete. This further justifies our use of the program in the iterative procedure that we present in the next section.

Our program fails entirely on some domains, where no instance can be solved. While this is often because our criterion simply fails to detect the instance’s unsolvability, this can also be due to the size of the model. This is the case of the bag-gripper, where the first instance has 5,681 fluents and 60,602 operators, which prevents us from building the associated linear program. In our assessments of the performances of the criteria, the limitation always came from memory. In this kind of situation, criteria that work on the FDR of the task might prove more efficient, since for the same problem, 317 facts resulting from 10 variables suffice to describe the first instance of bag-gripper (although 60602 operators are still needed). This may partly explain the success of other FDR-based criteria (Christen et al., 2022; Seipp et al., 2016) in some instances in which our criterion (and thus our procedure) entirely fails because of memory issues.

5.2. Iterative Refinement of the Model

In the case where the criterion could not immediately detect that an instance $Π$ is unsolvable, one can resort to the several operations previously introduced. In addition, the order in which operations are executed is also critical. Consider for instance an operator $a$ that is both recognized as a landmark and as a removable operator by our operations. In the case where the operator is first removed, then it cannot be detected as a landmark, and we thus missed an opportunity to return that the instance is unsolvable. In the case where $a$ is first detected as a landmark, then our routine terminates successfully by detecting that the instance is unsolvable.

5.2.1. Sequences of Operations

We present below the different lists of operations that we chose. Note that all sequences start and end with a simple test of solvability with the criterion: initially with only the information contained in the STRIPS model, and then with the information that was incrementally gathered after each series of operations. The exact sequences of operations can be found in Appendix A.

Linear

This sequence comprises all tests and operations that are linear in the size of the instance, that is, that only require one argument. We tried to put first the tests that were the most likely to succeed, so that the following tests and operations that come after have more information to work with. We successively apply the following tests on all relevant elements, in that order: LMDet, PreImp, $OpCount$ , FReach, and FNegGoal. By that, we mean that we run $LMDet (a)$ for all $a \in O$ , then $PreImp (a)$ for all $a \in O$ , etc.

Quadratic

In addition to the tests found in the linear sequence, the quadratic sequence has a sequence of operator mutexes tests OpMut, which comes right after the linear sequence’s operations. There exists a quadratic (in the size of the instance) number of such tests, as one has to consider every pair of operators successively. Even though the linear sequence is itself costly in terms of computation, it still builds fewer linear programs than the operator mutexes tests alone.

OperatorPreImpossible

As will be reported later, the PreImp tests that check an operator’s reachability are our most successful ones. We wished to gauge the time it requires and its possible impact on the model by itself.

OperatorCount

This sequence consists of finding lower bounds on the number of times each operator has to appear in any plan, and then upper bounds on the number of times each operator can appear in any plan. It aims to show that a linear number of integer programs to optimize can be done in a reasonable time, while also providing interesting information.

5.2.2. Results

We present our results below. As we prune out instances that can be immediately identified as unsolvable, domains that are immediately found unsolvable by the criterion are not reported.

Linear sequence

Table 2 shows statistics for the linear sequence. The main goal of our routine was to extract additional information from the model, either to directly prove the instance unsolvable, or so that another procedure that comes after can show the instance unsolvable more easily. We could indeed notice that our algorithm was sometimes enough to detect unsolvable instances that are otherwise not detected as such by the criterion. There are few examples of such instances (about 9.5% of the entire benchmark set), and they are grouped in only two domains (cave-diving and pegsol). Nonetheless, they suffice to show that a well-chosen sequence of operations can sometimes replace a search and that our work paves the way for further research in that regard.

Table 2.
Statistics for the Linear Sequence.

Operators Others

Set Diff. PreImp OpCount Rem. LMDet FReach FNegGoal

cave-diving (13/24) $+$ 11 10.0% 14.1% 10.4% 1.1% 4.8% 3.0%

diagnosis (19/20) 0 0% 57.0% 11.6% 18.3% 4.6% 17.6%

doc-transfer (5/20) 0 13.0% 26.4% 27.9% 1.7% 0.0% 39.8%

over-nomystery (2/24) 0 33.4% 25.7% 34.8% 2.1% 0% 7.4%

over-rovers (8/20) 0 27.9% 17.2% 29.3% 0% <0.1% 0%

over-tpp (6/29) 0 7.4% 54.8% 24.7% 0.3% 0.3% 0%

pegsol (24/30) +24 13.6% N/A 13.6% 0.8% N/A N/A

sliding-tiles (20/20) 0 0% 0% 0% 0% 0% 69.2%

	Operators	Others
cave-diving (13/24)	$+$ 11	10.0%	14.1%	10.4%	1.1%	4.8%	3.0%
diagnosis (19/20)	0	0%	57.0%	11.6%	18.3%	4.6%	17.6%
doc-transfer (5/20)	0	13.0%	26.4%	27.9%	1.7%	0.0%	39.8%
over-nomystery (2/24)	0	33.4%	25.7%	34.8%	2.1%	0%	7.4%
over-rovers (8/20)	0	27.9%	17.2%	29.3%	0%	<0.1%	0%
over-tpp (6/29)	0	7.4%	54.8%	24.7%	0.3%	0.3%	0%
pegsol (24/30)	+24	13.6%	N/A	13.6%	0.8%	N/A	N/A
sliding-tiles (20/20)	0	0%	0%	0%	0%	0%	69.2%

Note. The first column with the name of the domain also reports the total number of instances for which the procedure terminated entirely within the time and memory limits, out of the total number of instances that constitute the benchmark set that were not solved by the criterion alone. The “Difference” (Diff.) column shows the number of instances that could be found unsolvable during the execution of the procedure, compared to the single use of the criterion reported in Table 1. For example, out of the 25 cave-diving instances in the benchmark set, 24 were not solved by the criterion alone: out of these 24, the Linear procedure terminated on 13 instances, ran out of time on the others, and found 11 instances to be insolvable. The next set of columns shows stats for operations related to the deletion of operators. The first pair of columns show the percentage of success of each test ( $OpCount$ comprising both ${OpCount}^{+}$ and ${OpCount}^{-}$ ), that is, the percentage of such tests that brought new information on the problem. The last column of the set shows the average total percentage of operators removed at the end of the sequence of tests. The last three columns show the percentage of success of three other tests. N/A values indicate that no such test was performed as the program terminated before.

In the cases where our procedure could not conclude, it still manages to gather valuable information about the planning instance. For example, on some domains, almost a third of all operators are pruned on average, among instances on which our procedure terminates.

The termination of our procedure is, however, the main issue of this sequence of operations, which is too computationally costly, and often stops early because of the time and memory limits imposed. In some domains, very few instances could be run through the entire sequence of operations: such domains include over-nomystery, where this sequence terminated on only two instances out of the 24 that could be parsed.

Quadratic sequence

Table 3 shows statistics for the quadratic sequence. As the first part of the quadratic sequence consists of the exact same sequence of operations as the linear sequence, there are at least as many instances solved by the quadratic sequence as by the linear sequence. This is why domains cave-diving and pegsol only exhibit instances that are entirely solved: our procedure detects these instances as unsolvable before reaching the $OpMut$ tests (and then running out of time, in the case of cave-diving).

Table 3.

Statistics for the Quadratic Sequence.

		Operator mutexes
Set	Diff.	Av. mut/op	Success %	Av. $\| O \|$
cave-diving (11/24)	+11	N/A	N/A	1113
diagnosis (15/20)	0	4.8	3.6%	136
over-rovers (3/20)	0	8.2	6.2%	313
over-tpp (1/29)	0	0.8	<0.1%	1675
pegsol (24/30)	+24	N/A	N/A	77
sliding-tiles (10/20)	0	0	0%	193

Note. The number in parenthesis in each row indicates the instances the sequence terminated on, out of the total number of instances that constitute the benchmark set that were not solved by the criterion alone. As in Table 2, the Diff. column reports the instances found unsolvable by our routine, but on which the criterion failed to conclude by itself. The second column reports the average number of mutexes an operator finds itself in, and the third column reports the average percentage of success of $OpMut$ tests, that is, the average percentages of tests that led to the finding of a new mutex. This latter column could also be interpreted as the average proportion of operators that are found to be mutex with another fixed operator. The last column reports the average number of operators per instance. Note that our reports concern far fewer instances than for the linear sequence, as the computational cost is significantly greater for the sequence reported here. Domains document-transfer and over-nomystery are not reported as no instance finished before the cutoff.

In general, the use of a quadratic number of operations quickly becomes problematic: compared to the linear sequence, the quadratic sequence can be run in its entirety on significantly fewer instances. In addition, very few tests are actually successful, as even with the most receptive instances, only a few percent of all tests succeed. However, the main issue with our sequence of operations is that it considers all pairs of operators, and checks if they can be marked as mutexes. The quadratic number of such tests is then responsible for the computational cost of the sequence, but the test in itself remains an integer program that can be solved almost instantly.

Yet, even if we were to find operators mutexes, they would be hard to reinvest in the linear program. Indeed, to be included in the linear program, an operator mutex ${a_{1}, a_{2}}$ requires that $n^{+} (a_{1})$ and $n^{+} (a_{2})$ are finite (see equation (9)), which is common but not always the case, as reported in Table 5. As very few pairs of operators are detected as mutexes, the next logical step to make the most of operator mutexes is to more efficiently choose which pairs to test, to avoid performing unpromising tests.

Individual Tests

Tables 4 and 5 summarize the statistics for the other sequences, which mostly consist of a series of one or two of the same operations. However, we do not report comprehensive results for all remaining sequences: indeed, in the case of the OperatorDeadLock sequence, no test answered positively, and no dead-end operator could be found.

Table 4.

Performances of the Operator PreImpossible Sequence.

		OperatorPreImpossible
Set	Diff.	Removed	Time (s)
bag-barman (4/20)	0	77.2%	1177.8
cave-diving (16/24)	+4	6.5 $%$	147.8
diagnosis (20/20)	0	0%	6.4
document-transfer (13/20)	0	0%	475.7
over-nomystery (10/24)	0	18.8 $%$	587.8
over-rovers (11/20)	0	21.9 $%$	370.2
over-tpp (12/29)	0	<0.1 $%$	268.1
pegsol (24/30)	+6	16.4 $%$	0.6
sliding-tiles (20/20)	0	0%	5.6

Table 5.

Performances of the OperatorCount Sequence.

		OperatorCount
Set	Diff.	${OpCount}^{-}$	${OpCount}^{+}$	Removed	Time (s)
bag-barman (0/20)	–	–	–	–	–
cave-diving (16/24)	0	0.9%	28.0%	7.0%	329.3
diagnosis (20/20)	0	16.9%	96.3%	19.5%	91.6
document-transfer (8/20)	0	1.7%	50.7%	29.8%	643.2
over-nomystery (3/24)	0	1.4%	87.2%	3.9%	746.2
over-rovers (9/20)	0	0%	62.4%	5.2%	455.1
over-tpp (7/29)	0	0.3%	65.4%	20.2%	428.8
pegsol (24/30)	+22	0%	8.2%	3.0%	0.51
sliding-tiles (20/20)	0	0%	0%	0%	19.4

Note. The number in parenthesis in each row indicates the instances the sequence terminated on, out of the total number of instances that constitute the benchmark set that was not solved by the criterion alone. The Diff. column shows the number of instances solved thanks to the iterative refinement. ${OpCount}^{-}$ and ${OpCount}^{+}$ columns report the average percentage of success of their respective operations, that is, the average percentages of tests that discovered some new, nontrivial information. The “Removed” column shows the average percentage of operators that could be removed thanks to the ${OpCount}^{+}$ operations. The “Time” column shows, in seconds, the average time per instance.

Nonetheless, the results for the other sequences of operations are encouraging. Be it for the sequence centered on PreImp or the one focused on OpCount operations, a significant proportion of operators could be removed. In some cases, it suffices to show that the instance was not solvable, as is the case for the cave-diving or pegsol domains. However, the time required for the computation is significant, which is discussed in the next section.

An interesting property of our $OpCount$ tests is that they do not always return $0$ , $1$ , or $+ \infty$ : they sometimes find that some operator $a$ can be applied at most $c$ times, where $c$ is an integer such that $c \geq 2$ .

Note that these sequences of tests are not as powerful as the linear sequence, when it comes to detecting unsolvable instances. This seems to indicate that the combination of different kinds of operations is crucial to draw conclusions, and studying their interactions is crucial in designing more powerful sequences.

6. Related Work

The surge in interest for unsolvability detection, in the last decade, has been embodied by the first Unsolvability Planning Competition in 2016. The competition saw various adaptations of techniques that have shown themselves efficient for finding plans, in a state space search. Such methods include heuristics specifically tailored for unsolvability detection, such as a Merge & Shrink-based heuristic (Helmert et al., 2014) (which precedes the competition). Such heuristics rely on abstractions that do not preserve distance, but merely solvability.

Another heuristic that was successfully adapted was the operator-counting heuristic (Bonet, 2013; Pommerening et al., 2014; Van Den Briel et al., 2007). The heuristic is based on a relaxation of the orderings of the operators. Previous works showed that it admits an LP formulation, similar to the Linear Program 2 or the Integer Program 1 that we propose. However, while we only optimize the variable associated with the count of a single operator, the objective function that they minimize is the total cost of the plan. The adaptation of the linear program to the case of unsolvability detection was carried out by the Fast Downward-based unsolvability planner Aidos (Seipp et al., 2016). It consists of checking the existence of a solution, in the same way as for Linear Program 2. However, Aidos uses this component in a state space search, in order to detect dead ends.

More generally, be it in unsolvable or in solvable planning tasks, the early detection of states that cannot lead to a goal can help prune out whole branches of the search space. In the case of dead-end detection (Cserna et al., 2018), various works have focused on the elaboration of formulas that can be efficiently evaluated, and whose only models are states that cannot lead to a goal state. The notion of dead-end formula has been generalized with the notion of traps (Lipovetzky et al., 2016): a formula $ϕ$ such that, once it is verified in a state $s$ , all states reachable from $s$ will satisfy it too. A formula $ϕ$ that is inconsistent with the goal then shows that the current branch is not worth exploring.

In the case where our algorithm does not manage to find that the task is unsolvable, it still manages to remove unnecessary elements from the planning model, to make the task easier for the next algorithm. Various other methods prune the model in a preprocessing step: in Alcázar and Torralba (2015), the authors show that invariants in the form of mutexes can be leveraged to remove operators that will never be part of a plan. In Fišer et al. (2019), it is shown how to combine symmetries of the planning task and operator mutexes to find operators that are redundant, in the sense that removing them preserves at least one solution plan.

Our algorithm also learns information that is not explicitly expressible in a STRIPS planning instance. In Steinmetz and Hoffmann (2017b), the authors draw inspiration from a well-known technique in SAT solving, to learn clauses that recognize dead ends, through a conflict-driven approach during search. They also show how to learn traps online Steinmetz and Hoffmann (2017a). Learning is ubiquitous in generalized planning, which is a domain concerned with the synthesis of generalized plans, which are procedures that solve multiple instances. For instance, previous work (Ståhlberg et al., 2021) proposed to learn heuristics in the form of logical formulas, out of a set of small examples instances, so as to recognize unsolvable planning instances.

In Christen et al. (2022), another polynomial criterion is proposed to immediately detect a class of unsolvable instances without resorting to search. The authors synthesize a function that separates the initial state from all goal states, through a linear combination of features valued in a finite field, or at least in a ring. Akin to our criterion, their technique is incomplete, but it is very efficient at detecting parity arguments. In practice, they define a system of linear equations that they solve through a Gaussian elimination process, since the variables have domain $F_{2} = {0, 1}$ . More generally, in the same paper, the authors formalize the theory of what they named separating functions, which encompasses both their criterion and the one found in Aidos Seipp et al. (2016). Their theory proposes linear programs that are very close, if not identical, to Linear Program 1. However, they require a planning instance in FDR and TNF, and their results do rely on these hypotheses. In this paper, we showed that, in the special case where variables range over $R^{+}$ , and when the planning task is expressed in STRIPS, TNF is not a requirement for the criterion to hold.

7. Discussion

7.1. Additional Operations

We designed more operations than presented in this paper, but we only report those for which a nontrivial amount of tests answered positively. Operations that never succeeded include operator ordering tests: given $a_{1}$ , $a_{2}$ two operators, does $a_{1}$ always appear before $a_{2}$ in all solution plans? This is the case, for instance, if $a_{1}$ has some initial fluent $f$ in its precondition, $a_{2}$ destroys $f$ , and no other operator can establish $f$ . In order to use our framework to test that $a_{1}$ cannot occur after $a_{2}$ , one can check that the preconditions of $a_{1}$ cannot be reached once $a_{2}$ has been applied. But to do so, the instance has to be projected on the only fluent whose value is known after the application of $a_{2}$ . As a result, the linear program corresponding to the newly formed instance is loosely constrained, and our experiments did not allow us to conclude.

Other such tests include checking if some fluent $f$ can be added to the negative precondition of operator $a$ , when the STRIPS formalism used allows negative preconditions. This can be done by testing if $pre (a) \cup {f}$ is reachable from $I$ . If not, then $\neg f$ can be added to the preconditions of $a$ . Preliminary experiments showed that such tests sometimes succeed, but the proportion of tests that do is often negligible compared to the cost of testing each pair of $F \times O$ , hence of choice of STRIPS with positive preconditions only throughout this paper.

7.2. Perspectives

Section 5 showed that, when our criterion failed to show an instance unsolvable, it was still possible to extract additional information from the model by leveraging the criterion. Even more so, in some cases, otherwise undetected unsolvable instances could be identified as such by this means. Yet, there is still a lot of room for improvement: a more in-depth study of our operations, as well as their interactions, could help us fine-tune the algorithm, and tailor more effective sequences of operations. Indeed, not all sequences of tests are equal in all aspects, and finding a sequence that avoids unnecessary computations is a way to optimize our algorithm and boost its detection power.

In our tests, we choose to simply run predetermined sequences of operations and tests. This means that, regardless of how tests succeed or fail, the algorithm will linearly go through the same sequence of operations, except if it can show preemptively that an instance is unsolvable. However, the outcome of some tests may help in finding which step to take next. For instance, after finding that an operator is a landmark, it might be interesting to check right away if it can be removed. This can be done through a stack of operations, on which are added the operations that are made relevant by the result of another previous operation.

One of the main weaknesses of our iterative refinement algorithm is its computational cost. Even the most lightweight sequences, such as the OperatorPreImpossible sequence, take significant time to complete. Our program builds each linear program from scratch each time a test is performed. However, very few constraints differ from one linear program to the other; thus, one could modify only these constraints from one test to the next, in order to save significant time. In addition to that, the operations are mostly independent of one another: as a consequence, one could perform multiple operations in parallel with minimum loss.

Finally, as future work, we wish to find a method to reinvest the newly found information about the planning model directly into an off-the-shelf planner, in order to assist it during search. This could either be done by refining the planning model as a preprocessing phase, or by allowing the planner to use our procedure as an oracle during search, to ask very specific questions about some particular aspect of the problem at hand.

8. Conclusion

In this article, we showed that a simple criterion was sometimes enough to prove that a planning instance is unsolvable. Even though our program is nonoptimized, we have still managed to show that resorting to a search is not always necessary, as reasoning on the model directly can suffice. Even when our procedure fails, it still gathers valuable information about the instance, that provide insight on the planning instance itself.

Other operations and tests can be thought of and included in our framework. The most important point would be to ensure that the information that they bring is related to the other operations (e.g., checking if an operator $a$ ’s preconditions are removable, and checking if $n^{+} (a) \leq 0$ ), or at least that the new information can be reinvested in the linear program.

Footnotes

Acknowledgments

The authors would like to thank the reviewers of 17èmes Journées d’Intelligence Artificielle Fondamentale for their insightful comments.

ORCID iDs

Arnaud Lequen

Martin C. Cooper

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the AI Interdisciplinary Institute ANITI, funded by the French program “Investing for the Future—PIA3” under grant agreement no. ANR-19-PI3A-0004..

Appendix A. Sequences of Operations and Tests

Below are shown the exact sequences of operations and tests of each sequence tested on the benchmark sets. Tests that require an argument are run on all possible arguments (e.g., PreImp tests are run on all operators), and those that require two arguments are run on all possible pairs of such arguments (e.g., OpMut tests are run on all pairs of operators).

References

Alcázar

Torralba

(2015). A reminder about the importance of computing and exploiting invariants in planning. In Proceedings of the international conference on automated planning and scheduling (vol. 25, pp. 2–6). AAAI Press.

Bonet

(2013). An admissible heuristic for SAS+ planning obtained from the state equation. In F. Rossi (Ed.), IJCAI 2013, proceedings of the 23rd international joint conference on artificial intelligence (pp. 2268–2274). AAAI Press.

Christen

Eriksson

Pommerening

Helmert

(2022). Detecting unsolvability based on separating functions. In A. Kumar, S. Thiébaux, P. Varakantham and W. Yeoh (Eds.), Proceedings of the thirty-second international conference on automated planning and scheduling, ICAPS 2022 (pp. 44–52). AAAI Press.

Cserna

Doyle

W. J.

Ramsdell

J. S.

Ruml

(2018). Avoiding dead ends in real-time heuristic search. In S. A. McIlraith and K. Q. Weinberger (Eds.), Proceedings of the thirty-second AAAI conference on artificial intelligence (AAAI-18) (pp. 1306–1313). AAAI Press.

Fišer

Torralba

Shleyfman

(2019). Operator mutexes and symmetries for simplifying planning tasks. In Proceedings of the AAAI conference on artificial intelligence (vol. 33, pp. 7586–7593). AAAI Press.

Geffner

Bonet

(2013). A concise introduction to models and methods for automated planning. Morgan & Claypool Publishers.

Gurobi Optimization, LLC (2023). Gurobi optimizer reference manual.

Helmert

(2006). The fast downward planning system. JAIR, 26, 191–246.

Helmert

Haslum

Hoffmann

Nissim

(2014). Merge-and-shrink abstraction: A method for generating lower bounds in factored state spaces. Journal of ACM, 61(3), 16:1–16:63.

10.

Lipovetzky

Muise

C. J.

Geffner

(2016). Traps, invariants, and dead-ends. In A. J. Coles, A. Coles, S. Edelkamp, D. Magazzeni and S. Sanner (Eds.), Proceedings of the twenty-sixth international conference on automated planning and scheduling, ICAPS (pp. 211–215). AAAI Press.

11.

Perron

Furnon

(2019). Or-tools.

12.

Pommerening

Helmert

Röger

Seipp

(2015). From non-negative to general operator cost partitioning. In B. Bonet and S. Koenig (Eds.), Proceedings of the twenty-ninth AAAI conference on artificial intelligence, January 25–30, Austin, Texas, USA (pp. 3335–3341). AAAI Press.

13.

Pommerening

Röger

Helmert

Bonet

(2014). LP-based heuristics for cost-optimal planning. In S. A. Chien, M. B. Do, A. Fern and W. Ruml (Eds.), Proceedings of the twenty-fourth international conference on automated planning and scheduling, ICAPS. AAAI Press.

14.

Seipp

Pommerening

Sievers

Wehrle

Fawcett

Alkhazraji

(2016). Fast downward aidos. In Unsolvability international planning competition: Planner abstracts (pp.28–38). AAAI Press.

15.

Ståhlberg

Francès

Seipp

(2021). Learning generalized unsolvability heuristics for classical planning. In Z.-H. Zhou (Ed.), Proceedings of the thirtieth international joint conference on artificial intelligence, IJCAI-21 (pp. 4175–4181). International Joint Conferences on Artificial Intelligence.

16.

Steinmetz

Hoffmann

(2017a). Search and learn: On dead-end detectors, the traps they set, and trap learning. In Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI-17 (pp. 4398–4404). International Joint Conferences on Artificial Intelligence.

17.

Steinmetz

Hoffmann

(2017b). State space search nogood learning: Online refinement of critical-path dead-end detectors in planning. Artificial Intelligence, 245, 1–37.

18.

Steinmetz

Hoffmann

Kovtunova

Borgwardt

(2022). Classical planning with avoid conditions. In Thirty-sixth AAAI conference on artificial intelligence, AAAI 2022 (pp. 9944–9952). AAAI Press.

19.

Van Den Briel

Benton

Kambhampati

Vossen

(2007). An LP-based heuristic for optimal planning. In Principles and practice of constraint programming–CP 2007: 13th international conference (pp. 651–665). Springer.

		Operators			Others
Set	Diff.	PreImp	OpCount	Rem.	LMDet	FReach	FNegGoal
cave-diving (13/24)	$+$ 11	10.0%	14.1%	10.4%	1.1%	4.8%	3.0%
diagnosis (19/20)	0	0%	57.0%	11.6%	18.3%	4.6%	17.6%
doc-transfer (5/20)	0	13.0%	26.4%	27.9%	1.7%	0.0%	39.8%
over-nomystery (2/24)	0	33.4%	25.7%	34.8%	2.1%	0%	7.4%
over-rovers (8/20)	0	27.9%	17.2%	29.3%	0%	<0.1%	0%
over-tpp (6/29)	0	7.4%	54.8%	24.7%	0.3%	0.3%	0%
pegsol (24/30)	+24	13.6%	N/A	13.6%	0.8%	N/A	N/A
sliding-tiles (20/20)	0	0%	0%	0%	0%	0%	69.2%

Analysis of Planning Instances Without Search

Abstract

Keywords

1. Introduction

2. Background

2.1. STRIPS Planning Instance

2.2. States and Plans

3. Detecting Unsolvable Instances by LP

3.1. Potential-Based Argument

Linear Program 2 ( L Π op )

4.1. Operator Counts and Landmarks

4.1.1. Landmark Detection

Lemma 2 Let Π = ⟨ F , I , O , G ⟩ and a ∈ O . If Π ∣ a = ⟨ F , I , O ∖ { a } , G ⟩ is unsolvable, then a is a landmark. This allows us to define the landmark detection test below, where Π ∣ a is as defined in Lemma 2. 4.1.2. Operator Count

Integer Program 1 ( L Π opt ( a ) )

4.2. Operator Mutexes

4.2.1. Finding Operator Mutexes Through LP

4.3.1. Through a Modification of the Linear Program

4.3.2. Unreachable Preconditions

4.3.3. Dead-End Operators

4.5. Fluent Mutexes and Unreachable Fluents

4.6. Example Instance Solved Through Model Refinement

5.1. LP-Based Criteria

Table 1. Summary of the Results Returned by the LP-Based Criterion, run on the Unsat Planning Competition Benchmark Set. Set Unsat Total bag-transport 19 29 bottleneck 25 25 cave-diving 1 25 chessboard-pebbling 23 23 over-tpp 2 30 pegsol-row5 14 15 Tetris 20 20 Remaining 0 180 Total 104 347

5.2.1. Sequences of Operations

Linear

Quadratic

OperatorPreImpossible

OperatorCount

5.2.2. Results

Linear sequence

Quadratic sequence

Individual Tests

7. Discussion

7.1. Additional Operations

7.2. Perspectives

8. Conclusion

Footnotes

Acknowledgments

ORCID iDs

Declaration of Conflicting Interests

Funding

Appendix A. Sequences of Operations and Tests

References

Linear Program 2 ( $L_{Π}^{op}$ )

Lemma 2
Let $Π = ⟨ F, I, O, G ⟩$ and $a \in O$ . If $Π_{∣ a} = ⟨ F, I, O ∖ {a}, G ⟩$ is unsolvable, then $a$ is a landmark.

This allows us to define the landmark detection test below, where $Π_{∣ a}$ is as defined in Lemma 2.

4.1.2. Operator Count

Integer Program 1 ( $L_{Π}^{opt} (a)$ )