Abstract
RNA design aims to find a sequence that folds into a designated target structure under a specific RNA folding model, also known as the inverse problem of RNA folding. While numerous RNA design methods have been invented to search for sequences capable of folding into a target structure under the default (Turner) RNA folding model, little attention has been given to the identification of undesignable structures. This work bridges the gap between RNA design and undesignability by introducing a series of theorems and algorithms aimed at identifying both undesignable structures and their causative local structural components, which we define as minimal undesignable motifs. We first present theorems that provide sufficient conditions for recognizing undesignability structures and propose efficient, theorem-guided algorithms to verify whether an RNA structure is undesignable. While such global undesignability sheds light on the limits of RNA design, identifying the specific motifs responsible for undesignability is critical for understanding RNA folding models and advancing design methodologies. To this end, we develop a new theoretical framework for motif undesignability and propose scalable and interpretable algorithms to identify minimal undesignable motifs within a given RNA secondary structure. Our approach establishes motif undesignability by searching for rival motifs, rather than exhaustively enumerating all (partial) sequences that could potentially fold into the motif. Furthermore, we exploit rotational invariance in RNA structures to detect, group, and reuse equivalent motifs and to construct a database of unique minimal undesignable motifs. To achieve that, we propose a loop-pair graph representation for motifs and a recursive graph isomorphism algorithm for motif equivalence. Our algorithms successfully identified 24 unique minimal undesignable motifs among 18 undesignable puzzles from the Eterna100 benchmark. Surprisingly, we also find over 350 unique minimal undesignable motifs and 663 undesignable native structures in the ArchiveII dataset, drawn from a diverse set of RNA families. Last but not least, we demonstrate that our theory and algorithms can handle motifs with external loops—a critical advancement given the substantial impact of external loops on the quantity, diversity, and designability of RNA structure motifs.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
