Abstract
Bacteria and archaea acquire resistance to genetic parasites by preferentially integrating short fragments of foreign DNA at one end of a Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR). “Leader” DNA upstream of CRISPR loci regulates transcription and foreign DNA integration into the CRISPR. Here, we analyze 37,477 CRISPRs from 39,277 bacterial and 556 archaeal genomes to identify conserved sequence motifs in CRISPR leaders. A global analysis of all leader sequences fails to identify universally conserved motifs. However, an analysis of leader sequences that have been grouped by 16S rRNA-based taxonomy and CRISPR subtype reveals 87 specific motifs in type I, II, III, and V CRISPR leaders. Fourteen of these leader motifs have biochemically demonstrated roles in CRISPR biology including integration, transcription, and CRISPR RNA processing. Another 28 motifs are related to DNA binding sites for proteins with functions that are consistent with regulating CRISPR activity. In addition, we show that these leader motifs can be used to improve existing CRISPR detection methods and enhance the accuracy of CRISPR classification.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
