Abstract
Fast and exact comparison of large genomic sequences remains a challenging task in biosequence analysis. We consider the problem of finding all ∊-matches between two sequences, i.e., all local alignments over a given length with an error rate of at most ∊. We study this problem theoretically, giving an efficient q-gram filter for solving it. Two applications of the filter are also discussed, in particular genomic sequence assembly and BLAST-like sequence comparison. Our results show that the method is 25 times faster than BLAST, while not being heuristic.
Get full access to this article
View all access options for this article.
