Sage Journals: Discover world-class research

Abstract

We present a fast method for calculation of Hamming distance vector based on a simple preprocessing of the target text. For applications on protein sequences, with alphabet of 20 symbols or more, the proposed method is an order of magnitude faster than the brute force approach while much simpler than previously published methods.

Get full access to this article

View all access options for this article.

References

Abrahamson

1987. Generalized string matching. SIAM. J. Comput., 16, 1039–1051.

Al-Okaily

2015. Error tree: A tree structure for Hamming & edit distances & wildcards matching. J. Comput. Biol., 22, 1118–1128.

Amir

, Lewenstein

, and Porat

2004. Faster algorithms for string matching with k mismatches. J. Algorithm., 50, 257–275.

The UniProt Consortium. 2015. UniProt: A hub for protein information. Nucleic Acids Res. 43, D204–D212.

A Fast and Simple Pattern Matching with Hamming Distance on Large Alphabets

Abstract

Abstract

Get full access to this article

References