Abstract
Abstract
This work revisits the classic problem of coverage in genomic shotgun assembly (the “Lander-Waterman statistics”). A novel formulation, based on the analysis of an autonomous Markov automaton, is presented, and two main conclusions are derived. The first is an evaluation of the minimum multiplicity (“coverage”) required to achieve uninterrupted covering (one single contig) with a prescribed confidence level. The second is a detailed analysis of the effect of replacing the hypothesis of fixed-length genomic fragments with that of an arbitrary distribution of lengths over a finite interval.
Get full access to this article
View all access options for this article.
