Abstract
ABSTRACT
Consider a DNA mapping project in which overlap of clones is inferred from multiple complete restriction enzyme digests. Each enzyme cuts each clone randomly into fragments whose lengths are determined with some error. Clones that share fragments with matching lengths could contain a region of overlap. However, common fragment lengths may be due to random coincidence leading to a false overlap declaration. Although the probability of false fragment matching is small, a mapping project involves a large number of clone comparisons. Consequently, erroneous fragment matches can be a serious problem. We use a geometrical probability approach to develop exact integral formulas and first-order approximations for the expected number and variance of classes of fragment pairs that will be identified falsely as matching. We also find exact formulas for the expected value, and variance of the number of true fragment matches. These formulas are useful in comparing different mapping strategies.
Get full access to this article
View all access options for this article.
