Abstract
In this paper, we consider the problem of calibrating diagnostic rules based on high-resolution mass spectrometry data subject to the limit of detection. The limit of detection is related to the limitation of instruments in measuring low-concentration proteins. As a consequence, peak intensities below the limit of detection are often reported as missing during the quantification step of proteomic analysis. We propose the use of censored data methodology to handle spectral measurements within the presence of limit of detection, recognizing that those have been left-censored for low-abundance proteins. We replace the set of incomplete spectral measurements with estimates of the expected intensity and use those as input to a prediction model. To correct for lack of information and measurement uncertainty, we combine this approach with borrowing of information through the addition of an individual-specific random effect formulation. We present different modalities of using the above formulation for prediction purposes and show how it may also allow for variable selection. We evaluate the proposed methods by comparing their predictive performance with the one achieved using the complete information as well as alternative methods to deal with the limit of detection.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
