BACKGROUND AND OBJECTIVES: Kinship coefficients measure relatedness between two individuals and have wide usage in genetic applications. In this study, we repurpose the kinship coefficient to directly facilitate sample tracking to identify potential sample swaps. Such sample integrity metrics are particularly important for the following two scenarios in large-scale clinical studies: First, multiple biological samples from the same individual were routinely processed as unique samples or technical replicates. Querying the relatedness of genomic data of two samples can identify sample swaps prior to inappropriate inclusion in data analysis. In the second scenario, different biological analytes from the same samples were run across multiple platforms and it is critical to establish the correct mapping for each individual sample, linking genomic information derived from multiple platforms to the same sample. For both cases, all downstream inferences rely on such correct mapping. Kinship coefficients can directly measure the mapping accuracy and ensure the required sample integrity.
MATERIALS AND METHODS:
We first describe the general concept of kinship coefficients and focus on the novel adaptations on feature (i.e. variants and/or SNPs) selection utilizing expressed variants to make it suitable for the clinical setting.
RESULTS:
We illustrate the adapted kinship coefficients estimate in two studies: one for lung fibrosis where multiple samples were routinely collected from each patient and one for thyroid cancers where a cohort of samples was run on different platforms.
CONCLUSION:
We demonstrate the effectiveness of using kinship coefficients to improve sample integrity and discuss potential improvements in the methodology.