Abstract
Intratumor heterogeneity (ITH) impacts cancer progression, and its characterization is crucial. Clustering algorithms applied to the variant allele frequency (VAF) of mutations can facilitate the exploratory analysis of ITH. This study comparatively evaluated six clustering algorithms to characterize ITH by clustering mutations based on their VAFs. We utilized data from The Cancer Genome Atlas to analyze three cancer types by examining the distribution of clusters in the results from various methods and four internal validation metrics. The results indicated that the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Variational Bayesian Gaussian Mixture Model methods identified an insufficient number of clusters in most tumor samples. The Hierarchical DBSCAN (HDBSCAN) and Ordering Points to Identify the Clustering Structure (OPTICS) algorithms exhibited greater variability in the number of clusters, while Affinity Propagation (AP) showed controlled behavior, and Mean-Shift demonstrated greater consistency. The Mean-Shift and AP methods were consistently superior in the validation metrics, in contrast to HDBSCAN and OPTICS, which had inferior performance. We conclude that Mean-Shift and AP are promising and accessible alternatives for the initial exploratory analysis of ITH by VAFs. A computational pipeline is provided on the Google Colab platform to facilitate future studies.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
