Abstract
Accurate breast cancer subtype prediction is critical for precise diagnosis, treatment planning, and prognosis evaluation. Recent studies highlight the important role of epigenetic modifications in breast tumor, especially the potential of abnormal DNA methylation patterns as markers for distinct subtypes. However, developing a reliable model for subtype prediction based on DNA methylation profiles is challenging due to the scarcity of annotated dataset. This work proposes BCtypeFinder, a breast cancer subtype prediction framework that utilizes a domain adaptation network combined with semi-supervised learning to address batch effects. Our model leverages both labeled and unlabeled DNA methylation data to extract domain-invariant features while aligning subtype distributions across various datasets. BCtypeFinder outperforms current methods, showcasing superior classification performance across multiple test cases. Furthermore, we explored the effects of batch correction in BCtypeFinder, demonstrating its ability to remove batch-specific variations among patients of the same subtype, thus improving the robustness of the classifier. BCtypeFinder is publicly available at https://github.com/joungmin-choi/BCtypeFinder.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
