An integrative sparse boosting analysis of cancer genomic commonality and difference

Abstract

In cancer research, high-throughput profiling has been extensively conducted. In recent studies, the integrative analysis of data on multiple cancer patient groups/subgroups has been conducted. Such analysis has the potential to reveal the genomic commonality as well as difference across groups/subgroups. However, in the existing literature, methods with a special attention to the genomic commonality and difference are very limited. In this study, a novel estimation and marker selection method based on the sparse boosting technique is developed to address the commonality/difference problem. In terms of technical innovation, a new penalty and computation of increments are introduced. The proposed method can also effectively accommodate the grouping structure of covariates. Simulation shows that it can outperform direct competitors under a wide spectrum of settings. The analysis of two TCGA (The Cancer Genome Atlas) datasets is conducted, showing that the proposed analysis can identify markers with important biological implications and have satisfactory prediction and stability.

Keywords

Integrative analysis commonality and difference sparse boosting cancer genomics

Get full access to this article

View all access options for this article.

References

Ricketts

Cubas

AAD

Fan

, et al. The cancer genome atlas comprehensive molecular characterization of renal cell carcinoma. Cell Reports 2018; 23: 313–326.

Berger

Korkut

Kanchi

, et al. A comprehensive pan-cancer molecular study of gynecologic and breast cancers. Cancer Cell 2018; 33: 690–705.

Weinstein

Collisson

Mills

, et al. The cancer genome atlas pan-cancer analysis project. Nature Genetics 2013; 45: 1113–1120.

Omberg

Ellrott

Yuan

, et al. Enabling transparent and collaborative computational analysis of 12 tumor types within the cancer genome atlas. Nature Genetics 2013; 45: 1121–1126.

Liu

Huang

Zhang

, et al. Integrative analysis of prognosis data on multiple cancer subtypes. Biometrics 2014; 70: 480–488.

Lawrence

Stojanov

Mermel

, et al. Discovery and saturation analysis of cancer genes across 21 tumor types. Nature 2014; 505: 495–501.

Hoadley

Yau

Hinoue

, et al. Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer. Cell 2018; 173: 291–304.

Guerra R and Goldstein D. Meta-analysis and combining information in genetics and genomics. Boca Raton, FL: Chapman and Hall/CRC, 2009.

Zhu

Zhao

, et al. Integrating multidimensional omics data for cancer outcome. Biostatistics 2012; 17: 605–618.

10.

Huang

Shia

, et al. Identification of cancer genomic markers via integrative sparse boosting. Biostatistics 2012; 13: 509–522.

11.

Liu

Huang

. Integrative analysis of cancer diagnosis studies with composite penalization. Scand J Stat 2014; 41: 87–103.

12.

Zhao

Shi

Huang

, et al. Integrative analysis of ”-omics” data using penalty functions. Wiley Interdiscipl Rev: Computat Stat 2015; 7: 99–108.

13.

Shi

Liu

Huang

, et al. Integrative analysis of high-throughput cancer studies with contrasted penalization. Genetic Epidemiol 2014; 38: 144–151.

14.

Huang

Zhang

, et al. Promoting similarity of sparsity structures in integrative analysis with penalization. J Am Stat Assoc 2016; 112: 342–350.

15.

Huang

Liu

, et al. Promoting similarity of model sparsity structures in integrative analysis of cancer genetic data. Stat Med 2017; 36: 509–559.

16.

Sun

Jiang

, et al. Identification of cancer omics commonality and difference via community fusion. Stat Med 2019; 38: 1200–1212.

17.

Buhlmann

. Sparse boosting. J Mach Learn Res 2006; 7: 1001–1024.

18.

Huang

, et al. Gene network-based cancer prognosis analysis with sparse boosting. Genetics Res 2012; 94: 205–221.

19.

Yue

. Sparse boosting for high-dimensional survival data with varying coefficients. Stat Med 2018; 37: 789–800.

20.

Buhlmann

Hothorn

. Boosting algorithms: regularization, prediction and model fitting (with discussion). Stat Sci 2007; 22: 477–505.

21.

Buhlmann

. Boosting. Wiley interdisciplinary reviews: computational statistics 2010; 2: 69–74.

22.

Ing

Lai

. A stepwise regression method and consistent model selection for high-dimensional sparse linear models. Stat Sinica 2011; 21: 1473–1513.

23.

Almeida

Ramos

. Acute myeloid leukemia in the older adults. Leukemia Res Reports 2016; 6: 1–7.

24.

Serrano

Boguna

Vespignani

. Extracting the multiscale backbone of complex weighted networks. Proc Natl Acad Sci USA 2009; 106: 6483–6488.

25.

Blondel

Guillaume

Lambiotte

, et al. Fast unfolding of communities in large networks. J Stat Mech: Theory Experiment 2008; 2008: 155–168.

26.

van Bon

BWM

Oortveld

MAW

Nijtmans

, et al. Cep89 is required for mitochondrial metabolism and neuronal function in man and fly. Human Mol Genet 2013; 22: 3138–3151.

27.

Chen

Gerasimcik

Camponeschi

, et al. Cd27 expression and its association with clinical outcome in children and adults with pro-b acute lymphoblastic leukemia. Blood Cancer J 2017; 7: 575.

28.

Perwitasari

Torrecilhas

Yan

, et al. Targeting cell division cycle 25 homolog b to regulate influenza virus replication. J Virol 2013; 87: 13775–13784.

29.

Reikvam

Tamburini

Skrede

, et al. Antileukaemic effect of pi3k-mtor inhibitors in acute myeloid leukaemia-gene expression profiles reveal cdc25b expression as determinate of pharmacological effect. Br J Haematol 2014; 164: 200–211.

30.

Ebrahimi-Rad

Khatami

Ansari

, et al. Adenosine deaminase 1 as a biomarker for diagnosis and monitoring of patients with acute lymphoblastic leukemia. J Med Biochem 2017; 37: 1–6.

31.

Pankaj

Kenneth

. Granulocyte colony-stimulating factor receptor signaling in severe congenital neutropenia, chronic neutrophilic leukemia, and related malignancies. Experiment Hematol 2017; 46: 9–20.

32.

Gajjar

Patel

. Neuromedin: An insight into its types, receptors and therapeutic opportunities. Pharmacol Rep 2017; 69: 438–447.

33.

Moreno

Mantey

Lee

, et al. A possible new target in lung-cancer cells: the orphan receptor, bombesin receptor subtype-3. Peptides 2018; 101: 213–226.

34.

Liu

Xue

, et al. Epithelial-mesenchymal transition and galc expression of circulating tumor cells indicate metastasis and poor prognosis in non-small cell lung cancer. Cancer Biomarkers 2018, pp. 1–10. Preprint.

35.

Rabjerg

. Identification and validation of novel prognostic markers in renal cell carcinoma. Danish Med J 2017; 64: B5339 .

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.58 MB