Sage Journals: Discover world-class research

Abstract

Single-cell multi-omics technologies enable comprehensive interrogation of cellular regulation, yet most single-cell assays measure only one type of activity—such as transcription, chromatin accessibility, DNA methylation, or 3D chromatin architecture—for each cell. To enable a multimodal view for individual cells, we propose Polarbear, a semi-supervised machine learning framework that facilitates missing modality profile prediction and single-cell cross-modality alignment. Polarbear learns to translate between modalities by using data from co-assay measurements coupled with the large quantity of single-assay data available in public databases. This semi-supervised scheme mitigates issues related to low cell quantities and high sparsity in co-assay data. Polarbear first pre-trains a beta-variational autoencoder for each modality using both co-assay and single-assay profiles to learn robust representations of individual cells, and it then uses the co-assay labels to train a translator between these cell representations. This semi-supervised framework enables us to predict missing modality profiles and match single cells across modalities with improved accuracy compared with fully supervised methods, thus facilitating multimodal data integration.

Get full access to this article

View all access options for this article.

References

Ashuach

, Gabitto

, Jordan

, et al. Multivi: Deep generative model for the integration of multi-modal data. bioRxiv, 2021; doi: 10.1101/2021.08.20.457057.

Ashuach

, Reidenbach

, Gayoso

, et al. Peakvi: A deep generative model for single-cell chromatin accessibility analysis. Cell Rep Methods, 2022; 2(3):100182; doi: 10.1016/j.crmeth.2022.100182.

Buttgereit

, Lelios

, Yu

, et al. Sall1 is a transcriptional regulator defining microglia identity and function. Nat Immunol, 2016; 17(12):1397–1406; doi: 10.1038/ni.3585.

Cao

, Cusanovich

, Ramani

, et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science, 2018; 361(6409):1380–1385; doi: 10.1126/science.aau0730.

Chen

, Lake

, Zhang

High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nat Biotechnol, 2019; 37(12):1452–1457; doi: 10.1038/s41587-019-0290-0.

Eraslan

, Simon

, Mircea

, et al. Single-cell RNA-seq denoising using a deep count autoencoder. Nat Commun, 2019; 10(1):390; doi: 10.1038/s41587-019-0290-0.

Fang

, Preissl

, Li

, et al. Comprehensive analysis of single cell ATAC-seq data with SnapATAC. Nat Commun, 2021; 12(1):1–15; doi: 10.1038/s41467-021-21583-9.

Gala

, Gouwens

, Yao

, et al. A coupled autoencoder approach for multi-modal analysis of cell types. Adv Neural Inf Process Syst, 2019; 32.

Gayoso

, Steier

, Lopez

, et al. Joint probabilistic modeling of single-cell multi-omic data with totalVI. Nat Methods, 2021; 18(3):272–282; doi: 10.1038/s41592-020-01050-x.

10.

Hao

, Hao

, Andersen-Nissen

, et al. Integrated analysis of multimodal single-cell data. Cell, 2021; 184(3):3573–3587; doi: 10.1016/j.cell.2021.04.048.

11.

Harrow

, Denoeud

, Frankish

, et al. GENCODE: Producing a reference annotation for ENCODE. Genome Biol, 2006; 7(Suppl 1):S4; doi: 10.1186/gb-2006-7-s1-s4.

12.

Hinrichs

, Karolchik

, Baertsch

, et al. The ucsc genome browser database: Update 2006. Nucleic Acids Res, 2006; 34(suppl_1):D590–D598; doi: 10.1093/nar/gkj144.

13.

, Preissl

, Hou

, et al. An atlas of gene regulatory elements in adult mouse cerebrum. Nature, 2021; 598(7879):129–136; doi: 10.1038/s41586-021-03604-1.

14.

Liu

, Huang

, Singh

, et al. Jointly embedding multiple single-cell omics measurements. In: 19th International Workshop on Algorithms in Bioinformatics (WABI 2019), volume 143 of Leibniz International Proceedings in Informatics (LIPIcs). ( Huber

, Gusfield

, eds.) Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik: Dagstuhl, Germany; 2019; pp. 10:1–10:13; doi: 10.4230/LIPIcs.WABI.2019.10.

15.

Lopez

, Regier

, Cole

, et al. Deep generative modeling for single-cell transcriptomics. Nat Methods, 2018; 15(12):1053–1058; doi: 10.1038/s41592-018-0229-2.

16.

Lotfollahi

, Litinetskaya

, Theis

. Multigrate: Single-cell multi-omic data integration. bioRxiv, 2022; doi: 10.1101/2022.03.16.484643.

17.

Lun

ATL

, Bach

, and Marioni

JC.

Pooling across cells to normalize single-cell ma sequencing data with many zero counts. Genome Biology, 2016; 17(1):75; doi: 10.1186/$13059-016-0947-7.

18.

, Zhang

, LaFave

, et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell, 2020; 183(4):1103–1116; doi: 10.1016/j.cell.2020.09.056.

19.

Minoura

, Abe

, Nam

, et al. A mixture-of-experts deep generative model for integrated analysis of single-cell multiomics data. Cell Reports Methods, 2021; 1(5):100071; doi: 10.1016/j.crmeth.2021.100071.

20.

Talwar

, Mongia

, Sengupta

, et al. AutoImpute: Autoencoder based imputation of single-cell RNA-seq data. Sci Rep, 2018; 8(1):16329; doi: 10.1016/j.crmeth.2021.100071.

21.

Trong

, Kramer

, Mehtonen

, et al. Semisupervised generative autoencoder for single-cell data. J Computat Biol, 2020; 27(8):1190–1203; doi: 10.1089/cmb.2019.0337.

22.

Wang

, Gu

VASC: Dimension reduction and visualization of single-cell RNA-seq data by deep variational autoencoder. Genomics Proteomics Bioinformatics, 2018; 16(5):320–331; doi: 10.1016/j.gpb.2018.08.003.

23.

, Yost

, Chang

, et al. Babel enables cross-modality translation between multiomic profiles at single-cell resolution. Proc Natl Acad Sci U S A, 2021; 118(15); doi: 10.1073/pnas.2023070118.

24.

Xiong , Xu

, Tian

, et al. Scale method for single-cell ATAC-seq analysis via latent feature extraction. Nat Commun, 2019; 10(1):1–10; doi: 10.1038/s41467-019-12630-7.

25.

Zeisel

, Hochgerner

, Lönnerberg

, et al. Molecular architecture of the mouse nervous system. Cell, 2018; 174(4):999–1014; doi: 10.1016/j.cell.2018.06.021.

26.

Zhu

, Yu

, Huang

, et al. An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome. Nat Struct Mol Biol, 2019; 26(11):1063–1070; doi: 10.1038/s41594-019-0323-x.

Multimodal Single-Cell Translation and Alignment with Semi-Supervised Learning

Abstract

Get full access to this article

References