Abstract
Background
Expanded GGGGCC hexanucleotide repeats, ranging from hundreds to thousands in number, located in the noncoding region of the chromosome 9 open reading frame 72 (C9orf72) gene represent the most common genetic abnormality for familial and sporadic amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) (abbreviated as C9ALS). Currently, three pathological mechanisms, such as haplo insufficiency of C9orf72, formation of nuclear RNA foci composed of sense and antisense repeats, and accumulation of unconventionally transcribed dipeptide-repeat (DPR) proteins, are proposed for C9ALS. However, at present, the central mechanism underlying neurodegeneration in C9ALS remains largely unknown.
Methods
By using three distinct pathway analysis tools of bioinformatics, we studied molecular networks involved in C9ALS pathology by focusing on C9orf72 omics datasets, such as proteome of C9orf72 repeat RNA-binding proteins, transcriptome of induced pluripotent stem cells (iPSC)-derived motor neurons of patients with C9ALS, and transcriptome of purified motor neurons of patients with C9ALS.
Results
We found that C9orf72 repeat RNA-binding proteins play a crucial role in the regulation of post-transcriptional RNA processing. The expression of a wide range of extracellular matrix proteins and matrix metalloproteinases was reduced in iPSC-derived motor neurons of patients with C9ALS. The regulation of RNA processing and cytoskeletal dynamics is disturbed in motor neurons of patients with C9ALS in vivo.
Conclusions
Bioinformatics data mining approach suggests a logical hypothesis that C9orf72 repeat expansions that deregulate post-transcriptional RNA processing disturb the homeostasis of cytoskeletal dynamics and remodeling of extracellular matrix, leading to degeneration of stress-vulnerable neurons in the brain and spinal cord of patients with C9ALS.
Keywords
Introduction
Recent evidence indicates that expanded GGGGCC hexanucleotide repeats, ranging from hundreds to thousands in number, located in the noncoding region of the chromosome 9 open reading frame 72 (C9orf72) gene represent the most common genetic abnormality for familial and sporadic amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD).1,2 Both constitute an overlapping continuum of a multisystem disorder affecting the central nervous system (CNS), abbreviated as C9ALS. C9orf72 is an evolutionarily conserved protein expressed most abundantly in neurons in the CNS, distantly related to the differentially expressed in normal and neoplastic cells (DENN) family of GDP–GTP exchange factors that activate Rab GTPases.3,4 Although the precise biological function of C9orf72 remains mostly unknown, C9orf72 regulates Rab GTPase-mediated endosome trafficking. 5 The genetic mutations of C9orf72 are inherited in an autosomal dominant manner with incomplete penetrance. The patients with C9orf72 repeat expansions exhibit a clinical phenotype characterized by the earlier disease onset with bulbar involvement, cognitive and behavioral impairment, psychosis, and parkinsonism.6,7 The brains of patients with C9ALS not only show the TAR DNA-binding protein-43 (TDP-43) pathology but also exhibit an accumulation of TDP-43-negative p62-positive neuronal cytoplasmic and nuclear inclusions in the cerebellar granular cell layer and the dentate gyrus of the hippocampus. 8
Currently, three pathological mechanisms are proposed for C9ALS.9,10 First, noncoding C9orf72 repeat expansions cause transcriptional silencing of the C9orf72 gene by epigenetic mechanisms involving hypermethylation of CpG islands and histone trimethylation, leading to haploinsufficiency causing loss of normal function of C9orf72.11,12 Expanded hexanucleotide repeats constitute thermodynamically stable, multimolecular G-quadruplex structures that directly interfere with transcription. 13 Actually, C9orf72 protein levels are greatly reduced in brain tissues of patients with C9ALS. 14 Furthermore, deletion of the C9orf72 ortholog causes degeneration of motor neurons in zebrafish and Caenorhabditis elegans, suggesting that loss of normal function of C9orf72 is detrimental for motor neuron survival.15,16 Second, messenger RNAs coding for expanded C9orf72 repeats, transcribed from sense and antisense strands of the mutated allele, are concentrated on nuclear RNA foci that sequester a panel of RNA-binding proteins, leading to aberrant mRNA splicing and processing of the genes pivotal for neuronal function.17,18 Third, bidirectional transcripts of expanded C9orf72 repeats are unconventionally transcribed in all the three reading frames into aggregationprone dipeptide-repeat (DPR) proteins via repeat-associated non-ATG (RAN) translation.19,20 DPR proteins, accumulated in TDP-43-negative p62-positive neuronal cytoplasmic and nuclear inclusions in the early stage before the onset of TDP-43 pathology, could affect neuronal survival. However, at present, the central molecular mechanism underlying neurodegeneration in patients with C9ALS remains unknown.
Owing to the recent advance in microarray and next-generation sequencing technologies after the completion of the Human Genome Project, the global analysis of genome, transcriptome, proteome, and metabolome, collectively termed omics, helps us to characterize the genome-wide molecular basis of diseases and to identify disease-specific molecular signatures and biomarkers. Because omics approach produces high-throughput experimental data at one time, it is often difficult to extract biologically meaningful implications from these datasets. Recent progress in bioinformatics and systems biology enables us to illustrate the cell-wide map of complex molecular interactions with the aid of the literature-based knowledge base of molecular pathways. 21 The logically arranged molecular networks construct the whole system characterized by robustness that maintains the proper function of the system in the face of genetic and environmental perturbations. 22 In the scale-free molecular network, targeted disruption of limited numbers of critical components designated hubs, on which the biologically important molecular connections concentrate, could disturb the whole cellular function by destabilizing the network. Therefore, the integration of omics data derived from disease-affected cells and tissues with underlying molecular networks provides the highly efficient and rational approach for characterizing disease-relevant pathways to build up the most reasonable working hypothesis for disease mechanisms.
To establish a logically supported hypothesis accounting for the pathological role of C9orf72, we characterized molecular networks involved in C9ALS pathology by using three distinct pathway analysis tools of bioinformatics. We selected three C9orf72 omics datasets ideal for molecular network analysis, such as proteome of C9orf72 hexanucleotide repeat RNA-binding proteins, transcriptome of induced pluripotent stem cells (iPSC)-derived motor neurons of patients with C9ALS and transcriptome of purified motor neurons of patients with C9ALS.
Methods
RNA Pull-Down Assay Dataset of C9orf72 Hexanucleotide Repeat RNA-Binding Proteins
We studied two comprehensive datasets of C9orf72 hexanucleotide repeat RNA-binding proteins in human cells identified by recent studies.13,23 One is the dataset that comprised 288 interactors for biotinylated (GGGGCC)4 RNAs in G-quadruplex and hairpin conformations in HEK293T cells labeled by stable isotope labeling amino acids in cell culture (SIALC). 13 They were identified by RNA pull-down assay, followed by mass spectrometry. The other is the dataset containing 103 interactors for biotinylated (GGGGCC)5 RNA in human cerebellum homogenates and SH-SY5Y extracts. 23 They were identified by UV crosslinking and RNA pull-down assay, followed by mass spectrometry. We combined both datasets, and searched the presence of RNA recognition motif termed RRM (SM00360) on target proteins by Simple Modular Architecture Research Tool (SMART) v7 (smart.embl.de) and prion-like domain (PrLD)-containing proteins by referring to the established dataset. 24
RNA-Seq Dataset of iPSC-Derived Motor Neurons of Patients with C9ALS
We retrieved a RNA deep sequencing (RNA-Seq) dataset from Gene Expression Omnibus under the accession number GSE52202 (SRP032798). It contains eight transcriptome data derived from iPSC-derived choline acetyltransferase-positive motor neurons established from four normal controls (Group 1: 00i, 03i, 14i, and 83i) and four patients with C9ALS (Group 2: 28i, 29i, 30i, and 52i). 25 The patients numbered 28i, 29i, and 52i expressed ∼800 repeats, while the patient 30i showed ∼70 repeats. Transcriptome profiling was performed by single end-sequencing on HiSeq 2000 (Illumina). After removing poly-A tails and low-quality reads from the original data, we mapped short reads on the human genome reference sequence hg19 by TopHat2.0.9 (ccb. jhu.edu/software/tophat/index.shtml), and identified differentially expressed genes that satisfy the significance expressed as q-value (FDR-adjusted P value) < 0.05 by Cufflinks2.1.1 (cufflinks.cbcb.umd.edu). The processed data were visualized on CummeRbund (compbio.mit.edu/cummerbund).
Exon Array Dataset of Purified Motor Neurons of Patients with C9ALS
We retrieved an exon array dataset of postmortem frozen tissues from the recent publication. 26 It contains 12 transcriptome data derived from cervical cord motor neurons purely isolated by laser capture microdissection (LCM). They are derived from six normal controls, three patients with sporadic ALS, and three patients with C9ALS. Transcriptome profiling was performed on Human Exon 1.0 ST Array (Affymetrix), followed by gene- and exon-level analysis using the Partek Genomics Suite (www.partek.com/pgs). We extracted the set of genes and exons that satisfy the significance of P < 0.01 and fold change either >1.5 or <-1.5 from the original dataset comprising 742 genes and 6449 exons differentially expressed between total number of patients with ALS (n = 6) and controls (n = 6). 26
Molecular Network Analysis
To identify biologically relevant molecular networks and pathways, we utilized three distinct pathway analysis tools of bioinformatics endowed with the comprehensive knowledge base. They include Kyoto Encyclopedia of Genes and Genomes (KEGG) (www.kegg.jp), Ingenuity Pathways Analysis (IPA) (Ingenuity Systems; www.ingenuity.com), and KeyMolnet (KeyMolnet Data; www.km-data.jp/keymolnet). First, we imported Entrez Gene IDs of corresponding genes into the Functional Annotation tool of Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.7 (david.abcc.ncifcrf.gov). 27 DAVID automatically identifies KEGG pathways, and Gene Ontology (GO) categories composed of the genes enriched in the given set with statistical significance evaluated by the modified Fisher's exact test with Bonferroni multiple comparison test. KEGG includes manually curated reference pathways that cover a wide range of metabolic, genetic, environmental, and cellular processes, and human diseases. 28 Currently, KEGG contains 302,965 distinct pathways generated from 463 reference pathways.
IPA is a knowledge base that contains approximately 3,000,000 biological and chemical interactions and functional annotations with definite scientific evidence. By uploading the list of Gene IDs, the network-generation algorithm identifies focused genes integrated in a global molecular network. IPA calculates the score P value that reflects the statistical significance of association between the genes and the networks by the Fisher's exact test.
KeyMolnet contains knowledge-based contents on 164,000 relationships among human genes and proteins, small molecules, diseases, pathways and drugs. 21 They include the core contents collected from selected review articles with the highest reliability. By importing the list of Gene ID, KeyMolnet automatically provides corresponding molecules as nodes on the network. The neighboring network-search algorithm selected one or more molecules as starting points to generate the network of all kinds of molecular interactions around starting molecules, including direct activation/inactivation, transcriptional activation/repression, and the complex formation within one path from starting points. The generated network was compared side by side with 501 human canonical pathways of the KeyMolnet library. The algorithm counting the number of overlapping molecular relations between the extracted network and the canonical pathway makes it possible to identify the canonical pathway showing the most significant contribution to the extracted network. 21
Results
Molecular Network of C9orf72 Hexanucleotide Repeat RNA-Binding Proteins
First, we extracted the set of 353 human C9orf72 hexanucleotide repeat RNA-binding proteins from two proteome datasets (Supplementary Table 1). They included 38 proteins overlapped between them. SMART revealed that 65 proteins (18.4%) have one or more RRM domains, among them 17 proteins have PrLD (Table 1). Importantly, the list contains a wide variety of pre-mRNA splicing factors, along with FUS, EWSR1, HNRNPA1, HNRNPA2B1, and TAF15, all having RRM and PrLD, whose genetic mutations are causative of ALS.29–32
The set of 65 C9orf72 hexanucleoride repeat RNA-binding proteins with RRM motifs.
DAVID showed that 144 of 353 proteins (40.8%) belonged to the GO category termed as “ribonucleoprotein complex” (GO:0031012; P = 9.74E–46 corrected by Bonferroni). Molecular network analysis of 353 proteins by KEGG indicated a significant relationship with the pathways termed as “Ribosome” (hsa03010; P = 4.20E–81 corrected by Bonferroni) and “Spliceosome” (hsa03040; P = 2.36E–33 corrected by Bonferroni) (Fig. 1). IPA identified a significant relationship of these with functional networks defined by “RNA Post-Transcriptional Modification, Infectious Disease, Organismal Injury and Abnormalities” (P = 1.00E–128) (Supplementary Fig. 1) and “Gene Expression, Protein Synthesis, RNA Post-Transcriptional Modification” (P = 1.00E–102). KeyMolnet extracted the complex network comprising 1097 molecules and 1399 molecular relations (Supplementary Fig. 2). They showed a significant relationship with “HSP90 signaling pathway” (P = 3.85E–145), “Spliceosome assembly” (P = 2.23E–97), and “Intermediate filament signaling pathway” (P = 6.02E–88). Thus, three distinct pathway analysis tools commonly suggest that C9orf72 hexanucleotide repeat RNA-binding proteins play a pivotal role in the regulation of post-transcriptional RNA processing, particularly of sliceosome assembly.

KEGG molecular network of C9orf72 hexanucleotide repeat RNA-binding proteins. When Entrez Gene IDs of 353 C9orf72 hexanucleotide repeat RNA-binding proteins were imported into the Functional Annotation tool of DAVID, it identified “Spliceosome” (hsa03040) as the second-rank relevant KEGG pathway. Orange nodes represent C9orf72 hexanucleotide repeat RNA-binding proteins.
Molecular Network of Differentially Expressed Genes in iPSC-Derived Motor Neurons of Patients with C9ALS and Controls
Next, we reanalyzed a RNA-Seq dataset numbered GSE52202 and identified the set of 282 differentially expressed genes in iPSC-derived motor neurons of patients with C9ALS and normal controls (q value < 0.05) (Supplementary Table 2). C9orf72 mRNA expression levels were not significantly different in these cells between patients with C9ALS and controls (log2_fold change = 0.368586, q value = 0.999956). Unexpectedly, we did not find any significant differences in splicing patterns between patients with C9ALS and controls, in part attributable to the lower coverage of single-end RNA-Seq data. Among 282 differentially expressed genes, 11 were upregulated and 271 were downregulated in patients with C9ALS. Thus, the great majority of them were underexpressed in patients with C9ALS. The set of upregulated genes include CBLN4, CBLN2, CBLN1, DPP6, and SLITRK2 as reported previously, 25 supporting the validity of our analysis.
DAVID showed that 66 of the 282 genes (23.4%) belonged to the GO category termed as “extracellular matrix” (GO:0031012; P = 9.74E–46 corrected by Bonferroni). Molecular network analysis of 282 genes by KEGG indicated a significant relationship with the pathways defined as “ECM-receptor interaction” (hsa04512; P = 1.95E–14 corrected by Bonferroni) (Fig. 2), “Focal adhesion” (hsa04510; P = 6.23E–12), and “TGF-beta signaling pathway” (hsa04350; P = 0.0156). Consistent with the results of KEGG, IPA identified a significant relationship of these with functional network defined by “Connective Tissue Disorders, Dermatological Diseases and Conditions, Organismal Injury and Abnormalities” (P = 1.00E–76) where ECM proteins are enriched substantially (Supplementary Fig. 3). KeyMolnet extracted the complex network composed of 1291 molecules and 1735 molecular relations (Supplementary Fig. 4). They showed a significant relationship with “MMP signaling pathway” (P = 1.26E–232), “Transcriptional regulation by Ets-1/2” (P = 8.31E–164), and “Calpain signaling pathway” (P = 1.29E–122). Thus, three distinct tools commonly suggest that the expression of extracellular matrix proteins and matrix metalloproteinases is reduced in iPSC-derived motor neurons of patients with C9ALS.

KEGG molecular network of differentially expressed genes in iPSC-derived motor neurons of C9ALS patients and controls. When Entrez Gene IDs of 282 differentially expressed genes in iPSC-derived motor neurons of C9ALS patients and controls were imported into the Functional Annotation tool of DAVID, it identified “ECM-receptor interaction” (hsa04512) as the most relevant KEGG pathway. Light blue nodes indicate downregulated genes in C9ALS.
Molecular Network of Differentially Expressed Genes and Exons in Purified Motor Neurons of Patients with C9ALS and Controls
Finally, we studied an exon array dataset and identified the set of 353 genes and 3579 exons differentially expressed in purified motor neurons between patients with C9ALS and normal controls in the setting of the significance that satisfies P < 0.01 and fold change either >1.5 or <–1.5 (Supplementary Table 3 for genes and Supplementary Table 4 for exons). Among 353 differentially expressed genes, 134 were upregulated, whereas 219 were downregulated in patients with C9ALS. Among 3579 exons, 1574 were upregulated, while 2005 were downregulated in patients with C9ALS. Thus, downregulated genes and exons outnumbered upregulated classes in patients with C9ALS. Importantly, the list includes C9orf72 as one of the downregulated genes and exons, in addition to TARDBP (TDP-43) as one of the upregulated genes and exons in patients with C9ALS.
DAVID showed that 23 of the 353 differentially expressed genes belonged to the GO category termed as “RNA splicing” (GO:0008380; P = 6.45E–6 corrected by Bonferroni) and 460 of 3579 genes corresponding to differentially expressed exons belonged to the GO category termed as “nucleotide binding” (GO:0000166; P = 1.59E–11 corrected by Bonferroni). Molecular network analysis by KEGG indicated a discernible relationship with the pathways defined as “Spliceosome” (hsa03040; P = 0.0042, uncorrected) for differentially expressed genes and defined as “Regulation of actin cytoskeleton” (hsa04810; P = 0.0116 corrected by Bonferroni) for differentially expressed exons. IPA identified the most significant relationship with functional networks defined by “RNA Post-Transcriptional Modification, RNA Damage and Repair, Protein Synthesis” (P = 1.00E–93) for differentially expressed genes (Supplementary Fig. 5) and “RNA Post-Transcriptional Modification, Amino Acid Metabolism, Post-Translational Modification” (P = 1.00E–134) for differentially expressed exons (Fig. 3). KeyMolnet extracted the highly complex network composed of 1673 molecules and 2183 molecular relations for differentially expressed genes, exhibiting the most significant relationship with “Calpain signaling pathway” (P = 1.58E–141) (data not shown). KeyMolnet identified the extremely complex network composed of 4525 molecules and 8799 molecular relations for differentially expressed exons, showing the most significant relationship with “Kinesin family signaling pathway” (P = 1.82E–157) (data not shown). Thus, all these results suggest that the regulation of post-transcriptional RNA processing process, along with cytoskeletal dynamics and intracellular molecular transport, is disturbed in motor neurons of patients with C9ALS in vivo.

IPA molecular network of differentially expressed exons in purified motor neurons of C9ALS patients and controls. When Entrez Gene IDs of the genes encoding 3579 differentially expressed exons in LCM-isolated purified motor neurons of C9ALS patients and controls were imported into the core analysis tool of IPA, it identified “RNA Post-Transcriptional Modification, Amino Acid Metabolism, Post-Translational Modification” as the most relevant functional network. Red nodes indicate genes encoding upregulated exons, while green nodes represent genes encoding downregulated exons in C9ALS.
Discussion
By using three distinct pathway analysis tools of bioinformatics, we studied molecular networks involved in C9ALS pathology by focusing on three different C9orf72 omics datasets. They include proteome of C9orf72 hexanucleotide repeat RNA-binding proteins providing the most valuable biochemical information on molecular basis of C9ALS, transcriptome of iPSC-derived motor neurons of patients with C9ALS serving as the most representative cell culture model, and transcriptome of purified motor neurons of patients with C9ALS acting as the most clinically relevant in vivo source.
Molecular network analysis indicated that the proteome of human C9orf72 hexanucleotide repeat RNA-binding proteins plays an active role in the function of ribosome, spliceosome, and RNA post-transcriptional modification. Importantly, it is enriched in the category of RNA-binding proteins having RRM and PrLD. They have the capacity to self-assemble through PrLD, characterized by cluster of uncharged polar amino acids and glycine, resulting in generation of self-propagating amyloids.32,33 Previous studies showed that C9orf72 hexanucleotide repeats interact with HNRNPA1, HNRNPA2B1, HNRNPA3, HNRNPH, PURA, SRSF1, and ADARB2.18,34–37 In this study, the proteome included all these except for ADARB2. Notably, the proteome list contains FUS, EWSR1, HNRNPA1 HNRNPA2B1, and TAF15, all of which have RRM and PLD and are causative of ALS and/or FTD.29–32 Furthermore, a recent study of familial ALS patients identified missense mutations in the MATR3 gene encoding an RNA- and DNA-binding nuclear matrix protein interacting with TDP-43 and DEAH (Asp–Glu–Ala–His) box helicase 9 (DHX9). 38 We identified both MARN3 and DHX9 as C9orf72 repeat RNA-binding proteins. Poly-A-binding protein, cytoplasmic 1 (PABPC1) having four RRM domains shuttles between the nucleus and cytoplasm, binds to the 3'-poly-A tail of mRNAs, and promotes ribosome recruitment and translation initiation. PABPC1 protein accumulates in inclusions of ALS spinal cord motor neurons. 39 Enhanced phosphorylation of eIF2α (EIF2S1) serves as a marker for stress granules and an indicator for translational suppression in a Drosophila model of TDP-43-associated neurodegeneration. 39 We identified both PABPC1 and EIF2S1 as C9orf72 repeat RNA-binding proteins.
A previous study showed that the amount of collagen proteins is greatly reduced in the cervical spinal cord of patients with ALS. 40 Supporting this, we identified downregulation of various collagen genes in iPSC-derived C9ALS motor neurons. They include COL1A2, COL2A1, COL4A1, COL4A2, COL4A5, COL5A1, COL5A2, COL5A3, COL6A2, COL6A3, COL8A2, COL11A1, COL12A1, COL13A1, COL14A1, COL15A1, and COL16A1, along with reduction of BGN, DCN, LUM, PLOD1, and P4H2 involved in synthesis, assembly, and crosslinking of collagen fibrils. Importantly, the levels of integrin (ITGB4, ITGA5, ITGA11), laminin (LAMB1, LAMA2, LAMC1), tenacin (TNC), fibronectin (FN1), fibulin (FBN1, FBN2), thrombospondin (THBS1), and proteoglycan (HSPG2, GPC3, GPC4, CSPG4), all of which play a role in cell-to-cell and cell-to-matrix interactions essential for neuronal migration and axonal guidance during development, synaptic plasticity, and neuronal regeneration are also decreased in patients with C9ALS. Furthermore, the genes involved in regulation of actin cytoskeleton dynamics (ACTA1, ACTA2, ACTC1, ACTG2, FLNB, GSN, and TAGLN) are downregulated in iPSC-derived motor neurons of patients with C9ALS. Previous studies showed that matrix metalloproteinases, such as MMP9 and MMP2, are overexpressed in the spinal cord of ALS and SOD1G93 A transgenic mice.41,42 In contrast, we found that the levels of MMP2, MMP11, and MMP14, along with ADAMTS1 and ADAMTS9, are decreased in iPSC-derived motor neurons of patients with C9ALS, suggesting that remodeling of extracellular matrix proteins is downregulated in patients with C9ALS. C9ALS patient-derived iPSC expresses intranuclear RNA foci containing hexanucleotide expansions that potentially affect processing of various RNAs by sequestering RNA-binding proteins.25,34 Importantly, aberrantly spliced genes in spinal cord motor neurons of sporadic ALS patients are enriched in the category of cell–matrix adhesion. 43 We also found downregulation of the genes regulated by Ets transcription factors and those involved in calpain signaling in iPSC-derived motor neurons of patients with C9ALS. However, they are seemingly contradictory to the findings that Ets-2 immunoreactivity is enhanced in astrocytes in the spinal cord of patients with ALS and activation of calpain promotes the carboxylterminal cleavage of TDP-43, enhancing mislocalization of TDP-43 in affected neurons.44,45
Finally, we identified the set of 353 genes and 3579 exons differentially expressed in LCM-isolated purified motor neurons of patients with C9ALS and controls. Importantly, they constitute the molecular networks showing the most significant relationship with RNA post-transcriptional modification, consistent with the network of C9orf72 repeat RNA-binding proteins. These results suggest that a battery of genes encoding post-transcriptional RNA processing machinery components are aberrantly expressed and/or spliced in C9ALS motor neurons in vivo due to formation of nuclear RNA foci that sequester principal RNA-binding proteins. This constitutes a vicious circle for the proper RNA metabolism. A previous study found that a pre-mRNA splicing factor SRSF2 sequestered in RNA foci acts as a key molecule in aberrant regulation of splicing. 23
Conclusions
By using three distinct pathway analysis tools, we studied molecular networks involved in C9ALS pathology by focusing on C9orf72 omics datasets. We found that C9orf72 repeat RNA-binding proteins play a crucial role in the regulation of post-transcriptional RNA processing. The expression of a wide range of extracellular matrix proteins and matrix metalloproteinases is reduced in iPSC-derived motor neurons of patients with C9ALS. The regulation of RNA processing and cytoskeletal dynamics is disturbed in motor neurons of patients with C9ALS in vivo. Taken all together, bioinformatics data mining approach suggests a logical hypothesis that C9orf72 repeat expansions that deregulate post-transcriptional RNA processing disturb the homeostasis of cytoskeletal dynamics and remodeling of extracellular matrix, leading to degeneration of stress-vulnerable neurons in the brain and spinal cord of patients with C9ALS/FTD.
Author Contributions
JS designed data analysis and drafted the manuscript. YY, SK, MT, NA and YK performed molecular network analysis. All authors have read and approved the final manuscript.
Supplementary Material
Supplementary Figure 1
IPA molecular network of C9orf72 hexanucleotide repeat RNA-binding proteins.
Supplementary Figure 2
KeyMolnet molecular network of C9orf72 hexanucleotide repeat RNA-binding proteins.
Supplementary Figure 3
IPA molecular network of differentially expressed genes in iPSC-derived motor neurons of C9ALS patients and controls.
Supplementary Figure 4
KeyMolnet molecular network of differentially expressed genes in iPSC-derived motor neurons of C9ALS patients and controls.
Supplementary Figure 5
IPA molecular network of differentially expressed genes in purified motor neurons of C9ALS patients and controls.
Supplementary Table 1
The set of 353 C9orf72 hexanucleoride repeat RNA-binding proteins.
Supplementary Table 2
The set of 282 differentially expressed genes in iPSC-derived motor neurons established from C9ALS patients and controls.
Supplementary Table 3
The set of 353 differentially expressed genes in LCM-dissected motor neurons of C9ALS patients and controls.
Supplementary Table 4
The set of 3,579 differentially expressed exons in LCM-dissected motor neurons of C9ALS patients and controls.
