Abstract
Stem cells have the surprising potential to develop into many different cell types. Therefore, major research efforts have focused on transplantation of stem cells and/or derived progenitors for restoring depleted diseased cells in degenerative disorders. Understanding the molecular controls, including alternative splicing, that arise during lineage differentiation of stem cells is crucial for developing stem cell therapeutic approaches in regeneration medicine. Alternative splicing to allow a single gene to encode multiple transcripts with different protein coding sequences and RNA regulatory elements increases genomic complexities. Utilizing differences in alternative splicing as a molecular marker may be more sensitive than simply gene expression in various degrees of stem cell differentiation. Moreover, alternative splicing maybe provide a new concept to acquire induced pluripotent stem cells or promote cell–cell transdifferentiation for restorative therapies and basic medicine researches. In this review, we highlight the recent advances of alternative splicing regulation in stem cells and their progenitors. It will hopefully provide much needed knowledge into realizing stem cell biology and related applications.
Keywords
Introduction
Presently, stem cells are recognized by many researchers as having virtually unlimited application in the treatment and cure of many human diseases and disorders, including stroke (12, 26, 50–56, 58), Parkinson's disease (19), Alzheimer's disease (36), diabetes (13, 43), cancer (24), and so on. Stem cells are original cells with an extended self-renewal capacity and the power to develop into multiple cell types. There are two general types. Embryonic stem cells (ESCs) derive from the inner cell mass of a blastocyst and can differentiate into all kind of cells in the animal. Second are adult stem cells, which are found in mature animal body and have limited potency. For instance, hematopoietic stem cells (HSCs) are found in the bone marrow and give rise to all the blood cell types. Mesenchymal stem cells (MSCs) exist in many tissues and can differentiate into a variety of cell types. Neural stem cells (NSCs) have been isolated from various areas of the adult brain and possess the ability to differentiate along neuronal and glial lineages. In addition, induced pluripotent stem cells (iPSs) artificially derived from nonpluripotent cells are the novel type of pluripotent stem cells.
Understanding the molecular control of gene expression by chromatin modification, transcription factors, posttranscriptional regulation by alternative splicing and microRNAs, and posttranslational modification that arise during differentiation of stem cells and their progenitors is important for developing stem cell therapeutic approaches in regenerative medicine. Splicing is a modification of pre-mRNA of eukaryotic cell after transcription, in which introns are removed and exons are joined. It can be used to produce a correct protein through translation. Splicing is mainly catalyzed by the spliceosome, a complex of five small nuclear RNAs (snRNAs) and numerous protein factors (snRNPs), but there are also self-splicing introns. The main spliceosome binds to the pre-mRNA in a sequential manner, in the order of U1, U2, and then tri-snRNA particle of U4/U6.U5. Subsequent to the binding of tri-snRNA, the spliceosome undergoes a violent structural rearrangement, including the release of U1 and U4, and the addition of the Prp19-associated complex (NTC), and becomes a catalytic spliceosome. The catalytic spliceosome promotes two sequential transesterification reactions. First, the 2′OH of a specific branch-point nucleotide within the intron performs a nucleophilic attack on the first nucleotide of the intron at the 5′ splice site to form the lariat intermediate. Second, the 3′OH of the released 5′ exon then carries out a nucleophilic attack at the last nucleotide of the intron at the 3′ splice site, thus joining the exons and releasing the intron lariat. After the two transesterification reactions are complete, the postcatalytic spliceosome first releases the mature mRNA and then combines NTR complex to disassemble all components for a new round of splicing (60, 61).
Alternative splicing (AS) is the splicing variation mechanism. Exons of pre-mRNAs are linked by AS in different order and produce a large number transcripts with different protein coding sequences and RNA regulatory elements. Several common types of AS, including cassette exons, mutually exclusive exons, competing 5′ splice sites, competing 3′ splice sites, and retained introns, are illustrated in Figure 1 (5, 28, 37). In addition, the use of different promoters or different polyadenylation sites can also result in AS (Fig. 1). After AS, mature mRNA often results in a frameshift, in the use of a different start or stop codon, and in the modification of 5′ and 3′ untranslated regions of mRNAs containing different regulatory elements affecting mRNA translation, stability, and localization (5, 28, 37). Some articles reported one third of annotated AS transcripts produced premature termination codons (PTCs) and suffered nonsense-mediated decay (AS-NMD) (39).

Major types of alternative splicing (AS). The cassette exon is either included or skipped from the transcript. Mutually exclusive exons are never joined together. In some situations, competing selection of 5′ or 3′ splice sites and retained intron sequences are often observed in AS. In addition, sometimes alternative promoters or termination sites are used during AS.
Many studies showed that up to 74% of human genes suffer AS, with noticeable variation across tissue types and developmental stages (25). AS of genes involved in angiogenesis, adhesion, apoptosis and invasion, metastasis, proliferation, and hormone signaling is now well documented (5, 28, 37). For example, an antiangiogenic family of VEGF AS isoforms was found, and named VEGFXXXb. VEGFXXXb isoforms originate from an alternative 3′ splice site in exon 8 of VEGF, differing by only six amino acids at the C-terminus. This AS sequence radically changes the functional properties of VEGF (28). The AS isoforms of cell adhesion molecule and metastatic effector CD44 is discovered in prostate and breast cancer cells. CD44 isoforms that include the cassette exon v5 are connected with enhanced malignancy and invasiveness (30, 31). A chemokine, CXCL12α, importantly promotes the oriented cell migration and tissue homing of many cell types through interactions with CXCR4 receptor and heparan sulfate (HS). The AS isoform of CXCL12α, CXCL12γ, having the high happening of basic residues, could characterize specific adjustment by HS and is optimized to ensure its strong retention at the cell surface (29). AS events spread in eukaryotic cells; it is not surprising that AS is important in both development and related disease. Many human diseases develop from abnormal splicing of crucial transcripts or the appearance of deficient splice isoforms in affected tissues (63). Examples of disease genes include cystic fibrosis transmembrane conductance regulator (CFTR), microtubule-associated protein tau (MAPT), survival of motor neuron 1 (SMN1), etc. (5). The related therapeutic approaches used AS are present (16).
The most genetic and biochemical studies about control of AS have been shown in yeast, flies, nematode, mouse, rat, and human model (3). It is easy to predict splice sites by screening a pre-mRNA sequence and looking for consensus splice sites. Interestingly, exons flanked by the right consensus splice sites are not always spliced. Conversely, exons flanked by weak consensus splice sites are sometimes spliced well. This phenomenon can be answered that splicing is affected by helping sequences that assist to define real exons. These helping sequences are cis-regulatory elements located in the exonic and intronic regions of the gene. They are also called splicing codes and contain the exonic and intronic splice enhancers (ESEs and ISEs, respectively) and conversely exonic and intronic splice silencers (ESSs and ISSs, respectively), as illustrated in Figure 2 (5, 41, 64).

A diagram of cis-regulatory element regulation. Besides the splicing consensus sequences, a number of assisting elements can control AS, like exon splicing enhancers, silencers (ESEs and ESSs), intron splicing enhancers and silencers (ISEs and ISSs). Induction or skipping of an exon is determined by the balance of these elements' competing effects, which in turn might be determined by relative concentrations of specific trans-acting splicing factors.
Splice enhancers and silencers are recognized by trans-acting splicing factors. These factors are RNA binding proteins (RNABPs) that include the SR proteins and the hnRNP proteins. They determine the use and/or skip of splice sites and recognize splice enhancers and silencers by combinatorial binding. Binding sequences of some RNABPs have been characterized. The FOX proteins bind UGCAUG; PTB proteins interact with UCUCU; hnRNP A/B bind GGGG; TIA-1 associates U-rich sequences, muscleblind (MBNL) proteins attach UGCU; SUP-12 (RBM38) and the CELF proteins interact with UGUGU; and neuronal Nova RNA binding proteins bind either YCAY or YCATY (Y, pyrimidine) (10). Relative ratio of various trans-acting splicing factors can affect AS. For example, the SR protein SF2/ASF and the protein hnRNPA1 compete for binding to pre-mRNA. When SF2 is less than hnRNPA1, exon skipping is major, but when SF2 is more than hnRNPA-1, exon joining is favored (28). Many posttranscriptional and posttranslational mechanisms, such as AS-NMD, miRNA, ubiquitination, SUMOylation, and phosphorylation, controlling trans-acting splicing factors expression and cellular localization indicate possible autoregulatory organization at the level of splicing (41) (Fig. 3).

A diagram of trans-acting splicing factor regulation. There are various upstream regulatory mechanisms that regulate the balance of nuclear trans-acting splicing factors to control splicing decisions. These mechanisms are responsive to signaling pathways.
Although many studies on the functions and mechanism of AS that are related with specific transcripts, systematically elucidating the roles of AS events, are only now beginning to be used. Expressed sequence tags and cDNA sequences can be aligned to genomic sequences. Then the transcripts with or without middle exon alignments can be systematically recognized. But EST coverage associated with AS events is typically biased toward the 3′ and 5′ ends of transcripts, and there are not enough numbers of sequenced transcripts to deduce the frequency with which specific AS exons are included or skipped. At present, custom microarrays and computational tools have overcome some of the limitations in the analysis of EST/cDNA, like differential hybridization techniques, that permit the large-scale profiling of AS (5). AS arrays allow multiple probes to hybridize with the exon–exon junctions. These arrays contain probes within constitutive exons so that transcripts can be assessed as present or absent of the exon junction from the sample after hybridization (41) (Fig. 4). Moreover, combining microarray analysis to molecular techniques such as chromatin and RNA immunoprecipitation can discover populations of genes and transcripts regulated by specific trans-acting splicing factors (41).

Microarray-based analysis of AS. For example, a set of six probes, three targeted to exons (E1, E2, and E3 probes) and three to exon junction sites (E1–E2, E2–E3, and E1–E3 probes), permits detection of the exon inclusion or exclusion in various transcripts from different tissues or development stages.
Using microarrays and computational tools, we have novel insights about means of AS modulation. First, for different cell types and differentiation stages, the patterns of AS and relative ratio of specific trans-acting splicing factors are reproducible marks. Second, the patterns of AS and relative ratio of specific trans-acting splicing factors are dynamic and in response to intra-and extracellular signals. Third, functionally related transcripts can be coregulated in splicing networks to promote specific biological functions (41). In this review, we focused on the mechanism of AS in stem cell differentiation. Regulation of AS in multiple stem cells is discussed below.
Transcriptome Diversity and Alternative Splicing of Stem Cells
Using AS microarrays, some researchers compared transcriptional profiles in different stem cell lines and studied expression changes during the differentiation of stem cells to various lineage-committed cells (2, 7, 8, 15, 27, 40, 66). These studies showed that many genes were upregulated or downregulated during stem cells programming and differentiation through AS. At different stages of differentiation, AS generates different transcripts and often contributes to the regulation of gene expression by generating tissue-specific mRNA and protein isoforms (3, 9, 22, 68). Hence, it is interesting that AS plays important roles in regulating lineages gene expression and function (4, 21). In the splicing field, the next step attempts to identify and sort extensive AS isoforms in various stages of stem cell differentiation.
Embryonic Stem Cells
Histone deacetylase 7 (HDAC7) has an essential role in the regulation of gene expression on ESCs differentiation into smooth muscle cells (SMCs). Platelet-derived growth factor enhanced ES cell differentiation into SMCs through increase of HDAC7 splicing. The data revealed that HDAC7 splicing induced SMC differentiation through regulation of the SRF-myocardin complex (35).
Fibroblast growth factor 4 (FGF4) is a key candidate of autocrine message and is expressed by undifferentiated hESC lines. When recombinant FGF4 adds to hESCs, cells proliferation is promoted. FGF4si is a novel FGF4 AS isoform and translates for the amino-terminal half of FGF4. FGF4si is an antagonist of FGF4, closing FGF4-induced Erk1/2 phosphorylation. FGF4si effectively counters FGF4 effect in undifferentiated hESCs. The expression investigation shows that both isoforms are expressed in hESCs and early differentiated cells. FGF4si continues to be expressed after cell differentiation, whereas FGF4 is not. Using siRNA knockdown of FGF4 increased differentiation of hESCs (38).
CoAA is a splicing coactivator that regulates pre-mRNA splicing. CoAA gene is expressed in hESCs and processes AS in different tissues to three AS isoforms: CoAA, CoAM, and CoAR. The expression of CoAA undergoes a rapid change to its dominant negative splice variant CoAM in the embryoid body cavity during retinoic acid-induced P19 stem cell differentiation. CoAM inhibits CoAA function, and upregulates differentiation marker Sox6. Using a CoAA minigene cassette, the changed AS of CoAA and CoAM is controlled by the cis-regulating element upstream of the CoAA promoter. Interestingly, the CoAA gene often loses the cis-regulating element in human cancer cells. This selective default potentially deregulates CoAA during AS and alters stem cell differentiation (65).
The role of the POU domain, class 5, transcription factor-1 (POU5F1) in maintaining totipotency of hESCs has been demonstrated. In humans, there are two AS isoforms of POU5F1: POU5F1_iA and POU5F1_iB. They showed different temporal and spatial expression patterns. During human preimplantation development, a major POU5F1_iA expression was detected in all nuclei of compacted embryos and blastocysts and POU5F1_iB expression was shown from the four-cell stage onwards in the cytoplasm of all cells (11).
The two AS isoforms of the hepatocyte nuclear factor 1 (HNF1) transcription factor family, HNF1 and variant HNF1 (vHNF1), have high homology in their atypical POU homeodomain and dimerization domain but change in their transactivation domains. vHnf1-deficient mouse embryos die soon after implantation because they promote defective visceral endoderm. However, Hnf1 is induced at later developmental stages than vHnf1 and its deficit does not result in embryonic lethality or developmental defects. vHNF1 displays specific behavior depending on particular target genes and assists in the organization of a functional visceral endoderm (23).
Tonicity-responsive enhancer binding protein (TonEBP)-nuclear factor of activated T cell family 5 (NFAT5) is a DNA binding protein that plays a important role in the response of cells to hypertonicity. TonEBP existed in ESCs and the stages of fetal development. Extensive AS in exons 2–4 was detected during development and in different adult tissues. Four AS isoformes are produced with different lengths at the N-terminus. Two of the isoforms differ in their ability to stimulate transcription (34).
Protein kinase Cδ (PKCd) plays an essential role as a regulator of cellular apoptosis in response to various stimuli. PKCδI is proteolytically cut at its hinge region (V3) by caspase 3 and the fragment is enough to stimulate apoptosis in various cell types. Interestingly, mouse AS isoform PKCδII resists caspase cut because there is an insertion of 78 bp within the caspase recognition site in its V3 domain. Overexpression of PKCδI promotes apoptosis, but PKCδII overexpression inhibits the cells from apoptosis. In NT2 cells, retinoic acid regulates the expression of PKCδ AS variants (46).
Hematopoietic Stem Cells
Acute myeloid leukemia 1 (AML1, or runt-related transcription factor, RUNX1) plays a fundamental role for definitive hematopoiesis encoding the DNA binding subunit of the heterodimering transcription factor complex PEBP2 (CBF). Transcription of AML1 is restricted by two distinct promoter sequences, which lead to produce the respective AML1b and AML1c isoforms. AML1b exists in undifferentiated ESCs and upregulated in the early developmental stage, but AML1c expression is the slow upregulation and steady maintenance during embryogenesis. These two AS isoforms, driven by their own promoters, have different patterns and are likely to have different functions in early hematopoietic development (20).
The red cell membrane skeleton protein 4.1R (4.1R) is a major component of cells. It stabilizes the spectrinactin network and interacts with different skeletal and transmembrane proteins. 4.1R contains over 25 exons. Most of the exons splice by various AS mechanisms and produce many isoforms. For instance, exon 2′ includes translation initiation site AUG1 and its inclusion or exclusion from mature 4.1R mRNA controls expression of longer or shorter isoforms of 4.1R protein. The exon 1A-type transcripts skip exon 2′ and use the downstream AUG2 for translation of 80-kDa 4.1R protein, but exon 1B transcripts contain exon 2′ and initiate at AUG1 to produce 135-kDa isoforms (45). In addition, the C-terminal sequence of 4.1R encoded by exons 20 and 21 contains a binding region for nuclear mitotic apparatus protein (NuMA) and also produces two AS isoforms. CO.1 lacks most of exon 20-encoded sequence with a missense C-terminal sequence. However, CO.2 has the normal exon 21-encoded C-terminal sequence without exon 20-encoded C-terminal sequence and assembles to spindle poles, and colocalizes with NuMA in erythroid and lymphoid mutated cells (18). Moreover, during erythropoiesis, activation of protein 4.1R exon 16 (E16) inclusions shows a physiologically important AS that enhances 4.1R binding spectrin and actin in the red blood cell membrane biogenesis. Upregulation of E16 splicing can be controlled by Fox-2 or Fox-1, two related splicing factors that hold the same RNA recognition motif, UGCAUG in the proximal intron downstream of E16, and both could induce E16 splicing. Downregulation of E16 splicing is controlled by the binding of hnRNP A/B proteins to ESS and that downregulation of hnRNP A/B proteins in erythroblasts results in activation of E16 inclusion (17, 47).
Chronic myelogenous leukemia (CML) is a tumor of HSCs induced by the p210BCR/ABL protein. BCR/ABL promoted the expression of multiple genes involved in pre-mRNA splicing. β-Integrin signaling is essential to HSC maintenance and proliferation/differentiation, and is abnormal in CML. AS of β1-integrin-responsive nonreceptor tyrosine kinase gene (Pyk2) enhanced expression of full-length Pyk2 in BCR/ABL-containing cells. This may induce CML pathogenesis (48).
CD133 is a novel protein in cell surface. The function of CD133 is not clear, but its expression in the hematopoietic system is limited to CD34+ stem cells. The human CD133 gene has at least nine exons with different 5′-untranslated region (UTR), leading to form at least seven 5′-UTR AS transcripts of CD133, which are expressed in a tissue-specific manner (49). It suggests different roles for these transcripts in fetal development and mature organ homeostasis (67).
The c-Myb transcription factor controls the proliferation and differentiation of hematopoietic cells, and activation of c-myb promotes leukemias and lymphomas in animals. Relatively minor changes through AS in the c-Myb protein structure can change the genes expression that it regulates and can let loose its latent transforming activities. The c-Myb isoform showed differences in transcriptional activities and specificities. AS of c-myb may be a mechanism for its transforming activity change in human leukemias (44).
Neural Stem Cells
Sam68 (Src-associated in mitosis, 68 kDa) is a KH domain RNABP involved in a variety of cellular functions, including AS. Using RNAi of Sam68 and AS microarrays, researchers recognized some alternative exons whose splicing depends on Sam68. Precise analysis of one newly identified target exon in epsilon sarcoglycan (Sgce) showed that both RNA elements distributed across the adjacent introns. The RNA binding activity of Sam68 is essential to suppress the Sgce exon. Sam68 protein is increased upon neuronal differentiation of P19 cells, and many RNA targets of Sam68 alter in expression and splicing during this process. When Sam68 is knocked down, many Sam68-dependent splicing changes do not occur and P19 cells are unsuccessful to differentiate. The differentiation of primary neuronal progenitor cells from embryonic mouse neocortex is repressed by Sam68 depletion and induced by Sam68 overexpression. Thus, Sam68 controls neurogenesis through its influences on many specific AS of RNA targets (14).
Tenascin C (Tnc) is a multimodular extracellular matrix glycoprotein that exists in the ventricular zone of the developing brain. More than 25 different Tnc AS isoforms were well known in NSCs. After overexpression of homeodomain protein Pax6, the larger Tnc AS isoforms inclusion additional fibronectin type III domains were upregulated, whereas the smaller Tnc AS isoforms without any or with one additional fibronectin type III domain were downregulated (62). In addition, Sam68 as a target of Tnc signaling in NSCs and its overexpression also selectively increased the larger Tnc isoforms (42).
The polypyrimidine tract-binding protein (PTB/PTBP1) is important in keeping nonneuronal cells and blocks nonneuronal cells to differentiate into neurons (6, 33, 57). PTB is a splicing repressor on neuron-specific exon (1). When PTB protein knocked down, neuronal-specific AS of nonneuronal cells was sufficient to trigger. Interestingly, neurons express an antagonist of PTB, nPTB (PTBP2), which acts as a weaker splicing repressor compared with PTB. PTB and nPTB are expressed in a mutually exclusive fashion. PTB directly blocks nPTB expression in nonneuronal cells by preventing exon 10 of nPTB inclusion, which introduces a PTC, thereby degrading nPTB mRNA via AS-NMD (33). However, a neuronal-enriched microRNA (mir-124) directly binds the 3′-untranslated region of PTB, closes PTB expression, and promotes neuronal differentiation of mouse P19 cells. These data identified a “posttranscriptional control” that reprograms AS during neuronal development.
Mammalian Numb (mNumb) has multiple functions and plays key roles, including maintenance of neural progenitor cells and promotion of neuronal differentiation in the central nervous system (CNS). mNumb has two type AS isoforms based on the presence or absence of the specific amino acid in the proline-rich region (PRR) of the C-terminus. The human Numb isoform of long PRR domain (hNumb-PRRL) is mainly expressed during early neurogenesis in the CNS to promote proliferation of both neuroepithelial cells and postembryonic neuroblasts without affecting neuronal differentiation. The human Numb isoform of short PRR domain (hNumb-PRRS) is expressed throughout neurogenesis in the embryonic CNS, inhibits proliferation of the stem cells, and induces neuronal differentiation. Some studies showed that hNumb-PRRS more strongly downregulated the amount of nuclear Notch than hNumb-PRRL, and could antagonize Notch functions, probably through endocytic degradation. The two different types of hNumb isoforms could support different phases of neurogenesis in the embryonic CNS (59).
Conclusion
Understanding mechanisms of controlling stem cell differentiation will help us to develop effective stem cell-based therapies. The previous several years, microarray studies have mainly compared with differences expression of genes in various stem cell and somatic cell populations. However, the posttranscriptional regulations of transcriptome-related differentiation of stem cells remained unclear. In the future, work to characterize AS mechanism-related stem cell differentiation will be in several ways. One is to produce an extensive data pool of AS variants in stem cells and their lineages. A second it to identify cis-regulatory elements and trans-acting splicing factors of AS involved in stem cell differentiation. A third is to characterize the signaling pathways and regulation mechanism involved in expression and activity of trans-acting splicing factors in stem cell differentiation. In addition, utilizing different AS patterns and relative ratio of specific trans-acting splicing factors as a molecular marker of stem cells differentiation will be a good ideal. Presently, reprogramming of a somatic cell to an ESCs-like state by overexpression of specific factors that are highly expressed in ESCs, such as Oct4, Sox2, c-Myc, and Klf4, is a hot technology (iPS) (32). Here, expression or inhibition of specific trans-acting splicing factors may be a new method to acquire iPS or promote cell–cell transdifferentiation for realizing stem cell biology and regeneration medicine applications (Fig. 5).

Two strategies for generating iPS cell. (A) Yamanaka method (32). (B) The concept of retroviral transduction of specific trans-acting splicing factor or transient expression RNAi or miRNA of specific trans-acting splicing factor.
