Abstract
Induced pluripotent stem cell (iPSC) reprogramming requires sustained expression of multiple reprogramming factors for a limited period of time (10–30 days). Conventional iPSC reprogramming was achieved using lentiviral or simple retroviral vectors. Retroviral reprogramming has flaws of insertional mutagenesis, uncontrolled silencing, residual expression and re-activation of transgenes, and immunogenicity. To overcome these issues, various technologies were explored, including adenoviral vectors, protein transduction, RNA transfection, minicircle DNA, excisable PiggyBac (PB) transposon, Cre-lox excision system, negative-sense RNA replicon, positive-sense RNA replicon, Epstein-Barr virus-based episomal plasmids, and repeated transfections of plasmids. This review provides summaries of the main vectorologies and factor delivery systems used in current reprogramming protocols.
Introduction
I
Retroviral Vectors
The so-called RV widely used in reprogramming and gene transfer/therapy is based on the simple gamma retrovirus of murine origin, largely the Moloney murine leukemia virus (M-MuLV) [1,18 –20]. The gamma RV (γ-RV) played a critical role in the development of iPSC technology due to its ability to provide relatively long-term transgene expression [1]. Retrovirus has an RNA genome that can be converted into a double-stranded DNA by its own reverse transcriptase. The DNA is subsequently integrated into the host genome to generate a heritable DNA provirus. The process of heritability includes the production of RNA genomes via transcription of the provirus DNA, packaging of RNA genomes into viral particles, infection via interaction between the viral envelope proteins and viral receptors on host cells, reverse transcription, generation of a double-stranded DNA, and finally its subsequent integration back into the host genome as a provirus [21]. The simple gamma retrovirus encodes only three genes: gag, pol, and env. During virion maturation, Gag protein is cleaved into matrix (MA), p12, capsid (CA), and nucleocapsid (NC) by the viral protease (PR). The Pol moiety of Gag-Pol is also cleaved to release free PR, reverse transcriptase (RT), and integrase (IN) [21,22]. An advanced replication-incompetent RV can be constructed with only the essential cis-acting elements, including the packaging signal, primer binding sequence (PBS), polypurine tract (PPT), and long terminal repeats (LTRs), but devoid of all of the viral protein-coding genes. Transgenes are inserted in place of the deleted viral structural genes. However, the viral structural proteins have to be provided in trans for efficient encapsidation, infection/transduction, reverse transcription, and integration. The transgenes integrated into the host genome are then transcribed and translated by the cellular machinery. The infectability of viral particles depends largely on the interaction of the viral envelope protein and the viral receptor on host cells [23,24]. We can change the vector tropism by pseudotyping the same RV system with different envelope proteins, and murine virus-based RVs can be ecotropic, amphotropic, or pantropic (or polytropic) depending on the envelope proteins used. Ecotropic vehicles transduce only rodent cells due to the strict requirement for the rodent ecotropic viral receptor mCAT1 (cationic amino acid transporter), but cannot transduce human cells although human cells do express an mCAT1 homolog that is 87% identical to the mouse counterpart [23,24]. With the sodium-dependent phosphate symporter (Ram1) as a viral receptor, amphotropic vehicles transduce cells of rodent, human, chicken, dog, cat, and mink. Pantropic vectors transduce an even wider range of cells. The widely used pantropic vectors are pseudotyped with the vesicular stomatitis virus G protein (VSV-G). The VSV-G-pseudotyped virus enters cells through interaction with some mysterious ubiquitous cell membrane components [25], and thus is able to infect/transduce a wide variety of cell types, including cells of insects, fish, frogs, and humans. Recently, this ubiquitous VSV-G receptor was identified as the low-density lipoprotein receptor (LDLR) [26]. VSV-G pseudotyping not only broadens the tropism, but also increases the stability of viral particles and makes it possible to concentrate viral particles by ultracentrifugation [25]. Ecotropic and amphotropic retroviral particles are usually prepared using a packaging cell line that stably expresses the viral structural proteins Gag, Pol, and Env [27], but for pantropic virus, packaging is realized by transient cotransfection of the env gene and a transfer plasmid because of the cytotoxicity of VSV-G [25,28]. Like the wild-type retrovirus, M-MuLV-based RVs transduce only dividing cells [29,30], limiting their use in delivering reprogramming factors into nondividing and slow-dividing cells. Transgenes delivered by RVs are permanently integrated into host genomes, and thus provide stable expression of transgenes. Transgenes can be silenced depending on locations of integration (position effect), cell types, promoters installed, and viral cis-acting sequences. In embryonic stem cells (ESCs) and iPSCs, TRIM28/ZFP809 complex silences RV by binding to the viral PBS site, but not the HIV1-based lentiviral vectors [31,32]. Positional vector silencing in RVs and LVs may be alleviated by the incorporation of two classes of transcriptional regulatory elements: elements with boundary function, such as insulators and scaffold/matrix attachment regions, and elements that possess a dominant chromatin remodeling and transcriptional activating capacity, such as locus control regions and ubiquitous chromatin opening elements [33]. However, addition of these sequences impedes virus production because these elements are usually long. In addition, timely silencing might be beneficial to reprogramming [34], and transgene silencing provides a useful marker for complete reprogramming [35], although premature silencing is detrimental to reprogramming. Therefore, the best vector design to facilitate complete reprogramming should provide for relative long-term expression, while also allowing for timely silencing of the reprogramming factors.
HIV1-Based Lentiviral Vector
Lentiviral vectors were utilized to establish the first human iPSC line [2]. Lentiviral vectors have been developed from various viruses, including the equine infectious anemia virus (EIVA), bovine leukemia virus, simian immunodeficiency virus (SIV), feline immunodeficiency virus (FIV), foamy virus, and HIV2, but the most well-developed and widely used lentiviral vectors are based on HIV1 [36 –39]. Like other lentiviruses, HIV1 is a complex retrovirus because it encodes additional regulatory (Tat and Rev) and accessory proteins (Vpr, Vpu, Vif, and Nef ) in addition to the common Gag, Pol, and Env proteins shared with simple retroviruses [40]. Development of lentiviral vectors is facilitated by knowledge of simple RVs. HIV1 vectors are constructed with HIV1 cis-acting sequences, including LTR sequences, packaging signal, PBS, central PPT, and Rev response element (Fig. 1). A non-HIV1 sequence, WPRE (woodchuck hepatitis virus post-transcriptional regulatory element), is usually included to increase transgene expression (Fig. 1). One major goal in LV development is to increase safety since HIV1 causes human disease, and this is achieved by the following approaches: (1) deleting all of the viral-encoding genes from the transfer vector; (2) deleting some viral regulatory sequences in transfer vector, especially the enhancer and promoter in the U3 region of the 3′ LTR to make the vector replication incompetent (self-inactivating vector or SIN vector); (3) providing the essential viral proteins in trans from separate plasmids (split-genome approach); and (4) removing the packaging signal from the packaging plasmids; and (5) include nonviral heterologous promoters in the packaging plasmids [41]. Unlike RVs, HIV1 vectors require at least one regulatory viral protein, that is, Rev, in addition to Gag, Pol, and Env. HIV1 infects only a few types of cells (T cells, macrophages, and monocytes), and therefore the HIV1-based vectors with native envelopes have a very narrow tropism. Vector tropism can be broadened by pseudotyping with envelope proteins from various viruses, such as M-MuLV ecotropic and amphotropic Env glycoproteins, but the titer is generally low with these Env [37]. VSV-G can also pseudotype HIV1-based vectors, and gives the highest titer. The HIV1-based lentiviral vectors are usually packaged by transient cotransfection of all plasmids (transfer plasmids, packaging plasmids, and envelope plasmids) due to the toxic nature of some of the HIV1 proteins and VSV-G [42]. Inducible controlled expression of the viral packaging proteins makes it possible to package lentiviral vectors with a packaging cell line, but this method is not in wide use due to low titer and leaky expression of the toxic VSV-G and Rev [37,43,44]. In contrast to simple RVs, HIV1-based vectors can transduce nondividing and slow-dividing cells [45]. This is advantageous considering that somatic stem cells are better starting cells for reprogramming [3], but are generally slow growing or in quiescent states [46].

Schematic map of the transfer plasmid of lentiviral vector. cPPT, central polypurine tract (dark purple box); LTR, long terminal repeat (large open arrows); PBS, primer binding sequence (black dot); RRE, rev response element (blue box); WPRE, Woodchuck hepatitis virus post-transcriptional regulatory element (cyan box). Other major components are annotated in the figure. Figure is not drawn to proportion. Color images available online at
Minicircle DNA
Minicircle vectors were used to establish footprint-free iPSC lines because of their episomal nature and low degree of vector silencing [12]. Minicircle refers to episomal supercoiled circular DNA with only mammalian expression cassette and devoid of the bacterial backbone (Fig. 2). Bacterial DNA sequences (origin of replication, resistant gene, and others) in a mammalian expression construct are found to be largely responsible for the dramatic silencing of the delivered transgenes [47], and therefore systems have been developed to generate circular DNA (minicircle) containing the mammalian expression cassette only, and devoid of the bacterial backbone, using controlled, inducible intramolecular site-specific recombination in bacterial hosts prior to harvest of minicircles from bacterial cultures [48]. Central to these technologies is intramolecular site-specific recombination. Intramolecular recombination can be realized via different recombinases, including Streptomyces phage integrase ΦC31 [48,49], the Cre recombinase from bacteriophage P1 [50], FLP recombinase from the yeast plasmid 2 μm circle [51], and the ParA resolvase from the multimer resolution system of the broad host-range plasmid RK2 or RP4 [52]. The resulting minicircles can be purified by cesium gradient centrifugation after the selective destruction of bacterial backbone plasmids (miniplasmid) and of the residual parental plasmids by enzymatic restriction. This enzymatic removal of miniplasmids and residual parental plasmids is costly and labor intensive. Recently, a simple system was developed in which the miniplasmids and parental plasmids can be destroyed in vivo shortly after recombination by incorporating tandem repeats of the recognition sequence for yeast I-SceI outside of the recombination sites, in a genetically modified Escherichia coli strain that harbors inducible I-SceI genes and ΦC31 integrase (Fig. 2). This leaves only minicircles in the host cells, and allows for their simple purification using standard maxipreps or minipreps [49] (Fig. 2). Minicircles can also be purified using protein–DNA interaction chromatography [52]. Minicircles are generally nonreplicating, but a replicating minicircle was reported [51].

Minicircle generation system. Producer plasmid undergoes intramolecular recombination in an Escherichia coli strain that harbors the inducible ΦC31 and the rare-cut I-Sce I genes. The resulting miniplasmid of bacterial backbone is degraded by I-SceI-initiated digestion of DNA since this plasmid contains 32 copies of I-SceI recognition sequences. The second recombination product, the minicircle DNA, remains intact in the bacterial host, and can simply be purified using standard maxiprep or miniprep. attB, bacterial attachment (green star); attP, phage attachment (ochre red star). Color images available online at
Protein Transduction
Protein transduction is a technology to deliver exogenous proteins into cells in culture or directly into tissues of living organisms [53 –57]. Owing to their apparent lack of genome integration property, these technologies were used to generate footprint-free iPSCs [13,58]. Delivery of proteins into cells encounters a great barrier, the hydrophobic cellular membrane. The most widely used method is transduction mediated by a protein transduction domain (PTD; also called cell-penetrating peptide, CPP). PTD occurs naturally in many proteins, such as the HIV1 transcriptional activator TAT [59 –61], the structural protein VP22 of herpes simplex virus type 1 [62], and the Drosophila transcription factor antennapedia (AntP) [63,64]. These PTDs are fused onto proteins of interest for transduction. The mechanism of PTD-mediated transduction is poorly understood, but it might involve interaction between PTD and the cell membrane, endocytosis, and retrograde transportation into the cytoplasm and nucleus [54]. PTD is a highly basic domain containing a large portion of arginines, so the positively charged PTD can interact with the negatively charged components of the cell membrane via electrostatic interaction. Further internalization of the PTD involves different endocytotic pathways (macropinocytosis, clathrin-mediated endocytosis, and caveolin-mediated endocytosis) although previous data suggested that this process is independent of endocytosis, energy, and specific receptors [65,66]. PTD-mediated protein transduction can occur in many different cell types, including hard-to-transfect cells such as neural cells [63], and it delivers large proteins into all tissues in live animals including brain indicative of an ability to cross the blood-brain barrier [67]. One drawback is the short half-life of the transduced protein in cells. It is found that poly-arginine (around eight arginines) transduced the tagged proteins more efficiently than the naturally occurring PTDs [68]. PTD-mediated transduction is fast. In contrast to other delivery systems such as virus-mediated gene delivery, PTD-mediated protein transduction is not toxic to cells [61,68]. PTD not only delivers protein, it can also deliver DNA, RNA, nanoparticles, liposomes, fluorescence probes, and drugs [57]. PTD has a large cargo capacity and can deliver protein >100 kDa [67].
PTD-mediated protein transduction is an active process. Proteins can also be delivered passively into living cells by diffusion through the cell membrane temporarily opened using mechanical approaches or membrane-permeabilizing agents. One example of mechanical approaches is scrape loading [59]. The commonly used membrane permeabilizing agent is streptolysin O, which is a cholesterol-binding, thiol-activated, calcium-sensitive bacterial exotoxin that stimulates the formation of pores in the plasma membrane of mammalian cells up to the size of 35 nm that is large enough for large proteins to traverse but too small for organelles to escape [69 –71]. When permeabilizing conditions are optimized, the process is reversible under optimal re-sealing conditions, and the treated cells will remain viable and normal [70]. In one protein reprogramming experiment, the reprogramming factors in ESC extracts were delivered into the reprogramming cells temporarily permeabilized using streptolysin O [71].
Sendai Viral Vectors
Sendai virus is an enveloped, nonsegmented, single-stranded, negative-sense RNA virus of the Paramyxoviridae family [72]. The RNA viral genome is 15,384 nucleotides, which consists of six independent cistrons (NP, P/C/V, M, F, HN, and L) encoding six essential proteins—nucleoprotein (NP), phosphoprotein (P), matrix protein (M), fusion protein (F), hemagglutinin-neuraminidase (HN), and large protein (L, the catalytic subunit of the RNA-dependent RNA polymerase)—and two accessory proteins, C and V proteins. L is believed to possess all of the enzymatic activities necessary for viral transcription and replication. Genomic RNA, NP, P, and L form the ribonucleoprotein (RNP) that is a minimal functional unit in terms of genome replication and viral mRNA transcription after entering host cells. Unlike retroviruses, Sendai virus has no DNA phase, and therefore generally does not integrate into host genomes. This property has been employed to produce iPSC lines free of transgene footprints [15]. Replication and transcription are completed in the host cytoplasm and need no host nuclear factors. The only cis-element required for gene transcription is a short gene-start signal (UCCCNNUUUC) before each cistron, and a short gene-end signal (AUUCUUUUU) at the end of each cistron. Like their parent viruses, SeV vectors (SeVVs) in the form of RNP replicate and transcribe in the cytoplasm of host cells, without going through a DNA phase. This ability of RNP replication in the transduced cells ensures lasting expression of the reprogramming factors for complete reprogramming. The first-generation SeVV contains the entire viral genome with foreign genes inserted into seven possible locations, and is therefore an infectious vector [73,74]. The second generation of vectors deletes one of the three envelope-related genes to make the vector incompetent and less immunogenic [73]. Most advanced vectors delete all of the three envelope genes and modify the remaining genes and sequences [75,76]. The F-deficient nontransmissible vector (SeVV/ΔF) was developed by deleting the envelope fusion gene (F gene) from the RNA genome [77]. The F protein is provided in trans from a packaging cell line. Such an F-deficient vector cannot form intact infectious viral particles, yet can replicate and transcribe in the cytoplasm of host cells, so as to provide persistent transgene expression. But F-deficient vectors can generate significant amounts of virus-like particles, and such virus-like particles have residual infection activity [76]. For enhanced safety, formation of virus-like particles from the transfected host cells was significantly reduced by introducing temperature-sensitive mutations into the two remaining envelope-related genes M and HN in the context of ΔF (SeVV/MtsHNtsΔF vector) [78]. The most advanced SeVV incapable of forming any viral particles was developed by deleting all three of the structural genes (M, F, and HN) from the viral genome [75]. The efficiency of this triple-deficient vector is enhanced by incorporating missense mutations in the P and L genes, and by introduction of gene-end signal before NP gene, which enable a stable gene expression and avoid IFNβ induction (SeVV/ΔMΔFΔHN vector) [76]. SeVV/ΔMΔFΔHN virion particles are packaged in a cell line expressing M, F, and HN. Like Sendai virus, SeVV/ΔMΔFΔHN vectors transduce a wide range of cells from avian to human because SeVV recognizes ubiquitous sialic acids as its primary receptor. Three exogenous genes can be installed in place of M, F, and HN. Additional transgenes can be incorporated by the introduction of gene-start and gene-end signals. Therefore, in the SeVV/ΔMΔFΔHN vector, all of the four reprogramming factors can be installed on a single vector [76], while with the SeV/ΔF vector, four individual viruses are needed to deliver the four reprogramming factors [79]. Viral replication relies on an RNA-dependent RNA polymerase, and therefore siRNA against the L gene can actively clear viral replicons from transduced cells when there is no longer a need for transgene expression following the completion of reprogramming [76]. After reprogramming, viral replicons can also be cleared by a short temperature shift if ts mutations are introduced into the viral RNA polymerase gene of the vector [79]. Viral replicons become diluted over cell divisions, and some iPSCs eventually become devoid of viral replicons after sequential passaging, and so cells free of viral replicons can be passively obtained by negative selection against virus-containing iPSCs using an antibody against the spike protein HN when the HN gene is included in the vector [80].
RNA Transfection
RNA can be directly delivered into cultured cells or animals as an alternative to DNA and protein [81,82]. In contrast to DNA delivery, the destination of mRNA is the cytoplasm rather than the nucleus, and more genetic information can reach the target sites quicker due to reduced barriers during transfection because the inefficient nuclear internalization of transgenes is eliminated, and a shorter transport in cytoplasm will be needed [83]. mRNA can be translated almost instantly while DNA needs an additional transcription step. Expression of genes delivered in the form of DNA is frequently silenced by epigenetic modification while that of RNA is not. RNA does not modify the host genome while DNA frequently does, and therefore RNA-based therapy is classified as a nongene therapy approach [84]. It is this feature of nongenome modification that makes synthetic mRNA appealing in the establishment of footprint-free iPSC lines. However, mRNA transfection has two major drawbacks. First, the half-life of RNA is short and the expression of the delivered mRNA is more transient compared to that of the delivered DNA, limiting its use in a long process like reprogramming. Second, RNAs are more immunogenic than DNA, and cause strong innate immune responses through the antiviral mechanisms of the host cells, thus introducing complications in reprogramming. Modifications of synthetic mRNA can substantially reduce immune responses from transfected cells [14,81], and are essential for efficient transgene expression and reprogramming [85]. A combination of modifications to mRNAs allows for robust expression of the transfected mRNAs. Essential or beneficial modifications include inclusion of 5′- and 3′-UTRs of β-globin, addition of a poly(A) tail, capping with the antireverse cap analog, dephosphorylation of the uncapped population of the synthesized mRNAs, and incorporation of modified ribonucleoside bases such as pseudouridine [81,82]. Even with these modifications, reprogramming with synthetic mRNAs is not an easy task.
Epstein-Barr Virus Episomal Vectors
The Epstein-Barr virus (EBV) episomal vector used widely in reprogramming is an Epstein-Barr virus–based plasmid that can replicate and partition extrachromosomally in primate cells [86 –88]. The ability to replicate in cells allows its use in a long process as reprogramming [4]. EBV vectors require only two components of viral origin, the cis-element OriP and the trans-acting factor EBNA1 [89,90]. EBNA1 encodes the Epstein-Barr virus nuclear antigen 1 (EBNA1). OriP is a 1.7-kb DNA region containing two essential cis-acting sequences for plasmid replication and retention, namely, the family of repeat (FR), and the region of dyad symmetry (DS; also named origin of bidirectional DNA replication, OBR) [91]. FR functions to maintain the plasmids, and consists of 20 tandem imperfect copies of a 30-bp repeating unit, each of which contains an EBNA1 binding site. FR is also an EBNA1-dependent transcription enhancer. DS is a plasmid replicator, and contains four EBNA1 binding sites [92]. EBNA1 binds to both FR and DS for proper plasmid replication and retention. EBV plasmids rely on host for replication and retention because EBNA1 does not have any enzyme activity except for DNA binding capacity. EBV plasmids replicate once per cell cycle in synchrony with the host chromosomes, and have a low mutation rate compared with BPV- or SV40-based plasmids. Partitioning of EBV plasmids does not seem to be random, but the retention mechanism of the EBV vector is imperfect, resulting in a slow loss of plasmids at around 5% each cell cycle if the selection pressure is removed [93]. This feature was employed to generate footprint-free iPSCs [4 –6]. Unlike other episomal circular DNA viruses, EBV has a large genome of 172 kb, and therefore EBV vectors can accommodate a large DNA fragment. EBV-based vectors work mainly in primate cells, and are reported not to persist in rodent cells although one group reported that some rodent cell lines supported the long-term maintenance of EBV vectors [94].
2A Peptides and IRES for Construction of Polycistron Vectors
IRES and 2A peptides are two popular means for achieving the coexpression of multiple genes (Fig. 3) among other approaches, such as the utilization of multiple promoters and fusion proteins [95]. 2A is a short peptide (18–22 amino acids) that mediates the coexpression of two proteins from a single open reading frame (ORF) [95 –97]. The upstream protein terminates at the 2A terminal glycine in a fashion independent of the traditional stop codon, and the downstream protein initiates at a proline independent of the initiation codon. The 2A-peptide-mediated coexpression of transgenes has gained popularity recently due to its small size, efficiency, stoichiometry, and availability of various functional variants. Its small size is an invaluable feature in terms of constructing reprogramming vectors due to the need to assemble four genes into a single vector. The use of 2A variants can avoid potential intramolecular homologous recombination of polycistrons, and allows for the expression of multiple genes (up to five genes). There are four widely used 2A sequences, which are derived from the foot-and-mouth disease virus (F2A), porcine teschovirus-1 (P2A), Thoseaasigna virus (T2A), and equine rhinitis A virus (E2A), with P2A as the most effective one [98]. 2A is active in all eukaryotic cells [99], but not in prokaryotes [100]. 2A sequences (54–66 bp) are short compared to IRES and internal promoters, and therefore provide an advantage in vector design because of the limited packaging capacity of the widely used lentiviral vectors and RVs. The following two points should be kept in mind when 2A peptides are used for the construction of polycistronic vectors. First, there is still an imbalance of expression between proteins upstream and downstream of 2A, with the upstream protein being translated at a higher level, although this imbalance is not as severe as with IRES; second, the resulting N-terminal protein retains a 2A peptide and the spacer GSG if included during vector construction, while the C-terminal protein retains an extra proline at its very N terminus.

Coexpression mediated by 2A peptide and IRES. Upper panel: 2A-mediated coexpression of four genes into four individual proteins from a single ORF. Upper part is the eukaryotic expression cassette; middle is the resulting mRNA, and the bottom is the resulting four individual proteins. Lower panel: IRES-mediated coexpression of two genes into two individual proteins from two separate ORFs of the same mRNA. ORF, open reading frame. Color images available online at
IRES is a stretch of highly structured RNA of about 450 nts that directs a cap-independent initiation of protein synthesis from a downstream coding RNA [101 –104]. This property is employed to design dicistronic expression cassette under the control of a single promoter in which the first cistron is translated by a canonical cap-dependent initiation, but the second cistron is translated via IRES-mediated internal initiation of translation. In dicistronic vectors, the cistron downstream of IRES usually has a lower level of expression (6%–100% that of the upstream gene) [105]. Utilization of IRES and 2A for coexpression requires different vector design consideration. The stop codon of the first gene in the 2A vector should be removed and the two cistrons linked by 2A should be in-frame to form a single ORF, while the stop codon of the first cistron should be retained in the IRES vectors and IRES connects two distinct ORFs. There are two limitations to IRES-mediated coexpression: (1) IRES is long, and this is a negative attribute when vector capacity is limited as in RVs; (2) the expression of the second gene is compromised. Thus, IRES is less popular in reprogramming constructs.
PiggyBac Transposon and Vectors
PB is a nonviral 2,472-bp DNA transposon isolated from the cabbage looper moth. PB transposes in a wide range of host genomes from yeast to human [106 –108]. The PB transposon consists of 3′ and 5′ terminal repeat domains (TRDs) and an ORF encoding the PB transposase, the former of which delimit the transposon cassette for integration. Both the 3′ and 5′ TRDs contain a 13-bp terminal inverted repeat and a 19-bp internal inverted repeat. The 3′ TRD has a 31-bp spacer while the 5′ TRD spacer is only 3-bp long. In the PB vector system, the PB transposase can be provided in trans, and a transgene expression cassette is sandwiched between the two TRDs in place of the native PB transposase [109 –111]. A typical PB gene delivery system therefore includes a helper plasmid to express the PB transposase, and a transposon donor plasmid in which the two cis-acting TRDs enclose a transgene expression cassette. PB transposition follows a cut-and-paste mechanism, and the transgene cassette is integrated into genomes at TTAA sites. The PB transposition process initiates with nicking at the transposon 3′ end, and includes hairpin formation between the transposon 5′-end-TTAA overhang and its 3′ OH, hairpin resolution, target joining, and target repair [110]. The PB transposase is solely responsible for catalyzing all of these processes. Unlike transpositions with other DDE family recombinases, PB transposition does not require DNA synthesis, and results in the precise excision of transposon from the host DNA due to the tetranucleotide cohesive overhang (TTAA) left behind on both the flanking host DNA and the transposon ends, which can undergo a footprint-free repair through a simple process of TTAA base pairing and ligation [110]. Although TRDs are sufficient for interplasmid transposition, efficient genome transposition requires internal sequences adjacent to both the 5′ and 3′ internal inverted repeats [112]. The integrated expression cassette together with the transposon terminal elements can be precisely excised by transient expression of the reintroduced PB transposase. This excision process leaves no footprint behind. PB vectors constitute a nonviral, genome-integrating gene delivery system representing a cheaper and simpler approach than viral systems. They also have a much larger cargo capacity than retroviral systems. PB transposons have the highest transposition efficiency among DNA transposons, but still give a significantly lower rate of stable transposition (integration rate) compared to viral vectors [111]. These features allow the use of PB vectors to generate footprint-free iPSCs [8,113].
Adenoviral Vectors
Adenoviral vectors are the second most widely used vectors in clinical trials after RVs owing to their high capacity for transgene insertion, high virus yield, efficient transduction into a wide range of cells, and higher safety profile than integrating vectors [114]. Adenovirus is a nonenveloped double-stranded DNA virus with an icosahedral capsid. The genome is a nonsegmented linear DNA with a viral terminal protein covalently attached to each 5′ terminus of both viral ends at the two inverted terminal repeats (ITRs). The genome encodes two groups of overlapping genes on both strands, the early (E1 to E4) and the late genes (L1 to L5), which are expressed before and after replication of the viral genome, respectively [114,115]. Each gene gives rise to multiple protein products through alternative splicing. Adenoviral vectors are mostly based on the two well-characterized human adenoviruses: Ad2 and Ad5 [115 –117]. The development of adenoviral vectors has roughly three generations [118]. The first generation of vectors has E1 deleted to make virus replication incompetent, because E1 is responsible for the activation of both the early and the late viral genes, and thus for viral replication [117] (Fig. 4). The E1 function in the first-generation vector is provided in trans from a complementary packaging cell line, which is usually HEK293 transformed with the adenovirus E1 region. Some first-generation vectors are also devoid of the E3 gene to make more room for transgene insertion. The first-generation vector can accommodate up to 8.2 kb of exogenous sequence. Two issues with the first-generation vectors are the presence of some replication competent adenoviruses (RCAs), and immune responses from the transduced cells and/or tissue/animals. The second-generation vectors have part or all of their E2 and E4 genes deleted in addition to the E1 and E3 deletions (Fig. 4). These additional deletions have threefold benefits: reduced RCAs, reduced immune response, and additional room (up to 14 kb) for transgene insertion. The last generation of adenoviral vectors (also known as gutless vectors, gutted vectors, or helper-dependent Ad vectors) is devoid of all viral genes, leaving just the two ITRs and the packaging signal, so as to reduce vector toxicity and immunogenicity, and to make more room for transgene insertion [118] (Fig. 4). With most of the 36-kb sequence of the genome deleted, the gutless vector can accommodate up to 37 kb of foreign sequences without compromising viral growth and titer. A helper virus is needed for the gutless vector because it is not possible to establish a complementary cell line due to toxicity of some viral proteins and the need for a high level of viral proteins for viral packaging. Like lentiviral vectors, adenoviral vectors transduce both dividing and nondividing cells. Relevant to reprogramming is that adenoviral vectors are largely episomal because adenovirus exists as an extrachromosomal entity. A shortcoming associated with this feature is the short-term expression of transgenes compared with the integrating LVs and RVs. This transient nature of transgene expression makes the efficiency of adenoviral reprogramming extremely low [7]. Adenoviral vectors also have a high level of background integration [119], and this constitutes a concern when these vectors are employed to generate footprint-free iPSCs for clinical application. Adenoviral constructs can be difficult to clone due to their large size [117].

Genome structure of the adenoviral vectors of the three generations along with that of the wild-type Ad genome. ψ, packaging signal (tandem filled triangles); Δ indicates deletion of genes; ITR, inverted terminal repeat; MLP, major late promoter (large arrows); thick lines represent viral sequences; open boxes denote deletion of viral sequences; thin arrows indicate viral genes and their direction of transcription; open pentagons are ITRs. Numbers are insertion capacity of transgenes.
Alphaviral Vectors and RNA Replicon
Alphavirus is an enveloped, single-stranded, positive-sense, capped, and polyadenylated RNA virus with a genome of about 11,700 nucleotides [120]. Well-known alphaviruses for vector design include the Sindbis virus (SIN), Semliki forest virus (SFV), and Venezuelan equine encephalitis virus (VEE). The alphaviral genomes include two ORFs (nonstructural ORF and structural ORF), both of which encode polyproteins (Fig. 5). The 5′ nonstructural ORF encodes a polyprotein that generally gives rise to four nonstructural proteins (or replicase proteins, nsP1 to nsP4) and some cleavage intermediates. These proteins function mainly as an RNA-dependent RNA polymerase, and therefore this ORF is considered an RNA replicase/transcriptase gene (Rep). The structural ORF (26S ORF) encodes structural proteins: three main structural proteins [C (capsid), E1, and E2] and two minor structural proteins (E3 and 6K). E2 and E3 are products of a precursor protein, PE2. Genomic RNA serves as the direct template for translation of the nonstructural proteins, and as the template for synthesis of the negative-strand RNAs: (−)RNA. (−)RNA is an exact complement of the genomic RNA except for the presence of an unpaired G at its 3′ end. (−)RNA serves as the template for synthesis of new genomic RNAs, and also as the template for transcription of the 26S mRNA under the direction of the subgenomic 26S promoter. The 26S mRNA is capped and polyadenylated, and serves as the template for translation of the structural proteins. Structural proteins are not required for synthesis of the (−)RNA, replication of the (+)RNA, and the cytoplasmic transcription of the 26S mRNA. Therefore, an RNA genome deleted of the structural ORF can replicate in cells (RNA replicon), but is not infectious. Alphaviral vectors can be infectious and noninfectious. Infectious vector is made by insertion of transgenes into the complete genome before or after the structural ORF with the inclusion of an additional subgenomic promoter [121]. A noninfectious vector is made possible by replacing the structural ORF with transgene-coding sequences [122]. Noninfectious vectors can be packaged into virions when the structural proteins are provided in trans. Noninfectious RNA replicons can be delivered directly into cells as RNAs synthesized in vitro using SP6 or T7 promoters (Fig. 5). Such synthetic, noninfectious RNA replicons encoding the reprogramming factors were recently used to generate footprint-free iPSCs [16]. DNA-based alphaviral vectors are reported [123 –125]. In this case, RNA replicons are transcribed from conventional plasmids in the nuclei of the transfected cells using a eukaryotic promoter, such as the CMV or RSV promoters (Fig. 5). The RNA replicons are then exported from the nucleus into the cytoplasm, where they can undergo translation of replicase, (−)RNA synthesis, and transgene transcription and translation. Alphaviral vectors pose no risk of integration into the host genome, a desired feature for the generation of footprint-free iPSC lines. However, alphaviral vectors are highly cytopathic, and host cells die in a few days postinfection. Point mutations in the 5′ UTR and in the nsP2 protein can alleviate the cytopathic effect of alphaviral vectors by slowing down the replication of RNA replicons [126,127].

Alphaviral vectors. Upper: structure of an alphaviral genome. Middle: alphaviral vector design for in vitro preparation of RNA replicon (noninfectious) using SP6 promoter. Lower: alphaviral plasmid design for in vivo generation of RNA replicons via a eukaryotic promoter. Open box, nonstructural ORF of the alphaviral genome; gray box, structural ORF of an alphaviral genome; arrow in a box, the alphaviral subgenomic promoter; arrows, SP6 or CMV promoters; filled rounded square, RNA cap.
Concluding Remarks
Factor reprogramming requires sustained expression of the four reprogramming factors for ∼10–30 days. After reprogramming, transgene expression and retention are detrimental or pose potential risks [128]. Lentiviral and simple RVs provide sustained expression of transgenes, and transgenes are silenced in the reprogrammed pluripotent cells. However, the silencing process can occur at undesirable points in time, either too early or too late. In addition, RVs integrate into the reprogrammed genomes, posing risks of insertional mutagenesis, and residual expression and reactivation of the transgenes. Many nonintegrating approaches (protein transduction, RNA transfection, transient transfection, minicircle, and episomal plasmids) generally cannot support sustained expression of transgenes and suffer from low efficiencies of reprogramming. Alphaviral vectors are theoretically ideal reprogramming vehicles, but their strong cytopathic effects compromise their values. Sendai viral vectors might meet the complex requirements for factor reprogramming owing to their nonintegrating nature, self-replicating property, controllable removal of viral replicons, relatively low toxicity, and the ability to accommodate multiple exogenous genes. Despite their relatively low reprogramming efficiency, EBV-based episomal plasmids are currently a feasible vehicle choice due to their simplicity, self-replicating nature, nonintegrating property, low immunogenicity, and low toxicity. There might be some stones left unturned, and a more efficient, safer, and nonintegrating reprogramming approach could be developed in the future using some novel transgene vectors or factor delivery systems.
Glossary
Transduction: a gene transfer process in which replication-incompetent viral particles “infect” cells or tissues in vivo or in vitro. This term distinguishes itself from a normal viral infection by any replication-competent virus in which viruses invade cells and/or tissues, and subsequently replicate inside the host cells and spread to neighboring cells or tissues. Transduction also refers to the process of protein delivery into cells.
Packaging signal: a region in a viral chromosome that is essential for the efficient packaging of viral genomes.
Footnotes
Acknowledgments
Author Disclosure Statement
The author declares no competing financial interests.
