Overview of Recent Progress in Protein-Expression Technologies for Small-Molecule Screening

Abstract

Production of novel soluble and membrane-localized protein targets for functional and affinity-based screening has often been limited by the inability of traditional protein-expression systems to generate recombinant proteins that have properties similar to those of their endogenous counterparts. Such targets have often been labeled as challenging. Although biological validation of these challenging targets for specific disease areas may be strong, discovery of small-molecule modulators can be greatly delayed or completely halted due to target-expression issues. In this article, the limitations of traditional protein-expression systems will be discussed along with new systems designed to overcome these challenges. Recent work in this field has focused on two major areas for both soluble and membrane targets: construct-design strategies to improve expression levels and new hosts that can carry out the posttranslational modifications necessary for proper target folding and function. Another area of active research has been on the reconstitution of solubilized membrane targets for both structural analysis and screening. Finally, the potential impact of these new systems on the output of small-molecule screening campaigns will be discussed.

Keywords

Protein expression screening small molecule drug targets review

Introduction

The discovery of small-molecule therapeutics during the past few decades has relied heavily on the production of recombinant proteins in sufficient quantity and quality for use in functional and affinity-based screening campaigns.^1,2 Once small-molecule screening hits are selected by chemistry teams for further optimization, the focus shifts to activity in cellular assays, animal models, and eventually human testing, often with little regard to the properties of the target used to initially identify the molecules. Lack of efficacy in these downstream assays can occur for many reasons, including limited understanding of disease biology,³ but any potentially significant differences between the endogenous target and the protein used for screening must also be considered, especially because the properties of the same protein expressed in different systems may not be similar.^4,5 With the rapid proliferation of targets resulting from an explosion of genomic data combined with recent advances in affinity screening technologies that access significantly larger regions of chemical space than traditional screening decks,^6,7 it is critical at the outset of any new screening campaign to consider expression technologies designed to produce proteins that structurally and functionally resemble endogenous targets. Such systems should also be more suitable for production of what have unfortunately been labeled as “challenging” targets, such as those that require complex posttranslational modifications or the presence of cell-type-specific protein partners to fold into their native conformation and multispanning membrane targets.⁸ Production of these targets has been defined as challenging based only on limitations of the expression system chosen. Current estimates of the total number of human gene products that have been successfully targeted with small-molecule drugs range from 1% to 2%.^9–11 Putting aside the debate of how much of the human genome is actually “druggable,” because this may dramatically change as new chemical space is explored,^12,13 one can conclude that a significant number of disease-relevant proteins have yet to be successfully targeted using a small-molecule approach.¹⁴ To date, crystal structures have been obtained for only approximately 15–20% of human proteins, but only slightly more than 2000 targets, largely protein kinases and other soluble cytoplasmic targets, have been co-crystallized with druglike molecules,¹⁵ and thus there is a significant need for developing new expression and purification systems to drive both structural biology and production of novel screening targets. In this article, recent progress in expression-construct design, expression hosts, and reconstitution of membrane proteins will be discussed with an emphasis on how these new systems afford advantages in maintaining a close match between recombinant and endogenous targets and thus open up new challenging target classes for small-molecule screening efforts.

Expression-Construct Design

Many small-molecule drug-discovery projects begin by designing expression constructs of the screening target. Construct design is usually guided by the goals of the small-molecule screen. If the goals include identifying molecules that bind to novel sites on the protein, full-length constructs will be preferred, and the screen should be set up to detect all such binders. For example, the phosphoinositide-dependent protein kinase 1 (PDK1) contains three known small-molecule binding sites, the adenosine triphosphate (ATP) pocket, the phospho-transfer substrate pocket, and a site for binding to hydrophobic sequences in the C-terminus of kinase substrates known as the PDK1 interacting fragment (PIF) pocket.¹⁶ Using the full-length kinase expressed in the baculovirus system and a substrate designed to be sensitive to binders at all three small-molecule binding sites, two new series of selective, non-ATP-competitive compounds were identified that may bind to the PIF pocket.¹⁷ These novel binders would not have been discovered if only the PDK1 kinase domain was used or if the assay had been designed with a substrate that did not interact with all three sites.

If the goal of the screen is to target a particular binding site, then expression efforts should focus on producing the subdomain of the protein containing that site, as long as it is stable and functional in the absence of the remaining sequence. Such approaches have been used with a wide variety of targets, especially in cases when expression of the full-length protein has been challenging, and have recently been successfully used to identify small-molecule inhibitors of protein–protein²⁵ interactions. Many proteins involved in therapeutically relevant protein–protein interactions contain multiple domains, interact with multiple proteins in the cell, and can also form large multimers, all of which make expression and purification challenging. An example of such a target is the Cullin3 E3 Ligase, Keap1. Keap1 binds to Nrf2, a key regulator of cytoprotective genes, and targets Nrf2 for degradation by the proteasome.¹⁸ Small-molecule inhibition of the Keap1–Nrf2 interaction may be beneficial in both oncology¹⁹ and inflammatory disease.²⁰ Overexpression of full-length Keap1 in mammalian cells, however, results in the production of dimers and higher order multimers of the protein,²¹ which bodes poorly for expression of this construct in nonmammalian systems. A domain based expression strategy, using Escherichia coli expression of the Keap1 Kelch domain only, produced high-quality monomeric protein.²² The Kelch domain alone binds to Nrf2, and a recent screen conducted using a fluorescence polarization assay with the Kelch domain and a fluorescent Nrf2 peptide has led to the discovery of novel small-molecule inhibitors of this interaction that prevent Nrf2 degradation.²³

One of the most successful paradigms for expressing, purifying, and crystallizing proteins has been established by the Structural Genomics Consortium (SGC). To rapidly identify constructs of human genes that could be successfully expressed in E. coli and crystallized, they prepared full-length constructs and a series of truncated fragments of each gene.²⁴ They relied on the use of multiple bioinformatic approaches, such as PFAM-based²⁵ domain prediction and PSIPRED²⁶ secondary structure prediction, to define structured domains, but the primary approach was to use alignments with protein domains that had previously been crystallized. On average, a total of 9–15 constructs were designed for each domain by varying the boundaries of the bioinformatically defined domains by 2–10 amino acids. Successful expression was defined as the production of soluble protein in E. coli at expression levels greater than 2 mg protein/L of culture and up to as high as 50 mg protein/L. Interestingly, they found that only 11% (31 out of 282) of the constructs they tested could be expressed and purified as full-length proteins, some (6%) required the removal of nine or fewer amino acids, whereas the vast majority (80%) could only be successfully expressed with significantly larger truncations.²⁴ The obvious conclusion is that all expression projects should include truncated versions of the target as well as the full-length version to improve the likelihood of successful expression in E. coli. Other expression systems, however, should also be considered (discussed below), because another report on large-scale expression testing in E. coli has shown that the success rate of purifying soluble targets is only approximately 30% for both full-length native sequences and truncated synthetic sequences.²⁷

Furthermore, the SGC also found that the number of disordered residues at the ends of successfully expressed and crystallized constructs was on average very low, and for about 50% of the constructs all of the amino acids were in ordered structures.²⁴ The predicted number of disordered residues based on both PSIPRED- and PFAM-defined domain boundaries was much higher for both N- and C-terminal ends. The observation strongly suggests that basing new constructs on alignments with previously crystallized homologous domains should be much more successful than using secondary structure prediction. Obviously, this can be helpful only if related domains have already been successfully expressed and crystallized, and the disordered regions outside of those domains have been defined. If no prior related crystal structures exist, then beginning with the PFAM and PSIPRED domain definitions is a good start, but other constructs with varying N- and C-terminal extensions should be designed in parallel.

Gene Synthesis and Codon Optimization

Once constructs have been selected for expression, the appropriate clones must be generated. During the past few years, many labs have taken advantage of custom gene synthesis to generate expression constructs rather than lengthy subcloning procedures from cDNA libraries or from commercially available vector collections.²⁸ Producing constructs using gene synthesis provides a number of advantages over traditional cloning. Along with being less labor intensive, gene synthesis eliminates concerns over accumulating mutations that can occur during PCR-based amplification steps involved in the subcloning process. Gene synthesis also allows complete control over the selection of restriction sites used for subcloning the construct into the required expression vectors and the incorporation of affinity tags directly into the construct sequence at any location.

Another option that custom gene synthesis provides is the choice to optimize sequences based on the codon preferences of the expression host. Codon usage tables have been generated for many host organisms, including E. coli, based on various criteria, one of which is the presence or absence of certain codons in highly expressed proteins.²⁹ These tables are then used to alter the codon sequences of heterologous genes that are expressed in the host organism with the goal of maximizing expression levels. Simply replacing rare codons with more frequently represented codons does not, however, always increase expression levels. Analysis of more than 250 codon-optimized synthetic genes showed that for greater than 20% of the constructs, optimization had no effect on expression yields.³⁰ Other factors such as clustering of rare codons and the position of rare codons at the beginning of an open reading frame may be part of the reason why such codons have more deleterious effects than others on protein expression.³¹ Preferences for an order of certain codons within constructs, observed as a bias toward one particular tRNA after it has been used for the first time in a sequence, may also be important for optimal expression.³²

In an effort to more fully understand the effects of rare codons in E. coli expression, a recent study assembled more than 14,000 synthetic sequences and tested the effects of rare codons in the N-terminus of expression constructs.³³ Surprisingly, the inclusion of rare codons in the N-terminus of a GFP (green fluorescent protein) construct substantially increased the expression of most constructs, in some cases as much as 100-fold. These results correlate well with the preference for rare codons in the N-terminus of highly expressed endogenous E. coli genes. The authors conclude that one significant reason for the rare codon preference in endogenous genes is that rare codons, which on the whole have lower GC content than more frequently used codons, produced more relaxed 5′ RNA structures that are translated faster by the ribosome.³³ Other studies have also clearly demonstrated the undesirable effects of 5′ mRNA secondary structures on protein production,^31,32 but this is the first to do so by using such a massive array of synthetic sequences. A similar relationship between mRNA structure and codon bias in highly expressed yeast genes has led to the conclusion that the tRNA pool has adapted to the genomic codon profile rather than the other way around.³⁴ Further work needs to be done to determine the effects of such codons in other parts of constructs as well as to more fully understand the reasons for the N-terminus rare codon effect.

Finally, one must also consider recent reports on the effects of synonymous mutations before deciding to alter the native codons of a target. Because the presence and positioning of rare codons can slow down the rate of translation,³⁵ and because translation rate can have a significant effect on protein folding in E. coli,³⁶ replacing rare codons, especially at transition points in secondary structure,³⁷ may result in protein misfolding. Translation in E. coli is also slowed by reducing induction temperature to lower than the normal growth temperature of 37°C, and thus temperature reduction may give proteins more time to fold into their native structures. Synonymous mutations have also been shown to have effects on the specific activity of an enzyme, ADAMTS13,³⁸ and on the behavior of circadian clock proteins,^39,40 and can even alter substrate specificity and small-molecule interactions with a membrane protein.⁴¹ The potential for synonymous mutations to alter small-molecule binding is extremely relevant to both affinity- and activity-based screening, and it will be of interest to see how many other target classes are sensitive to codon alterations. Thus, although codon optimization is beneficial for specific targets, these benefits must be weighed against the small, but still possible, chance that synonymous mutations may cause the behavior of the construct to differ from wild type, and the more likely result that the “optimized” construct may not improve expression levels over the wild-type construct.

Fusion Tag Selection

Once the genetic material for the target protein is selected, the next major choice in expression construct design is the selection of a fusion tag. Fusion tags come in many different varieties and have been designed for a wide array of purposes.^42,43 Two major purposes are to improve the solubility of recombinant proteins in E. coli and to provide a convenient handle for affinity purification.

The first class of tags includes TrxA,⁴⁴ based on the very soluble E. coli protein thioredoxin A; SUMO,⁴⁵ a small-ubiquitin-related modifier from yeast; and NusA,⁴⁶ a very hydrophilic E. coli protein that also promotes pauses in the coupled transcription and translation machinery in E. coli to allow more time for protein folding.⁴⁷ Because these tags are based on very soluble proteins, they are able to confer solubility onto less soluble target proteins. Recently, larger screens of abundant endogenous E. coli proteins have turned up more potential fusion tags for insoluble targets. In one such screen, 88 highly expressed E. coli proteins were fused to three challenging model targets: human β-defensin 2, human epidermal growth factor, and human erythropoietin.⁴⁸ Twelve of the 88 were found to successfully increase soluble expression and were then retested using a panel of cytokines. Certain tags were able to increase soluble expression up to 29-fold.

Other fusion tags, such as TrpE,⁴⁹ ketosteroid isomerase,⁵⁰ and PagP,⁵¹ are extremely insoluble and decrease the solubility of the fused recombinant protein, resulting in the deposition of the fusion product into E. coli inclusion bodies. These approaches have been used to produce peptides that are toxic to E. coli.⁵² Targeting these peptides to inclusion bodies prevents the peptides from killing their expression host and simplifies purification because they can be readily extracted from the insoluble portion of the E. coli lysate. This approach may also be used for other very toxic proteins or for proteins that are subject to degradation in E. coli. Typically, these tags are removed prior to working with the target protein by using very harsh conditions such as cyanogen bromide in acid, but recent work using metal–ion-catalyzed peptide bond cleavage⁵³ may provide a gentler route to the desired product. Interestingly, this approach has also been used for the production of a very challenging target, the N-terminal sequence of human cardiac troponin I, an intrinsically disordered protein, as a PagP fusion construct.⁵¹ Ketosteroid isomerase fusions have also been used to produce fragments of G-protein coupled receptors (GPCR) transmembrane domains⁵⁴ that could be very interesting targets for affinity-based screening approaches.

The second major class of fusion tags is referred to as affinity tags because they are added to proteins so that they can be purified by using the affinity of the tag for a specific matrix ( Table 1 ). Affinity tags are also often used in protein–protein interaction screens as handles for the attachment of reagents to detect the interaction between two differentially tagged constructs using homogeneous time-resolved fluorescence-based assays.⁵⁵ Affinity tags are also critical for affinity-based screening methodologies that rely on capturing protein–small-molecule complexes to identify hits.⁷

Table 1.

Protein Affinity Tags. Abbreviations: EC, Escherichia coli; I, Insect Cells; M, Mammalian Cells. NA, Not Available.

Tag	Host(s)	Size (kDa)	Affinity Matrix	Capacity (mg/ml)	Elution Condition	Comments
1D4	M	0.9	Rho-1D4 antibody	NA	1D4 peptide	Used to purify a membrane transporter
ABDz1 tag	EC	6.3	Albumin and protein A	NA	Low pH	Bispecific affinity tag
Avitag^TM	EC, I	1.8	Streptavidin	10	None	Site-specific biotinylation
Calmodulin-binding peptide (CBP)	EC	5.3	Calmodulin	2	EGTA	High affinity; no endogenous binders in EC
Cellulose-binding domain	EC, I, M	20	Cellulose	3	Denaturing or ethylene glycol	High affinity
Chitin-binding domain (CBD)	EC	27	Chitin	2	Intein self-cleavage	Combined with self-cleaving inteins
C-myc epitope	EC, I, M	1.2	mAb9E10	≥0.5	c-Myc peptide	Used more for detection than purification
Covalent yet dissociable (CYD)	EC	0.56	InaD PDZ domain	>0.2	DTT	Disulfide bond formed to affinity matrix
Fc tag	EC, I, M	25.6	Protein A	10–20	Low pH	Used for extracellular and secreted proteins
FC-III tag	EC	1.53	IgG–Fc	NA	Low pH	Peptide version of protein A tag
FLAG	EC, I, M	1.01	Anti-FLAG antibody	≥0.6	FLAG peptide	No endogenous binders, but low capacity
Glutathione S-transferase (GST)	EC, I, M	26	Glutathione	5–10	10 mM reduced glutathione	Can improve solubility; dimer
Halo tag	EC, M	34	HaloLink^TM Resin	7	None (covalent)	Can improve solubility and attach functional groups
Heavy chain of protein C (HPC)	EC, M	1.38	Monoclonal antibody	0.5	EDTA, HPC peptide	Low capacity; high cost
Hemagglutinin antigen (HA)	EC, I, M	1.1	Monoclonal antibody	≥0.6	0.1 M glycine (pH 2.0–2.8)	Detection and immunoprecipitation
Maltose-binding protein (MBP)	EC	42	Amylose	7–10	10 mM maltose	Can improve solubility
polyHis	EC, I, M	0.8-1.4	Metal (Ni, Co)–based resins	15–60	Imidazole	Most frequently used tag, high-capacity resin
Protein A	EC	42	IgG	2	Low pH	Large tag; rarely used
Softag1	M	1.2	Monoclonal antibody	0.3	Salt, polylol	Mammalian expression
Softag3	EC	0.86	Monoclonal antibody	0.3	Salt, polylol	Eukaryotic expression
S-tag	EC, I	4.8	S-protein resin	0.5	Low pH	Allows colorimetric detection
Strep-tag II	EC, I, M	1.05	Strep-Tactin®	2.5–25	Desthiobiotin	Engineered matrix allows elution
Streptavidin-binding peptide	EC, I, M	4.3	Streptavidin	0.5	2 mM biotin	High-affinity tag; excellent purity
Thioredoxin	EC	11.7	Anti-thioredoxin	0.4	Low pH	Can improve solubility
V5	M	1.42	Anti-V5 antibody	0.3	Low pH	Detection and immunoprecipitation
VSV-G	M	1.33	Anti-VSV-G antibody	0.75	VSV-G peptide	Detection and immunoprecipitation

One of the most commonly used affinity tags on constructs today is the polyHis tag. Designed more than 25 years ago, the polyHis tag binds to metal-containing matrices and can be eluted with high concentrations of imidazole.⁵⁶ The polyHis tag, which can vary in length from 6 to 12 histidine residues, can be placed in almost any position in the protein sequence and, in most cases, will still be able to serve as an affinity handle for purification. Tag placement can, however, dramatically alter both the function of the tag and the behavior of the protein if the tag is required for the screening assay.⁴² Just as multiple constructs may need to be tested to optimize protein expression, tagging constructs at both N- and C-terminal ends is also warranted, especially if the small-molecule site is proximal to one of the termini. It also may be necessary to test various linker peptide lengths between the tag and the beginning or end of the target gene.⁵⁷ Testing tags at both termini may also be important if the construct is going to be used in protein–protein interaction screens, and there is concern that the presence of the tag may sterically interfere with the interaction. PolyHis tags have been successfully used on a wide variety of targets, but they are not recommended for use with metalloenzymes because the polyHis capture matrix may extract metals from the active site and reduce specific activity.⁴²

Other popular tags include GST, MBP, FLAG, and Strep tags. GST refers to the 26 kDa glutathione S-transferase enzyme from Schistosoma japonicum that binds to immobilized glutathione and can be eluted by excess glutathione.⁵⁸ Maltose-binding protein, or MBP, is a hydrophilic 42 kDa E. coli protein that binds to immobilized amylose and can be eluted with maltose.⁵⁹ GST and MBP tags can often increase the solubility of a target, but because GST is an obligate dimer,⁴³ it may present the protein to the small-molecule screening deck in a nonnative conformation. The FLAG tag is an 8 amino acid hydrophilic tag that binds to several antibodies and can be eluted using excess FLAG peptide.⁶⁰ A side-by-side comparison of the FLAG tag to other proteins tags showed that it produced the highest level of purity from an E. coli lysate in a single-step purification.⁶¹ Strep tags are based on peptides that have been selected from large libraries based on their ability to bind to streptavidin or streptavidin derivatives.⁶² The Strep-II tag is a short 8 amino acid sequence that binds to streptactin, an engineered version of streptavidin, and is eluted with Desthiobiotin.⁶³ The longer streptavidin-binding peptide, or SBP tag, is a 38 amino acid tag that binds to streptavidin and can be eluted with biotin.⁶⁴ One disadvantage of using the SBP tag has been that the matrix was not reusable after the biotin treatment, because biotin binds essentially irreversibly to streptavidin. The recent discovery of a new streptavidin mutant that still maintains high-affinity binding to the SBP tag, but has significantly lower affinity for biotin so that the streptavidin matrix can be reused, may result in more frequent use of this novel tag on other proteins.⁶⁵ All of the tags discussed above can be engineered to be cleaved by site-specific proteases to remove the tags once purification is complete.⁶⁶ Tag removal is often desired for structural studies to prevent any tag-dependent artifacts, but it may also be important for screening campaigns that make use of reagents that bind to specific tags on targets.

Expression Hosts for Soluble Targets

The final major decision in the expression of a screening target is the choice of expression host. Due largely to its ease of use, potential for high protein yields, and commercial availability, E. coli, in particular the BL21 strain housing the lambda DE3 phage expressing T7 polymerase, is the most widely used expression host for both structural analysis and screening of protein targets.⁸ Of the 85,577 structures deposited in the crystal structure database (http://www.pdb.org) that have annotations describing the expression host used, 75,592 or 88% were produced in E. coli. Only 3% of the crystallized proteins were produced in insect cells using the baculovirus system, and a mere 1.4% were purified directly from human tissue or produced using a human cell expression system. Analysis of protein sources used for co-crystal structures with druglike molecules also shows that the targets used to identify these molecules are also largely produced in E. coli, with a growing representation of targets produced in baculovirus.¹⁵ Thus, to date, most proteins have been visualized and screened as represented by the E. coli expression system.

The caveats of E. coli expression for human targets have been described in many publications. Many target proteins, some have estimated as high as 77%,⁶⁷ are insoluble when expressed in E. coli, thus creating the need for the solubility-promoting tags described above. A specific example is interleukin-6 (IL-6), a secreted human protein that contains multiple disulfide bonds and naturally forms higher order multimers.⁶⁸ When IL-6 is expressed in E. coli, it is deposited in inclusion bodies, the destination of aggregated and often misfolded proteins, and purification of IL-6 from these inclusion bodies results in inactive protein.⁶⁹ Efforts to increase the solubility of IL-6 by fusing it to GST, MBP, and NusA only led to minor improvements in solubility and significantly increased the labor required to purify the final product. Targeting IL-6 to the periplasmic space did produce soluble protein, but the yields were dramatically reduced. Only co-expression of multiple cytosolic chaperones (DnaK, DnaJ, GrpE, GroES, and GroEL) combined with expression at room temperature and lower concentrations of the inducer, IPTG (isopropyl beta-D-thiogalactoside), resulted in reasonable levels of soluble cytosolic expression and bioactivity comparable to native IL-6.⁶⁹ Even though there have been some successful attempts to produce disulfide-bonded proteins in E. coli using various strategies⁷⁰ other than the multiple chaperone approach required for IL-6, some of these attempts have unexpectedly resulted in the production of periplasmic inclusion bodies⁷¹ and expression of proteins with nonnative disulfide bonds,⁷² as was most likely the case for IL-6.⁶⁹ Thus, the limitations of E. coli for the expression of human proteins can be overcome but only by testing multiple expression conditions, including localization and co-expression of chaperones.

Many limitations of the E. coli system for producing human protein targets can be overcome by switching to the baculovirus system, a eukaryotic system based on lepidopteran cell lines from organisms such as the fall armyworm, Spodoptera frugiperda.⁷³ The baculovirus system is the most widely used eukaryotic expression system for both screening and structural studies.⁷⁴ S. frugiperda cells contain posttranslational modification systems and chaperones similar to those observed in mammalian cells, and they have been used to produce soluble and functional forms of many recombinant human targets, including secreted proteins, such as interleukins, and transmembrane proteins.⁷⁵ The production of proteins using the baculovirus system is a two-step process. First, the target gene is packaged into the baculovirus particle by co-transfecting insect cells with the plasmids required for target and virus production. Several rounds of virus production are often required to attain the titer required for high protein expression. The virus is then added to the insect cells in suspension cultures to produce the cell mass needed for high protein yields. Although baculovirus protein production is more labor intensive than the E. coli system, the resulting yields of soluble and functional targets are often well worth the extra effort. It is not, however, the ideal system for all targets, because some posttranslational modifications, such as N-linked glycosylation, are slightly different from those of mammalian cells,⁷³ and these observations have driven recent efforts to reengineer the insect cells to produce fully mammalian sugar chains.⁷⁶

An alternate expression system for human targets is the methylotrophic yeast, Pichia pastoris. A recent publication comparing the expression of human adiponectin in both E. coli and P. pastoris systems demonstrated that the P. pastoris system produced more soluble protein per liter of cells than E. coli.⁷⁷ Only the P. pastoris protein contained posttranslational modifications and was able to demonstrate a more pronounced effect on mouse blood glucose levels than the E. coli material.⁷⁷ Another recent publication has demonstrated the superiority of the P. pastoris system over E. coli for the production of PDGF-AA.⁷⁸ Production of the N-terminus of urokinase plasminogen activator in P. pastoris also yielded biological active material that contained posttranslational modifications not seen in E. coli–produced protein.⁷⁹ Overall, P. pastoris is superior to E. coli for the production of bioactive proteins that require posttranslational modifications, particularly glycosylation. Strain engineering with P. pastoris has identified combinations of mutations and overexpression of metabolic enzymes that can result in significantly higher production of recombinant proteins than wild-type strains.⁸⁰ The P. pastoris expression system has been recently used to screen for inhibitors of human matrix metalloproteases (MMPs) that are very difficult to express as active enzymes in prokaryotic systems due to their mechanism of activation.⁸¹ MMP2 and MMP9 were both expressed on the cell surface of P. pastoris by fusing the target constructs to a cell wall anchor sequence, and the intact cells were used as the enzyme source for a screen with a fluorescent substrate that resulted in the discovery of potent inhibitors.⁸¹

Interestingly, descriptions of expression of human proteins in mammalian systems, and human cell–based platforms, are somewhat rare in the screening literature and in the online database of crystallized proteins. Even though such systems have been commercially available for some time, their use has often been limited to the production of antibodies.⁸² Because of the limitations of E. coli expression for many full-length human proteins and because eukaryote-specific posttranslational modifications are often required for proper folding, new systems that readily allow side-by-side expression of protein targets in several eukaryotic systems have been developed. One example is the pFlp-Bac-to-Mam system, a single-vector system designed for expression in three host systems: baculovirus, transient expression in suspension culture–adapted HEK293 human cells, and stable expression in Chinese hamster ovary (CHO) cells.⁸³ The vector contains an Epstein–Barr virus oriP for the transient expression of the target in mammalian cell lines such as HEK293E, sites for Tn7 transposition into a bacmid for insect cell expression, and Flp–recombinase sites so that the insert can be rapidly integrated into certain CHO cell lines for stable expression. The authors compared the expression of three different proteins using this system. They found that the intracellular fluorescent protein, mCherry, was produced at very high levels of more than 50 mg/L in both the HEK293 and insect cells, with about fivefold lower expression in the CHO line.⁸³ The extracellular domain of the TL2 receptor was produced to nearly 1 mg/L in the CHO line, and scFc–hIgGFc was produced at 90 mg/L in HEK293 cells but only at 2 mg/L in the insect cells.⁸³ The advantage of this system was that all three expression systems could be explored in parallel in less than 12 weeks without the need for further subcloning or constructs design. The varying results with three different proteins clearly show the utility of such a system for quickly selecting the optimum host for expression of a screening target.

One of the challenges of expressing proteins in mammalian cells, such as HEK293 cells and their derivatives, is that they grow much more slowly than E. coli, and the assessment of multiple constructs using large-scale cultures can be quite laborious, especially if stable cell lines are generated for expression. To overcome these challenges, higher throughput methods for optimizing expression in mammalian cells have recently been developed. One system uses PCR to rapidly generate tagged expression constructs that are then tested for expression using small (2 ml) culture volumes in 24-well blocks, a dot blot apparatus, and a fluorescent antibody to detect the tagged protein after it is blotted on a membrane.⁸⁴ The expression of proteins in small culture volumes was found to be representative of expression in larger culture volumes and greatly accelerated the process of identifying highly expressed clones. A fully automated system for transient transfection of mammalian cells to assess protein production in a high-throughput manner has also been developed.⁸⁵

One example of a target for which the HEK293-based expression system has been successfully used for both structural and screening purposes is the secreted glycoprotein autotaxin. Autotaxin is a target for both new oncology and antifibrotic drugs,⁸⁶ but progress in building detailed structure–activity relationships had been limited due to the inability to produce the protein in enough quantity for structural studies.⁸⁷ Expression of autotaxin in HEK293 cells and purification of the secreted protein from the media have since been used to determine the crystal structure of the murine enzyme⁸⁸ and to successfully identify potent inhibitors by high-throughput screening.⁸⁹

Novel host organisms for the expression of human proteins have also begun to appear in the literature. Leishmania tarentolae, a nonpathogenic protozoan parasite of the white-spotted wall gecko, has been used to successfully express both soluble cytosolic proteins,⁹⁰ and secreted proteins, such as erythropoietin, that require mammalian-type high-mannose N-glycosylation.⁹¹ Interestingly, this system makes use of the high transcription rate of RNA polymerase I by using vectors that integrate the recombinant expression cassette into a rRNA locus in the genome.⁹⁰ Insertion of 3′UTRs from highly expressed L. tarentolae genes can also greatly increase expression level. L. tarentolae can also perform other posttranslational modifications such as the incorporation of disulfide bonds and the processing of signal sequences, and it can handle more complex protein-folding challenges such as proteins with epidermal growth factor–like domains.⁹² Human SOD1, without any additional fusion tag, was recently crystallized from protein expressed in L. tarentolae, demonstrating the ability of this system to produce large amounts (30 mg/L) of high-quality protein.⁹³ Other complex proteins that have been challenging to express in E. coli, such as the hepatitis E virus capsid protein,⁹⁴ have been successfully expressed in L. tarentolae, and because this system has recently begun to be sold commercially, it will be of interest to see what the future of this expression system is for producing challenging screening targets.

In vitro expression systems, based on E. coli transcription and translation machinery, have been used for many years to screen expression constructs for appropriate folding and function.⁹⁵ In vitro expression systems have not, however, been commonly used to produce protein targets for screening campaigns, mostly because it is much easier to scale up expression in E. coli than by using in vitro transcription and translation (IVTT) systems. A new cell-free mammalian IVTT expression system based on lysates from CHO cells may be able to provide some significant advantages to producing challenging mammalian proteins without the need to produce large volumes of mammalian cells in a sterile environment.⁹⁶ This system links transcription and translation, so the user can start with a DNA plasmid. Expression levels as high as 50 ug/ml have been attained.⁹⁶ Because this system uses CHO microsomes, both soluble and membrane proteins can be expressed. As with the L. tarentolae system, it will be interesting to see if the CHO IVTT system will be successfully used to produce and screen recombinant versions of challenging new targets.

Expression Systems for Membrane Targets

Membrane proteins, including the seven-transmembrane G-protein coupled receptors (GPCRs), represent the largest class of drug targets.^11,97,98 Even though approximately 25% of all prokaryotic and eukaryotic genes encode membrane proteins, most still are considered proteins of unknown function, and only slightly more than 300 such proteins have high-resolution crystal structures.⁹⁹ It is clear from these observations that the overexpression of integral membrane proteins to sufficient levels to enable downstream purification and use in screening presents a unique set of challenges. These proteins are highly hydrophobic and prefer to exist in a lipid bilayer environment. Extraction of the membrane protein from the lipid bilayer environment of the cell wall using detergents often leads to loss of stability and activity.¹⁰⁰ Large numbers of experiments are often required to identify detergents that both efficiently extract the membrane protein from the cell wall and maintain its stability.¹⁰¹ Experiments performed on the diacylglycerol kinase (DGK) enzyme revealed a dramatic effect on thermal stability and activity when the protein was solubilized in different detergents. DGK solubilized in dodecyl maltoside (DDM) retained >90% activity even when heated to 75°C but had <50% activity at 55°C when solubilized in lauryldimethylamine-N-oxide (LDAO).¹⁰²

Current high-throughput screening approaches for membrane protein targets often rely on cell-based assays or assays using isolated membranes. Both of these formats can be prone to artifacts because these systems contain not only the target of interest but also many other proteins that are present in the cell wall or peripherally associated with the membrane.^103,104 Reporter-based cellular assays can reduce false-positive rates; however, it is important to select an assay that generates a biologically relevant readout, which is nontrivial.^105,106 The ideal membrane protein-screening reagent would be one in which the target is pure, soluble, and active, which would necessitate the overexpression of a recombinant target protein to enable downstream purification. Mammalian membrane proteins are often subjected to posttranslational modifications, such as glycosylation, phosphorylation, lipidation, and ubiquitination, which would preclude their expression in a bacterial host system (e.g., E. coli).^107–110 Baculovirus-infected insect cells, as discussed above, are capable of generating most, but not all, of the posttranslational modifications produced in mammalian cells, so the selection of the expression host will be highly dependent on how the protein is modified and what affect those modifications have on activity.¹¹¹ Expression of α1B-adrenergic receptor (AR) in different host cell lines demonstrated that the N-linked glycosylation and phosphorylation modifications were critical in maintaining structural heterogeneity of the receptor, whereas palmitoylation or O-linked glycosylation modifications were not.¹¹² Therefore, it may be suitable to express this receptor in baculovirus-infected insect cells because these cells produce N-linked glycosylations similar to those produced in mammalian cells. For other receptors in which O-linked glycosylation is essential for structural and functional fidelity, however, use of a mammalian host may be required. The baculovirus expression system has been used successfully for the generation of human GPCRs for structural studies,¹¹³ but it is far from a panacea.^114,115 A recent comparison of yeast and baculovirus expression systems for the production of human GPCRs demonstrated multiple advantages of yeast expression systems, including higher yields (in both the total protein and the fraction of active protein), ease of use, and shorter production times.¹¹⁶ Use of the Bac-to-Mam system described previously for the infection of mammalian cells can produce transiently transfected cells displaying high levels of membrane protein expression, but the proprietary nature of this system limits its accessibility to small companies and academic research labs.^117,118 Mammalian cells that have been adapted for serum-free growth in suspension at high density, such as Freestyle293 (Life Technologies) and Expi293 (Life Technologies), reduce the cell handling required for large-scale fermentations and transient transfections. Although these systems are advertised as being capable of producing milligram quantities of membrane proteins per liter of culture, in the authors’ experience the typical yields for a GPCR are often <1 mg/L of purified receptor.

Recent advances in determining the three-dimensional crystal structures of GPCRs have depended on approaches that require the modification of the protein, either by the insertion of a large, soluble fusion domain or by introducing multiple point mutations, both of which can alter the ligand-binding properties of the expressed receptor.^119–121 The recently published crystal structure of glucagon receptor (GCGR) contains a thermally stabilized E. coli apo-cytochrome b562RIL (BRIL) domain in place of the native N-terminal domain of the receptor.¹²² The N-terminal domain of the class-B GPCR plays a critical role in ligand binding and receptor activation, so it is unlikely that the GCGR–BRIL fusion construct would be capable of adopting a conformation competent for peptide binding.¹²³ Screening using highly modified constructs of membrane proteins would be more likely to produce false positives and reduce the likelihood of finding molecules with the desired biological activity. The structural determination of the first complex of a GPCR with its cognate G-protein was facilitated by the presence of a stabilizing immunoglobulin molecule (nanobody).^124,125 It has been recently demonstrated that nanobodies can induce the formation of specific conformational states of the protein (i.e., agonist/activated).¹²⁴ The nanobody approach has the appeal of not requiring multiple mutations to the receptor itself, maintaining the conformation-specific ligand binding of the receptor and allowing for screening against a population of receptors locked into a defined conformational state.¹²⁶ The limitation of this approach is the ability to generate the nanobodies that require either isolation from the serum of inoculated llamas or screening against recombinant nanobody libraries.¹²⁷

Reconstitution of Membrane Targets

New approaches to reconstituting solubilized membrane protein targets in more native environments have focused on embedding membrane proteins in encapsulated lipid bilayers (nanodiscs) or solubilizing proteins in a mixed micelle, including lipids, detergents, and sphingolipids ( Table 2 ). The use of nanodiscs for this purpose is more broadly applicable, having been successfully applied to multiple membrane protein target classes, including GCPRs, cytochrome p450s, chemoreceptors, and many others,¹²⁸ and this approach does not require the mutation of the membrane protein itself. The nanodisc technique allows the membrane protein to be solubilized in the absence of detergent, which makes it appealing for applications that are sensitive to the presence of detergents.^129,130 The nanodisc-embedded membrane protein is in a native, lipid-bilayer environment and is, therefore, more likely to adopt a biologically relevant conformation. The presence of lipids and other lipid-bound components can be critical for membrane protein function, as has been observed for gamma-secretase, a membrane-embedded protease involved in the proteolytic processing of the amyloid precursor protein (APP). The reconstitution of gamma-secretase into proteoliposomes of varying compositions demonstrated that the enzyme that had been reconstituted into proteoliposomes containing sphingolipids, cholesterol, and phospholipids had significantly increased activity relative to other preparations.¹³¹

Table 2.

Overview of Advantages and Disadvantages of Various Approaches Used to Prepare Membrane Targets for Screening.

	Receptor in Cell Membrane	Wild-Type Receptor in Detergent Micelle	Stabilized Receptor in Detergent Micelle	Nanobody and G-Protein Fusion Receptor in Detergent Micelle	Wild-Type Receptor in Encapsulated Bilayer
Advantages	• Target in bilayer	• Purified target	• Purified target	• Purified target	• Target in bilayer
	• Overexpression not required		• Receptor stable for screening	• Homogeneous receptor conformation	• Purified target
					• No detergent present
Disadvantages	• Heterogeneous surface	• Overexpression needed	• Overexpression needed	• Overexpression needed	• Overexpression needed
	• Prone to false positives	• Detergent could interfere with screening	• Detergent could interfere with screening	• Detergent could interfere with screening
		• Stability concerns	• Highly modified target	• Generation of fusion protein

Discussion

So what is the best path forward to express a challenging new screening target? We have laid out a very simple flowchart to guide this process ( Fig. 1 ). Overall, the simplest path forward for a new target is to identify a homologous target that has already been successfully expressed and to follow the same method. It is mainly for this reason that there is such an extreme bias toward E. coli–expressed protein in both the structural and screening worlds. It has been estimated that only 15% of eukaryotic proteins can be expressed by E. coli in a soluble and active form,¹³² and thus new expression systems are absolutely needed for novel targets. There have been many attempts to improve the E. coli system, with the addition of solubility-inducing fusion tags or the co-expression of multiple chaperones as described above, but sampling all of these methods with a new target is a tedious endeavor. Much more promising expression systems for the proper folding and posttranslational modification of target proteins, such as the multihost systems including mammalian cells, novel host organisms, and cell-free mammalian systems, have been developed in the past few years. Even production of some of the most challenging membrane targets in an environment very close to that of cellular membranes is becoming approachable with novel expression, purification, and reconstitution techniques. It will be of great interest to compare the results of screens conducted with protein produced from these novel systems to screening results and small-molecule drug-development campaigns with E. coli–expressed targets. It will also be of interest to assess how many novel targets will become accessible to screening campaigns as a result of these innovative technologies. The ultimate question is whether these new expression methods will produce more efficacious small molecules either by accessing novel targets for disease treatments or by producing screening reagents that more faithfully represent the endogenous targets. Perhaps the former is more likely than the latter, but in both cases, the production of challenging new targets by novel expression systems significantly increases the likelihood of identifying the exciting new small-molecule therapies of the future.

Figure 1.

Simplified flowchart for protein expression. PDB, Protein Data Bank (http://www.pdb.org); PFAM, Protein Families (http://pfam.sanger.ac.uk); PTM, posttranslational modification.

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

References

Pereira

Williams

Origin and Evolution of High Throughput Screening. Br. J. Pharmacol. 2007, 152 (1), 53–61.

Kodadek

Rethinking Screening. Nat. Chem. Biol. 2010, 6 (3), 162–165.

Bamborough

System-Based Drug Discovery within the Human Kinome. Expert Opin. Drug Discov. 2012, 7 (11), 1053–1070.

Engel

Vuissoz

J.-M.

Eggimann

. Bacterial Expression of Mutant Argininosuccinate Lyase Reveals Imperfect Correlation of in-Vitro Enzyme Activity with Clinical Phenotype in Argininosuccinic Aciduria. J. Inher. Metab. Dis. 2012, 35 (1), 133–140.

Zhuang

Durrani

Hodges

. Expression of Recombinant Ara h 6 in Pichia pastoris but Not in Escherichia coli Preserves Allergic Effector Function and Allows Assessment of Specific Mutations. Molec. Nutr. Food Res. 2012, 56 (6), 986–995.

Clark

M. A.

Acharya

R. A.

Arico-Muendel

C. C.

. Design, Synthesis and Selection of DNA-Encoded Small-Molecule Libraries. Nat. Chem. Biol. 2009, 5 (9), 647–654.

Zhu

Cuozzo

Review Article: High-Throughput Affinity-Based Technologies for Small-Molecule Drug Discovery. J. Biomol. Screen. 2009, 14 (10), 1157–1164.

Nettleship

Assenberg

Diprose

. Recent Advances in the Production of Proteins in Insect and Mammalian Cells for Structural Biology. J. Struct. Biol. 2010, 172 (1), 55–65.

Rask-Andersen

Almén

Schiöth

Trends in the Exploitation of Novel Drug Targets. Nature Rev. Drug Disc. 2011, 10 (8), 579–590.

10.

Dixon

Stockwell

Identifying Druggable Disease-Modifying Gene Products. Curr. Opin. Chem. Biol. 2009, 13 (5–6), 549–555.

11.

Overington

Al-Lazikani

Hopkins

How Many Drug Targets Are There?

Nature Rev. Drug Disc. 2006, 5 (12), 993–996.

12.

Clark

M. A.

Selecting Chemicals: The Emerging Utility of DNA-Encoded Libraries. Curr. Opin. Chem. Biol. 2010, 14 (3), 396–403.

13.

Verdine

Walensky

The Challenge of Drugging Undruggable Targets in Cancer: Lessons Learned from Targeting BCL-2 Family Members. Clin. Cancer Res. 2007, 13 (24), 7264–7270.

14.

Crews

Targeting the Undruggable Proteome: The Small Molecules of My Dreams. Chem. Biol. 2010, 17 (6), 551–555.

15.

Kufareva

Ilatovskiy

Abagyan

Pocketome: An Encyclopedia of Small-Molecule Binding Sites in 4D. Nucl. Acids Res. 2012, 40 (Database issue), 40.

16.

Biondi

Kieloch

Currie

. The PIF-Binding Pocket in PDK1 Is Essential for Activation of S6K and SGK, but Not PKB. EMBO J. 2001, 20 (16), 4380–4390.

17.

Bobkova

Weber

. Discovery of PDK1 Kinase Inhibitors with a Novel Mechanism of Action by Ultrahigh Throughput Screening. J. Biol. Chem. 2010, 285 (24), 18838–18846.

18.

Magesh

Chen

Small Molecule Modulators of Keap1-Nrf2-ARE Pathway as Potential Preventive and Therapeutic Agents. Med. Res. Rev. 2012, 32 (4), 687–726.

19.

Kim

T.-H.

Hur

E.-G.

Kang

S.-J.

. NRF2 Blockade Suppresses Colon Tumor Angiogenesis by Inhibiting Hypoxia-Induced Activation of HIF-1α. Cancer Res. 2011, 71 (6), 2260–2275.

20.

Steel

Cowan

Payerne

. Anti-Inflammatory Effect of a Cell-Penetrating Peptide Targeting the Nrf2/Keap1 Interaction. ACS Med. Chem. Lett. 2012, 3 (5), 407–410.

21.

Choo

Hagen

Mechanism of Cullin3 E3 Ubiquitin Ligase Dimerization. PloS One 2012, 7 (7), e41350.

22.

Zhang

Hannink

. Crystal Structure of the Kelch Domain of Human Keap1. J. Biol. Chem. 2004, 279 (52), 54750–54758.

23.

Magesh

Chen

. Discovery of a Small-Molecule Inhibitor and Cellular Probe of Keap1-Nrf2 Protein-Protein Interaction. Bioorg. Med. Chem. Lett. 2013, 23 (10), 3039–3043.

24.

Savitsky

Bray

Cooper

. High-Throughput Production of Human Proteins for Crystallization: The SGC Experience. J. Struct. Biol. 2010, 172 (1), 3–13.

25.

Punta

Coggill

P. C.

Eberhardt

R. Y.

. The Pfam Protein Families Database. Nucleic Acids Res. 2012, 40 (Database issue), D290–301.

26.

Jones

D. T.

Protein Secondary Structure Prediction Based on Position-Specific Scoring Matrices. J. Mol. Biol. 1999, 292 (2), 195–202.

27.

Raymond

Haffner

. Gene Design, Cloning and Protein-Expression Methods for High-Value Targets at the Seattle Structural Genomics Center for Infectious Disease. Acta Crystallograph. F 2011, 67 (Pt 9), 992–997.

28.

Gustafsson

Govindarajan

Minshull

Codon Bias and Heterologous Protein Expression. Trends Biotechnol. 2004, 22 (7), 346–353.

29.

Ermolaeva

Synonymous Codon Usage in Bacteria. Curr. Issues Molec. Biol. 2001, 3, 91–97.

30.

Zheng

Qureshi

. SGDB: A Database of Synthetic Genes Re-designed for Optimizing Protein Over-expression. Nucleic Acids Res. 2007, 35 (Database issue), 9.

31.

Gustafsson

Minshull

Govindarajan

Engineering Genes for Predictable Protein Expression. Protein Expression and … 2012, 83 (1), 37–46.

32.

Cannarozzi

Cannarrozzi

Schraudolph

. A Role for Codon Order in Translation Dynamics. Cell 2010, 141 (2), 355–367.

33.

Goodman

Church

Kosuri

Causes and Effects of N-Terminal Codon Bias in Bacterial Genes. Science (New York, N.Y.) 2013, 342 (6157), 475–479.

34.

Trotta

Selection on Codon Bias in Yeast: A Transcriptional Hypothesis. Nucleic Acids Res. 2013, 41 (20), 9382–9395.

35.

Novoa

Ribas de Pouplana

Speeding with Control: Codon Usage, tRNAs, and Ribosomes. Trends Genet. 2012, 28 (11), 574–581.

36.

Zhang

Hubalewska

Ignatova

Transient Ribosomal Attenuation Coordinates Protein Synthesis and Co-translational Folding. Nat. Struct. Molec. Biol. 2009, 16 (3), 274–280.

37.

Saunders

Deane

C. M.

Synonymous Codon Usage Influences the Local Protein Structure Observed. Nucleic Acids Res. 2010, 38 (19), 6719–6728.

38.

Edwards

Hing

Perry

. Characterization of Coding Synonymous and Non-synonymous Variants in ADAMTS13 Using Ex Vivo and In Silico Approaches. PloS One 2012, 7 (6), e38864.

39.

Shah

. Non-optimal Codon Usage Is a Mechanism to Achieve Circadian Clock Conditionality. Nature 2013, 495 (7439), 116–120.

40.

Zhou

Guo

Cha

. Non-optimal Codon Usage Affects Expression, Structure and Function of Clock Protein FRQ. Nature 2013, 495 (7439), 111–115.

41.

Kimchi-Sarfaty

Kim

I.-W.

. A “Silent” Polymorphism in the MDR1 Gene Changes Substrate Specificity. Science (New York, N.Y.) 2007, 315 (5811), 525–528.

42.

Young

Britton

Robinson

Recombinant Protein Expression and Purification: A Comprehensive Review of Affinity Tags and Microbial Applications. Biotechnology J. 2012, 7 (5), 620–634.

43.

Bell

M. R.

Engleka

M. J.

Malik

. To Fuse or Not to Fuse: What Is Your Purpose? Protein Sci. 2013, 22 (11), 1466–1477.

44.

LaVallie

E. R.

Diblasio-Smith

E. A.

. Thioredoxin as a Fusion Partner for Production of Soluble Recombinant Proteins in Escherichia coli. Methods Enzymol. 2000, 326, 322–340.

45.

Malakhov

M. P.

Mattern

M. R.

Malakhova

O. A.

. SUMO Fusions and SUMO-Specific Protease for Efficient Expression and Purification of Proteins. J. Struct. Funct. Genomics 2004, 5 (1–2), 75–86.

46.

Davis

G. D.

Elisee

Newham

D. M.

. New Fusion Protein Systems Designed to Give Soluble Expression in Escherichia coli. Biotechnol. Bioeng. 1999, 65 (4), 382–388.

47.

Gusarov

Nudler

Control of Intrinsic Transcription Termination by N and NusA: The Basic Mechanisms. Cell 2001, 107 (4), 437–449.

48.

Ahn

J.-H.

Keum

J.-W.

Kim

D.-M.

Expression Screening of Fusion Partners from an E. coli Genome for Soluble Expression of Recombinant Proteins in a Cell-Free Protein Synthesis System. PloS One 2011, 6 (11), e26875.

49.

Yansura

D. G.

Expression as trpE Fusion. Methods Enzymol. 1990, 185, 161–166.

50.

Athan

Christopher

T. W.

Production, Purification, and Cleavage of Tandem Repeats of Recombinant Peptides. JACS 1994, 116, 4599–4607.

51.

Hwang

P. M.

Pan

J. S.

Sykes

B. D.

A PagP Fusion Protein System for the Expression of Intrinsically Disordered Proteins in Escherichia coli. Protein Expr. Purif. 2012, 85 (1), 148–151.

52.

Park

T.-J.

Kim

J.-S.

Choi

S.-S.

. Cloning, Expression, Isotope Labeling, Purification, and Characterization of Bovine Antimicrobial Peptide, Lactophoricin in Escherichia coli. Protein Expr. Purif. 2009, 65 (1), 23–29.

53.

Hwang

Pan

Sykes

Targeted Expression, Purification, and Cleavage of Fusion Proteins from Inclusion Bodies in Escherichia coli. FEBS Lett. 2013.

54.

Britton

Z. T.

Hanle

E. I.

Robinson

A. S.

An Expression and Purification System for the Biosynthesis of Adenosine Receptor Peptides for Biophysical and Structural Characterization. Protein Expr. Purif. 2012, 84 (2), 224–235.

55.

Degorce

Card

Soh

. HTRF: A Technology Tailored for Drug Discovery—a Review of Theoretical Aspects and Recent Applications. Curr. Chem. Genom. 2009, 3, 22–32.

56.

Chaga

G. S.

Twenty-Five Years of Immobilized Metal Ion Affinity Chromatography: Past, Present and Future. J. Biochem. Biophys. Meth. 2001, 49 (1–3), 313–334.

57.

Lee

Kim

S.-H.

High-Throughput T7 LIC Vector for Introducing C-Terminal Poly-Histidine Tags with Variable Lengths without Extra Sequences. Protein Expr. Purif. 2009, 63 (1), 58–61.

58.

Smith

Johnson

Single-Step Purification of Polypeptides Expressed in Escherichia coli as Fusions with Glutathione S-Transferase. Gene 1988, 67 (1), 31–40.

59.

di Guan

Riggs

. Vectors That Facilitate the Expression and Purification of Foreign Peptides in Escherichia coli by Fusion to Maltose-Binding Protein. Gene 1988, 67 (1), 21–30.

60.

Einhauer

Jungbauer

The FLAG Peptide, a Versatile Fusion Tag for the Purification of Recombinant Proteins. J. Biochem. Biophys. Meth. 2001, 49 (1–3), 455–465.

61.

Lichty

Malecki

Agnew

. Comparison of Affinity Tags for Protein Purification. Protein Expr. Purif. 2005, 41 (1), 98–105.

62.

Schmidt

Skerra

The Random Peptide Library-Assisted Engineering of a C-Terminal Affinity Peptide, Useful for the Detection and Purification of a Functional Ig Fv Fragment. Prot. Engin. 1993, 6 (1), 109–122.

63.

Korndorfer

I. P.

Improved Affinity of Engineered Streptavidin for the Strep-Tag II Peptide Is Due to a Fixed Open Conformation of the Lid-Like Loop at the Binding Site. Protein Sci. 2002, 11, 883–893.

64.

Keefe

Wilson

Seelig

. One-Step Purification of Recombinant Proteins Using a Nanomolar-Affinity Streptavidin-Binding Peptide, the SBP-Tag. Protein Expr. Purif. 2001, 23 (3), 440–446.

65.

S.-C.

Wong

S.-L.

Structure-Guided Design of an Engineered Streptavidin with Reusability to Purify Streptavidin-Binding Peptide Tagged Proteins or Biotinylated Proteins. PloS One 2013, 8 (7), e69530.

66.

Charlton

Zachariou

Tag Removal by Site-Specific Cleavage of Recombinant Fusion Proteins. Meth. Molec. Biol. 2011, 681, 349–367.

67.

Marblestone

Edavettal

Lim

. Comparison of SUMO Fusion Technology with Traditional Gene Fusion Systems: Enhanced Expression and Solubility with SUMO. Protein Sci. 2006, 15 (1), 182–189.

68.

Simpson

Hammacher

Smith

. Interleukin-6: Structure-Function Relationships. Protein Sci. 1997, 6 (5), 929–955.

69.

Nausch

Huckauf

Koslowski

. Recombinant Production of Human Interleukin 6 in Escherichia coli. PloS One 2013, 8 (1).

70.

de Marco

Strategies for Successful Recombinant Expression of Disulfide Bond-Dependent Proteins in Escherichia coli. Microb. Cell Fact. 2009, 8, 26.

71.

Arié

J.-P.

Miot

Sassoon

. Formation of Active Inclusion Bodies in the Periplasm of Escherichia coli. Molec. Microbiol. 2006, 62 (2), 427–437.

72.

Martin

Rojas

Mitchell

. A Simple Vector System to Improve Performance and Utilisation of Recombinant Antibodies. BMC Biotech. 2006, 6, 46.

73.

Contreras-Gómez

Sánchez-Mirón

García-Camacho

. Protein Production Using the Baculovirus-Insect Cell Expression System. Biotechnol. Prog. 2014, (In press).

74.

Assenberg

Wan

Geisse

. Advances in Recombinant Protein Expression for Use in Pharmaceutical Research. Curr. Opin. Struct. Biol. 2013, 23 (3), 393–402.

75.

Kato

Kajikawa

Maenaka

. Silkworm Expression System as a Platform Technology in Life Science. Appl. Microbiol. Biotechnol. 2010, 85 (3), 459–470.

76.

Harrison

R. L.

Jarvis

D. L.

Protein N-Glycosylation in the Baculovirus-Insect Cell Expression System and Engineering of Insect Cells to Produce “Mammalianized” Recombinant Glycoproteins. Adv. Virus Res. 2006, 68, 159–191.

77.

Rothan

Teh

Haron

. A Comparative Study on the Expression, Purification and Functional Characterization of Human Adiponectin in Pichia pastoris and Escherichia coli. Intl. J. Molec. Sci. 2012, 13 (3), 3549–3562.

78.

Hui

Yang

. High Level Expression, Efficient Purification and Bioactivity Assay of Recombinant Human Platelet-Derived Growth Factor AA Dimer (PDGF-AA) from Methylotrophic Yeast Pichia pastoris. Protein Expr. Purif. 2013, 91 (2), 221–227.

79.

Lin

Zhuang

. Expression, Purification, and Biological Characterization of the Amino-Terminal Fragment of Urokinase in Pichia pastoris. J. Microbiol. Biotech. 2013, 23 (9), 1197–1205.

80.

Krainer

Dietzsch

Hajek

. Recombinant Protein Expression in Pichia pastoris Strains with an Engineered Methanol Utilization Pathway. Microb. Cell Fact. 2012, 11, 22.

81.

Diehl

Hoffmann

T. M.

Mueller

N. C.

. Novel Yeast Bioassay for High-Throughput Screening of Matrix Metalloproteinase Inhibitors. Appl. Environ. Microbiol. 2011, 77, 8573–8577.

82.

Jäger

Büssow

Wagner

. High Level Transient Production of Recombinant Antibodies and Antibody Fusion Proteins in HEK293 Cells. BMC Biotech. 2013, 13 (1), 52.

83.

Meyer

Lorenz

Baser

. Multi-Host Expression System for Recombinant Production of Challenging Proteins. PloS One 2013, 8 (7), e68674.

84.

Chapple

Crofts

Shadbolt

. Multiplexed Expression and Screening for Recombinant Protein Production in Mammalian Cells. BMC Biotech. 2006, 6, 49.

85.

Zhao

Bishop

Clay

. Automation of Large Scale Transient Protein Expression in Mammalian Cells. J. Struct. Biol. 2011, 175 (2), 209–215.

86.

Parrill

A. L.

Baker

D. L.

Autotaxin Inhibition: Challenges and Progress toward Novel Anti-Cancer Agents. Anticancer Agents Med. Chem. 2008, 8 (8), 917–923.

87.

Song

Dilger

Bell

. Large Scale Purification and Characterization of Recombinant Human Autotaxin/Lysophospholipase D from Mammalian Cells. BMB Rep. 2010, 43 (8), 541–546.

88.

Nishimasu

Okudaira

Hama

. Crystal Structure of Autotaxin and Insight into GPCR Activation by Lipid Mediators. Nat. Struct. Molec. Biol. 2011, 18 (2), 205–212.

89.

Gierse

Thorarensen

Beltey

. A Novel Autotaxin Inhibitor Reduces Lysophosphatidic Acid Levels in Plasma and the Site of Inflammation. J. Pharm. Exper. Therap. 2010, 334 (1), 310–317.

90.

Niimi

Recombinant Protein Production in the Eukaryotic Protozoan Parasite Leishmania tarentolae: A Review. Meth. Molec. Biol. 2012, 824, 307–315.

91.

Breitling

Klingner

Callewaert

. Non-Pathogenic Trypanosomatid Protozoa as a Platform for Protein Research and Production. Protein Expr. Purif. 2002, 25 (2), 209–218.

92.

Phan

H. P.

Sugino

Niimi

The Production of Recombinant Human Laminin-332 in a Leishmania tarentolae Expression System. Protein Expr. Purif. 2009, 68 (1), 79–84.

93.

Gazdag

Cirstea

Breitling

. Purification and Crystallization of Human Cu/Zn Superoxide Dismutase Recombinantly Produced in the Protozoan Leishmania tarentolae. Acta Crystallograph. F 2010, 66 (Pt. 8), 871–877.

94.

Baechlein

Meemken

Pezzoni

. Expression of a Truncated Hepatitis E Virus Capsid Protein in the Protozoan Organism Leishmania tarentolae and Its Application in a Serological Assay. J. Virol. Meth. 2013, 193 (1), 238–243.

95.

Murray

C. J.

Baliga

Cell-Free Translation of Peptides and Proteins: From High Throughput Screening to Clinical Production. Curr. Opin. Chem. Biol. 2013, 17 (3), 420–426.

96.

Brödel

Sonnabend

Kubick

Cell-Free Protein Expression Based on Extracts from CHO Cells. Biotechnol. Bioengin. 2013, 111, 25–36.

97.

Garland

S. L.

Are GPCRs Still a Source of New Targets?

J. Biomol. Screen. 2013, 18 (9), 947–966.

98.

Hopkins

A. L.

Groom

C. R.

The Druggable Genome. Nat. Rev. Drug Discov. 2002, 1 (9), 727–730.

99.

Bernaudat

Frelet-Barrand

Pochon

. Heterologous Expression of Membrane Proteins: Choosing the Appropriate Host. PloS One 2011, 6 (12).

100.

Silvius

J. R.

Solubilization and Functional Reconstitution of Biomembrane Components. Ann. Rev. Biophys. Biomol. Struct. 1992, 21, 323–348.

101.

le Maire

Champeil

Moller

J. V.

Interaction of Membrane Proteins and Opioids with Solubilizing Detergents. Biochim. Biophys. Acta 2000, 1508 (1–2), 86–111.

102.

Zhou

Bowie

J. U.

Building a Thermostable Membrane Protein. J. Biol. Chem. 2000, 275 (10), 6975–6979.

103.

Sarver

Black

R. J.

Bridges

. Frontiers in HIV-1 Therapy: Fourth Conference of the NIAID National Cooperative Drug Discovery Groups-HIV. AIDS Res. Hum. Retroviruses 1992, 8 (5), 659–667.

104.

Witvrouw

Pauwels

Vandamme

A. M.

. Cell Type-Specific Anti-Human Immunodeficiency Virus Type 1 Activity of the Transactivation Inhibitor Ro5-3335. Antimicrob. Agents Chemother. 1992, 36 (12), 2628–2633.

105.

Hendricson

Matchett

Ferrante

. Design of an Automated Enhanced-Throughput Platform for Functional Characterization of Positive Allosteric Modulator-Induced Leftward Shifts in Apparent Agonist Potency in Vitro. J. Lab. Autom. 2012, 17 (2), 104–115.

106.

Noblin

D. J.

Bertekap

R. L.

Jr. Burford

N. T.

. Development of a High-Throughput Calcium Flux Assay for Identification of All Ligand Types including Positive, Negative, and Silent Allosteric Modulators for G Protein-Coupled Receptors. Assay Drug Dev. Technol. 2012, 10 (5), 457–467.

107.

Claing

Laporte

Caron

. Endocytosis of G Protein-Coupled Receptors: Roles of G Protein-Coupled Receptor Kinases and Beta-Arrestin Proteins. Prog. Neurobiol. 2002, 66 (2), 61–79.

108.

Dores

M. R.

Trejo

Ubiquitination of G Protein-Coupled Receptors: Functional Implications and Drug Discovery. Molec. Pharmacol. 2012, 82 (4), 563–570.

109.

Morello

J. P.

Bouvier

Palmitoylation: A Post-Translational Modification That Regulates Signalling from G-Protein Coupled Receptors. Biochem. Cell Biol. 1996, 74 (4), 449–457.

110.

Wheatley

Hawtin

S. R.

Glycosylation of G-Protein-Coupled Receptors for Hormones Central to Normal Reproductive Functioning: Its Occurrence and Role. Hum. Reprod. Update 1999, 5 (4), 356–364.

111.

Klenk

Post-Translational Modifications in Insect Cells. Cytotechnology 1996, 20 (1–3), 139–144.

112.

Bjorklof

Lundstrom

Abuin

. Co- and Posttranslational Modification of the Alpha(1B)-Adrenergic Receptor: Effects on Receptor Expression and Function. Biochemistry 2002, 41 (13), 4281–4291.

113.

Hanson

M. A.

Brooun

Baker

K. A.

. Profiling of Membrane Protein Variants in a Baculovirus System by Coupling Cell-Surface Detection with Small-Scale Parallel Expression. Protein Expr. Purif. 2007, 56 (1), 85–92.

114.

Aloia

A. L.

Glatz

R. V.

McMurchie

E. J.

. GPCR Expression Using Baculovirus-Infected Sf9 Cells. Meth. Molec. Biol. 2009, 552, 115–129.

115.

McCusker

E. C.

Bane

S. E.

O’Malley

M. A.

. Heterologous GPCR Expression: A Bottleneck to Obtaining Crystal Structures. Biotechnol. Prog. 2007, 23 (3), 540–547.

116.

Asada

Uemura

Yurugi-Kobayashi

. Evaluation of the Pichia pastoris Expression System for the Production of GPCRs for Structural Analysis. Microb. Cell Fact. 2011, 10, 24.

117.

Davenport

E. A.

Nuthulaganti

Ames

R. S.

BacMam: Versatile Gene Delivery Technology for GPCR Assays. Methods Mol. Biol. 2009, 552, 199–211.

118.

Kost

T. A.

Condreay

J. P.

Ames

R. S.

. Implementation of BacMam Virus Gene Delivery Technology in a Drug Discovery Setting. Drug Discov. Today 2007, 12 (9–10), 396–403.

119.

Scott

Kummer

Tremmel

. Stabilizing Membrane Proteins through Protein Engineering. Curr. Opin. Chem. Biol. 2013, 17 (3), 427–435.

120.

Serrano-Vega

Tate

Transferability of Thermostabilizing Mutations between Beta-Adrenergic Receptors. Molec. Memb. Biol. 2009, 26 (8), 385–396.

121.

Stevens

R. C.

Cherezov

Katritch

. The GPCR Network: A Large-Scale Collaboration to Determine Human GPCR Structure and Function. Nat. Rev. Drug Discov. 2013, 12 (1), 25–34.

122.

Siu

F. Y.

de Graaf

. Structure of the Human Glucagon Class B G-Protein-Coupled Receptor. Nature 2013, 499 (7459), 444–449.

123.

Koth

C. M.

Murray

J. M.

Mukund

. Molecular Basis for Negative Regulation of the Glucagon Receptor. Proc. Natl. Acad. Sci. USA 2012, 109 (36), 14393–14398.

124.

Rasmussen

S. G.

Choi

H. J.

Fung

J. J.

. Structure of a Nanobody-Stabilized Active State of the Beta(2) Adrenoceptor. Nature 2011, 469 (7329), 175–180.

125.

Rasmussen

S. G.

DeVree

B. T.

Zou

. Crystal Structure of the Beta2 Adrenergic Receptor-Gs Protein Complex. Nature 2011, 477 (7366), 549–555.

126.

Steyaert

Kobilka

B. K.

Nanobody Stabilization of G Protein-Coupled Receptor Conformational States. Curr. Opin. Struct. Biol. 2011, 21 (4), 567–572.

127.

Muyldermans

Baral

T. N.

Retamozzo

V. C.

. Camelid Immunoglobulins and Nanobody Technology. Vet. Immunol. Immunopathol. 2009, 128 (1–3), 178–183.

128.

Nath

Atkins

W. M.

Sligar

S. G.

Applications of Phospholipid Bilayer Nanodiscs in the Study of Membranes and Membrane Proteins. Biochemistry 2007, 46 (8), 2059–2069.

129.

Denisov

I. G.

Grinkova

Y. V.

Lazarides

A. A.

. Directed Self-Assembly of Monodisperse Phospholipid Bilayer Nanodiscs with Controlled Size. J. Am. Chem. Soc. 2004, 126 (11), 3477–3487.

130.

Timothy

H. B.

Yelena

V. G.

Stephen

G. S.

Self-Assembly of Discoidal Phospholipid Bilayer Nanoparticles with Membrane Scaffold Proteins. Nano. Lett. 2002, 2, 853–856.

131.

Osenkowski

Wang

. Direct and Potent Regulation of Gamma-Secretase by Its Lipid Microenvironment. J. Biol. Chem. 2008, 283 (33), 22529–22540.

132.

Service

R. F.

Structural Genomics. Tapping DNA for Structures Produces a Trickle. Science 2002, 298 (5595), 948–950.