Abstract
In recent years, a number of peptides containing a cyclic structural fold have been described. Among them, the cyclotides family was widely reported in different plant tissues, being composed of small cyclic peptides containing 6 conserved cysteine residues connected by disulfide bonds and forming a cysteine-binding cyclic structure known as a cyclic cysteine knot. This structural scaffold is responsible for an enhanced structural stability against chemical, thermal, and proteolytic degradation. Because of the observed stability and multifunctionality, including insecticidal, antimicrobial, and anti-HIV (human immunodeficiency virus) action, much effort has gone into trying to elucidate the structural-function relations of cyclotide compounds. This review focuses on the novelties involving gene structure, precursor formation and processing, and protein folding of the cyclotide family, shedding some light on molecular mechanisms of cyclotide production. Because cyclotides are clear targets for drug development and also biotechnology applications, their chemical synthesis, heterologous systems production, and protein grafting are also addressed.
The cyclotides are a family of cyclic peptides that range between 28 and 37 amino acid residues and are commonly found in plant species from the Violaceae, Rubiaceae, Curcubitaceae, Poaceae, and Fabaceae families. 1 –4 This protein group is characterized by a peculiar cyclic structure, with the absence of the N- and C-termini. In this family, 6 conserved cysteine residues are involved in the formation of 3 disulfide bonds. This unusual structure combination forms the motif known as a cyclic cystine knot or simply CCK. 5 –8 This motif provides cyclotides with structural rigidity and a remarkable stability to chemicals, temperature, and enzymatic degradation. 9
The cyclotides are divided into 3 subfamilies: Möbius, bracelets, and trypsin inhibitors. The main difference between the Möbius and bracelets is the presence of a conserved proline residue in cis conformation at loop 5 of the Möbius subfamily, causing a 180° twist in the loop, which is not seen in the bracelet subfamily. 1,10 –12 This conformation occurs because of the presence of a tryptophan residue (Trp19) that precedes proline (Pro20). Thus, a hydrophobic interaction occurs between the proline pirolidinic ring and the side chain of the tryptophan aromatic ring, causing a cis-proline conformation (cis-Pro). 1,11 –14 Moreover, the trypsin inhibitors subfamily is currently composed of only 2 members (isolated from Momordica cochinensis seeds), named MCoTI and MCoTII (Figure 1 ), 12,15 which generates discussion because these 2 peptides do not show a significant primary structure homology to other cyclotides, despite the presence of the CCK structural motif. 11,16 In addition to the 3 subfamilies mentioned, hybrid cyclotides have already been reported, such as kalata B8, extracted from Oldenlandia affinis, which in its loops 2 and 3 shows a striking resemblance with the same loops of Möbius kalata B1 and B2. Otherwise, loop 5 is similar to bracelets circulin A and B. Finally, loop 6 shows the same isomerization observed at MCoTI-II. Besides being structurally hybrid with properties of all families, kalata B8 shows a higher hydrophobic character in comparison to other cyclotides synthesized by O affinis. 12,17 Moreover, as will be further described below, gene construction also seems to be variable for several cyclotides, and all these properties could lead to functional promiscuity, which is defined as a single structure having multiple functions. 18 Since the discovery of kalata B1 with uterotonic activity, cyclotides have been characterized by multiple activities. Among these actions, insecticidal, 2 nematicidal, 19 molluscicidal, 20 bactericidal, 21 antiviral, 22 antitumoral, 23 hemolytic, 3 and antifungal 21 activities have been highlighted. These activities are seen in different species (Table 1 ). Finally, because of the high stability and multifunctionality, the cyclotides framework and its use as a potential scaffold for drug design will also be discussed.

Three-dimensional structures of (A) bracelet cycloviolacin O1 (pdb:1DF6), featuring small 310 helix in loop 3; (B) Möbius kalata-B1 (pdb:1NB1); and (C) MCoTI (pdb:1HA9); 6 Cys-conserved residues involved in disulfide bond formation are marked light shade
Multiple Examples of the Biological Activity of Cyclotides
Modulation of Cyclotide Gene Expression
Despite the well-known functional promiscuity of cyclotides, their distribution and molecular variety in the plant kingdom have just begun to be explored during the last 2 decades. The extraordinary diversity of this peptide family within a single species was previously demonstrated by Trabi and Craik 24 who showed that the Violaceae Viola hederacea was able to synthesize at least 50 different cyclotides. This surprising number suggests that the widespread diversity of cyclotides in a single organism could be related to a wide range of different biological functions. Studying the relations between structural diversity and broad bioactivities, multiple cyclotides were isolated from V odorata 3 —7 Möbius and 6 bracelet members—revealing huge differences among them in their ability to cause hemolysis (cycloviolacin O24 caused 75% hemolysis, whereas cycloviolacin O14 caused only 11% of the same activity). However, no proteolytic degradation by pepsin, trypsin, and thermolysin was observed for any cyclotide tested (cycloviolacin O13, O14, O16, and O24), in keeping with the framework stability. This work revealed that the sequence variation can modulate a differential activity between these cyclotides, even with a single residue replacement, by variations in the peptide epitopes, but the overall structure of the CCK motif remains the same. 3,24
Aiming to understand cyclotide genic construction and to explain why individual plants typically express multiple cyclotides, several studies have been performed. In a pioneering work, Jennings et al 25 studied cyclotide gene expression in O affinis by Northern blot hybridization analysis and a greater expression level of kalata B1 mRNA was observed in young leaves than in mature leaves. Otherwise, kalata B1 expression was relatively low in roots, as was the expression of several different cyclotides. 25 The diversity of cyclotides and their expression in different plant tissues was also investigated by Trabi and Craik 24 studying leaves, petioles, flowers, pedicels, roots, and bulbs from V hederacea. In addition, Simonsen et al 7 revealed the variation in cyclotide expression profiles caused by climatic changes and seasonal fluctuations, where the same plant species can express different cyclotides in different locations; also, they observed a higher production of some cyclotides in the warm season, whereas others could only be found during winter. 7,24 Plan et al 26 also observed different cyclotide synthesis patterns in O affinis, in which kalata B1 and kalata B2 showed a constitutive expression, whereas kalata B5 was produced only under specific conditions, providing evidence of a possible expression modulation caused by differential physiological functions of each cyclotide in the plant’s needs. 26
Possibly, multiple cyclotide expression occurs as a result of the presence of numerous cyclotide genes, combining single genes encoding different cyclotides with several genes encoding cyclotides in tandem repeats, where cyclotide expression can be modulated by several mechanisms. 25,27 Regulation factors of cyclotide synthesis could comprise differences in mRNA splicing and translation or differences in the precursor protein processing to produce several and different mature cyclotides. 23,28
Thus, differential cyclotide expression and its multiple activities, as previously described, suggest that the production of some cyclotides could be modulated by biotic and/or abiotic stresses, with possible recognition of infective agents and protection against pathogens, implying an important role in the plant’s defense system. 14,24,26,29
General Structure of Cyclotide Precursors
Cyclotides are multidomain gene-encoded peptides with ribosomal biosynthesis of a precursor protein, and they have a unique genetic construction. 30,31 Their expression initially occurs with the synthesis of a linear precursor protein containing, on average, 100 to 200 amino acid residues, according to the Cyclic Proteins Database (CyBase). 32 The precursor proteins usually have the same basic and highly conserved organization, which is essentially formed by (1) an initial endoplasmic reticulum signal domain, with 18 to 30 residues; (2) an N-terminal prodomain with 22 to 55 residues and relatively representing the region with the most precursors; (3) an N-terminal repeat domain with 16 to 20 residues; (4) a mature cyclotide domain with 28 to 37 residues; and (5) a hydrophobic C-terminal tail region with 3 to 11 amino acids (Figure 2A). 29 –31,33,34

A. Basic general construction of a linear cyclotide precursor protein. Precursors can be separated into 2 distinct parts: nonrepeated region, involving the ER-Signal and the NTPD , and repeated region, comprising the NTR , the MCD , and the CTR , which might be present 1 to 3 times, containing equal or different cyclotide domains. In the MCD, there is a highly conserved pattern of 6 cysteine residues, found in every cyclotide sequence known until now, which is responsible for the disulfide bond formation and represents the key to the CCK motif structuration. B. Schematic comparison among the diversity on genic construction of full-length precursors elucidated from Rubiaceae, Violaceae, and Fabaceae species. The representation shows the comparative real size of each precursor in the amino acid range, where the longest is Oak4 with 210 residues and the shortest is Mra13 with just 100 residues. Commonly, Rubiaceae and Violaceae cyclotide genes express mRNAs coding for precursor protein with the same domain structuration (A), but not all precursors display the same construction. Whereas most have single- or multicyclotide domains (with possible NTR/MCD/CTR repetitions), some demonstrate unique variations, such as the absence of specific regions (like Vov1 and Mra13, without the CTR because of a mutation in the C-termini generating a stop codon) or the presence of particular regions (like Tricyclon and Mra4, with large CTRs after the first cyclotide domain). Recently discovered Fabaceae precursors also showed discrepant divergence from the previously known ones, presenting a chimeric construction of the cyclotide gene, fusing a cyclotide and an albumin domain. These novel precursors have an ER-Signal directly followed by the cyclotide domain, linked to an albumin domain (albumin1 a-chain) and a CTR, devoid of the typical NTPD and NTR, sharing the same arrangement of the albumin1 genes in the legume family, replacing the PA1b-chain
In general, the precursors have 2 distinct regions: one that does not repeat itself, formed by the endoplasmic reticulum signal domain and N-terminal prodomain, which appear only once in the precursors, and another that can be repeated up to 3 times, covering the N-terminal repeat domain, mature cyclotide domain, and C-terminal tail region, where each repetition will result in 1 mature cyclotide that could be identical or different. 32,35
All the cyclotide precursors start with an endoplasmic reticulum signal domain, located at the N-terminal, which comprises a conserved sequence that directs the protein for a secretory pathway to the endoplasmic reticulum where the precursor splicing starts. 25 The second part of the precursor comprises the N-terminal prodomain, which presents poor sequence or length conservation and large variability among the cyclotides in terms of size and amino acid identity. Until now, the N-terminal prodomain has no known function in precursor biosynthesis. 25,27 The next domain is the N-terminal repeat domain, with high sequence conservation and similarity among molecules from the same species, while presenting low sequence identity in comparison with cyclotides from different species. N-terminal repeat domain function, as well as that of the C-terminal tail region (a small hydrophobic tail that comes after the mature cyclotide domain), probably involves the excision and cyclization of the mature cyclotide domain, flanking and signaling the cleavage site for enzymes involved in the precursor processing through conserved amino acids. 29 –31 The mature cyclotide domain presents a great variability in sequence but has a highly conserved pattern of cysteine residues (Figure 2A), which exists within all cyclotides described up to now. 32,33,36
Searching for Cyclotide Precursors
Primarily, extraction, purification, and sequencing of cyclotides provide the first data about peptide amino acid content, and then, based on known cyclotide sequences, transcript analysis can be performed, where primers can be designed for the isolation of cyclotide cDNA by reverse transcriptase-polymerase chain reaction, allowing cDNA library screening. 25,27,29 Knowledge about this cyclic family has grown exponentially through RACE experiments (Rapid Amplification of cDNA Ends), empowering the scan for novel and full-length precursors. Consequently, new approaches based on the knowledge of full-length precursors are being applied, where conserved elements are identified in the endoplasmic reticulum signal domain (ie, the amino acid sequence AAFALPA, present in various Violaceae species) and recognized as targets for primer design, which allow the search for semifull precursors without cDNA library screening. 7,25,27,29,37
Besides, through bioinformatics, analyses have been carried out, increasing the search for new cyclotides, precursors, or cyclotide-containing plants. Through regular expression and similarity patterns, mature cyclotide domains have been exhaustively searched in public genomic databases, where several cyclotide-like precursor proteins were found in Poaceae species. 38 Indeed, this technique is very helpful in the postgenomic era because several genes of unknown function are deposited in public databases. To date, there are 135 440 924 nucleotide sequences deposited in GenBank, many of them defined as hypothetical sequences, as is the case of Poaceae cyclotide-like precursor sequences.
In addition, recently, genetic transcript analysis by sequencing expressed sequence tags from O affinis leaves was performed and symbolized the first significant sequencing effort to discover novel precursors (ie, O affinis kalata 5 [Oak5], which produces the kalata B19 cyclotide) as well as finding new proteins potentially involved with cyclotide precursor processing. 35 This approach represents a step forward in the knowledge about cyclotides, helping us to understand the basis of their molecular diversity, genetic construction, and biosynthesis.
Currently, there are about 100 precursor sequences deposited in CyBase, including complete precursors (which have the entire translated sequence, from the initial methionine to the last amino acid of the protein), partial precursors (which have incomplete amino acid sequences), and hypothetical precursors (which were discovered through bioinformatics analysis), representing a huge advance over the past decade. 32 Thus, in the coming years, with the development of biotechnological approaches, it is expected that other unknown molecules will be revealed as well as new proteins or enzymes that could be involved in processing the precursor, which represents a significant effort toward improving the understanding of cyclotide biosynthesis.
Diversity in Precursor Structures
Cyclotide precursors might contain 1 to 3 mature cyclotide domain copies, containing the same mature peptide or different peptides, with the possibility of observing several types of precursors in the same organism. O affinis (Rubiaceae) was the first source of cyclotide isolation and also had the first cyclotide precursor elucidated, expanding research to the genic level. This member is currently known to have an enormous variability of cyclotide precursor proteins, presenting 3 kinds of precursors: (1) some with only 1 cyclotide domain, such as Oak1, Oak3, and Oak5, which produce kalata B1, kalata B7, and kalata B19, respectively; (2) a precursor with 2 different cyclotide domains, such as Oak2, which has 1 kalata B6 domain followed by 1 kalata B3 domain; and (3) the precursor with 3 equal domains, Oak4, containing 3 domains of the cyclotide kalata B2 (Figure 2B). 25,27,28,35,39
After studies with O affinis cyclotide precursors, other species began to be screened. Precursors from the Violaceae family were also elucidated. Some of them present a single-domain precursor, such as the V odorata cycloviolacin precursors (Voc1, Voc2, and Voc3, which produce cycloviolacin O8, cycloviolacin O11, and cycloviolacin O13, respectively). Multidomain precursors have also been observed, such as the V odorata kalata 1 precursor (Vok1), which have 3 cyclotide domains, containing 1 kalata B1 interspersing 2 kalata S domains. 27,35 Moreover, Viola, Melicytus, and Gloeospermum species also show several monodomain and multidomain precursors, all of them having sequence similarities to previously described cyclotide precursors. 27,29,40 However, some uncommon variations can be observed in these precursors, as in the case of V odorata violacin A precursor (Vov1) and the M ramiflorus 13 precursor (Mra13), which do not have the C-terminal tail region domain because of a mutation that generates a stop codon at the end of the sequence, producing linear peptides. 3,40 Another interesting structural deviation (Figure 2B) from the usual precursors involves the extremely large C-terminal tail region that follows the first mature cyclotide domain from the multidomain V tricolor tricyclon precursor (Tricyclon) and the M ramiflorus 4 precursor (Mra4). 37,40 Besides, an important difference between the gene structure of Rubiaceae and Violaceae precursors is the presence of an intron in the signal peptide of Rubiaceae genes that is not present in the Violaceae genes. 41
In 2005, the first cyclotide-like precursor from a Poaceae species was observed in Zea mays at the transcript level, 42 and subsequently, Mulvenna et al 38 identified other precursors in Poaceae species by bioinformatics and expression analysis, including in crop plants such as rice, wheat, sorghum, and sugarcane. These precursors show some differences from precursors observed in eudicot species. They are minor, containing 80 to 90 amino acid residues in range and have no N-terminal repeat domain description, being composed of only the endoplasmic reticulum signal domain, the N-terminal prodomain, the mature cyclotide domain, and the hydrophobic tail. These precursors are split into 2 groups, differing in the size of the hydrophobic C-terminal tail region and the spacing between the 2 first cysteine residues. 37 The first group is characterized by a short C-terminal tail region and the typical spacing between the 2 first cysteines with 3 amino acid residues. The second group has 4 or 5 amino acid residues in loop 1 and a long hydrophobic tail. The putative mature sequence from these precursors might not be cyclic because of the absence of an asparagine or aspartate residue in the C-terminal, which seems essential for cyclization. 27,30 However, there are no reports of cyclotides from Poaceae at a protein level, which prevents a better understanding of these putative cyclotides. 38
Until very recently, single- and multidomain precursors had been identified from Rubiaceae, Violaceae, and Poaceae families. However, cyclotides isolated from Clitoria ternatea made this the first member of the Fabaceae family reported to have these peptides and led to its becoming a new target for further studies to understand the gene structure of Fabaceae cyclotide precursors. 43 Therefore, through transcript sequencing, a novel structure of the precursor protein was identified in C ternatea, called the C ternatea M precursor (CterM), 9 and subsequently, a suite of cyclotide precursors was also characterized in the same species (Ctc1 to Ctc14). 4 This Fabaceae precursor was completely different from others studied previously, showing a cyclotide domain fused into an albumin precursor. This domain could be derived from a modified double cyclotide domain precursor or a modified albumin-1 (A1, a Fabaceae Albumin) precursor, where the replacement of one of the domains of any precursor is extremely intriguing. This new chimeric precursor differs from the others found so far, in that it presents an unusual structure, neither having the N-terminal prodomain and N-terminal repeat domain before the mature cyclotide domain, nor having the C-terminal tail region after the mature cyclotide domain, displaying a configuration with just the endoplasmic reticulum signal domain followed directly by the mature cyclotide domain (replacing the A1b chain domain) and, after a short link region, the A1a chain domain, and ending the sequence with C-terminal tail region (Figure 2B). 4,9
This novel precursor structure reveals the complexity of cyclotide biosynthesis and could present multiple synthetic pathways for its production, expanding our understanding of genic architecture and evolution of cyclotides within flowering plants even more. Full-length precursors elucidated from Rubiaceae, Violaceae, and Fabaceae families are given in Table 2 as well as a comparison of the amount of amino acids and the number of cyclotide domains.
Full-length Precursors From Rubiaceae, Violaceae, and Fabaceae Species, Showing Wide Distribution of Single- and Multidomain Precursors in the Plant Kingdom, and a Comparison of the Amino Acid Amounts
Abbreviations: Oak, Oldenlandia affinis kalata; Vok, Viola odorata kalata; Voc, Viola odorata cycloviolacin; Vov, Viola odorata violacin; Mra, Melicytus ramiflorus; VbCP, Viola baoshanensis cyclotide precursor; Cter, Clitoria ternatea; ctc, Clitoria ternatea cyclotide; AA, amino acids; REF, references; #, the precursor number.
From Precursor Biosynthesis to Cyclotide Maturation
After synthesis, the linear precursor protein undergoes posttranslational processing steps, including the splicing and folding of the mature cyclotide domain. The molecular folding occurs by oxidation of cysteine residues and formation of disulfide bridges in the MCD. It is believed that this process is facilitated by a protein disulfide isomerase. 29,36,41 Reports about protein disulfide isomerase enzymes in previous studies have shown them to be responsible for increasing the folding yield of in vitro cyclotides, interacting with the cyclotide domain of the precursor protein; they were also able to eliminate misfolded cyclotides and refold them appropriately. 6,41 Besides, through transcriptional analyses, the same study showing the coexpression of protein disulfide isomerase and kalata B1 precursor provides strong evidence for the molecular interaction between them and lends support to the idea that essential protein disulfide isomerase catalyzes within the oxidative folding of cyclic cysteine knot proteins. 41
Protein disulfide isomerases are the most plentiful proteins in the endoplasmic reticulum, 44 representing enzymes with oxidoreductase functions and belonging to the thioredoxin protein family. They perform peptide folding in the cellular compartment of eukaryotes through oxidation and reduction reactions, capable of forming, breaking, or shuffling disulfide bonds between the cysteine residues. 41 This oxidative folding mechanism for cyclotides was initially identified in the Rubiaceae family (mainly described for various kalata precursors processed from O affinis) but has recently also been elucidated in the Violaceae family. 29,41
Moreover, in addition to the folding process, precursor processing also passes through a step of splicing with cleavage and separation of the MCD from the rest of the precursor, possibly involving an asparaginylendopeptidase. The asparaginylendopeptidases are plant vacuolar enzymes known to function in secretory pathways that play a crucial role in proteolytic cleavage of asparagine or aspartic acid residues and are also capable of forming peptide bonds. 31,45
The literature reveals the enzymatic action of an asparaginyl-endopeptidase in the cleavage of the asparagine or aspartic acid residue between the cyclotide domain and the C-terminal by a conserved residues sequence (XXNGLP) recognition, 45 releasing the C-terminal tail region, and simultaneous cleavage of the last residue from the N-terminal repeat domain (lysine, glycine, or asparagine). The enzyme is an intermediary in MCD excision from the precursor and, furthermore, makes the cyclotide domain backbone cyclization take place through a peptide bond between the C-terminal and N-terminal. 29,36
Indeed, the presence of a highly conserved Asn/Asp residue on the C-terminal seems to be crucial for cyclization across all known precursors. 27 Until 2006, the understanding of the cyclization mechanism was extremely limited because of the absence of a natural linear version of cyclotides. However, Ireland et al 3 found the first linear cyclotide: violacin A. This cyclotide has a mutation on the C-terminus that stops translation prematurely. Therefore, the translation of an Asn residue and the C-terminal hydrophobic tail does not occur, producing a linear mature cyclotide. 3 Moreover, previous work had shown that plant asparaginylendopeptidase inhibition by gene knocking down with virus-induced gene silencing causes a significant reduction in this enzyme’s activity, leading to linear cyclotide species accumulation. 30
After cleavage of the mature cyclotide domain, backbone cyclization, and oxidative folding, it is easy to note the conserved molecular patterns of peptide tridimensional structure, like the CCK motif, with β-sheet secondary structures and several loops. 29,36,41,45 In summary, the current model for natural processing of cyclotide precursors suggests that oxidative folding occurs prior to cyclization in the endoplasmic reticulum, where the oxidation of cysteine residues and formation of disulfide bonds within the endoplasmic reticulum happens primarily, with the excision and cyclization of the cyclotide domain from the precursor in a secretory pathway taking place later (Figure 3 ). 41 However, the details of precursor processing, including the order of events, are not fully understood. It is possible that, in planta, the oxidation and folding of the precursor protein may occur first, with the mature peptide being subsequently excised and cyclized. 29,33

Linear Oak1 cyclotide precursor processing in O affinis: proposed molecular model for in vivo formation of kalata B1 cyclotide comprises oxidative folding with disulfide bond formation, through a redox system mediated by a protein disulfide isomerase, approximating the N- and C-terminals. Later, excision of the cyclotide domain from precursor protein and its cyclization occurs through an asparaginylendopeptidase (AEP) catalysis, generating the native peptide cyclic backbone and constituting the CCK motif
Structural Features of Cyclotides
So far, information on the 3-dimensional structure of cyclotides has been obtained by nuclear magnetic resonance. 17 Most cyclotides have a triple-stranded β-sheet and several turns. The first vertex is located between residues Thr15, Cys16, Asn17, and Thr18, and the second vertex is found between residues Thr21, Cys22, Ser23, and Trp25, with both vertices connected by β-turns (as seen in Figure 1). 17,46,47 Furthermore, some bracelets, such as cycloviolacin O2, could also exhibit a small 3-10 helix in loop 3, involving residues Trp12, Ile13, Pro14, Cys15, and Ile. 18,47,48 These structural elements, inside the CCK motif, are stabilized by a hydrogen interaction network and also by hydrophobic contacts, in addition to conserved disulfide bonds, thus providing exceptional peptide stability. 46,47
Among Möbius cyclotides, there is virtually no variation in loop length. 39 On the other hand, bracelets often show loop size variation. 7 The sequence alignment analyses (Figure 4 ) of the Möbius and bracelets show a series of conserved residues, including the cysteines. The cyclotides have 2 highly conserved loops in terms of numbers and types of amino acid residues, such as loops 1 and 4, which are particularly highly conserved in bracelets and the Möbius because loops seem to be extremely important for functional properties. Loop 1 shows only 3 amino acid residues, including the conserved Glu8 that seems to be involved in cyclic structure stabilization by the formation of hydrogen bonds with residues Asn16 and Thr17 in loop 3. 46,47 For the bracelet cycloviolacin O2, the Glu8 present in loop 1 makes hydrogen interact with Thr16, Val18, and Thr19 residues located in loop 3, stabilizing the small 3-10 helix. 47 Furthermore, loop 4 consists of a single amino acid residue, normally a Ser25 or Thr25, which could be directly related to disulfide bond linking. 34,39 Otherwise loops 2, 3, 5, and 6 are not conserved in terms of amino acid residue number. Loop 2 shows around 4 to 5 residues, at the position 11 in loop 2, composed of a hydrophobic region in both bracelets and the Möbius (Val, Tyr, Phe, Thr, and Iso). 17 It is known that the conserved Gly12 in some Möbius cyclotides interacts with Asp16 and Thr17, through hydrogen bonds, helping stabilize the structure. 47 However, in bracelets, this position shows hydrophobic residues (examples being Trp, Tyr, and Phe). 47 Gly13 is much more common in the Möbius. Moreover, Gly2, Gly13, Gly19, and Gly23 adopt a positive φ angle, which is crucial for the establishment of the disulfide bond between CysI and CysIV in loop 3. They thus form the molecule in its lowest ring, displacing the glycine residue present because the compaction allows the connecting disulfide to penetrate the cyclic core of the molecule, forming the CCK. 2,47

Primary structures of multiple alignment cyclotide subfamilies: A. Bracelets, B. Möbius, and C. Trypsin inhibitors. Conserved Cys residues are shown in gray and represented by Roman numerals I to VI. The black lines show disulfide bond connectivity
In loop 3 of the Möbius, around 4 amino acid residues are observed, but in bracelets, there are around 7 to 8 residues. At positions 21 and 22 of this loop in some bracelet cyclotides, residues seem to be involved in a small 3-10 helix (Lys, Val, Ile, Ala, and Phe). 47 Loop 5 from the Möbius is composed of 4 to 5 amino acid residues, whereas only a 4-residue length was observed for bracelets. In loop 6, around 2 to 8 amino acid residues are observed, and the Arg29 present in most Möbius cyclotides and some bracelets is directly related to the activities of these peptides because of the intrinsic positive charge. 47 This loop is also observed conserving a residue of Ans or Asp, at position 30, because these residues are cleavage sites of the mature cyclotides during biosynthesis. 25 In addition to these residues, there is also Pro4 in the Möbius and Pro6 in bracelets; these induce twisting of the structure, which could help in backbone cyclization. 46
Insecticidal, Molluscicidal, and Anthelmintic Activities
One of the main biocidal activities of cyclotides is related to the potency with which these peptides can control insect pests, making them part of the vast arsenal of plant defense compounds. Initially, it was believed that insecticidal activity could be directly related to inhibition of the insect’s digestive enzymes such as trypsin, chymotrypsin, and α-amylase. 25 However, kalata B1 studies showed that this insecticidal cyclotide was unable to affect the digestive enzymes of pests. On the other hand, kalata B1 is capable of changing the cell lining morphology from the insect’s intestinal tract, causing edema formation and cell lysis, suggesting a direct effect against insect cells. 49 Moreover, kalata B1 seems to disrupt the plasma membrane of insect epithelial cells by binding to specific receptors, probably forming holes or pores in the lipid bilayer and leading to cell lysis and damaged columnar and goblet cells. 49 Specific receptors for cyclotides with insecticidal activity have not yet been reported, their activity being related to the interaction of exposed hydrophobic residues with nonpolar membrane lipids commonly found in the insect’s intestinal tract cells. 49 –51 Studies with kalata B1 have shown that it binds to the dodecylphocholine (DPC) micelles through hydrophobic residues, including Trp19, Pro20, and Val21 in loop 5; Leu27, Pro28, and Val29 in loop 6; and Val6 in loop 2. The Arg24 and Glu3 residues are responsible for electrostatic interactions by binding to phosphatidylethanolamine exposed on the surface of DPC. 51 Surface plasmon resonance studies were also performed to evaluate the kalata B1–lipid interactions. These data show that cyclotides can bind to phosphatidylethanolamine, which is commonly found on the surface of insect–pest membranes. Based on studies that show the trend of cyclotides to form tetramers and octamers, 49 it is likely that the mechanism of action involves an accumulation of molecules on the membrane surface, where it forms multimeric structures that generate pores, resulting in disruption of the membrane and change in osmotic balance. These effects in turn lead to lower absorption of food, low larval development, and increased insect mortality. 52,53 In the morphological study previously described by Barbeta et al, 49 cyclotide intake affected Helicoverpa armigera and H punctigera midguts, causing severe disruption of the microvilli, swelling, and ultimately rupture of the gut epithelium cells.
Molluscicidal activity was also observed for several cyclotides such as cycloviolacin O2 and kalata B1, B2, B7, and B8. All of them caused deleterious effects on golden apple snail (Pomacea canaliculata), which is a predator in rice (Oryza sativa) crops. 20 The mechanism of action against the molluscicidal cyclotides has not been elucidated yet. Nevertheless, excessive mucus secretion and an improvement in the mode of retraction of snails into their shells in the presence of cyclotides indicate a process similar to molluscicide metaldehyde toxicity. 20 Metaldehydes and cyclotides cause damage to the skin, walls, and mucocytes of the shellfish’s digestive tract, also leading initially to excessive mucus secretion, followed by changes in energy metabolism. 54 It is unclear how metaldehyde causes cell disintegration and improvement of mucus secretion, but it seems that cyclotides apparently adopt similar mechanisms.
In recent studies, it was found that kalata B1, B2, B6, B7, and cycloviolacin O2 also showed anthelmintic activities against nematodes Haemonchus contortus and Trichostrongylus colubriformis, which attack the ruminant’s gastrointestinal system. It was suggested that the cyclotide activity may occur as a result of interaction with the nematode membrane’s outer lipid-rich epicuticle. 55 –57 Investigating this suggestion, it was found that modified forms of kalata B1, produced through insertion of lysine residues in several loops, caused an enhanced nematicidal activity, especially with a triple insertion of lysine (G1/K1, T20/K20, and N29/K29) in loops 1, 3, and 6. 56 These results indicated the structural regions involved in nematicidal activity and suggested that the inclusion of epitopes in loops helps in interaction with the membrane. 56,57
Antimicrobial Activity
Bactericidal activity is an important function related to the cyclotide family. The cyclotides circulin B and cyclopsychotride A show antimicrobial activities against Escherichia coli and Pseudomonas aeruginosa. 21,58 Moreover, kalata B1 showed activity against Proteus vulgaris and Klebsiella oxytoca and against the fungi Candida albicans, C tropicalis, and C kefyr, 21 but the addition of salt (100 mM NaCl) caused significant reduction in this activity, which suggests that antimicrobial function may be dependent on electrostatic interactions with the bacterial membrane. 59 Studies using synthetic membranes showed that kalata B1 and similar analogues interact with synthetic membranes containing dodecylphosphocholine, possibly through affinity between loops 5 and 6, hydrophobic residues, and hydrophobic lipids. 56 Cycloviolacin O2 shows activity against Salmonella enterica, E coli, P aeruginosa, and multidrug-resistant Klebisiella pneumoniae strains. 10
Anti-HIV (Human Immunodeficiency Virus) Activity
Several cyclotides, such as VHL-1 found in V hedareacea, the cycloviolins A to D isolated from Leonia cymosa, palicourein isolated from Palicourea condensata, kalata B1 and B8 isolated from O affinis, and circulins A and B isolated from Chassalia parvifolia showed deleterious activities against HIV. 60 –62 Anti-HIV activity has been related to hydrophobic characteristics of the cyclotide surface in the Möbius and bracelet subfamilies. 60 It is not clear how previously described peptides act. 60,63 However, the increase in the cytoprotective effect of cells infected with HIV virus, in addition to a decrease in infectious viral particles, suggests that the protective effect of cyclotides could occur before the entry of the virus into the host cell. 64,65 It seems that the activity of the Möbius cyclotides could occur by binding the 2 hydrophobic loops 5 and 6 with dodecylphosphocoline present in the micelle. The bracelet loops that have been connected with this binding to the micelle are loops 2 and 3. 3,51 However, it is too early to suggest how cyclotides interact with the virus or the cell membrane target or both, making it difficult to use such peptides for the development of therapies against HIV. 65,66
Cancer Cell Toxicity
Cyclotides also show cytotoxic activities against several human cancer cell lines. 23,65,67 The cycloviolacins and vitri-A demonstrated activity against lymphoma and myeloma cell lines in ranged concentrations between 0.96 and 5.0 μM, with potency similar to the chemotherapy drugs used in cancer treatments. 23,67 Studies using kalata B1 analogues, with introduction of the sequence Arg-Arg-Lys-Arg-Arg-Arg in loops 2, 3, 5, and 6, were performed. However, only the substitution of sequence Asn16-Thr17-Pro18-Gly19 and loop 3 by Arg-Arg-Lys-Arg-Arg-Arg was more efficient in anticarcinogenic activity, showing lower hemolytic effects. 48 The best-studied cyclotide that shows antitumor activities is cycloviolacin O2, which presented cytotoxic activities against 10 different cancer cell lineages in a dose-dependent manner. 29,56 Small changes in amino acid sequence altered cycloviolacin activity, which shows that it can be manipulated to produce powerful and safer drugs. 68 For example, modifications in cycloviolacin O2, including the acetylation of Lys residues as well the binding of 1,2 cyclohexanedione to the guanidine group of Arg residues, caused a 7-fold decrease in cytotoxicity potential. Moreover, the etherification of the Glu3 side chain with acetyl chloride, mainly found in loop 1, resulted in a real decrease in the same activity by 48 times. 68
Chemical Synthesis and Heterologous Expression
Plant cyclotides are a promising class of biomolecules for both agriculture and the pharmaceutical industry, as widely described. However, a real problem with this class is the low amount of protein recovery from natural extracts. 33,69 Moreover, for studies of structure–activity relationships, analogues and mutants are needed, which could be obtained mainly through chemical synthesis, semisynthesis, or heterologous expression. 69 –71
The initial approach used in cyclotide chemical synthesis is a thioester-mediated strategy in which linear cyclotide precursors are synthesized with a cysteine residue at the N-terminus and a thioester linker added to the C-terminus. These modifications were made in order to provide the cyclization process. The first thia zip reaction step consists of forming a thiolactone in C-termini by nucleophilic attacks, where the electrons from the carbonyl group attack the sulfur from the C-terminal thioester linker, followed by an attack of CysVI on the carbonyl group, making a thiolactone. In the next step, the electrons from the carbonyl group attack the sulfur from CysVI, releasing it and making a new thiolactone with CysV. The process occurs over and over, until the formation of a thiolactone with the CysI (the first N-termini residue). Moreover, the disruption of thiolactone entails a NH3 + attack on the carbonyl group, making a peptide bond and finally cyclizing the molecule. 71,72 After the thia zip reaction, the cyclic peptide undergoes oxidative folding. Through this method, about 50% of synthesized molecules adopt the native conformation, whereas the others adopt misfolded conformations. However, the productivity of synthesis can be improved to more than 75% by reducing and refolding the nonnative products. 71
Cyclotides can also be synthesized by conventional peptide bond chemistry by guiding the oxidation prior to cyclization, as used in the synthesis of kalata B1, 72 and can be readily synthesized in vitro by using an adaptation of native chemical ligation technology. 73 In this procedure, the backbone is initially cyclized and after that, the oxidative folding reaction may occur. 70
The bacterial heterologous expression method also has some sequence modifications to permit cyclization. The sequence must be changed to ensure that the first residue is a cysteine. Then, this modified sequence is fused to a methionine at the N-terminal and to a modified intein at the C-terminal. Intein is an internal protein domain, 74 which permits effective protein purification by using its inducible self-cleaving potential fused to an affinity domain, for example, a chitin-binding domain and polyhistidine tag. Intein-mediated protein splicing is an autocatalytic process and needs no auxiliary enzymes. Then, chimeric protein can be arrested on the column. 75 The intein system excludes the lack of exogenous proteases or chemicals necessary for other fusion systems to eliminate the carrier. 75 The modified intein provides the thioester at the C-terminal. In this method, 2 steps are required before the thia zip reaction: the first is methionine removal by a Met aminopeptidase, and the second is the N-S acyl shift in the C-terminal. After these steps, the thia zip reaction occurs, leading to cyclization. Subsequently, the cyclized product folds spontaneously into the cytoplasm, using the E coli Origami 2 (DE3) strain, an engineered cell line able to promote disulfide bridge formation. This approach was successfully applied to expression of MCoTI and MCoTII. 70
A third method to obtain mature cyclic products is the chemoenzymatic synthesis, where, unlike the previous methods, the C-terminus thioester is unnecessary. In this strategy, the oxidative process occurs first and then the cyclization. However, a sequence modification is needed, requiring a lysine or phenylalanine at C-termini, necessary for cyclization to take place. After peptide synthesis and folding, cyclization is mediated by a serine protease (trypsin or chymotrypsin), mimicking the biosynthetic mechanism because proteases can be induced to synthesize a peptide bond if the reaction conditions are controlled. This is the reason for the need for a specific residue at the C-terminus. 76
Other than these methods, efforts have been made to construct transgenic plants for cyclotide production. 30,31 The creation of recombinant proteins in plant cell cultures has advantages over creation in traditional microbial and/or mammalian host systems: for example, their intrinsic safety, cost-effective bioprocessing, and the facility for posttranslational modifications. Gillon et al 31 have shown that non–cyclotide-producing plants can express cyclotide genes when transformed. The cyclotide kalata B1 was created from a precursor of about 11 kDa. The Oak1 cDNA was transported into Arabidopsis thaliana and Nicotiana tabacum to prove that circular peptides could be produced by plants that are not from the Rubiaceae or Violaceae families and that do not clearly produce cyclotides. This work shows that transgenic plants have the biochemical machinery necessary to produce circular proteins such as kalata B1, but they are not as efficient as natural cyclotide producers because they also produce linear variations of kalata B1. 31
Cyclotides: A New Perspective on Therapeutic Treatments
As described above, cyclotides are characterized by a wide variety of activities and high structural stability. 51,59,77 These combined properties have attracted enormous attention from both the scientific community and pharmaceutical industry for new drug development. 25,59,77 Thus, several approaches have been implemented to improve the stability of pharmaceutically interesting peptides, including protein engineering and chemical synthesis of analogues. 13,25,59 Together with these processes, chemical synthesis of peptides, carried out by solid phase synthesis, has led to great advances in the construction and manipulation of new molecules, as previously described.
Various approaches have been applied to improve the stability and activity of cyclotides. The protein engineering technique is one of them—commonly known as “protein grafting”—that is, the transfer of a bioactive amino acid sequence on the surface of a stable folded cyclotide. 78,79 This transference is successful mainly because of a stable structure scaffold, rich in disulfide bonds that stabilize protein grafting, making this a promising technique for drug design without decreasing their activity. A good example of protein grafting is the introduction of Poly-Arg with anticarcinogenic properties into loop 3 of kalata B1, which stimulates the activity of the vascular endothelial growth factor (VEGF-A), a physiological regulator that is also able to inhibit angiogenesis and tumor cell proliferation, such as in pancreatic carcinomas, melanoma, and mesothelioma. 65,78 Another example consists of the introduction of the sequence Arg-Lys-Gln in loop 1 of the trypsin inhibitory cyclotide MCoTI. This modification was performed to evaluate the inhibitory activity over other proteases, such as the 3C protease, commonly synthesized by the virus responsible for foot and mouth disease (Aphthovirus epizooticae) that attacks hoofed animals, causing severe losses in the agricultural sector. 16 In the native MCoTI form, no activity against this protease was obtained. Nevertheless, the modified MCoTI form showed a moderate activity against 3C protease. 16 Although the structural change yielded only a modest result, this was the first of many studies that could be proposed involving changes in the structures of cyclotides, thus making them active against other proteases. 65,80
Furthermore, another study that involved modifications in MCoTI-II loop 1, with the exchange of lysine, which is a residue involved in trypsin inhibition by valine or alanine, showed additional activity against human leukocyte elastase at a concentration of 21 and 32 nM, respectively, and the residue truncate “closing the loop” in Ser-Asp-Gly-Gly, present in loop 6, was responsible for activity against β-tryptase at a concentration of 9 nM. 16 Human leukocyte elastase and β-tryptase are serine proteases that play significant roles in a range of pathological conditions, such as tissue destruction associated with pulmonary emphysema, rheumatoid arthritis, and cystic fibrosis among others. 16,81 These results suggest that the structure of cyclotide MCoTI-II is extremely versatile for the development of protease inhibitors against inflammatory diseases.
However, it is not only the addition of activities by cyclotide engineering that has been observed. Clark et al 79 synthesized kalata B1 analogues, introducing the hydrophobic residues (Trp19, Pro20, and Val21) present in loop 5 of kalata B1 and switching them for cationic residues (Lys19, Asn20, and Lys21). In another experiment, the residues (Pro20 and Val21) present in loop 5 of kalata B1 were switched for residues (Asp20 and Lys21). It was discovered that these changes almost abolished the residual hemolytic activity of kalata B1. 79 This could be a vital step in cyclotide-based drug development, reducing side effects on the mammalian host organism. 79
In summary, research into using cyclotides in the development of therapeutic drugs is only beginning, and many more studies regarding their applicability are required. To reduce the time taken, it is necessary to minimize some adverse cyclotide effects such as hemolytic activity and cardiotoxicity. The fact that they are stable and accept changes in their structures can help speed up the process for using them as biotechnological resources. Thus, these peptides may soon be widely used in drug design, and their application in agriculture could begin in the not-so-distant future. 59
Footnotes
Michelle Flaviane Soares Pinto and Renato Goulart de Almeida contributed equally to this work.
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
This work was financial supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES); Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq); Fundação de Apoio à Pesquisa do Distrito Federal (FAPDF); Universidade Católica de Brasília (UCB).
