Abstract
Type III polyketide synthases have a substantial role in the biosynthesis of various polyketides in plants and microorganisms. Comparative proteomic analysis of type III polyketide synthases showed evolutionarily and structurally related positions in a compilation of amino acid sequences from different families. Bacterial and fungal type III polyketide synthase proteins showed <50% similarity but in higher plants, it exhibited >80% among chalcone synthases and >70% in the case of non-chalcone synthases. In a consensus phylogenetic tree based on 1000 replicates; bacterial, fungal and plant proteins were clustered in separate groups. Proteins from bryophytes and pteridophytes grouped immediately near to the fungal cluster, demonstrated how evolutionary lineage has occurred among type III polyketide synthase proteins. Upon physicochemical analysis, it was observed that the proteins localized in the cytoplasm and were hydrophobic in nature. Molecular structural analysis revealed comparatively stable structure comprising of alpha helices and random coils as major structural components. It was found that there was a decline in the structural stability with active site mutation as prophesied by the in silico mutation studies.
Introduction
Polyketide synthases (PKSs) are a family of enzyme complexes that produce the large class of secondary metabolites, known as polyketides. They exhibit many pharmacologically important properties including antibiotic, antifungal, antitumor and immunosuppressive activities. Three types of PKSs are known to date; type I, II and III which are quite different from each other in their structures and functions and can be easily distinguished by their physical composition. 1
Type HI PKSs are responsible for the synthesis of polyketides such as chalcones and stilbenes. Among the proteins mentioned in Table 1, the well studied and widely distributed family of chalcone synthases (EC 2.3.1.74) play a significant role in the biosynthesis of various polyketides.2,3 They exist as multigene families in many of the plants4,5 and are found to be involved flower pigmentation, defense response and plant fertility. Type III PKS proteins identified from Zingiber officinale (ZoPKS) show significant variations from the previously identified type in PKSs. 6 Extensive gene duplication followed by the functional divergence is believed to have played an important role in generating the biochemical diversity of PKS superfamily 3 This diversity provides a rich source of pharmaceutically valuable antibiotics, anti-cancer drugs, biologically active pigments and allelochemicals. For example, resveratrol, a stilbene synthase derivative from grapes showed cancer chemopreventive activity in murine models.7,8 Stilbene from red wine is also reported to have protective role against heart diseases. 2
Table I. Functionally divergent members of type III PKS superfamily.
Type III PKSs are structurally simple, small homodimeric proteins of approximately 40–45 kDa, with a catalytic triad of Cys164-His303-Asn336 at the active site. Structural insight into the reaction mechanism elucidated for M. sativa CHS indicates the presence of three inter connected cavities; the CoA binding tunnel, a coumaroyl-binding pocket and a cyclization pocket at the active site. 9 Type III PKS monomer utilizes the triad within an internal active site cavity, which is connected to the surrounding aqueous phase by a narrow CoA- binding tunnel. 10 Enzymes in this family have broad substrate specificities and they generate a wide diversity of polyketide derivatives by changing the starter substrates, 11 the number of acetyl additions and cyclization reactions. The reaction (Fig. 1) involves the formation of chalcone or stilbene by the stepwise decarboxylative condensation of coumaroyl CoA with three malonyl CoA followed by the intramolecular cyclization known as claisen type condensation12,13 of the tetraketide product. Fatty acid biosynthesis also utilizes the same mechanism of carbon-carbon bond formation and both the synthesis pathways show many common features. 12

CHS reaction mechanism. Chalcone synthase condense three acetyl anions derived from the malonyl-CoA decarboxylation with the coumaroyl-CoA to form the tetraketide intermediate, which further aromatize and cyclise to form the chalcone.
The aim of the present study is to investigate the evolutionary relationship of Type III PKSs among different plant, bacterial and fungal species. The study also aims to find out the sequence conservation, structural features and physicochemical properties across the different type III PKS sequences.
Materials and Methods
Sequence data retrieval
Type III PKS amino acid sequences were retrieved in FASTA format from GenBank (http://www.ncbi.nlm.nih.gov/Genbank/) and Swiss-Port (http://www.expasy.ch/sprot/). Subsequently, a search was done using the BLASTp program 14 at NCBI to identify and retrieve potentially related sequences. A homology threshold (E-value) of 0.01 was set to get significant hits to be included in the alignment comprising subfamilies. M. sativa CHS (Swiss-Prot Ace. No- P30074) was used as query against the non-redundant protein sequence database. From the retrieved PKS sequences, about 56 Type III PKSs were selected to represent different families of plants, fungi and bacteria. Partial sequence entries were excluded. The species names and database accession numbers of the selected type III PKSs are given in Supplementary material 1.
Alignment process
Type III PKS amino acid sequences, representing 56 families were aligned using the multiple sequence alignment program ClustalW2.15,16 Default gap opening and extension penalty were used. A careful manual examination was done to avoid misalignment. Gaps in the alignment were treated as missing data. Proteins were named using an abbreviation followed by the family name in the alignment.
Phylogenetic analysis
Phylogenetic analysis was carried out by using MEGA4. 17 Neighbor-joining, Maximum Parsimony and Minimum Evolution methods were used to infer the phylogeny across the data.18–20 Bootstrap analysis was performed to check the reliability of the evolutionary tree. 21 Branches corresponding to partitions reproduced in less than 50% bootstrap replicates were collapsed.
Cavity volume analysis
Substrate binding pockets and respective cavity volumes were predicted using CASTp, a fast and precise algorithm. 22 based on computational geometry methods, including alpha shape and discrete flow theory. Default probe radius was used (1.4 Angstroms). In the present work, we have compared cavity volume of eight different type III PKS proteins viz. Chalcone synthase (M. sativa), Stilbenecarboxylate synthase 2 (M. polymorpha), Type III PKS (M. tuberculosis), Tetrahydroxynaphthalene synthase (S. coelicolor), Stilbene synthase (A. hypogaea), 2-pyrone synthase (G. hybrida), Type III PKS (N. crassa) and Pentaketide chromone synthase (A. arborescens). The atomic coordinates of these type III PKS proteins were obtained from the RCSB PDB database.
Physicochemical and structural analysis
Amino acid composition of selected type III PKS proteins were analyzed by ProtParam, an online tool available on ExPASy proteomics server. 23 Physicochemical properties and subcellular localization were predicted by protein calculator v3.3 (http://www.scripps.edu/∼cdputnam/protcalc.html) and SubLoc v1.0 server 24 respectively. For subcellular localization prediction of bacterial type III PKSs, TargetP 1.1, 25 and PSORTb v.3.0 26 were used. The disordered regions in the above mentioned eight different type III PKS proteins were studied by PreDisorder 27 (http://casp.rnet.missouri.edu/predisorder.html). Single site mutation studies were performed on MUpro server 28 which is available at www.ics.uci.edu/∼baldig/mutation.html. Cellular function and hydrophobicity of the proteins were analyzed by ProtFun 2.229,30 and pepstats 31 servers respectively. Secondary structures were predicted by GARNIER 32 on EMBOSS server.
Results and Discussion
Sequence alignment
Multiple alignments (Supplementary data 2) showed high sequence similarity across various type III PKS proteins. Around 80% similarity was observed among chalcone synthase superfamily of plant type III PKSs. This suggested a high conservation among CHSs from plant species. Fungal and bacterial type III PKSs shared less than 50% similarity, while the non-CHS proteins presented more than 70% similarity. Highly conserved amino acid residues were depicted in Figure 2.

Alignment of type III PKS proteins. The amino acid residues that forms the catalytic triad (▼), substrate binding pocket (•), Cyclisation pocket (†) are shown in the alignment. The symbol “↨” represents the non-active arginine residue that is conserved. The asterisks (*) symbol represents the ‘GFGPG loop’ which serve as a part of active site scaffold. The residues, shaping the geometry of the active site are indicated by the symbol “◊”. Symbol “□” represents conserved cysteine residues.
Residues in the catalytic site
Earlier studies on CHS from M. sativa indicated that four amino acid residues (Cys164, Phe215, His303 and Asn336) that belong to the catalytic region show conservation in all the members of CHS family 9 Among these residues ‘Cys-His-Asn’ is known as catalytic triad in CHSs. 3
In our studies, all type III PKSs including CHSs and non-CHSs showed absolute conservation in the case of catalytic triad. Proteins from bacteria and fungi also hold the conserved catalytic residues (Cys, His and Asn). ‘Phe215’ residue replaced with ‘Leu’ in BAS which is a non-CHS protein from R. palmatum (Polygonaceae family). This indicates that this substitution is favoured. The conserved asparagine residue (Asn336) was found to be substituted with ‘Ala’ in P. americana CHS (Lauraceae- Q9ZU06), eventhough it showed 75.3% similarity to M. sativa CHS.
Previous studies have reported that the mentioned active sites residues (Cys, His and Asn) are located at the lower side of hydrophobic pockets or crevices and that they vary according to the substrate. 12 It is found that the residue ‘Cys164’ is located at the amino terminal end of the a-helix and ‘Phe215’ at the bottom of the binding pocket. It is reported that ‘Phe215’ is involved in the decarboxylation of malonyl-CoA and may also help in the orientation of substrates and reaction intermediates at the active site. 33
Residues in the substrate binding pocket
The amino acid residues (Ser133, Glu192, Thr194, Thr197 and Ser338) located at the substrate binding pocket presented high degree of conservation with a number of exceptions (Supplementary data 3, Table 1). Sequence alignment clearly showed that all fungal and bacterial type III PKSs substituted ‘Ser133’ with ‘Thr’. Here both ‘Ser’ and ‘Thr’ belong to the same neutral polar amino acid group. ‘Ser133’ also showed substitution in most of the non-CHS members. Absolute conservation of residue ‘Glu192’ was found across the alignment while Thr194, Thr197 and Ser338 showed variations. In most of the fungal and bacterial type III PKSs, ‘Thr194’ was substituted with ‘Cys’. ‘Ser338’ showed lesser conservation among the non-CHSs of angiosperms and bacterial type III PKSs. In case of M. sativa (PDB: 1BT5), structural analysis showed that the residue ‘Ser133’ was located at the end of the fourth strand and ‘Glu192’ at the end of 6th strand. Threonine residues of the substrate binding pocket were located at the 10th alpha helix, where ‘Thr194’ was at the beginning and ‘Thr197’ towards the end. ‘Ser338’ was found to be positioned at the starting of the 15th alpha helix. Sequence conservation pattern also indicated that a single amino acid change can alter a CHS protein to a non-CHS protein. So substrate preference can alter and this may help in the drug development studies.
Residues in the cyclization pocket
Residues that formed the cyclization pocket presented high degree of conservation (Thr132, Met137, Phe215, Ile254, Gly256, Phe265, and Pro375) with a number of exceptions (Supplementary data 3, Table 2). Thr132 was substituted with ‘Cys’ in most of the fungal and bacterial type III PKSs. CHSs from all plants exhibited ‘Thr132’ conservation, with an exception of Z. officinale, where isoleucine (I) was found instead of Threonine. Type III PKS from the pteridophyte, P. nudum (wisk fern) displayed variations in the case of cyclization pocket residues, ie, VPS of P. nudum showed three amino acid substitutions (Ser139, Val261 and Leu272 instead of Thr132, He254 and Phe265 respectively) while STS displayed a single substitution (‘Met137’ with ‘Leu’). The residues corresponding to ‘Pro375’ exhibited absolute conservation across the various type III PKSs. In the case of angiospermic CHSs, the residues Met137, Ile254 and Phe265 were highly conserved. But in the case of non-CHSs, the fungal and bacterial type III PKSs exhibited variations within these residues. It has been reported that among these residues Phe215 and Phe265 were situated at the active site entrance (Jez et al, 2002). So in this regard we can infer that these residues might possess major catalytic roles.
Cysteine conservation
Investigation on RS3 from A hypogaea and CHS from S. alba showed cysteine conservation at six positions (Cys65- Cys89- Cys135- Cys169- Cys195- Cys347). These residues were also found to be conserved in all other chalcone synthases known so far.34,35 In our studies, ‘Cys65’ was accommodated with other residues in bacterial and fungal PKSs. Fungal type III PKSs substituted ‘Ala’ instead of ‘Cys65’, while bacterial PKSs presented ‘Ala’, ‘Phe’, ‘Ile’ or ‘Tyr’ in the referred position. ‘Cys89’ is substituted with ‘Thr’, ‘Gly’, ‘Val’, ‘Ala’, ‘Arg’, ‘Ser’ or ‘Glu’ residues in some type III PKSs. In most of the bacterial and fungal PKSs, ‘V’ was found in place of Cys135. In THBS of H. androsaemum (Clusiaceae), CHS of P. patens (Funariaceae) and M. Polymorpha STCS2 (Marchantiaceae), ‘Cys135’ was substituted with ‘Ala’. Cys195 was conserved in most cases leaving out Zingiberaceae, Haemodoraceae and Funariaceae. In fungal PKSs, non-polar neutral ‘Ala’ was found in place of ‘Cys195’ with the exception of Pleosporaceae where polar residue ‘Thr’ was found. Fungal, bacterial, bryophytic and pteridophytic type III PKSs showed little conservation for the ‘Cys347’ residue. In addition to that, some angiospermic CHSs and non-CHSs also showed variations in the case of cysteine conservation (ie, P. americana (Lauraceae) and Z. officinale (Zingiberaceae).
GFGPG Loop
A highly conserved ‘GFGPG’ loop was detectable in all plant CHSs and non-CHSs. In this, the amino acid ‘P’ (proline) constituted the central part of cyclization scaffold. 36 Phenylalanine (F) in the ‘GFGPG’ loop was found to be replaced with leucine (L) in ‘THBS’ of H. androsaemum (Clusiaceae) and ‘STS’ of P. nidum (Psilotaceae). In silico ‘single site amino acid mutation’ predictions indicated that the stability of the protein structure increases when ‘leucine’ replaces ‘phenylalanine’ in the loop. As predicted by Mupro, there is a decline in the structural stability of THBS when leucine was mutated to phenylalanine. In fungal proteins, the ‘GPG residues of the above mentioned loop exhibited absolute conservation, while the very first ‘Glycine’ residue was found to be substituted with alanine (A) or serine (S). In P. nodorum (Phaeosphaeriaceae), phenylalanine (2nd residue of the loop) was assigned with ‘valine’. In bacteria, this loop was conserved in most cases with very few exceptions. ‘FGPG’ residues of the loop were highly conserved but the first ‘glycine’ residue of the loop was substituted with ‘serine’ and ‘alanine’ in B. subtilis (Bacillaceae) and N. farcinica (Nocardiaceae) respectively.
Other residues
Amino acid residue ‘Gly238’ is found only in CHS of Brassicaceae and rarely found in other plant PKSs. 34 Our analysis showed that THBS of Clusiaceae also possessed this ‘glycine’ residue. In addition to this, it was identified that the amino acid residues Pro138, Gly163, Gly167, Leu214, Asp217, Gly262, Pro304, Gly305, Gly306, Gly335, Gly374, Pro375 and Gly376 presented high degree of sequence conservation, which may play a crucial role in shaping the correct geometry of the active site. The residues (‘Gly163’ and ‘Phe165’) surrounding the conserved active site residue ‘Cys164’ also appeared to have high degree of conservation. But in case of CHSs from Z. officinale, ‘Ala’ substituted the place of ‘Gly’, but it did not contribute much variation to the geometry of the active site, as ‘Ala’ belongs to the same category of non-polar amino acids. In most of the non-CHSs, the position of ‘Phe165’ was occupied by aromatic amino acids like ‘Tyr’ or ‘His’. In Z. officinale, ‘Phe165’ was replaced with polar amino acid ‘Thr’. PKS from Heamodoraceae presented basic polar amino acid ‘His’ at the position corresponding to ‘Phe165’. ‘2PS’ from Asteraceae and type III PKSs from fungi and bacteria also substituted ‘Phe165’ with ‘Ala’. We can conclude that these changes might have occurred during gene duplication events in the course of evolution.
Phylogenetic distribution of type III PKS proteins
To determine the evolutionary history of type III PKSs, it is necessary to examine how these proteins are distributed across the species. Comparative amino acid sequence analysis was performed to investigate the phylogenetic relationship among plants (mono-cots, dicots, bryophytes, pteridophytes and gymnosperm), fungi and bacteria. Separate clustering of the plant CHSs and non-CHSs has already been reported 37 and they suggest a repeated gene birth, death and reinvention of non-CHS functions throughout the evolution of angiosperms.
In our investigation, a total of 56 Type III PKSs were selected to represent various families in an attempt to study the phylogenetic relationship that exists between them. The evolutionary history was inferred using the Neighbor-Joining method. 18 The bootstrap consensus tree developed from 1000 replicates (Fig. 3) represents the evolutionary history of the taxa analyzed. 25 Branches corresponding to partitions reproduced in less than 50% bootstrap replicates were collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test is shown next to the branches. 21 The evolutionary distances were computed using the Poisson correction method 38 and are in the units of the number of amino acid substitutions per site. All positions containing gaps and missing data were eliminated from the dataset (Complete deletion option). There were a total of 276 positions in the final dataset. The number on each branch indicated the bootstrap probability (%) by re-samplings. The length of each branch is proportional to the estimated evolutionary distance.

Neighbor Joining (NJ) tree showing the separate clusterring of plant, fungal and bacterial proteins. Numbering in the branch indicates bootstrap probability obtained (values lesser than 50% were not shown). T and ‘II’ groups in the figure represent bacterial and fungal type III PKSs respectively and ‘III’ group indicates plant PKSs.
Phylogenetic tree based on Maximum Parsimony (MP) was obtained using the Close-Neighbor-Interchange algorithm with search level 321,39 in which the initial trees obtained were with the random addition of sequences (10 replicates). There were a total of 276 positions in the final dataset, out of which 253 were parsimony informative. The MP based phylogenetic tree for PKS superfamily diverged into different clusters, while the type III PKS from Nocardiaceae formed an out-group (Supplementary data 4).
The phylogenetic tree of the type III PKS superfamily diverged into three distinct groups (plant, fungal and bacterial proteins). Clusterring of plant and fungal proteins occurred with 99% bootstrap support (Fig. 3) while bacterial proteins with 63%. Non-CHS proteins showed sequential clustering followed by bryophytic and pteridophytic proteins. From this we can infer that plant type III PKS proteins might evolved successively from simple bryophytes to angiosperms (Fig. 4). When considering the plant type III PKSs alone, they showed a clear distinction as CHSs and non-CHSs with few exceptions (Fig. 5). CHSs from Z. officinale (Zingiberaceae), I. purpurea (Convolvulaceae) and P. americana (Lauraceae) were grouped along with non-CHSs (75%, 78% and 75.3% similarity to the M. Sativa CHS respectively). CHSs from E. arvense (Equisetaceae) and P. patens (Funariaceae) were also found to be clusterred with the non-CHSs.

A Consensus tree derived from 1000 bootstrap replicates by Neighbor Joining (NJ) method. Numbering in the branch indicates bootstrap probability obtained.

Phylogenetic tree derived from plant type III PKS alone, showing separate clustering of CHSs and non-CHSs. 1st group indicates non-CHS proteins and the IInd group indicates CHS.
The bacterial PKSs clustering occurred in two distinct clades with a bootstrap support of 63% and fungal PKSs were grouped (99% bootstrap support) very next to the bacterial clade. Type III PKS proteins from pteridophytes (VPS and STS from P. nudum) and bryophytes (CHS from P. patens and STCS2 from M. polymorpha) clustered well within the tree and are nested by the non-CHSs from monocot flowering plants (A. arborescens). The CHS from the gymnosperm P. strobus (Pinaceae) grouped along with angiosperms (83% identity to the typical M. sativa CHS). Putative type III PKS (Swiss-Prot ID: Q5YPW0) of N. farcinica (Nocardiaceae) showed 22.4% identity and 41.6% similarity to the typicalM. sativa CHS (Swiss-Prot ID: P30073). Further, the clustering of bryophytic CHSs in between the fungal and plant proteins suggested the evolution of CHS proteins from lower to higher plants and hence it can be regarded as the ancestor of the plant type III PKS superfamily.
Cavity volume and structural analysis on type III PKSs
Cavity volume plays an important role in the strength of the substrate binding interaction and hence any change in it alters the product formation profile 40 of type III PKSs. The potential substrate binding pockets and their corresponding cavity volumes can be predicted from its three dimensional structure using the program CASTp. The program analytically measures the area and the volume of each pocket and cavity, both in solvent accessible surface (SA, Richards’ surface) and molecular surface (MS, Connolly's surface). Studies on protein structure, stability, design, and substrate binding interaction rely on the accurate prediction of cavities in it.
CASTp analyses of eight different type III PKS structures (Figs. 6 and 7) were carried out after removing the water molecules. The number of cavities per protein ranged from 40 to 65. The surface area and the volume of all the pockets in the proteins were measured and the best cavity was selected (as shown in Fig. 6). Usually the large pocket or cavity is the active site of the protein. The pockets or clefts, which are important for molecular recognition and protein function 41 possess a well developed mouth opening that provides a direct connection to the exterior of the protein. The mouth leads to a cavity which is tunnel shaped. In our analysis, flexible active site pocket volume varied over a range of 737Å-1683Å and tunnel length in between 310Å-685Å. The cavity volume of 2-pyrone synthase at the active site was found to be significantly smaller than that of chalcone synthase. The cavity volume (Fig. 7) of the eight proteins in the increasing order is as follows; N. crassa Type III PKS < A. arborescens PCS < S. coelicolor THNS < A. hypogaea STS < G. hybrida 2PS < M. sativa CHS < M. polymorpha STCS2 < M. tuberculosis Type III PKS. The details of the values obtained by CASTp analysis and the changes in residues of the substrate binding pockets are given in tables (Supplementary data 3, Tables 3 and 4).

Molecular structures of type III PKS showing substrate binding pocket.

Variation of cavity parameters (A) in different type III PKS. Here MtCHS shows highest values area, length and volume. While NcPKS shows the least volume and AaPCS shows the lowest area. (MsCHS- M. sativa CHS, AhSTS- A. hypogaea STS, Gh2PS- G. hybrida 2PS, AaPCS- A. arborescens PCS, MpSTCS2- M. polymorpha STCS2, NcPKS- N. crassa Type III PKS, MtCHS- M. tuberculosis Type III PKS, ScPKS- S. coelicolor THNS)
To find out the stability of the protein structure, we analyzed the disordered regions of the eight proteins using PreDisorder. The probability of disorder in the protein sequence is depicted in Figure. 8. From the evaluation of the disordered regions, it was found that the type III PKS proteins from plants adopted comparatively stable structure. In case of N. crassa, an increase in disorder was observed towards C- terminal end. According to the predictions from GARNIER, M. sativa CHS has a structure composed of coils, turns, alpha helices and beta strands in a proportion of 18%, 14.2%, 46.4% and 25.7% respectively (Supplementary data 3, Table 6). Coils and turns together constitutes about 32%. It was also found that the alpha helix formed the major structural component in type III PKS proteins.

Probability of disorder in type III PKS proteins. The red curve shows the predicted probability of disorder for each residue in the protein sequence.
Single site amino acid mutation predictions on the active site residues (Cys-His-Asn) performed by ‘Mupro’. The results showed that both ‘His’ and ‘Asn’ mutations decreased the protein structure stability (Supplementary data 3, Table 5). A general decline in the stability of protein was also noticed upon mutation of the ‘Cys’ residue. Eventhough mutations with Leu, Ala, Ile, Met, Arg, Glu and Asp were resulted in enhanced stability, they might alter the substrate specificity and hence not preferred. This indicates that these residues possess major roles in the catalytic functions and specificities of type III PKSs. These predictions will help the researchers to carry out mutation studies in vitro.
Physicochemical characteristics
Physicochemical analysis suggested that type III PKSs are hydrophobic in nature and are localized in the cytoplasmic matrix. It is obvious that the disparity in the physicochemical properties and molecular structures is modest in case of type III PKSs from angiospermic plants. But the bacterial and fungal type III PKSs exhibited numerous variations (Supplementary data 3, Table 7). Analysis on amino acid composition (Supplementary data 3, Table 8) revealed that the residues ‘Ala’, ‘Glu’, ‘Gly’, ‘Leu’, ‘Lys’, ‘Val’ and ‘Ile’ were found abundant in proteins (totalling above 50% of all the amino acids in examined cases), while ‘Cys’, ‘His’ and ‘Tip’ in very low percentile (<3%).
The sub cellular localization prediction by SubLoc v1.0 and PSORTb v.3.0 suggested that all the examined type III PKSs from plants and fungi were localized in the cytoplasmic matrix exclusive of any transmembrane peptides. This indicated that all plant and fungal type III PKSs are functional in the cytoplasmic matrix but in the case of bacteria (M. tuberculosis), CHS is located in the cytoplasmic membrane. PSORTb and TargetP prediction identified that the sequence contained mTP, a mitochondrial targeting peptide, 42 80 amino acids in length. CHS was found to be localized in the bacterial cytoplasmic matrix in S. coelicolor.
Summary
In silico sequence and structural analysis of type III PKSs showed evolutionarily and structurally related regions in a compilation of amino acid sequences from different families. CHSs from higher plants displayed more similarity when compared to type III PKSs from bacteria and fungi. Besides the conserved cysteine residues; the amino acid residues in the active site, the substrate binding pocket, the cyclization pocket and the ‘GFGPG’ loop that forms the cyclization scaffold also exhibited a high degree of conservation across the various type III PKSs. Molecular phylogenetic analysis showed that the type III PKS proteins from plants, bacteria and fungi were clustered separately. Proteins from the primitive bryophytes and pteridophytes grouped immediately near the fungi, as a proof of the evolutionary divergence that occurred in the type III PKS superfamily Structural analysis showed that the protein's secondary structures were mainly composed of alpha helices and random coils. In silico sequence analysis revealed that type III PKS proteins are highly hydrophobic in nature and mostly localized in the cytoplasmic matrix.
Abbreviations
polyketide synthase;
2-pyrone synthase;
Phloroisovalerophenone synthase;
Trihydroxybenzophenone synthase
Acridone synthase;
Stilbene synthase 1;
Pentaketide chromone synthase;
Octaketide synthase;
Resveratrol synthase 3;
Benzalacetone synthase;
Aloesone synthase;
Stilbenecarboxylate synthase 2;
Putative 1, 3, 6, 8-tetrahydroxynaphthalene synthase;
Chalcone synthase.
Disclosures
This manuscript has been read and approved by all authors. This paper is unique and not under consideration by any other publication and has not been published elsewhere. The authors and peer reviewers report no conflicts of interest. The authors confirm that they have permission to reproduce any copyrighted material.
Footnotes
Acknowledgements
This study was supported by the Department of Information Technology, Government of India. The authors acknowledge BTISNet, Department of Biotechnology, Government of India for the Bioinformatics facility.
