Abstract
We investigated the relationship between the thermostability of Klentaq1 and factors stabilizing interdomain interactions. When thermal adaptation of Klentaq1 was analyzed at the atomic level, the protein was stable at 300 and 350 K. It gradually unfolded at 373 K and almost spontaneously unfolded at 400 K. Domain separation was induced by disrupting electrostatic interactions in two salt bridges formed by Lys354-Glu445 and Asp371-Arg435 on the interface domain. The role of these interactions in protein stability was evaluated by comparing free energy solvation (ΔΔGsolv) between wild type and mutants. Substitution of Asp371 by Glu or Asn, and also Glu445 by Asn resulted in a positive value of ΔΔGsolv, suggesting that mutations destabilized the protein structure. Nevertheless, substitution of Glu445 by Asp gave a negative value to ΔΔGsolv reflecting increasing protein stability. Our results demonstrate that interactions at the interface domains of Klentaq1 are essential factors correlated with the Klentaq1 thermostability.
Introduction
Based on amino acid sequence comparisons and crystal structure analyses, deoxyribonucleic acid (DNA) polymerase (Pol) can be divided into seven groups including A, B, C, D, X, Y, and reverse transcriptase (RT).1–3 The best studied of these groups is DNA Pol A (or Pol I), including the DNA Pol I of Escherichia coli (E.coli), Bacillus DNA Pol I and Thermus aquaticus (Taq) DNA Pol. These enzymes and their derivatives have been used extensively in biotechnological research, such as for polymerase chain reaction (PCR) research and DNA sequencing. 4
Taq Pol is an important enzyme that plays a role in DNA repair and recombination in vivo. 5 This enzyme was first isolated and characterized by Chien and colleagues. 6 The gene sequence of Taq Pol is 38% identical to E.coli DNA Pol I, 7 and the gene was cloned and overexpressed in E.coli by Kaledin and colleagues as well as Lawyer and colleagues.8,9 The heterologous enzyme still shows functional activity and its active form is a single monomeric enzyme with a molecular mass of about 93.9 kDa. 9 Biochemical and three-dimensional (3D) structure analysis showed that Taq Pol consists of three domains. These domains include 5′→3′ exonuclease (residue number 1–290), 3′→5′ exonuclease (residue number 291–419) and 5′→3′ polymerase (residue number 420–832).10–12 A 3D structure comparison between Taq Pol I and E.coli Pol I showed that the C-terminal polymerase domain in both proteins is almost identical, consisting of palm, fingers and thumb subdomains.13,14 Taq Pol I could be digested by a proteolytic enzyme, resulting in a large C-terminal fragment (3′→5′ exonuclease and 5′→3′ polymerase domains) and N-terminal fragment (5′→3′ exonuclease region). The C-terminal fragment is known as the Stoffel Fragment or Klenow Taq Pol I (Klentaq1).4,14 It has been shown that Taq Pol I and Klentaq1 have higher thermostability compared to E.coli DNA Pol I. As a result, Taq Pol I is widely used for amplification of DNA in vitro using PCR. However, this enzyme has lower fidelity compared to E.coli DNA Pol I due to lack of a proofreading function motif. 4
Extensive research on protein thermostability has led to the development of a theoretical foundation and practical applications, especially in the field of protein engineering. Design of mutants with higher thermostability that retain catalytic activity is widely practiced in industry; eg, food, animal feeding, medicine and toiletries. 15 Constructing such mutants is a challenging task because we need to know precisely which key residues/interactions in a protein are responsible for its thermostability and activity. 16 This task can be achieved if we explore the effect of thermal perturbation at the atomic level, which is currently not directly accessible, even after the recent advancement of available instruments. Molecular dynamics (MD) simulation is an alternative approach that provides atomic level information about the effect of thermal perturbation on protein conformation. 17 Key interactions responsible for the maintenance of protein conformation against thermal perturbation can be revealed through examination of the MD simulation trajectory, since we are able to observe gradual thermal unfolding of a protein at the atomic level.
In previous work, we have studied the unfolding process of thermolabile and moderate thermostable elements of Klenow-like DNA Pol I in order to elucidate factors that induce thermal unfolding of both proteins by MD simulation.18,19 We found similarity in the early stage of the unfolding process in both proteins. The unfolding was initiated by the breaking of electrostatic interactions at the interface between 3′→5′ exonuclease and 5′→3′ polymerase domains. In this study, we successfully distinguished thermostability level based on the number of electrostatic interactions at the domain interface. The number of interdomain electrostatic interactions in moderately thermostable DNA Pol I is higher than that of its thermolabile counterpart. Comparison of several thermophilic proteins indicate that there is a correlation between electrostatic interaction and the thermostability of proteins.15,20,21 In order to confirm such a profile, in the present study we performed a similar simulation method using a more thermostable DNA Pol I Taq, ie, Klentaq1. We show here that interface domains of Klentaq1 are stabilized by higher numbers of electrostatic interactions compared to thermolabile elements, thereby strengthening the correlation between the number of electrostatic interactions at the interface domain and the thermostable property of DNA Pol I.
Materials and Methods
System preparation
Initial coordinates for the Klentaq1 structure were obtained from the Protein Data Bank website (http://www.rscb.org) 22 with the entry code of 1QSS. 12 The deoxynucleoside triphosphate (dNTP) was removed from the complex enzyme using Visual Molecular Dynamics (VMD) software, 23 thereby leaving the apoenzyme. This apoenzyme was used as the initial structure of the enzyme throughout the simulations.
Molecular dynamics simulation parameters
All simulations were performed by the simulated annealing with nuclear magnetic resonance-derived energy restraints (SANDER) module of the Assisted Model Building with Energy Refinement (AMBER) 9.0 program. 24 Potential forces working on individual atoms were calculated using the AMBER force field (the Parm99 + frcmod ff03). 25 The implicit solvent (initial value for the generalized born; igb = 5) was used to describe the effect of solvent in MD simulation. 26 The mbondi2 radii and the reaction field cutoff (rbmax= 16) were employed to speed-up calculation of effective Born radii. In order to improve simulation efficiency, only those bond lengths involving hydrogen atoms were constrained with the SHAKE algorithm. 27 The protein consisted of 539 amino acid residues with a net molecular charge of about –8. The system was neutralized using 0.5 M NaCl. Temperature and pressure of the system were controlled using a Berendsen heat bath 28 and the external pressure of the bath was maintained at 1 atm, with relaxation time of 4 ps, respectively. The structure was first subjected to 5000 steps of energy minimization comprised of 500 steps by the steepest descent method, and then followed by the conjugate gradient to remove bad van der Waals contact. The trajectories and coordinates were saved every 1 ps with a 2 fs time step for structural analysis. MD trajectories of corresponding atoms were extracted using PTRAJ program. All images of the simulation systems and visualization of the structure were produced using VMD software.
Free energy calculation
The free energy perturbation (FEP) was calculated using the alchemical free energy method implemented in Scalable Molecular Dynamics (NAMD) 2.6. 29 The system was initially soaked in a cubic box (50 × 50 × 50 Å 3 ) of TIP3P water model. 30 A topological file containing hybrid amino acids for all point mutations was created with a VMD plugin called Mutator, based on standard Chemistry at HARvard Molecular Mechanics (CHARMM) topology. 31 The solvated system was equilibrated to the simulation temperature and 1 bar of pressure, and then run for FEP simulation in the isothermal-isobaric (NPT) ensemble. Electrostatic forces were calculated using Particle Mesh Ewald (PME) method.32,33 Ten Langevin dynamics simulations were performed with δλ of 0.1. The FEP calculations were run up to λ values of 1.0. In each λ, 50000 MD steps were carried out. The trajectories were saved every 2 fs.
Results
Conformational motion of Klentaq1
Thermostability of Klentaq1 was evaluated by performing the MD simulation at various temperatures using the implicit solvent model. Four different temperatures were set, including 300, 350, 373 and 400 K, to model temperature effects on the Klentaq1-structure. The simulation at 300 K was carried out to validate the setting parameters used in the entire simulation, and as reference for the native structure of the protein. The validated parameters at 300 K were then used for the other temperatures to simulate thermal unfolding of the protein.
The simulation result at 300 K was analyzed using a root-mean-square deviation (RMSD), 34 secondary structure composition, solvent accessible surface area (SASA) and B-factor in order to check the applied simulation parameters against the stability of Klentaq1. The RMSD analysis of the backbone atoms at 300 K showed that values converged at the relatively low level of 5 Å, indicating the protein conformation was less deviated from its initial structure (see Fig. 1). In terms of secondary structure composition, we found that α-helix and β-sheet contents were almost steady along the simulation time range. The same pattern was also found in the SASA profile; its value was almost constant at 300 K (data not shown). Both analyses suggest that the protein conformation was stable at secondary and tertiary structures. Further analysis was carried out to measure residual mobility of the protein by calculating root-mean-square fluctuation (RMSF) of Cα in each residue. RMSF of residues reflect flexibility/rigidity regions where those residues reside in the protein structure. The obtained RMSF value was the converted into B-factors and mapped onto the structure of protein. The map (see Fig. 2) suggests that amino acid residues in Klentaq1 are stable during simulation at 300 K. All analyses above thus confirm the validity of simulation parameters used in our study; therefore they were used in whole simulations carried out in this study.
Early unfolding process of Klentaq1
Evaluations of thermal unfolding simulations of Klentaq1 were performed at 350, 373 and 400 K. Simulation at 350 K was chosen based on Barnes 4 to mimic the natural temperature for polymerization. Barnes 4 reported that the optimum temperature of Klentaq1 for polymerization was about 343–348 K, whereas the thermal denaturation (Tm value) based on differential scanning calorimetry (DSC) was 373 K. 35 The 400 K temperature was selected to observe the effect of higher temperatures on the unfolding process of Klentaq1.
The RMSD of Cα was measured at the above temperatures and the results are shown in Figure 1. The RMSD at 400 K showed a sharp increase up to 65 Å early in the simulation (~600 ps) suggesting that the structure of the protein was disrupted. In contrast, the RMSD profile at 350 K yielded values that were constant during 10 Å to 12 ns of simulation. Meanwhile, the profile at 373 K increased smoothly and reached up to 60 Å after 4 ns. Further analyses on conformational stability at 350 and 373 K were carried out by evaluation of the SASA and secondary structure composition. These results showed that the α-helix and β-sheet at 350 K were gradually decreased, for the α-helix from 50% to 45% and for the β-sheet 11% to 10%, while the profile of the total SASA values was increased from 30000 Å 2 to 40000 Å 2 after 3 ns (data not shown). In contrast, the percentages of the secondary structure at 373 K sharply decreased from 50% to 30% for the α-helix and from 13% to 2% for the β-sheet after 4 ns of simulation (see Fig. 3A). To investigate detailed tertiary structure stability, each parameter of SASA values such as the whole surface area, polar region, non-polar region, and backbone of protein, was calculated. Each of these values was sharply increased after 1 ns simulation (see Fig. 3B). These results suggest that temperature at 373 K can be further used to analyze detailed conformational instability of Klentaq1.

Root-mean-square deviation (RMSD) values of the Cα of Klentaq 1 at various temperatures as a function of simulation time.

Thermostability map of Klentaq1 based on B-factor value for 300 K simulation.

(
Several unfolding trajectory snapshots were also analyzed in order to examine qualitative conformation. Snapshots of Klentaq1 at 373 K indicate that the unfolding was a stepwise process (see Fig. 4). The 3D structure of protein is stable up to 250 ps. Local changes, especially at the thumb and fingers subdomains, appeared at 250 to 750 ps. Beyond 750 ps, the thumb subdomain was disrupted followed by domain separation after 1 ns of simulation (see Fig. 4). The unfolding snapshot was consistent with the RMSD and SASA values observed. These data suggest that the unfolding process of Klentaq1 is initiated by disruption of fingers and thumb subdomains, followed by separation between 3′→5′ exonuclease and 5′→3′ polymerase domains.

Conformational changes in Klentaq1 observed in the unfolding profile at 373 K simulation.
Domain integrity and thermostability
In order to elucidate factors responsible for maintenance of interdomain integrity, we further analyzed the simulation trajectory of the thermal unfolding simulation at 373 K, particularly at the event of domain separation between 3′→5′ exonuclease and 5′→3′ polymerase domains. The critical event in the interdomain separation was found when two buried salt bridges formed by pairs of Asp371-Arg435 (I) and Lys354-Glu445 (II) were disrupted (see Figs. 4 and 5). The electrostatic interaction distance of each pair was monitored and the results showed that the distance between these interactions was initially constant at around 2 Å up to 400 ps, then increased slowly up to 8 Å, and then sharply increased up to 100 Å until after 2 ns of simulation (see Fig. 5). These data were consistent with the snapshot result, suggesting that the domains are intact until 400 ps. They were found since the distance between the opposite charged residues in each salt bridge separated within a similar time range to the sharp increase in RMSD and SASA (see Figs. 1 and 3B).

The distance of electrostatic interaction (salt bridges) ie, Asp371-Arg435 (blue line) and Lys354-Glu445 (red line) during 373 K simulation.
In order to further identify the important role of electrostatic interactions at interface domains, in silico mutations were carried out to create a variety of mutants. Four hypothetic mutants were created to evaluate interactions (I) and (II), ie, Asp371Glu, Asp371Asn, Glu445Gln and Glu445Asp (see Table 1). The Gibbs free energy solvation (ΔΔGsolv) of mutants was calculated using FEP methods and their values were compared with the wild type. The Asp371Glu (I.1), Asp371Asn (I.2), and Glu445Gln (II.2) mutants showed positive values of ΔΔGsolv. These values were 25.94, 63.45, and 56.57 kcal/mole respectively (see Table 1), suggesting that these mutants were less stable compared to the wild type. In contrast, the Glu445-Asp (II.1) mutant showed a negative value of ΔΔGsolv (–20.72 kcal/mole), suggesting that this mutant was more stable than the wild type.
The Gibbs free energy solvation (ΔΔGsolv) of mutants.
Discussion
Thermostability is the description of how a protein unfolds globally in response to thermal energy, but still retains its catalytic function. 17 Proteins and enzymes from thermophilic microorganisms offer good model systems by which to study the origins of thermostability. 36 Various factors have been shown to contribute to the thermostability of a protein from thermophiles, such as increased packing density, increased core hydrophobicity and decreased flexibility of the loop region.37,38 These factors have been studied through both experimental (mutagenesis study) and computational methods. 39 The use of a computational approach provides deeper insight into atomic motion in protein structure and dynamic behavior of enzymes.40–42 Based on computational methods, thermostability can be studied by performing high temperature simulation to denature the protein, namely thermal unfolding simulation. 43 In this report, MD simulations of Klentaq1 at elevated temperatures were performed to elucidate factors governing the stability of protein conformation.
Throughout the simulations in this study, we used an implicit solvent system instead of an explicit one. The implicit model provides an adiabatic solvent response that is important for the fast unfolding simulation. 44 Therefore, we can probe factors responsible for interdomain integrity within a relatively short time. This is because when one uses an implicit solvent, the viscosity of a system is very low and almost null due to the lack of frictional force of water molecules. This means that the unfolding process of the protein will occur rapidly under elevated temperature. 45 It has been reported elsewhere that simulation in an implicit solvent has comparable accuracy to simulation conducted in an explicit solvent. MD simulation carried out by Huang and Stultz 46 showed that the paired-helical filament 6 (PHF6) peptide in Tau protein simulated using an implicit solvent could accurately reproduce the set of local energy minimums resulting from explicit solvent simulation. Moreover, all implicit solvent models used by Huang and Stultz 46 demonstrated that PHF6 forms extended β-structures in solution, which is consistent with the experimental result that PHF6 forms intracellular aggregates causing Alzheimer's disease. 47
It is shown above that the two electrostatic interactions/salt-bridges (Asp371-Arg435 and Lys354-Glu445) buried in the interface domain region play an important role in maintaining the stability of the interdomain integrity (Table 1). Our finding is supported by Korolev and colleagues 14 who suggested that electrostatic interaction at the interface domain between 3′→5′ exonuclease and 5′→3′ polymerase domains is a key factor for the thermostability of an enzyme, based on their 3D structure analysis of Klentaq1. By using a variation of salt concentration, the importance of electrostatic interaction is also found in KF.48,49 However, the authors of the above study 14 did not mention the amino acid residues interactions that play a role in maintaining the stability of the enzyme. Studies of multidomain proteins also show the importance of interdomain interactions to protein stability, such as glycosyl hydrolase 50 and β- and γ-crystallin eye lens protein. 51 Glycosyl hydrolase is a multidomain enzyme that is similar to DNA Pol, consisting of a catalytic domain, a second catalytic domain with similar or different activity (bifunctional enzyme), substrate-binding domains, a surface-layer domain and a linker domain whose function is unknown. 52 Kataeva and colleagues 50 showed that domain interactions affect the thermostability of glycosyl hydrolase from Clostridium thermocellum. Furthermore, Vieille and Zeikus 15 suggested that interdomain or intersubunit interactions are a major mechanism of stability for most hyperthermophilic proteins, such as for glutamate dehydrogenase (GluDH) from P.furiosus, P.kodakaraensis, and T.litoralis.53,54
Our previous results18,19 in thermolabile DNA Pol I (KF) and moderately thermostable DNA Pol I (Klenow-like DNA Pol I ITB-1) also suggested that interdomain interaction contributes to the stability of enzymes. Moreover, we also found that there was similarity in the early unfolding processes of these enzymes. The process initiated by domain separation between 3′→5′ exonuclease and 5′→3′ polymerase domains. The results also suggested that the unfolding process of an enzyme from a thermophilic microorganism was much slower than that of its mesophilic counterpart. This phenomenon might be due to the increase in electrostatic interaction at the interface domain; ie, 1 in KF and 2 in both of Klenow-like DNA Pol I ITB-1 and Klentaq1. The above results are also supported by findings regarding hyperthermostable GluDH. Isolated from P.furiosus, this contains 7 electrostatic interactions buried at interdomain interfaces. In contrast, the GluDH from mesophilic Clostridium symbiosum consists of only one electrostatic interaction at the interface domain. 55 Our previous results also suggested that Klentaq1 was more stable than the Klenow-like DNA Pol I ITB-1 since the unfolding process of Klenow-like DNA Pol I ITB-1 was faster than Klentaq1. These findings are also supported by experimental polymerization data,4,56,57 which shows that Bacillus DNA Pol I has a lower polymerization temperature (around 60° C–65° C) compared to that of Klentaq1. The results of the MD simulation also suggest that these 2 enzymes (Klenow-like DNA Pol I ITB-1 and Klentaq1) have 2 pairs of electrostatic interactions 18 at their interface domains (Table 1). The interface domain interaction difference between both enzymes only differs in the second interaction; specifically, Asp371-Arg435 on Klentaq1. The Asp-Arg interaction is reportedly 15 better adapted to high temperature compared to the Lys-Glu interaction. This might due to the finding that, at elevated temperature, Arg exhibits more stable maintenance of electrostatic interactions and a net positive charge, since the pKa of Arg side chains are more basic (12.1) than Lys (11.1) (pKa values drop as the temperature increase). As with Arg, the side chains of Asp (pKa 3.9) are more acidic than Glu (pKa 4.2) hence Asp will be more stable during maintenance of a net negative charge than Glu. The above conditions suggest that an electrostatic interaction formed by Arg-Asp would be stronger than an electrostatic interaction formed by Lys-Glu. 15
Author Contributions
Conceived and designed the experiments: SN, RH. Analysed the data: SN. Wrote the first draft of the manuscript: SN. Contributed to the writing of the manuscript: SN, RH, Akh. Agree with manuscript results and conclusions: RH, MAM, Akh. Jointly developed the structure and arguments for the paper: SN, RH, MAM, Akh. Made critical revisions and approved final version: SN, RH. All authors reviewed and approved of the final manuscript.
Funding
The present work has partially been supported by research grant from Institut Teknologi Bandung (ITB) No 174/K0107/PL/2007 to RH and Institut Teknologi Bandung scholarship (voucher) to SN.
Disclosures and Ethics
As a requirement of publication author(s) have provided to the publisher signed confirmation of compliance with legal and ethical obligations including but not limited to the following: authorship and contributorship, conflicts of interest, privacy and confidentiality and (where applicable) protection of human and animal research subjects. The authors have read and confirmed their agreement with the ICMJE authorship and conflict of interest criteria. The authors have also confirmed that this article is unique and not under consideration or published in any other publication, and that they have permission from rights holders to reproduce any copyrighted material. Any disclosures are made in this section. The external blind peer reviewers report no conflicts of interest.
Footnotes
Author(s) disclose no potential conflicts of interest.
