Sage Journals: Discover world-class research

Abstract

Multiple inositol polyphosphate phosphatase 1 (Minpp1) in higher organisms dephosphorylates InsP₆, the most abundant inositol phosphate. It also dephosphorylates less phosphorylated InsP₅ and InsP₄ and more phosphorylated InsP₇ or InsP₈. Minpp1 is classified as a member of the histidine acid phosphatase super family of proteins with functional resemblance to phytases found in lower organisms. This study took a bioinformatics approach to explore the extent of evolutionary diversification in Minpp1 structure and function in order to understand its physiological relevance in higher organisms. The human Minpp1 amino acid (AA) sequence was BLAST searched against available national protein databases. Phylogenetic analysis revealed that Minpp1 was widely distributed from lower to higher organisms. Further, we have identified that there exist four isoforms of Minpp1. Multiple computational tools were used to identify key functional motifs and their conservation among various species. Analyses showed that certain motifs predominant in higher organisms were absent in lower organisms. Variation in AA sequences within motifs was also analyzed. We found that there is diversification of key motifs and thus their functions present in Minpp1 from lower organisms to higher organisms. Another interesting result of this analysis was the presence of a glucose-1-phosphate interaction site in Minpp1; the functional significance of which has yet to be determined experimentally. The overall findings of our study point to an evolutionary adaptability of Minpp1 functions from lower to higher life forms.

Keywords

inositol phosphates multiple inositol polyphosphate phosphatase inositol (1,3,4,5)-tetrakisphosphate 3-phosphatase Minpp1 Minpp2 Mipp HiPER bioinformatics computational protein structure prediction phytases evolution of proteins protein motifs protein domains evolutionary significance

Introduction

Inositol phosphates (InsPs) are a group of vital molecules naturally occurring in both animal and plant cells. They are essential for regulating diverse cellular processes such as calcium mobilization, vesicular trafficking, chromatin remodeling, and apoptosis.^1–3 Changes in cellular levels of InsPs have been implicated to regulate cell physiology.⁴ Among InsPs, inositol hexakisphosphate (InsP₆) is well studied for its role in apoptosis and other cellular signaling processes.⁵ Plant cells contain more InsP₆ than animal cells.⁶ In higher animals, the amount of InsP₆ ranges from nanomolar to micromolar concentrations.⁷ Multiple inositol polyphosphate phosphatase 1 (Minpp1) is an enzyme that dephosphorylates 3-phosphate from the most abundant InsP₆^8–12 as well as less abundant InP₇ and InsP₈. Minpp1 is classified as a member of the histidine acid phosphatase super family of proteins. In vitro studies revealed that higher InsPs act as competitive substrates for Minpp1 with higher affinity than lower InsPs.

Subcellular localization of Minpp1 was studied in an attempt to understand the physiological role of Minpp1. Previous studies indicate that Minpp1 is localized inside the endoplasmic reticulum (ER) as a soluble luminal protein with limited access to predominantly cytosolic InsPs.⁸ This has prompted researchers to search for alternative functions of Minpp1 that are different from its role in InsPs hydrolysis. For example, in human follicular thyroid carcinomas, Minpp1 was shown to have a role in cell differentiation¹³ and proliferation.¹⁴ The levels of InsP₆ and InsP₅ were noticeably higher in Minpp1 gene knockout mice than in wild type. Exogenous reintroduction of Minpp1 into cells resulted in decreased levels of InsP₆, which shows its role in maintenance of InsPs homeostasis. Additional studies have shown that there exists a relationship between Minpp1 and insulin-dependent alkaline phosphatase. On deleting the Minpp1 gene, there was a decrease in insulin-dependent alkaline phosphatase levels, which apparently influenced phosphate stability.¹⁵ Subsequently, overexpression of Minpp1 in chondrogenic cells decreased levels of InsP₆, which impaired chondrogenesis and cellular differentiation. In a different study conducted in osteoblastic mouse cells, Minpp1 was used as an osteoblastic differentiation marker for bone development.¹⁶ Minpp1 has also been implicated in regulating oxygen binding to hemoglobin in erythrocytes by hydrolysis of 2,3-bisphosphoglycerate (2,3-BPG) to 2-phosphoglycerate (2-PG), bypassing the intermediate mutase reaction in Rapoport–Luebering shunt in the glycolytic pathway.¹⁷ The chick homolog of Minpp1 is referred to as HiPER; it regulates maturation of chondrocytes from the proliferative to the hypertrophic stage in the growth plate region of the bone.¹⁸ Minpp1 also dephosphorylates other phosphorylated organic compounds, eg, p-nitrophenyl phosphate. This demonstrates that Minpp1 may function as a non-specific dephosphorylating enzyme.

Despite several attempts to study Minpp1 structure, function, and its subcellular location, our understanding of its physiological significance remains unclear. There are also a number of other concerns, eg, its occurrence in lower and higher organisms, cellular location, functional diversification, and non-specific dephosphorylation property. It is not clear how Minpp1 functions vary in a phylogenetic context ranging from microorganisms like prokaryotes to lower eukaryotes, plants, and animals. Since there is no distinct cell organelle differentiation in prokaryotes, does subcellular or organelle-specific localization of this protein have any significant impact on its physiological function?

The purpose of our study was to understand the evolutionary significance of Minpp1, ie, whether Minpp1 function is conserved through evolution or if there is diversification from a simple prokaryotic protein to a more complex functional protein in eukaryotes. To accomplish this goal, we took a bioinformatics approach to analyze Minpp1 related sequences for its distribution across taxa. We identified and collected 40 different species that shared homology with the Minpp1 gene and have compared similarity and variability within conserved motifs among sequences. Phylogenetic analysis was performed to understand the homology among collected organisms and study the evolutionary pattern of Minpp1 from lower to higher organisms. Further, we have analyzed the amino acid (AA) sequences by multiple sequence alignment, predicted the 3D model of Minpp1 protein by I-TASSER and motif scan to predict possible functions. This is an innovative bioinformatics approach to analyze active site and ligand interaction at the molecular level. This approach facilitated our ability to recognize its structure and function and also predicted its possible interactions with potential substrates.

Methods

Collection of Minpp1 Sequences

Homology modeling of the Minpp1 protein was performed to determine the structural and functional relation that exists between the Minpp1 proteins of other species. The full-length human Minpp1 (hMinpp1) AA sequence, as shown in Figure 1, was obtained from the National Center for Biotechnology Information (NCBI) with gene ID: 9562, NCBI RefSeq: NP_004888_2, UniProt ID: Q9UNW1 (other accession IDs: O05286, Q59EJ2, Q9UGA3), and protein code (E.C 3.1.3.62). This was used to find related sequences using the Basic Local Alignment Search Tool (BLAST) from NCBI (www.blast.ncbi. Blast.cgi) and The European Bioinformatics Institute (EMBL-EBI) (www.ebi.ac.uk/) databases. Multiple sequence alignment of the various sequences was achieved using ClustalX2. We predicted the three-dimensional structure of Minpp1 protein sequence (Fig. 1) using I-TASSER online-Zhang-Server (I-TASSER, http://zhanglab.umich.edu/I-TASSER/). Templates judged to be appropriate were downloaded from the protein database (PDB, http://www.pdb.org) connected to the above server and were regenerated using PyMOL.

Figure 1.

AA sequence of hMinpp1.

Phylogenetic Analysis: Evolutionary Relevance of Minpp1

A hierarchical clustering approach was adopted to analyze the relatedness among the AA sequences. Results of the analysis are summarized in a phylogenetic tree constructed by the neighbor joining of a Jones–Taylor–Thornton (JTT) matrix using MEGA6 (M6, http://megasoftware.net/). Sequences used for analyses were cross-checked for their appropriateness to be included by a pairwise distance analysis method; a value of one (1) was considered reliable for further analysis. The analysis did not contain any duplicate sequences. The next step was to confirm the average AA identity to estimate the reliability of the aligned sequences. This was done by estimating the p-distance value among sequences using MEGA 6, which resulted in a value of 0.6 (<0.8 is an acceptable range). Since we obtained two distinct phylogenic clusters, ie, higher and lower organisms with a greater degree of variation in similarity (20–30%), we separated the sequences of the two clusters. An analysis of intra-similarity was performed by pairwise sequence comparison and then using maximum likelihood statistical methods as above, separate phylogenetic trees were generated.

Prediction of 3D Model for Minpp1

The full-length hMinpp1 AA sequence (NP_004888_2) was used to predict a 3D model using I-TASSER, an online platform that predicts the 3D protein structure from AA sequences.¹⁹ The 3D models are built based on multiple-threading alignment using LOMETS. The LOMETS utilizes a number of internal servers such as MUSTER, HHSEARCH, SAM-T02, SPARKS2, SP3, PROSPECT2, and PPA. Each server relies on the inherent template modeling (TM) score cut-off values, which can have bias to the individual algorithms.^20,21 However, an overall TM score gives the confidence level of the particular server and the sequence identity between the query and the template. This scoring function is termed as confidence score. The top 10 models are selected based on the confidence score. The root mean squared deviation (RMSD) values for the selected top 10 models ranged from 0.91 Å to 3.70 A. The selected Minpp1 model had an RMSD value of 0.91 A and a TM score of 0.89. Our top selected models had unique global fold,^20,21 suggesting quality modeling. The pdb file of the predicted Minpp1 model was downloaded and was reproduced by the PyMOL Molecular Graphics System. 3D model was regenerated using Protein Homology/analogY Recognition Engine V2.0 (http://www.sbg.bio.ic.ac.uk/~phyre/). Inositol phosphatase motif RHGxRxP was then highlighted and labeled.

Structural Motif Analysis: Biological Relevance of Minpp1

To understand the biological relevance of the protein, online bioinformatic tools were used to first find functional motifs in the hMinpp1 sequence and then to compare these motifs among other related sequences. Tools used included: PROSITE Scan prosite tool (PROSITE, http://prosite.expasy.org), motif scan (SIB myhits, http://myhits.motif_scan), Sanger Pfam (pfam, http://pfam.sanger.ac.uk), Simple Modular Architecture Research Tool (SMART, http://smart.embl-heidelberg.de/), Protein ANalysis THrough Evolutionary Relationships (PANTHER, http://www.pantherdb.org/), CATH: Protein Structure Classification (CATH/Gene3D, http://www.cathdb.info/), and SCOP: Structural Classification of Proteins (superfamily, http://supfam.org/). A total of five unique motifs were selected based on the highest number of hits or scores using the above tools. Selected motifs were ER retention signal, phosphotransferase, phytase, inositol phosphate phosphatase, and pleckstrin homology (PH) domain. The presence of the above motifs was searched among all collected sequences. The sequences of these motifs were then compared for conservation and variation using MEGA 6.

Ligand-Binding Prediction of Minpp1

Besides predicting the 3D structure of Minpp1, we predicted protein ligand-binding or enzyme active site interaction with possible substrates. We did this in order to find if Minpp1 binds to ligands other than InsPs. This was achieved by COACH, highly rated bioinformatics software for protein–ligand docking (COACH, http://zhanglab.umich.edu/COACH/). This is a meta-server approach that combines the use of the state-of-the-art tools (COFACTOR, TM-SITE, S-SITE, FINDSITE, and ConCavity) to predict protein–ligand binding. First, the Minpp1 primary sequence was provided as input to generate 3D structure using I-TASSER. I-TASSER feeds the 3D structure into the COACH pipeline for ligand-binding site prediction. COACH utilizes the BioLiP database that houses data on known proteins with their specific ligands. The predicted ligands with their binding sites on the protein were selected based on their scoring values.

Results and Discussion

Identification of Functional Domains in Minpp1 Sequence

In order to identify structural domains and functional motifs in hMinpp1 protein sequence, NCBI Refseq: NP_004888_2 was used. This accession number is for the full length of hMinpp1 isoform that contains 487 AAs. The major domains and motifs present in Minpp1 as identified using the above-mentioned bioinformatics tools are labeled in the AA sequence of the protein as shown in Figure 1 and summarized in Table 1. The use of multiple bioinformatics tools provided us with varied criteria to identify the true motifs within a domain. This has also allowed us to minimize prediction errors and confirmed the results of various approaches. The motifs with higher TM scores were chosen for further analysis. TM is a measure of similarity between two protein structures with different tertiary structures and is considered an accurate reflection of true protein structure.^19–21 The overall TM scores were between 0.55–0.75; a reliable range to consider for an identified functional motif.²²

Table 1.

Summary of functional domains identified with references.

IDENTIFIED FUNCTIONAL DOMAINS OF MINPP1
SNO.	AMINO ACID#	FUNCTIONAL DOMAIN	REFERENCE
1	1–30	Signal peptide	Chi, H et al. 1999
2	71–429	Acid phosphatase A	Cho, J et al. 2006
3	74–207	Phosphoglyceromutase acid phosphatase	Cho, J et al. 2008
4	88–94	Acid phosphatase domain	Caffrey, J.J et al. 1999
5	242–245 481–484	N-glycosylation	Liu, T et al. 2005
6	107–112 113–118 128–133	N-myristoylation	http://myhits.motif_scan
7	401–480	Protein kinase phosphorylation	Matthias Frech, et al. 1997
8	485–487	Endoplasmic reticulum retention signal	Chi, H et al. 1999
9	306–309	Glucose-1-phosphatase	Collet, J.F et al. 1998

Note: Identified functional domains as marked in Figure 1 obtained by protein sequence with accession ID NP_004888_2 from NCBI database as summarized with references.

Most domains and motifs identified in our analysis have also been described in the literature (Table 1). AAs 1–30 at the N-terminal end constitute a signal peptide,¹⁰ AAs 71–429 comprise acid phosphatase-A (AP-A) domain,⁹ and AAs 74–207 span a domain for phosphoglyceromutase acid phosphatase (PGAM).¹⁷ These two domains share AAs 71–207, contributing to the complexity of the protein. A protein kinase B domain²³ identified between AAs 401–480 overlaps with AP-A. The most conserved region known for InsPs phosphatase activity is AAs 88–94, with the signature sequence RHGTRYP.⁹ This motif is shared between the PGAM and AP-A domains. AAs 242–245 (NATA) and 481–484 (NSTS) are N-glycosylation sites that might be involved in shuttling of proteins between the ER and Golgi complex. However, no evidence of such transportation mechanism is described in the literature. AAs 149–160 (KGRQDMRQLALR) were identified as PH domain motif, AAs 306–309 (DIDD) as glucose-1-phosphatase (G1P)/glucose transferase, and AAs 485–487 (KDEL) as an ER retention signal.¹⁰

Three signature motifs for phytases were also identified in Minpp1. AAs 126–128 (DLG), AAs 103–200 (QJHYH), and AAs 48–53 (GTKTRY).²⁴ The presence of N-glycosylation sites and an ER retention signal predicts its role in trafficking between ER–Golgi and other endomembrane systems. The tools also gave hits for N-myristolation sites at three different positions: AAs 107–112, 113–118, and 128–133. All tools predicted and collectively confirmed the presence of acid phosphatase and phosphoglycerate mutase domains and motifs for phytases and InsPs phosphatase. These features qualify Minpp1 as a genuine member of the histidine phosphatase super family that possesses the signature RHGxRxP catalytic motif. A considerable number of references for the members of this category are available.^2,9,15,25 It is one of the largest functionally diversified families of proteins known. Members of this family also include bisphosphoglycerate mutase that is known for dual phosphatase and mutase activities and fructose-2,6-bisphosphatase, which is involved in glycolysis and gluconeogenesis.²⁶

Identification of Local Similarity between Sequences

Both NCBI and EMBL-EBI databases were searched using the Minpp1 sequence for the presence of homologous sequences. There was a wide range of variation in homology between species given the breadth of taxa examined (eg, yeast, zebrafish, plants, and mammals). The BLAST results produced a total of 286 hits with a significant matching score >100 belonging to varied taxonomic groups like bacteria, metazoa, fungi, and plants. This suggests that Minpp1 is a widely distributed protein through evolution. Such a distribution of Minpp1 has also been addressed by Chi and associates.¹⁰ While collecting data from the two databases, care was taken not to collect duplicate sequences and any putative, hypothetical, or synthetic sequences.

Species selected for phylogenetic analysis were taken based on AA sequence identity. Plant species Hordeum vulgare, Triticum aestivum, and Lilium longiflorum were not found during the regular BLAST query. However, there is evidence that Minpp1 exists in these species. Therefore, to take these three species into account, further NCBI BLAST 2 SEQUENCES were performed to see the identity between each species and the query Minpp1 protein. This resulted in 20–30% sequence identity for plant species. Low sequence identity could be a possible explanation for the veiled results during the regular BLAST search. A total of 40 species from a wide range of taxa were finally selected for further analysis and are summarized in Table 2.

Table 2.

Summary of the identified species with their sequence identity ranges.

SEQUENCE IDENTITY	ORGANISMS
90%–100%	Homo sapiens, Pan troglodytes, Pongo abelii, Macca fasiculus, Callithrix Jacchus, Canis lupus
70%–90%	Bos tarus, Sus scrofa, Ailuropoda melanoleuca, Rattus norvegicus, Equus caballu, Mus musulus
60%–70%	Monodelphus domestica
50%–60%	Gallus gallus, Xenopus tropicalis
40%–50%	Danio renio, Takifuga rubripes, Tetraodon nigroviridis
30%–40%	Branchiostoma floridae, Hydra magnipapillata, Oikopleura diocia, Daphnia pulex, Camponotus floridanus, Ciona intestinalis, Solenopis invicta
20%–30%	Harpegnathos saltator, Apis mellifera, Drosophila sechellia, Pediculus humanus corporis, Phytophthora infestans, Dictyostelium purpureum, Polysphondylium pallidum, Lilium longiflorum, Hordeum vulgare, Triticum aestivum, Anopheles darling, Aedes aegypti, Bacteroides coprocola, Bifidobacterium pseudocatenulatum, Prevotella veroralis

Notes: Minpp1 protein sequence with accession ID NP_004888_2 used to BLAST NCBI and EMBL-EBI databases. Hits are summarized based on sequence identity.

In our study, the query resulted in 104 and 108 hits in NCBI and EMBL-EBI databases, respectively. The sequences selected from the above categories were based on their presence in both databases and upon elimination of predicted, constructed, and putative sequences. Results show that highly evolved species like Homo sapiens and Pan troglodytes share 90–100% sequence identity. Mus musculus and Rattus norvegicus shared 70–90% sequence identity, while numerous other groups such as plants, insects, and bacteria shared about 20–30% sequence identity with the former. It is interesting to note the similarity of Minpp1 between lower organisms and highly evolved species.²⁷

Analysis of Hminpp1 Isoforms

Location of the Minpp1 protein in higher organisms is restricted to the ER because of the presence of ER retention signal (KDEL), while InsPs are present in the cytosol. Minpp1 must hydrolyze these InsPs outside the ER as there has not been any association of InsPs with ER demonstrated. Until now it was not known how Minpp1 and InsPs interact for enzymatic activities against InsPs. To address this, we examined available databases for the existence of Minpp1 isoforms and their subcellular locations. Four isoforms of hMinpp1 were found during our search using UniProt (www.uniprot.org/uniprot/Q9UNW1). These isoforms are believed to be formed by alternative splicing of the gene for which experimental confirmation is still unknown. The first isoform with UniProt ID Q9UNW1 is considered variant 1. It represents the longest isoform encoded. The second isoform (variant 2) of the protein with UniProt ID Q9UNW1–2 is shorter than variant 1. Variant 2 lacks two alternate coding exons. The third isoform (variant 3) of the protein with UniProt ID Q9UNW1–3 is even shorter than variant 2. Variant 3 differs in the 5′ UTR and coding sequence as compared to variant 1. Variant 3 has a shorter and distinct N-terminus compared to variant 1. There were variations in AA sequences 279–284 between variant 1 and 3; AAs 285487 were missing in variant 3. The fourth isoform (variant 4) with UniPort ID Q9UNW1–4 is the shortest. In this isoform, AAs 1–213 are absent at the N-terminal end of the sequence. Upon comparison with variant 1, variant 4 has a C-terminal KDEL motif. Functionality of these isoforms is still unknown and no experimental data are available. Sequence alignment was done (Fig. 2) to see the similarity among all four isoforms of hMinpp1. Considerable sequence identity is seen for AAs 213–278; AA residues H 231 and H 248 are conserved. The NATA-N-glycosylation site is shown to be conserved (Fig. 2) in all four human isoforms. Of the four isoforms, two are predicted to be present as a luminal protein in the ER because of the presence of their KDEL motif. The two other isoforms are perhaps present in the cytosolic location because of the absence of KDEL-like sequence at the C-terminal.

Figure 2.

Multiple sequence alignment of isoforms of hMinpp1

The 3-phosphatase activity of Minpp1 in erythrocyte plasma membrane appears similar to Minpp1 activity found in the ER.^17,28 It is likely that the phosphatase activity in lower organisms is associated with the cell membrane.

Evolutionary Relationship of Minpp1 Determined by Phylogenetic Analysis

The phylogenetic tree constructed using all 40 species of Minpp1 is shown in Figure 1. The reliability and reproducibility of the tree was evaluated using a bootstrap procedure. Minimum replication value, a measure of reproducibility, was set at 2000 in this study. The resultant cladogram from all 40 species represents the evolutionary pattern of Minpp1 protein (Fig. 3). The tree provides strong evidence of conservation of Minpp1 protein across a wide range of species ranging from primitive species like Prevotella veroralis, to plants T. aestivum, to a primitive chordate with only 15,000 genes Oikopleura dioica, to highly evolved species like Bos taurus, Equus caballus, Macaca fascicularis, and H. sapiens. Its evolutionary breadth of occurrence indicates that Minpp1 is a significant protein that might play a crucial role across primitive to advanced taxa.²⁷

Figure 3.

Phylogenetic tree for Minpp1 related sequences.

Since there was a wide variation in sequence identity between the two major clusters of the phylogenetic tree, it appeared obvious to examine the degree of identity among organisms within each cluster. In other words, whether lower organisms eg, bacteria share a greater degree of identity among themselves. The phylogenetic trees generated from this analysis are given as Supplementary Figures 6A and B. Our analysis of the cluster representing 22 lower organisms generated the phylogenetic tree showing the highest log likelihood (-9452.8222), which indicates that they have a close evolutionary relatedness to each other. In the cluster for higher organisms for a group of 18 sequences, the analysis showed the highest log likelihood (-1878.8844), and the partitions reproduced were within 50% bootstrap replicates. An overall pairwise alignment value was 0.386, indicating more evolutionary divergence among higher organisms.

Prediction of 3D Structure for Minpp1

The 3D structure of Minpp1 (NP_00048_2) was predicted by using I-TASSER. Top 10 models were selected based on the confidence score. The TM score, which gives the confidence level of the sequence identity between the query and the template, was taken into account. Since crystal structure of Minpp1 is not available, we relied on the predicted structure quality of our 3D modeling. The RMSD values for the top 10 highest scoring models ranged from 0.91 to 3.70 Å. The selected Minpp1 model had an RMSD value of 0.91 Å and a TM score of 0.89, suggesting the high quality of the predicted model. The global folding was also unique among the top scoring models. The predicted PDB structure consisted of a helix represented in green, extended β-sheet and bridged β structures in magenta, and coils in pink (Fig. 4A). The motif RHGxRxP for phosphatase activity is highlighted in Figure 4B.

Figure 4.

3D modeling of Minpp1 protein generated using the AA sequence.

Motif Analysis

One of the objectives of our study was to analyze any functional variation in Minpp1 activity that might have occurred through evolution. We examined the conserved functional motifs in the Minpp1 related sequences. We addressed whether basic Minpp1 function was conserved among species or functional complexity developed and new functions were added and adopted as more complex functional needs arose through evolution. The homology modeling tools described in the methods section were used to find the functional motifs in Minpp1 sequences. Motifs that were studied are discussed in the paragraphs that follow.

Inositol Phosphate Phosphatase Motif (RHGXRXP)

Phosphorylation and dephosphorylation are fundamental processes that regulate physiological functions in the cell. The histidine acid phosphatase family of enzymes is characterized by the presence of the conserved RHGXRXP (Arg-His-Gly-X-Arg-X-Pro) motif²⁵ that is essential for InsPs hydrolysis. The RHGXRXP motif has a nucleophilic histidine, an active arginine, and aspartic or glutamic acid residues. The tripeptide RHG interacts directly with the phosphate group of the substrate, making it more susceptible to nucleophilic attack given the positive charge of the guanidino group of the arginine residue. The conserved arginines in the RHGxRxP motif are important for binding the highly negatively charged phosphate group. During hydrolysis, the aspartic acid residue from the C-terminal motif protonates the substrates and the nucleophilic histidine residue is involved in the formation of a covalent phospho-histidine intermediate. In Minpp1, the phosphatase motif is generally conserved. The organisms that did deviate from the conserved pattern are microbial organisms and insects (Supplementary Fig. 1). The variations in RHG noted were THG, MHG, and VHG.

ER Retention Motif (K/SDEL)

The ER participates in folding, sorting, and transport of proteins to various cellular destinations. Because of a retention motif, the majority of ER-resident proteins are retained in the ER. This motif is composed of four AAs at the C-terminal end of the protein. KDEL (Lys-Asp-Glu-Leu) is a signal for permanent retention of proteins in the ER.¹⁸ The KDEL sequence is recognized by a membrane-bound receptor that continually retrieves proteins from the Golgi compartment of the secretory pathway and returns them to the ER via retrograde transport vesicles. In Minpp1, AAs 485–487 at the C-terminal represent the ER retention signal.

Variations in the KDEL motif include RDEL, QDEL, KNEL, KEDL, DKEL, or KDEL²⁹ It is not known whether such variations can lead to sub-ER localization.³⁰ Our analysis, obtained by pairwise multiple alignment using ClustalX2, shows that KDEL is highly conserved in higher organisms. Exceptions include VKTEL in two of three species; RNTEL in Danio rerio and KKTEL in Hydra magnipapillata. However, as shown in Supplementary Figure 2, insects and bacterial species lack the retention signal.

Phosphotransferase Motif (DXDX[T/V])

Phosphotransferases are a complex group of enzymes that catalyze reactions by phosphorylation. This family is characterized by the presence of a conserved aspartate residue in the amino-terminal region. The first aspartate of the motif DXDX(T/V) (Asp-X-Asp-X[Thr/Val]) is well conserved.³¹ Phosphotransferase is considered to be functionally conserved from prokaryotes to higher eukaryotes, and it belongs to a large hydrolase family comprising several phosphatases. The family is slightly larger than the bisphosphoglycerate mutase family, which comprises two mutases and three phosphatases. The phosphatases have the phosphorylated residue that is a histidine in a conserved RHG motif. The conserved DXDX(T/V) motif is considered characteristic of phosphatases/phosphomutases acting on phosphate esters.^28,31 The first aspartate in the motif is responsible for phosphorylation and is position sensitive. The second aspartate in the motif is involved in catalysis, which acts as a nucleophile and in some cases as an acid-base catalyst.³² The conserved motif of phosphotransferase is well conserved in a hierarchical fashion from lower to higher organisms. Some of the AA variations that were observed are depicted in Supplementary Figure 3. The two extremes include variations in insects and a complete absence in bacteria.

PH Domain Motif (K-X_n-[K/R]-X-R)

The PH domain occurs in a wide range of proteins. This was first detected in pleckstrin and is the major substrate for protein kinase C in platelets. It is usually made up of AA residues K-X_n-(K/R)-X-R (Lys-X_n-[Lys/Arg]-X-Arg). It serves as a constituent of the cytoskeleton and is involved in intracellular signaling by binding to the beta/gamma subunit of heterotrimeric G proteins, phosphatidylinositol-4,5-bisphosphate (PIP2), phosphorylated serine/threonine residues, and protein kinase C.^33,34 The PH domain also interacts with insulin receptor substrate proteins mediating transcriptional responses in pancreatic islet cells and has a role in postnatal growth. PH domain-interacting protein (PHIP) is overexpressed in metastatic melanomas. Recently, it has been shown that the PH domain plays a role in fission and vesicle release through an in vitro study and in endocytosis with clathrin-mediation in an in vivo system.³⁵ The PH domain was found to be conserved in higher organisms as shown in Supplementary Figure 4; the motif varies in some other organisms and is absent in insects and bacteria.

Phytase Motif (DXG, GDXXY, GNH[E/D], GHXH)

Phytases are a ubiquitous class of proteins. They have broad substrate specificity and have the ability to hydrolyze many phosphorylated compounds that are not structurally similar to phytic acid. Phytases (PAP) and histidine superfamily phytases primarily hydrolyze InsPs. Purple acid phosphatases are commonly present in lower organisms. They have seven conserved residues (bold) in the five conserved motifs – DXG, GDXXY, GNH(D/E), VXXH, and GHXH. These phytases hydrolyze phytic acid to myo-inositol and inorganic phosphate. Our analysis revealed that all five motifs were not similarly conserved as represented in Supplementary Figures 5A–C; variations exist in both lower and higher organisms.

Active Site Residue–Ligand/Substrate Binding Prediction Analysis

AA residues that are active in ligand binding were predicted using a consensus-based algorithm (COACH). InsP₆ or inositol hexakissulfate (InsS₆), glucose-1-phosphate (GP), and PO₄ ligand binding sites were studied. AAs involved in InsP₆/InsS₆ binding are T49, K50, R88, H89, R92, T95, K97, R186, F228, Q321, H370, and A371; and in PO₄ binding are R88, H89, R92, R186, H370, and A371. This shows that R88 and H89 are actively involved in phosphatase activity. GP ligand binding showed active binding sites at R88, R92, R186, H370, A371, and E372. G1P belongs to the histidine acid phosphatase family and acts primarily as a glucose scavenger. It is well studied in E. coli, where it is located in the periplasmic space of cells. It appears to have high selectivity for inositol phosphate hydrolysis.³⁶ The enzyme is also potentially involved in pathogenic inositol phosphate signal transduction pathways via type III secretion into the host cell.³⁶ Similarly, N-acetyl-D-glucosamine was identified as a potential ligand with active residues that include N222, H231, T471, S472, L477, A478, R479, A480, and N481. The analysis scores of the predicted potential active residues for all ligands were within acceptable ranges as shown in Figures 5A-D (COACH >1.2, COFACTOR >, TM-SITE >0.2, S-SITE >0.3).

Figure 5.

Potential ligand-binding site prediction by a consensus-based algorithm (COACH).

The AAs R88, R92, R186, H370, and A371 in hMinpp1 are identified as the most active residues for binding to phosphorylated substrates. Our studies revealed that GP is also a potential substrate for this active site. It would be interesting to see if mammalian Minpp1 also utilizes GP as a substrate. This prediction is based on the nature of Minpp1 as a non-InsP-specific phosphatase, but this needs to be experimentally confirmed.

Conclusion

In an attempt to understand the physiological relevance of Minpp1 in mammalian systems, we took a bioinformatics approach to compare the hMinpp1 sequence across a broad range of taxa. Two major databases encompassing 40 species were used to study relatedness, divergence, and complexity adopted in function through evolution.

Phylogenetic analysis showed that Minpp1 related proteins clustered into two major groups, one representing lower and another higher organisms. Since we observed two very distinct clusters with a greater degree of variation in sequence identity, we carried out separate analysis on each cluster to see intra-organism relatedness among lower and higher organisms. While lower organisms were phylogenetically more related to each other, they were distinct from higher organisms (only 20–30% similarity). Additionally, we observed more divergence among higher organisms. It is likely that lower organisms developed a simpler version of Minpp1 that functioned primarily in non-specific dephosphorylation of phosphorylated organic compounds. In higher organisms, this function was adopted by way of natural selection and diversified through evolution. More functional motifs were added over time giving rise to a more complex structure with a higher molecular size of the enzyme in higher organisms. This is evident by the presence of fewer functional motifs in primitive than in higher organisms.

One of the key findings of our analysis was the identification of four spliced variants (isoforms) of Minpp1 in humans. These isoforms vary in size and apparent localization in the cell because all isoforms do not contain, for example, the KDEL motif at C-terminal for ER retention. The existence of these isoforms, to our knowledge, has not been documented in the literature. Whether other species also have similar isoforms has not yet been analyzed.

Another key finding of our predictive bioinformatics analysis has been the identification of interaction of GP with Minpp1. It will be interesting to determine the significance of this interaction experimentally. The presence of G1P activity has been documented in lower organisms. However, the presence of G1P motif in Minpp1 in higher organisms is intriguing with regard to its potential role in glucose metabolism. Our future studies are directed to establish experimentally any catalytic activity of Minpp1 against GP or related compounds such as glucose-6-phosphate as potential substrates.

Authors' Contributions

SPK performed bioinformatics analysis and writing of the first draft. AS assisted with bioinformatics tools. WHB guided the appropriateness of the tools and the study. NA conceived the idea, guided the study, and participated in writing. All authors participated in editing of the manuscript. All authors reviewed and approved of the final manuscript.

Supplementary Material

Supplementary Figure 1. Multiple sequence alignment of the Inositol phosphatase RHGxRxP motif using ClustalX2.

Forty protein sequences collected from the NCBI and EMBL-EBI databases were aligned using ClustalX2 to see similarity and diversification in the conserved catalytic “RHGxRxP” motif. The highlighted (blue) region shows that the motif is conserved from lower to higher organisms; especially Arg, His, Gly. Out of the 40 species used in the analysis, 34 showed the conserved pattern, exceptions were seen in insects and microorganisms.

Supplementary Figure 2. Multiple sequence alignment of the ER retention S/KDEL motif using ClustalX2.

ER resident proteins are retrieved and retained in the ER due to retention motif. In Minpp1 AAs KDEL 485–487 at the C-terminal were ER retention signal. All collected sequences are aligned using ClustalX2 to see similarity and diversification in the protein sequences. Out of the 40 sequences, 36 species highlighted in blue show that the motif is conserved from lower to higher organisms; in plants it is VKTEL and in some insects and bacteria it was absent.

Supplementary Figure 3. Multiple sequence alignment of the phosphotransferase (DXDX(T/V) motif using ClustalX2.

The 40 sequences collected from NCBI and EMBI-EBI databases were aligned using ClustalX2 to see similarity and diversification in the conserved phosphotransferase motif “DXDX(T/V)”. Highlighted are 36 out of 40 species, which represent conservation of aspartic acid in all aligned species of higher organisms with some exceptions in insects. The first aspartic acid is responsible for phosphorylation and is position sensitive.

Supplementary Figure 4. Multiple sequence alignment of the PH Domain motif K-Xn-(K/R)-X-R using ClustalX2.

PH Domain is found in many proteins that are involved in intracellular signalling and as a constituent of the cytoskeleton. The conservation and diversification of the motif were determined using ClustalX2. The motif highlighted in blue for 37 species shows the conservation of the motif; some amino acid residual variations are seen in more complex organisms and completely absent in some insects and bacteria.

Supplementary Figure 5. Multiple sequence alignment Phytase DXG, GDXXY, GNH(D/E), VXXH, and GHXH using ClustalX2.

Phytases are enzymes that catalyze phosphate monoester of phytate into the stepwise formation of myo-inositol (pentakis-, tetrakis-, tris-, bis-, and mono-) phosphates and inorganic phosphorous. In our analysis, lower and higher organisms were separated into two clusters using ClustalX2. Characterizations of motifs are as follows: (a) DXG analysis showed this motif to be conserved in 36 species. Glycine was conserved in both lower and higher organisms. GDXXY and VXXH did not show any conserved pattern in either lower or higher organism; (b) GNH(D/E) was found to be conserved in 34 species; and (c) GHXH was conserved in 36 species with variations.

Supplementary Figure 6A. Molecular phylogenetic analysis by maximum likelihood method (lower organisms).

The evolutionary history was inferred by using the Maximum Likelihood method based on the JTT matrix-based model. The tree with the highest log likelihood (-9452.8222) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 22 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 210 positions in the final dataset. Evolutionary analyses were conducted in MEGA6.

Supplementary Figure 6B. Molecular phylogenetic analysis by maximum likelihood method (higher organisms).

The evolutionary history was inferred by using the Maximum Likelihood method based on the JTT matrix-based model. The tree with the highest log likelihood (-1878.8844) is shown. The bootstrap consensus tree inferred from 1000 replicates is taken to represent the evolutionary history of the taxa analyzed. Branches corresponding to partitions reproduced in less than 50% bootstrap replicates are collapsed. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 18 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 136 positions in the final dataset. Evolutionary analyses were conducted in MEGA6.

Footnotes

List of Abbreviations

Acknowledgments

SPK thanks UALR Tech Launch, University of Arkansas at Little Rock, for a graduate assistantship.

References

Ali

, Craxton

, Sumner

, Shears

S.B.

Effects of aluminium on the hepatic inositol polyphosphate phosphatase. Biochem J. 1995; 305(Pt 2): 557–61.

Shears

S.B.

The versatility of inositol phosphates as cellular signals. Biochim Biophys Acta. 1998; 1436(1-2): 49–67.

York

J.D.

, Odom

A.R.

, Murphy

, Ives

E.B.

, Wente

S.R.

A phospholipase C-dependent inositol polyphosphate kinase pathway required for efficient messenger RNA export. Science. 1999; 285(5424): 96–100.

Agarwal

, Hassen

, Ali

Changes in cellular levels of inositol polyphosphates during apoptosis. Mol Cell Biochem. 2010; 345(1-2): 61–8.

Vucenik

, Shamsuddin

A.M.

Cancer inhibition by inositol hexaphosphate (IP₆) and Inositol: from laboratory to clinic. J Nutr. 2003; 133(11): 3778S–84.

Mehta

B.D.

, Jog

S.P.

, Johnson

S.C.

, Murthy

P.P.

Lily pollen alkaline phytase is a histidine phosphatase similar to mammalian multiple inositol polyphosphate phosphatase (MINPP). Phytochemistry. 2006; 67(17): 1874–86.

Letcher

A.J.

, Schell

M.J.

, Irvine

R.F.

Do mammals make all their own inositol hexakisphosphate?

Biochem J. 2008; 416(2): 263–70.

Ali

, Shears

S.B.

Hepatic Ins(1,3,4,5)P4 3-phosphatase is compartmentalized inside endoplasmic reticulum. J Biol Chem. 1993; 268: 6161–7.

Caffrey

J.J.

, Hidaka

, Matsuda

, Hirata

, Shears

S.B.

The human and rat forms of multiple inositol polyphosphate phosphatase: functional homology with a histidine acid phosphatase up-regulated during endochondral ossification. FEBS Lett. 1999; 442(1): 99–104.

10.

Chi

, Tiller

G.E.

, Dasouki

M.J.

. Multiple inositol polyphosphate phosphatase: evolution as a distinct group within the histidine phosphatase family and chromosomal localization of the human and mouse genes to chromosomes 10q23 and 19. Genomics. 1999; 56(3): 324–36.

11.

Craxton

, Ali

, Shears

S.B.

Comparison of the activities of a multiple inositol polyphosphate phosphatase obtained from several sources: a search for heterogeneity in this enzyme. Biochem J. 1995; 305(Pt 2): 491–8.

12.

Nogimori

, Hughes

P.J.

, Glennon

M.C.

, Hodgson

M.E.

, Putney

J.W.

Jr. , Shears

S.B.

Purification of an inositol (1,3,4,5)-tetrakisphosphate 3-phosphatase activity from rat liver and the evaluation of its substrate specificity. J Biol Chem. 1991; 266(25): 16499–506.

13.

Gimm

, Chi

, Dahia

P.L.

. Somatic mutation and germline variants of MINPP1, a phosphatase gene located in proximity to PTEN on 10q23.3, in follicular thyroid carcinomas. J Clin Endocrinol Metab. 2001; 86(4): 1801–5.

14.

Windhorst

, Lin

, Blechner

. Tumour cells can employ extracellular Ins(1,2,3,4,5,6)P6 and multiple inositol-polyphosphate phosphatase 1 (MINPP1) dephosphorylation to improve their proliferation. Biochem J. 2013; 450(1): 115–25.

15.

Hidaka

, Kanematsu

, Caffrey

J.J.

, Takeuchi

, Shears

S.B.

, Hirata

The importance to chondrocyte differentiation of changes in expression of the multiple inositol polyphosphate phosphatase. Exp Cell Res. 2003; 290(2): 254–64.

16.

Miron

R.J.

, Bosshardt

D.D.

, Zhang

, Buser

, Sculean

Gene array of primary human osteoblasts exposed to enamel matrix derivative in combination with a natural bone mineral. Clin Oral Investig. 2013; 17(2): 405–10.

17.

Cho

, King

J.S.

, Qian

, Harwood

A.J.

, Shears

S.B.

Dephosphorylation of 2, 3-bisphosphoglycerate by MIPP expands the regulatory capacity of the Rapoport-Luebering glycolytic shunt. Proc Natl Acad Sci USA. 2008; 105(16): 5998–6003.

18.

Romano

P.R.

, Wang

, O'Keefe

R.J.

, Puzas

J.E.

, Rosier

R.N.

, Reynolds

P.R.

HiPER1, a phosphatase of the endoplasmic reticulum with a role in chondrocyte maturation. J Cell Sci. 1998; 111(Pt 6): 803–13.

19.

Roy

, Kucukural

, Zhang

I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 2010; 5(4): 725–38.

20.

Pei

, Kim

B.H.

, Grishin

N.V.

PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res. 2008; 36(7): 2295–300.

21.

Zhang

, Skolnick

TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 2005; 33(7): 2302–9.

22.

Zhang

I-TASSER server for protein 3D structure prediction. BMC Bioinformatics. 2008; 9: 40.

23.

Andjelkovic

, Alessi

D.R.

, Meier

. Role of translocation in the activation and function of protein kinase B. J Biol Chem. 1997; 272(50): 31515–24.

24.

Mishra

I.G.

, Sharad

Molecular characterization and comparative phylogenetic analysis of phytases from fungi with their prospective applications. FTB-3196, Review;2013.

25.

Rigden

D.J.

The histidine phosphatase superfamily: structure and function. Biochem J. 2008; 409(2): 333–48.

26.

Fothergill-Gilmore

L.A.

, Watson

H.C.

The phosphoglycerate mutases. Adv Enzymol Relat Areas Mol Biol. 1989; 62: 227–313.

27.

Stentz

, Osborne

, Horn

. A bacterial homolog of a eukaryotic inositol phosphate signaling enzyme mediates cross-kingdom dialog in the mammalian gut. Cell Rep. 2014; 6(4): 646–56.

28.

Van Dijken

, Lammers

A.A.

, Van Haastert

P.J.

In dictyostelium discoideum inositol 1,3,4,5-tetrakisphosphate is dephosphorylated by a 3-phosphatase and a 1-phosphatase. Biochem J. 1995; 308(Pt 1): 127–30.

29.

Wilson

D.W.

, Lewis

M.J.

, Pelham

H.R.

pH-dependent binding of KDEL to its receptor in vitro. J Biol Chem. 1993; 268(10): 7465–8.

30.

Cabrera

, Muniz

, Hidalgo

, Vega

, Martin

M.E.

, Velasco

The retrieval function of the KDEL receptor requires PKA phosphorylation of its C-terminus. Mol Biol Cell. 2003; 14(10): 4114–25.

31.

Collet

J.F.

, Stroobant

, Pirard

, Delpierre

, Van Schaftingen

A new class of phosphotransferases phosphorylated on an aspartate residue in an amino-terminal DXDX(T/V) motif. J Biol Chem. 1998; 273(23): 14107–12.

32.

Kim

, Gentry

S.M.

, Harris

E.T.

, Wiley

E.S.

, Lawrence

C.J.J.

, Dixon

E.J.

A conserved phosphatase cascade that regulates nuclear membrane biogenesis. Proc Natl Acad Sci USA. 2007; 104(16): 6596–601.

33.

Lemmon

M.A.

, Ferguson

K.M.

Signal-dependent membrane targeting by pleckstrin homology (PH) domains. Biochem J. 2000; 350(Pt 1): 1–18.

34.

Rebecchi

M.J.

, Scarlata

Pleckstrin homology domains: a common fold with diverse functions. Annu Rev Biophys Biomol Struct. 1998; 27: 503–28.

35.

Ramachandran

, Pucadyil

J.T.

, Liu

Y.W.

. Membrane insertion of the pleckstrin homology domain variable loop 1 is critical for dynamin-catalyzed vesicle scission. Mol Biol Cell. 2009; 20(22): 4630–9.

36.

Lee

D.C.

, Cottrill

M.A.

, Forsberg

C.W.

, Jia

Functional insights revealed by the crystal structures of Escherichia coli glucose-1-phosphatase. J Biol Chem. 2003; 278(33): 31412–8.

Computational Analysis Reveals a Successive Adaptation of Multiple Inositol Polyphosphate Phosphatase 1 in Higher Organisms through Evolution

Abstract

Keywords

Introduction

Methods

Collection of Minpp1 Sequences

Phylogenetic Analysis: Evolutionary Relevance of Minpp1

Prediction of 3D Model for Minpp1

Structural Motif Analysis: Biological Relevance of Minpp1

Ligand-Binding Prediction of Minpp1

Results and Discussion

Identification of Functional Domains in Minpp1 Sequence

Identification of Local Similarity between Sequences

Analysis of Hminpp1 Isoforms

Evolutionary Relationship of Minpp1 Determined by Phylogenetic Analysis

Prediction of 3D Structure for Minpp1

Motif Analysis

Inositol Phosphate Phosphatase Motif (RHGXRXP)

ER Retention Motif (K/SDEL)

Phosphotransferase Motif (DXDX[T/V])

PH Domain Motif (K-Xn-[K/R]-X-R)

Phytase Motif (DXG, GDXXY, GNH[E/D], GHXH)

Active Site Residue–Ligand/Substrate Binding Prediction Analysis

Conclusion

Authors' Contributions

Supplementary Material

Footnotes

List of Abbreviations

Acknowledgments

References

PH Domain Motif (K-X_n-[K/R]-X-R)