Abstract
BACKGROUND:
Leucine-rich alpha-2-glycoprotein (LRG) has been repeatedly proposed as a potential plasma biomarker for myelodysplastic syndrome (MDS).
OBJECTIVE:
The goal of our work was to establish the total LRG plasma level and LRG posttranslational modifications (PTMs) as a suitable MDS biomarker.
METHODS:
The total plasma LRG concentration was determined with ELISA, whilst the LRG-specific PTMs and their locations, were established using mass spectrometry and public mass spectrometry data re-analysis. Homology modelling and sequence analysis were used to establish the potential impact of PTMs on LRG functions via their impact on the LRG structure.
RESULTS:
While the results showed that the total LRG plasma concentration is not a suitable MDS marker, alterations within two LRG sites correlated with MDS diagnosis (
CONCLUSIONS:
We report the presence of LRG proteoforms that correlate with diagnosis in the plasma of MDS patients. The combination of mass spectrometry, re-analysis of publicly available data, and homology modelling, represents an approach that can be used for any protein to predict clinically relevant protein sites for biomarker research despite the character of the PTMs being unknown.
Introduction
The physiological role of leucine-rich alpha-2-glycoprotein (LRG) is not yet fully understood. Studies have demonstrated the possible involvement of LRG in innate immune response [1]; LRG was proposed a potential marker of granulocytic differentiation [2]. Alterations in LRG serum expression have been observed in several malignant diseases, including lung 3̧ and pancreatic [4] cancers. In our previous work focusing on the proteomics of myelodysplastic syndrome (MDS) [5, 6, 7, 8], LRG has been proposed as a potential biomarker candidate especially of advanced MDS subgroups [9]. Our results suggested that the degree of LRG plasma alterations could be related to the severity of the disease. However, the available 2D electrophoretic data cannot directly identify the source of the changes (total plasma concentration or posttranslational modification (PTM) specific alterations). The increase in LRG plasma level could be an acceptable explanation as up-regulated LRG serum expression has also been observed for other malignant diseases [3, 4]. Moreover, western blot analysis for the RAEB-2 MDS subgroup suggested an increase in LRG concentration [8]. On the other hand, mass spectrometry-based relative quantification of selected LRG peptides (for RAEB-1 and RAEB-2 MDS subgroups) strongly indicated the presence of PTMs [6, 8].
In order to shed more light on the existing LRG related knowledge, we used mass spectrometry in combination with ELISA to map LRG total plasma concentration and its PTMs for different MDS patient subgroups, together with healthy controls. In addition, re-analysis of published mass spectrometry plasma proteomic data was performed to target the potential PTMs and their LRG positions. In order to estimate the potential impact of PTMs on LRG function via modifications in the LRG structure, we performed homology modelling and sequence analysis.
Methods
Plasma samples
Blood samples were collected as described previously [5]. In total, there were 101 different plasma samples used in the study: 71 samples obtained from MDS patients and 30 samples from healthy controls. The diagnoses of MDS subgroups were established according to the WHO classification criteria [10]. Samples were obtained and analysed in accordance with the Ethical Committee regulations of the Institute of Hematology and Blood Transfusion in Prague and with the Helsinki Declaration. All individuals tested agreed to participate in the study on the basis of informed consent.
Mass spectrometry
Relative label-free peptide quantification on the basis of Selected Ion Monitoring was performed as described in detail previously [6] using an HCT ultra ion-trap mass spectrometer with nanoelectrospray ionization (Bruker Daltonics, Bremen, Germany) coupled to a nanoLC system (UltiMate 3000; Dionex, Sunnyvale, CA, USA). Plasma samples were processed using modified protocols of the acetonitrile precipitation of plasma proteins and trypsin digestion as described by Kay et al. [11]. All LRG peptides were selected according to the PeptideAtlas [12]. Results were validated on the basis of the MS/MS spectra and retention times [13].
ELISA
Two different ELISA kits were used to establish the LRG total plasma concentration: SEB934Hu (Cloud-Clone Corp., Houston, TX, USA) and ELH-LRG1 (RayBiotech, Norcross, GA, USA). Samples were prepared and processed according to the manufacturers’ instructions.
Western blot
Western blot analysis was performed as described previously [14]. Monoclonal mouse anti-leucine-rich alpha-2-glycoprotein antibody (1: 500; Abcam ab57992) and rabbit anti-mouse IgG antibody conjugated with peroxidase (1: 80000; Sigma-Aldrich A9044) were used as a primary and secondary antibodies, respectively. The membranes were incubated for 60 minutes at 30
LRG affinity isolation
LRG was isolated from blood and purified according to the protocol by Weivoda et al. [15].
Deglycosylation assay
An N-deglycosylation assay was performed according to Zielinska et al. [16] using PNGase F (G1549; Sigma-Aldrich, St. Louis, MO, USA).
LRG plasma level quantified by ELISA. Box plot of LRG plasma level measured by monoclonal- (A) and polyclonal-based (B) ELISA kits in the control cohort (N) and MDS patient subgroups (RA-RARS, RCMD, RAEB-1, and RAEB-2).
In order to search for LRG posttranslational modifications, different published plasma mass spectrometry data sets (health and cancer samples, depleted and non-depleted plasma samples, using data-dependent and data-independent acquisition, etc.) were re- analysed [17, 18, 19, 20, 21]. The following software was used: AB Sciex MS Data Converter (Beta 1.3), DIA Umpire 2.1.3 [22], ProteoWizard 3.0 [23], Trans-Proteomic Pipeline 5.1.0 [24].
Sequence analysis
There were 119 mammalian reference sequences of the LRG protein (Supplement S1.1) downloaded from the NCBI database [25] and used for sequence analysis of orthologous proteins. Only the most usual splicing variant was considered for species with more than one splicing variant. For paralogue analysis, sequences designed as paralogues to LRG (ENST00000306390.7) protein in the Ensembl database [26] were used.
Sequences for both analyses were aligned with ClustalX [27] and sequence logos were created by the weblogo server [28]. Sequence analyses focused on the protein regions where PTMs were detected by mass spectrometry (E149 to K164 and V251 to R260).
Homology modelling
Homology modelling of human protein LRG (NP_443204) was performed with Modeller [29]. Alignment using crystal structures 2Z66_A [30] and 5A5C_A [31] as templates was prepared in ClustalX and adjusted manually. The best of the 25 models was chosen according to the Modeller objective function and stereochemical properties calculated with Procheck [32].
Statistical methods
For continuous data, we assessed statistical significance using tests and correlation for non-normally distributed data. Mann-Whitney U test was used to assess the differences in LRG concentrations. The results were analysed as follow: Spearman’s Rank-order correlation (for age and iron level), Mann-Whitney U test (for sex), and Kruskal-Wallis test (for diagnosis). ROC curve was used to estimate LRG diagnostic ability. Sample power calculations were performed (80% of power,
Characterisation of LRG peptides for relative label-free quantification
Characterisation of LRG peptides for relative label-free quantification
RT – retention time, PI – product ion.
Relative LRG quantification results
Fold change vs control group (N) for RA, RARS, RCMD, RAEB-1, and RAEB-2 subgroups.
The results indicated the decreased LRG concentration in MDS samples (Fig. 1A) with the lowest levels in the advanced MDS subgroups (mean/median values: 46.7/44.6, 47.5/47.7, 54.7/56.0, 35.8/38.8, and 38.5/37.7
In order to quantify the total LRG plasma concentration, we consequently used a polyclonal antibody-based ELISA kit (ELH-LRG1). The results showed an increase in the LRG level in MDS samples (Fig. 1B) compared to the healthy controls (mean/median values: 22.6/21.9, 24.4/22.6, 32.4/30.5, 35.2/23.1, and 42.9/37.3
To specify MDS-related LRG PTMs, we analysed plasma samples of different MDS subgroups by MS/MS. There were no specific LRG modifications found in the patient and control plasma samples; even using affinity isolated and purified LRG provided no clues. Western blot analysis of N-glycosylation also did not show any changes (Supplement S1.2). For this reason, we further performed relative label-free quantification of LRG peptides among all of the samples in order to map peptides that change their PTMs in MDS; peptides for this assay were selected using the PeptideAtlas [12] and previously collected spectra (Table 1). Two of the peptides were found to change markedly among the samples (251–260 VAAGAFQGLR, 149–164 ENQLEVLEVSWLHGLK), especially in advanced MDS subgroups (Table 2). Consequent statistical analysis showed a strong correlation with diagnosis for ENQLEVLEVSWLHGLK, indicating a relation of this peptide’s PTM(s) to MDS. There was no correlation with the iron level observed (as a possible source of oxidative PTMs). As these two LRG peptides emerged to be promising MDS related PTMs locations, we searched publicly available MS/MS data for possible LRG PTMs in order to find if these peptides are modified in other cancer-related or different control plasma or serum cohorts. ENQLEVLEVSWLHGLK was found in two datasets being modified (Trp to kynurenine); this PTM was not identified in our MDS dataset. Therefore, the results collected thus far showed the presence of unknown PTM(s) and two LRG sites whose alterations correlated with MDS diagnosis.
The physiological role of LRG is not yet fully understood, however, some findings have shown its role in apoptosis [33, 34, 35, 36]. This involvement could represent a (patho)physiological link of LRG to MDS as apoptosis is one of the hallmarks of myelodysplastic syndrome [37]. Since we were not able to detect the identity of LRG PTMs, but we did detect their locations, we focused on homology modelling and sequence analysis in order to establish their potential impact on LRG functions through influencing the LRG structure.
To determine the position of sequences of interest within the protein tertiary structure, a homology model was constructed (Supplemental Fig. 1A and B). In the absence of crystal structure of any LRG protein, a homology model had to be made first. To do so, crystal structure 5A5C_A (LRRs containing a domain of synthetic construct of murine LRRTM2 protein) was used as a template for amino acids 38 to 306 of human LRG because it has a high homology with LRG (34%) because it covers a long part of the modelled protein. We were looking for a crystal structure containing disulfide bond at the positions representing C43 and C56 in LRG because such disulfide bond in LRG is predicted [38]. Further, we were looking for a template with a high homology to LRG and the resolution of crystal structure was the decisive factor among structures fulfilling the abovementioned criteria. Based on this, crystal structure 2Z66 was considered the best template for the N-terminal region (amino acids 6 to 46 in human LRG) of human LRG. We tried to improve the model further using crystal structure 3RFE [39] as a template for its C-terminal part which did not lead to improvement, so this structure is not considered in the presented final model. Evaluating the homology model, one must be aware that LRG is a glycoprotein and that saccharides may have an influence on protein secondary structure. The structure of saccharides bound to LRG is, to the best of our knowledge, unknown. For this reason, we could not incorporate them into the model.
The presented structure is, as far as we know, the first homology model of human LRG protein made by some other than “black-box” server approach; the mouse LRG structure was predicted by Wang et al. [40]. The model shows a molecule of a solenoid shape with a 10-stranded parallel
Sequence analyses were performed to identify conserved amino acids in the regions, where PTMs were identified by mass spectrometry. Alternations of conserved amino acids are believed to have a bigger impact on protein structure and/or functions than other amino acids. The only amino acid residue that is conserved among orthologues and paralogues of LRG in the region E149 to K164 is N150 (Supplemental Fig. 2A and B) that is a part of the LRR structure motif. There are hydrophobic residues conserved at positions 152, 155, 160 and 163 (but for threonine in LRRC52 at this position) among paralogues and L152, L160 and L163 conserved among orthologues. There is a hydrophobic amino acid conserved at position 155 and tryptophan (but for 3 species out of 119 species bearing glycine and 2 species bearing serine) conserved at position 159 among orthologues. This hints that W159 is important for LRG and that it is more likely involved in its function. Therefore, W159 modification should influence LRG properties and functions substantially. Interestingly, this amino acid was found to be modified (Trp to kynurenine) in data re-analysis as mentioned above.
The other region bearing PTM (V251 to R260; Supplemental Fig. 2C and D) is considerably less conserved than the previous one. There is no homology among paralogues. Sequence analysis of orthologues revealed there are conserved hydrophobic amino acids at positions 251, 256 and 259. The absence of conserved amino acid residues does not allow us to predict which residues are important for protein structure and which PTM can be of importance. For this reason, we do not discuss this region further. N150 is the only amino acid residue conserved among orthologues and paralogues of LRG. This amino acid is a part of the LxxLxLxxNxL sequence motif of LRR repeats and participated in formation of the
Conclusions
In conclusion, we report the presence of LRG proteoforms that correlate with diagnosis in the plasma of MDS patients. Using mass spectrometry, data re-analysis and other techniques we specified two posttranslationally modified LRG sites with alterations related to MDS. Despite the unknown character of these modifications, homology modelling and sequence analyses showed their influence on LRG functions. The results show the potential impact of LRG alterations in MDS biomarker research and specify the area for future focus. The nature of the PTMs can be difficult to reveal due to several reasons; PTMs can be too unstable to be analysed by the MS method used, there can be combinations of several PTMs at the protein site (sequence) so that the specific proteoforms (PTMs combinations) are too low in abundance to be identified, the natural individual variability of PTMs may hamper the identification of biomarker relevant PTM(s), etc. Using the approach that combines mass spectrometry, data re-analysis of publicly available data, and homology modelling can be used for any protein to predict clinically relevant protein sites for marker research even though the character of PTMs is unknown.
Author contributions
Conception: PM.
Interpretation or analysis of data: PM, ZS, KP, JS, ZG, PP, VI, JED.
Preparation of the manuscript: PM, ZS, KP, PP, VI, JED.
Revision for important intelectual content: PM, ZS, KP, PP, VI.
Supervision: PM.
Supplementary data
The supplementary files are available to download from http://dx.doi.org/10.3233/CBM-210033.
sj-docx-1-cbm-10.3233_CBM-210033.docx - Supplemental material
Supplemental material, sj-docx-1-cbm-10.3233_CBM-210033.docx
Footnotes
Acknowledgments
This work was supported by the Czech Science Foundation (grant number 20-10845S) and by the Ministry of Health of the Czech Republic project for the conceptual development of the research organization (Institute of Hematology and Blood Transfusion, 00023736).
Conflict of interest
The authors declare no conflict of interest.
