Abstract
This study is to measure the diagnostic examination quality of magnetic resonance spectroscopy in differentiating high-grade gliomas from metastases. PubMed, Embase, and Chinese Biomedical databases were systematically searched for relevant studies published through 10 July 2016. Based on the data from eligible studies, heterogeneity and threshold effect tests were performed; pooled sensitivity, specificity, and areas under summary receiver-operating characteristic curve of magnetic resonance spectroscopy were calculated. Finally, seven studies with a total of 261 patients were included. Quantitative synthesis of studies showed that pooled sensitivity/specificity of Cho/NAA and Cho/Cr ratio in peritumoral region was 0.85 (95% confidence interval: 0.79–0.90)/0.93 (95% confidence interval: 0.80–0.99) and 0.86 (95% confidence interval: 0.76–0.92)/0.86 (95% confidence interval: 0.73–0.94). The area under the curve of the summary receiver-operating characteristic curve was 0.95 and 0.90. Pooled sensitivity, specificity, and area under the curve of magnetic resonance spectroscopy to identify high-grade gliomas from metastases were 0.85 (95% confidence interval: 0.79–0.90), 0.84 (95% confidence interval: 0.75–0.90), and 0.90, respectively. We concluded that magnetic resonance spectroscopy demonstrated moderate diagnostic performance in distinguishing high-grade gliomas from metastases. Furthermore, Cho/NAA ratio showed higher specificity and higher value of area under the curve than Cho/Cr ratio in peritumoral region. We suggest that Cho/NAA ratio of peritumoral region should be used to improve diagnostic accuracy of magnetic resonance spectroscopy for differentiating high-grade gliomas from metastases.
Introduction
Similar to primary brain tumors, brain metastases (METs) deteriorate neuronal functions, displace, and destruct normal brain tissue and induce cerebral edema, resulting in neurocognitive impairment. 1 Solitary intracranial metastatic lesions are usually indistinguishable from high-grade gliomas (HGGs) on magnetic resonance imaging (MRI). This distinction is important because the approach to imaging, diagnostic workup, planning, treatment modalities, and follow-up are different for these two common brain tumors. 2
However, conventional MRI has a limited capacity to differentiate these two types of intracerebral lesions because their neuroimaging appearance is often similar, equivocal, or indistinguishable. 2 Magnetic resonance spectroscopy (MRS) is one of the MR methods that start to play an important role in determining most brain tumor types and grades. 3 MRS provides information about metabolic tissue composition, and advanced spectroscopic methods have been used to quantify markers of tumor metabolism (e.g. glucose), membrane turnover and proliferation (e.g. choline (Cho)), energy homoeostasis (e.g. creatine (Cr)), intact glioneuronal structures (e.g. N-acetyl-aspartate (NAA)), and necrosis (e.g. lactate (Lac) or lipids). 4
There are few reports describing the sensitivity (SEN) and specificity for differentiating HGGs from METs using MRS. A previous systematic review 5 dates back to year 2006, while a significant amount of literature has been published after that date. The aim of this work was to systematically review and meta-analyze the diagnostic performance of MRS in differentiating HGGs from METs based on the eligible published studies.
Materials and methods
Literature search strategy
The PubMed, Embase, and Chinese Biomedical (CBM) databases were systematically searched to select relevant published articles (through 10 July 2016), with the language restricted to English and Chinese. We used the following keywords: (“magnetic resonance spectroscopy” or “MR spectroscopy” or “MRS”) AND (glioma OR “brain neoplasm”) AND (“Neoplasm Metastasis” or “tumor metastasis”). Additionally, the reference lists of all retrieved articles were checked for other potential eligible studies that were not identified in the initial search.
Inclusion and exclusion criteria
The inclusion criteria were as follows: (1) MRS was used to differentiate HGGs from METs in patients with no clinical history of previous surgery, chemotherapy, or radiotherapy; (2) values of true positive (TP), false positive (FP), false negative (FN), true negative (TN), SEN, specificity (SPE), positive likelihood ratio (LR+), and negative likelihood ratio (LR−) could be accurately calculated from the data reported; (3) at least 14 patients were included; (4) pathology and/or clinical follow-up were used as the reference standard; (5) data were not overlapping; and (6) only English and Chinese language full-text publications were included. The following types of studies were excluded: animal studies, abstracts, reviews, case reports, letters, editorials, comments, and conference proceedings.
Potential relevant articles were evaluated independently by Two authors (Q. W and W.L.X.) flowing the inclusion and exclusion criteria. If no agreement could be reached between the two authors, inconsistencies were discussed and resolved by a third author (B.N.X.).
Data extraction and quality assessment
The screened articles were assessed independently by two of the reviewers (J. S. Zhang and X.L. Chen). For each included study, basal characteristics (authors, year of publication, and country of origin), patient characteristics (mean age, sex, number, and type of tumors), and technical aspects (imaging field strength, techniques of spectrum acquisition, device parameters, metabolite ratios, cut-off value, metabolite ratio mean value in different types of tumors, and reference standard) were obtained. For the differentiation, HGGs (grades III–IV) were positive, and METs were negative. The number of TP, FP, FN, and TN results was calculated and recorded. The methodological quality of the studies was assessed using the Quality Assessment Tool for Diagnostic Accuracy Studies version 2 (QUADAS-2). 6 Disagreements were resolved by consensus.
Statistical analysis
Standard methods recommended for meta-analysis diagnostic accuracy were used.7,8 We computed the Spearman correlation coefficient between the logit of SEN and the logit of (1−SPE) to obtain the threshold effect for evaluating heterogeneity. A strong positive correlation would suggest a threshold effect with p < 0.05. The extent of heterogeneity was then assessed using the chi-square value test and the inconsistency index (I2) of the diagnostic odds ratio (DOR). If p < 0.1 or I2 > 50%, significant heterogeneity was proved to exist. The existence of significant heterogeneity required using a random-effects coefficient binary regression model when the test performance was summarized; otherwise, a fixed-effects model was used.9,10 And, meta-regression analysis was performed to find the source of heterogeneity.
The pooled SEN, SPE, LR+, LR−, and DOR, with their 95% confidence intervals (CIs), were calculated for each study, and the same principle was used in the subgroup analyses. We added a value of 0.5 to all cells of studies that had SENs or SPEs of 100%. The summary receiver-operating characteristic curve (SROC), area under the curve (AUC), and Q* index (Q* index is the point on the SROC at which SEN and SPE are equal and is the best statistical method assessing diagnostic performance) were calculated. AUC values of 51%–70%, 71%–90%, and >90% indicated low, moderate, and high diagnostic accuracy, respectively.
Subgroup analysis was performed when some homogeneous set of studies adopted similar design variables and constructed only when more than three studies could be included. Tests of interaction were performed to assess differences between subgroups. 11 The statistical analyses mentioned above were performed using Meta-DiSc statistical software version 1.4. 8
Publication bias was assessed by Deek’s funnel plot asymmetry test. Formal testing for publication bias was conducted using a regression of the diagnostic log odds ratio against ESS1/2 and weighting according to the effective sample size, with p < 0.1 indicating significant asymmetry. 12 This statistical analysis was performed using Stata 12.0 software (StataCorp LP, College Station, TX, USA).
Results
Study selection and characteristics
The study selection process is described in Figure 1. After a full-text review, the remaining seven studies,2,13–18 comprising a total of 261 patients (274 lesions) with radiologically suspected untreated HGG or MET, met all inclusion and exclusion criteria. The detailed characteristics of the included studies are summarized in Table 1.

Flow diagram of the study selection process.
Characteristics of studies included in the meta-analysis of MRS for the differential diagnosis of HGGs from METs.
EL: enhancing lesion; His: histology; F: female; FN: false negative; FP: false positive; LM13: lipid and macromolecular peaks at 1.3 ppm; LTE: long echo time; M: male; MVS: multi-voxel spectroscopy; NA: not available; NL: nonenhancing lesion; ROI: region of interest; STE: short echo time; SVS: single-voxel spectroscopy; TN: true negative; TP: true positive; Cho: choline; Cr: creatine.
As shown in Table 1, four studies14–17 were retrospective cohort studies, and three studies2,13,18 were prospective. Among the seven studies, the sample size in each study had a range from 18 to 65. Histological results obtained from either surgical biopsy or resection were used as the main reference standard in all studies. In 274 lesions with MRS of appropriate quality, there were 162 HGGs and 112 METs. The detailed grading of HGGs was available in all but one study. 16
As for the strength of imaging field, the compact 1.5-T MRI scanners were utilized in five studies and 3.0-T in two studies. There were five studies utilized multi-voxel spectroscopy (MVS) and one study 13 utilized single-voxel spectroscopy (SVS), and one study 17 utilized both techniques. Echo time (TE) of spectroscopic sequence could also be acquired; short TE sequence was used in three studies and long TE in four studies.
The ratios of brain metabolites—Cho/Cr, Cho/NAA, and NAA/Cr—were measured to distinguish HGGs from METs in different regions of interest. Most studies used metabolite ratio in peritumoral region (PTR), while two articles used only lipid in intratumoral area. The risk of bias and concerns regarding the applicability of the studies were shown in Figure 2. In most studies, the risk of bias was low or unclear. Overall, the study quality was satisfactory.

The methodological quality analysis of the seven eligible studies using QUADAS-2 tool.
Quantitative synthesis
Overall pooled synthesis
Spearman correlation coefficient turned out to be 0.432 (p = 0.333), which indicated no notable threshold effect. The fixed-effects model was used since other heterogeneity was not detected (I2 = 6.1%). The pooled weighted values were as follows: SEN: 0.85 (95% CI: 0.79–0.90); SPE: 0.84 (95% CI: 0.75–0.90); LR+: 5.19 (95% CI: 3.35–8.06); LR−, 0.19 (95% CI: 0.13–0.27); and DOR: 25.04 (95% CI: 13.01–48.20). The forest plots from seven studies are shown in Figure 3(a). The AUC under the SROC was 0.8966 (Figure 4(a)). We could not perform subgroup analyses by imaging field strength (1.5 and 3.0 T) and techniques of spectrum acquisition (MVS and SVS) because of limited studies.

Forest plot showing the sensitivity and specificity of different metabolite ratios for the differentiation of HGGs from METs: (a) overall, (b) Cho/NAA ratio, and (c) Cho/Cr ratio.

Summary receiver-operating characteristic curve (SROC): (a) overall, (b) Cho/NAA ratio, and (c) Cho/Cr ratio. AUC: area under the curve.
Cho/NAA ratio and Cho/Cr ratio
A total of 134 MRS examinations were performed to detect HGGs by calculating the ratio of Cho to NAA of PTR in the four included studies.2,14,16,17 The diagnostic threshold of Cho/NAA ranged between 1 and 1.11. Threshold effect (p = 0.600) and heterogeneity (I2 = 11.9%) did not exist among individual studies. The pooled weighted values were determined to be SEN: 0.85 (95% CI: 0.79–0.90); SPE: 0.93 (95% CI: 0.80–0.99); LR+, 9.25 (95% CI: 3.65–23.41); LR−: 0.11 (95% CI: 0.06–0.22); DOR: 74.9 (95% CI: 20.68–270.95). The forest plots from four studies are shown in Figure 3(b). The AUC under the SROC was 0.9504 (Figure 4(b)).
A total of 133 MRS examinations were analyzed in three studies2,15,16 to detect HGGs from METs by calculating the ratio of Cho to Cr of PTR. The diagnostic threshold of Cho/Cr ranged between 0.4 and 1.24. No threshold effect (p = 0.667) was found. Significant heterogeneity was observed among individual studies (I2 = 50.1%). Therefore, the test performance was summarized using a random-effects coefficient binary regression modal. The pooled SEN and SPE values were 0.86 (95%CI: 0.76–0.92) and 0.86 (95% CI: 0.73–0.94), respectively, (Figure 3(c)).The pooled LR+ was 5.36 (95% CI: 1.43–20.03) and pooled LR− was 0.18 (95% CI: 0.15–0.30). The pooled DOR was 29.41 (95% CI: 5.52–156.72) and the AUC under the SROC was 0.8959 (Figure 4(c)).
Only two studies13,18 had been included using lipid in intratumoral area to detect HGGs. Quantitative synthesis of studies measuring other metabolite was impossible because of limited data.
Finally, we compared the AUC among the three groups with different metabolites. It turned out that Cho/NAA group (AUC = 0.9504) had the most powerful diagnostic performance. The pooled SEN, SPE, LR+, LR−, DOR, and AUC of different groups are summarized in Table 2.
Subgroup analyses of diagnostic accuracy variables.
AUC: area under the curve; Cho: choline; CI: confidence intervals; DOR: diagnostic odds ratio; LR+: positive likelihood ratio; LR−: negative likelihood ratio; SEN: sensitivity; SPE: specificity; SE: standard error.
Heterogeneity analysis and publication bias
In the pooled analysis of Cho/Cr groups, we found significant heterogeneity that could not be explained by threshold effect. Meta-regression analysis could not be conducted based on limited studies. There was no significant heterogeneity in the overall group (p = 0.67) and Cho/NAA groups (p = 0.43) pooled analysis.
Deek’s funnel plot asymmetry test was performed to assess publication bias for the overall analysis. Finally, no proof of publication bias was obtained in overall group (p = 0.67) and Cho/Cr groups (p = 0.43), while publication bias may be existed in Cho/NAA groups (p = 0.04) (Figure 5).

The funnel plot of publication bias: (a) overall, (b) Cho/NAA ratio, and (c) Cho/Cr ratio.
Discussion
The former systematic review 5 concluded that MRS was uncertain to differentiate HGGs from METs reliably because of limited studies. And, a recently published meta-analysis 19 also made a conclusion that HGGs and METs cannot be reliably differentiated by MRS. In those reviews, there were few included articles describing the FP and FN ratios for differentiating HGGs from METs using MRS. The main problem was that the authors abstracted the data from related articles without identifying intratumoral or PTRs, which would inevitably increase heterogeneity and make the conclusion unreliable. Therefore, we performed this systematic review and meta-analysis based on accurate calculations of relevant data.
Overall pooled synthesis
According to the quantitative synthesis, the pooled SEN and SPE were 0.85 and 0.84; the AUC of SROC curve was 0.8966, suggesting a moderate level of overall accuracy. The DOR is a single indicator of test accuracy that combines the SEN and SPE data into a single number. 20 In our meta-analysis, the pooled DOR for diagnostic accuracy of overall group was 25.04, indicating that MRS may be helpful in the diagnosis of distinguishing HGGs from METs. LR+ and LR− are also adopted as ways to assess the diagnostic accuracy of the test because these values appear to be more significant in clinical practice than are the SROC curve and the DOR. An LR+ of 5.19 suggests that HGG patients have about a 5-fold higher chance of a positive test (>cutoff value) compared with METs patients. On the other hand, the LR− was 0.19, suggesting that if the metabolite or metabolite ratio was low as cutoff value, the probability that this patient was HGG would be 19%, which was not low enough to exclude HGG. Heterogeneity and publication bias did not exist in the overall pooled analysis.
Comparisons between Cho/NAA ratio and Cho/Cr ratio
As it has been previously reported, MRS is able to differentiate HGGs from METs through analysis of the peritumoral edema. 2 An increased Cho/Cr ratio from the PTR has been reported in HGG compared with that in MET.18,21,22 Another reason for its use is that creatine concentration is known to be relatively stable during the formation of the anaplastic foci in contrast to Cho concentration, which increases progressively. 3 However, the AUC values using peritumoral Cho/Cr and Cho/NAA ratios for discrimination of HGGs from METs were not consistent.2,23 The most optimum ratio is full of controversy.
Based on our meta-analysis, Cho/NAA ratio (SE = 0.85, SP = 0.93) in the diagnosis of HGGs showed higher SPE than Cho/Cr ratio (SE = 0.86, SP = 0.86). For Cho/NAA and Cho/Cr ratio, the AUC of SROC curve was 0.9504 and 0.8959, respectively, indicating Cho/NAA ratio have a high level of overall accuracy. The pooled DOR of diagnostic accuracy of Cho/NAA and Cho/Cr ratio was 74.9 and 29.41, respectively, indicating that Cho/NAA ratio may be more helpful in the diagnosis of than Cho/Cr ratio. Taking the diagnostic performance, SEN, and SPE into consideration, the Cho/NAA ratio may be a superior index for distinguishing HGGs from METs in PTR. However, the efficacy of Cho/NAA ratio needs further confirmation due to limited data at present.
Heterogeneity was observed in the Cho/Cr ratio group pooled analysis. Therefore, we used a random-effects coefficient binary regression modal to calculate the test performance of Cho/Cr ratio. We could not found the source of the heterogeneity. There may be publication bias of Cho/NAA ratio according to the results of Deek’s funnel plot asymmetry test (p = 0.04) due to the small study effects (the tendency for the small studies in a meta-analysis to show high accuracy).
Other metabolite ratios
Previous studies 24 have reported that the MRS values of intratumoral region have no significant difference between brain METs and gliomas. But some studies indicate that the lipid signals provide significant information toward determining whether a tumor is a solitary metastasis or a glioblastoma. 18 In our meta-analysis, two studies13,18 had been included to detect HGG using lipid in intratumoral area. The SEN/SPE was 78%/83% and 80%/79%, respectively. Considering the low efficiency, we do not recommend using intratumoral spectra to differentiate HGGs and METs. The number of articles regarding other metabolite ratios was not adequate to perform a meta-analysis with credible results.
Limitations
This meta-analysis revealed moderate overall diagnostic accuracy for MRS in distinguishing HGGs from METs, however, some design limitations should be considered when interpreting our results.
First, heterogeneity was found among the Cho/Cr ratio group that threshold effect cannot explain. And, factors such as year of publication, country, study design, compact, voxel, and TE may contribute to heterogeneity. And, we failed to perform different subgroup analyses due to the limited data. Besides, although heterogeneity was not present in overall and Cho/NAA groups, there was considerable variation in study design. MRI devices used in each included study were vague and may be various, and heterogeneity caused by this was inevitable.
Second, there may be publication bias in Cho/NAA group. Our meta-analysis was based only on published studies, which tend to report high accuracy; studies with lower accuracy are often rejected or not even submitted. However, it is suggested that the quality of the data reported in articles accepted for publication in peer-reviewed journals is superior to the quality of unpublished data. 25 In addition, this review was restricted to full-text articles published in English and Chinese, omitting eligible studies that were unpublished or reported in other languages, which also likely result in bias.
Finally, the diagnostic accuracy may be affected by the limited number of studies and patients. The studies of small sample sizes would be affected greatly by adding 0.5 to each cell of the study for zero entries correction. Furthermore, the conclusion is potential valuable only as a general guide and it needs more multicentric trials with large sample to confirm in the future.
Conclusion
This meta-analysis provides evidence that MRS has moderate diagnostic performance in distinguishing HGGs from METs. Cho/NAA ratio showed higher SPE and higher value of AUC than Cho/Cr ratio in PTR. MRS can differentiate HGGs from METs, especially with peritumoral measurements, Cho/NAA ratio of PTR may be a superior index. We suggest that Cho/NAA ratio of PTR should be used to improve diagnostic accuracy of MRS for differentiating HGGs from METs.
Footnotes
Acknowledgements
The scientific guarantor of this publication is JianMin Zhang, PhD. The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article. One of the authors (J.Z.) has significant statistical expertise. Neither institutional review board approval nor written informed consent was required because of the nature of our study, which was a systemic review and meta-analysis. Methodology: meta-analysis, performed at one institution. Q.W., J.Z., and W.X. are the first authors and they contributed equally to this work. No overlapping with previously published works was present except for statistic methods of meta-analysis.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by Hospital Young Doctor Funding Plan of Chinese PLA General Hospital (grant number: 15KMM19), Hospital Clinical Sponsor Foundation Plan of Chinese PLA General Hospital (grant number: 2016FC-TSYS-1023), and National Natural Science Foundation of China (grant number: NSFC81271515). The sponsor had no role in the design or conduct of this research.
