Abstract
Purpose
The aim of this study was to investigate the prognostic relevance of the Ki-67 labeling index (LI) in gastrointestinal stromal tumor (GIST) through a systematic review, meta-analysis and diagnostic test accuracy review.
Method
The study included 1,967 GIST cases from 24 eligible studies. We investigated the correlation between high Ki-67 LI and survival and the proper criteria for high Ki-67 LI. In addition, a diagnostic test accuracy review was conducted to evaluate the predictive role of high Ki-67 LI for higher risk of tumor recurrence.
Results
A high Ki-67 LI was significantly correlated with worse disease-free survival (DFS) (hazard ratio [HR] 3.658, 95% confidence interval [CI] 2.687-4.979, p<0.001) and overall survival (OS) (HR 3.730, 95% CI 2.819-4.936, p<0.001). With regard to DFS and OS, the subgroup with a cutoff value of >4% for high Ki-67 LI had a higher HR than the subgroup with a ≤4% cutoff. In the diagnostic test accuracy review, a high Ki-67 LI was significantly correlated with higher risk of tumor recurrence (pooled sensitivity = 0.44, pooled specificity = 0.87, area under the curve on the summary receiver operating characteristic curve = 0.656).
Conclusions
Our results showed that a high Ki-67 LI was significantly correlated with worse prognosis and higher risk of tumor recurrence in GIST. Further prospective studies of the prognostic role of Ki-67 LI are necessary prior to application in clinical practice.
Introduction
Most gastrointestinal stromal tumors (GISTs) have at least some potential for malignancy. The AJCC staging system uses tumor size and mitotic rate as the staging parameters for these tumors (1). While the maximum tumor diameter is easy to determine, there is a possible bias in evaluating the number of mitoses (2). Uncontrolled proliferation is one of the chief characteristics of tumor cells and can be assessed using various methods, including counting mitoses under a microscope. However, stringent criteria are needed for defining mitosis in order to avoid counting pyknotic or dyskaryotic nuclei as mitoses (1).
One of the most widely practiced measurements is the immunohistochemical assessment of Ki-67 antigen, a nuclear marker expressed in all phases of the cell cycle other than G0 (3, 4). Previous investigations have demonstrated that the percentage of Ki-67-positive tumor cells, also known as the Ki-67 labeling index (LI), can be used to risk stratify patients with GIST (5-6-7). However, Ki-67 LI is not an established prognostic factor in clinical practice or in the management of patients with GIST. In addition, Ki-67 LI can be affected by various factors, such as differences in immunohistochemical methods, evaluation methods and observers.
In the present study, we investigated the suitable optimal criteria of high Ki-67 LI and its correlation with survival in 1,967 GISTs from 24 eligible studies. Moreover, a diagnostic test accuracy review was conducted to confirm the predictive role of high Ki-67 LI for high risk of tumor recurrence.
Materials and methods
Published Studies Search and Selection Criteria
Relevant articles were obtained by searching the PubMed and MEDLINE databases through April 30, 2015. Searches were performed using the following keywords: “gastrointestinal stromal tumor(s),” “Ki-67” (also “Ki 67” and “Ki67”) and “MIB-1” (also “MIB 1” and “MIB1”). The title and abstract of all searched articles were screened for exclusion. Two independent investigators first screened titles, abstracts and keywords to determine if the given information of each study met the criteria described below, and then they retrieved full-text publications of all potentially eligible articles. Review articles were also screened to identify additional eligible studies. The search results were then scanned according to the following inclusion and exclusion criteria: (1) Ki-67 LI investigated in human GIST tissue; (2) the correlation between Ki-67 LI and survival rate was included; (3) case reports were excluded; and (4) English or Korean language publication.
Data Extraction
The following items were collected from the full texts of selected articles by J. Pyo and were verified by G. Kang: publication date, first author's name, clone name/source and dilution, cutoff for assessing high Ki-67 LI, and data allowing estimation of the impact of Ki-67 expression on disease-free survival (DFS) and/or overall survival (OS). Whenever possible, data were extracted according to the NIH risk-group stratification system (based on tumor size and mitotic index). We did not define a minimum number of patients needed to include a study in our analysis. Data associated with survival were extracted after a 60-month follow-up period. When the authors reported the same patient population in several publications, only the most recent or complete study was included in the analysis. Any disagreements were resolved by consensus.
Statistical Analysis
Ki-67 LI was considered high or low according to the cutoff values provided by the authors. For quantitative aggregation of survival results, the impact of Ki-67 on survival was analyzed according to hazard ratios (HRs) using 1 of 3 methods. In studies not quoting the HR or its confidence interval (CI), these variables were calculated from the presented data using the following parameters: the HR point estimate, the log-rank statistic or its p value, and the O-E statistic (the difference between the number of observed and expected events) or its variance. If those data were unavailable, the total number of events, the number of patients at risk in each group, and the log-rank statistic or its p value were used to estimate the HR. Finally, if the only useful data were in the form of graphical representations of survival distributions, survival rates were extracted at specified times in order to reconstruct the HR estimate and its variance under the assumption that patients were censored at a constant rate during the time intervals (8). The published survival curves were read independently by 2 authors in order to reduce reading variability.
The HRs were then combined into an overall HR using Peto's method (9). In the current meta-analysis, the random-effect model was more suitable than the fixed-effect model, because all eligible studies used various methods and cutoff values for evaluation of Ki-67 LI. In addition, the random-effect model is suitable to estimate the mean of an effect distribution. Therefore, the pooled estimates in the current meta-analysis were analyzed using the random-effect model. Heterogeneity between studies was assessed using the Q and I2 statistics. For assessment of publication bias, Begg's funnel plot and Egger's test were performed. If significant publication bias was found, the fail-safe N and trim-and-fill tests were additionally conducted to confirm the degree of publication bias. The results were 2-sided and considered statistically significant at p<0.05. These analyses were performed using Comprehensive Meta-Analysis (CMA) version 2.0 (Biostat).
To confirm the correlation between high Ki-67 LI and risk stratification, we performed a diagnostic test accuracy review using the Meta-Disc program (version 1.4) (10). The forest plots for sensitivity and specificity and the summary receiver operating characteristic (SROC) curve were obtained through diagnostic test accuracy review. The value of the area under the curve (AUC) on SROC and the diagnostic odds ratio (OR) were calculated using the Meta-Disc program.
Results
Characteristics of the Studies
Using the search keywords we identified 211 potentially relevant publications at the level of the abstract and title. Studies were excluded if any of the following criteria applied: 1) incomplete or insufficient information; 2) non-English; 3) duplicate publication; 4) performed using animal models or cell lines; 5) review or case report; and 6) study of another disease. Applying the inclusion and exclusion criteria, we eventually selected 24 reports for the meta-analysis (Fig. 1 and Tab. I) (7, 11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33). The total number of patients was 1,967. Eligible studies used various antibodies and methodologies for Ki-67 immunohistochemistry (IHC) and cutoff values of Ki-67, as shown in Table I. The cutoff values of Ki-67 LI varied by up to 22%, and the rates of high Ki-67 LI ranged from 5.9% to 68.9%. All eligible studies except for Carrillo's report (13), which used image analysis, evaluated Ki-67 LI through microscopic observation. None of the subjects received pre- or postoperative imatinib in eligible studies.
Main characteristics of eligible studies
N = number of patients; LI = labeling index; ND = no description.

Flow chart of study search and selection methods.
Systematic Review and Meta-Analysis
Correlation between High Ki-67 LI and Survival
Among the eligible studies, 15 and 11 analyzed subsets provided data for the correlation between Ki-67 LI and DFS or OS, respectively. A high Ki-67 LI was significantly associated with worse DFS (HR 3.658, 95% CI 2.687-4.979, p<0.001, Fig. 2A) and OS (HR 3.730, 95% CI 2.819-4.936, p<0.001, Fig. 2B). The eligible studies showed significant heterogeneity in DFS (I2 = 58.6%, p = 0.003) but not OS (I2 = 7.2%, p = 0.375). In the sensitivity analysis, the eligible studies had no effect on the pooled HR, and the range of HRs was 3.289-3.952 and 3.476-4.141 in DFS and OS, respectively. Egger's test showed a considerable publication bias for DFS (p = 0.045). To confirm the degree of publication bias, fail-safe N and trim-and-fill tests were conducted. The number of missing studies that would bring the p value to >alpha was 642 in the fail-safe N test. Since there were only 15 observed studies, the publication bias could not be large. Furthermore, there was no significant difference between the observed and adjusted values in the trim-and-fill test. However, with regard to OS, there was no evidence of publication bias in Egger's test and no asymmetry in Begg's funnel plot.

Forest plot diagrams of hazard ratio for correlations between high Ki-67 labeling index and disease-free survival (
Subgroup Analysis
Next, we conducted a subgroup analysis based on Ki-67 LI to evaluate the optimal criteria of high Ki-67 LI. Accordingly, eligible studies were divided into categories based on the cutoff value. The results of the subgroup analysis based on the criteria for high Ki-67 LI are shown in Figure 3. In DFS, subgroups with a >4% cutoff showed significantly poorer survival rates than those with a cutoff ≤4% (HR 3.954, 95% CI 2.769-5.646 vs. HR 2.736, 95% CI 1.324-5.652, respectively). In OS, HRs were 2.175 (95% CI 1.305-3.624) and 4.500 (95% CI 3.291-6.153) in subgroups with a ≤4% and >4% cutoff, respectively.

Forest plot diagrams of hazard ratios according to criteria of high Ki-67 labeling index. (
Diagnostic Test Accuracy Review
To confirm the diagnostic accuracy of the predictive role of high Ki-67 LI for higher (high and intermediate) risk, we performed a diagnostic test accuracy review. Among the eligible studies, 10 studies showed a correlation between Ki-67 LI and risk stratification of tumor recurrence (15, 18, 21, 23-24-25, 29, 30, 34, 35). The pooled sensitivity and specificity of these 10 studies were 0.44 (95% CI 0.41-0.48) and 0.87 (95% CI 0.84-0.91), respectively (Fig. 4). The range of sensitivity was 0.35 to 0.72, and the range of specificity was 0.38 to 1.00. The AUC on the SROC curve was 0.656 (Fig. 5), and the pooled diagnostic OR was 4.54 (95% CI 2.69-7.65).

Pooled sensitivity (

Summary receiver operating characteristic (SROC) curve of high Ki-67 labeling index for predicting high and intermediate risks of tumor recurrence.
Discussion
The recurrence and prognosis of GISTs can be clinically predicted using tumor size and mitotic rate. Ki-67 LI has been widely used as a prognostic marker in various tumors such as neuroendocrine tumors, soft tissue tumors and breast tumors. However, the prognostic relevance and diagnostic standardization of Ki-67 LI in GISTs have not yet been fully elucidated. The present study is the first systematic review and meta-analysis of published studies for evaluation of the role of Ki-67 LI as a prognostic marker in GIST.
In clinical practice, evaluation for tumor recurrence distinguishes high, intermediate, low and very low risks through the combination of tumor size and mitotic rate. In addition, there are different staging criteria between primary tumor locations, regardless of similar parameters. The mitotic rate is expressed as the number of mitotic figures per 50 high power fields. However, unlike tumor size, counting mitoses under a microscope can vary due to numerous causes including inter- and intra-observer differences, differentiation with pyknosis, and observed tumor area. Because the standardization for evaluation of mitosis has not been fully elucidated in GIST, risk evaluation among studies may show discrepancies. Prognostic parameters based on IHC, such as Ki-67, can be useful in decreasing such discrepancies. Previous studies have reported a correlation between Ki-67 LI and mitotic rate in GIST (16, 18, 23, 36, 37); however, this conclusion is controversial. The current meta-analysis showed that high Ki-67 Li was significantly concordant with higher mitotic count (pooled sensitivity = 0.60, 95% CI 0.51-0.69; pooled specificity = 0.91, 95% CI 0.88-0.94; AUC in SROC = 0.977; data not shown). In addition, previous studies reported that Ki-67 LI was an independent parameter for prediction of prognosis by multivariate analysis (7, 11-12-13-14-15-16-17-18-19-20-21-22-23-24-25-26-27-28-29-30-31-32-33). Taken together, these results suggest that Ki-67 LI can be used as an alternative marker of the mitotic rate.
Biopsy is needed if preoperative therapy is being considered for unresectable or metastatic GIST. In addition, [for high diagnostic accuracy of] fine-needle aspiration (FNA) biopsy for GISTs, Ki-67 LI has been shown to predict the malignant potential in (37, 38). On the other hand, a recent study showed that the mean mitotic count of tissue obtained from endoscopic ultrasound-guided core biopsies was significantly lower than that in resected specimens (39). The main reason for this discrepancy is uneven distribution of mitotic figures within a tumor; in addition, mitotically active areas can be missed in a small biopsy sample (40). Despite these limitations, the National Comprehensive Cancer Network (NCCN) recommends Ki-67 IHC analysis when the mitotic index is based on fewer than 50 HPF (i.e., in small biopsy samples), which could further support the proliferation rate as determined by the mitotic rate (41). In our diagnostic test accuracy review, the specificity was as high as 0.87. If low Ki-67 LI is identified, the probability of low or very low risk of tumor recurrence might be very high, regardless of tumor size. Therefore, Ki-67 LI might be useful for risk stratification in small biopsy and FNA specimens.
Our meta-analysis showed that high Ki-67 LI was associated with a higher risk of recurrence and a worse survival at variable cutoff values, which raises the question of an optimal Ki-67 cutoff point. Until now, there has been a lack of consensus on scoring for Ki-67 IHC and an appropriate cutoff point for positivity or high expression in GISTs. Many cutoff values have been used in previous studies, varying from a few percent to more than 20%. These cutoff values have limited value outside the studies from which they were derived and the laboratories that performed them (4). In the absence of standardized IHC methodology, we were unable to achieve an ideal cutoff point that could be used in routine pathology practice. However, with regard to which part of the tumor should be assessed (e.g., hot spots, overall average), it seems reasonable that Ki-67 LI be obtained from the tumor area with the highest nuclear labeling on screening, much like the grading of neuroendocrine tumors (1). The criteria used for Ki-67 LI are 2% and 20% in neuroendocrine tumors. However, more detailed criteria would be needed for evaluation of the usefulness of Ki-67 LI in GIST. Subgroup analysis could be helpful for confirmation of the proper cutoff value in determining high Ki-67 LI. Interestingly, the difference in HRs for DFS was largest between subgroups using a 4% cutoff (Fig. 4A). In OS, the difference in HRs between subgroups was larger in a subgroup using 2%-4% than in a subgroup using 5% as the cutoff for high Ki-67 LI (Fig. 4B). Therefore, our data indicate that the optimal cutoff value for higher Ki-67 LI is 4%. To establish the optimal cutoff value of Ki-67 LI, more cumulative studies will be required.
There are a number of limitations to the present study. First, with the current meta-analysis we could not evaluate the differences in prognostic roles between surgical specimens and other types of specimens, such as small biopsy and FNA specimens, due to insufficient information in the eligible studies. Further prospective studies are needed to investigate the prognostic role of Ki-67 LI in small biopsy and FNA specimens of GIST. Second, the follow-up periods differed between eligible studies. To avoid bias from follow-up periods, survival data were extracted after a 60-month follow-up period. Although the follow-up period had no effect on the correlation between Ki-67 LI and survival, the correlation between Ki-67 LI and survival could differ from those in previous reports. Third, the present study included only 1 study using image analysis (13). To assess the effect on HR, sensitivity analysis was conducted, which showed that Carrillo's study (13) had no effect on the pooled HR (HR 3.517, 95% CI 2.679-4.618). Although the concordance of Ki-67 LI was good between microscopic observation and image analysis in a study using a cutoff of 10% in GISTs (7), additional analysis of the correlation between microscopic observation and image analysis could not be performed due to the small number of studies using image analysis. Lastly, analysis related to specific organs of the gastrointestinal tract, such as stomach and small bowel, could not be performed due to insufficient information from eligible studies.
In conclusion, the current meta-analysis showed that high Ki-67 LI was significantly correlated with worse survival and high risk of tumor recurrence. Further prospective studies for diagnostic standardization and correlation between Ki-67 LI and mitotic rate are required prior to application in various tumor locations and various types of specimens.
Footnotes
Financial support: None.
Conflict of interest: The authors declare that they have no conflict of interest.
