Abstract
Hematopoietic cell transplantation (HCT) is a potentially curative therapy for hematologic malignancies that relies on the graft-
Keywords
Introduction
Hematopoietic cell transplantation (HCT) is a potentially curative therapy for hematologic malignancies that relies on the graft-
The primary treatment of GVHD is immunosuppression with high-dose systemic glucocorticoids, usually lasting for a minimum of several months. Even when steroid therapy results in complete resolution of GVHD symptoms, intensive immunosuppression leads to significant morbidity and mortality, including opportunistic infections, avascular necrosis, osteopenia, osteoporotic fractures, metabolic disturbances, and neuropsychiatric abnormalities.4–6 The response of GVHD to therapy has traditionally been measured by the change in clinical symptoms, or the reduction in overall grade following the start of treatment. Patients who do not respond to primary therapy face dismal outcomes, with greater than 50% mortality.2,7 The clinical response to therapy after 4 weeks has been the most useful predictor of nonrelapse mortality (NRM), and serves as the primary endpoint in most clinical trials of GVHD treatment.8,9 Yet the change in clinical symptoms has a poor positive predictive value (PPV) and better predictive methods are urgently needed.
Acute GVHD biomarkers
In the past decade, serum biomarkers have emerged as an additional potential measurement of acute GVHD severity. 10 Several cytokines, cytokine receptors, and T cell surface markers have shown positive correlations with clinical GVHD outcomes. The first validated systemic biomarkers of acute GVHD were combined into a four biomarker panel consisting of serum concentrations of IL-2Rα, TNFR-1, IL-8, and hepatocyte growth factor (HGF). 11 Elevated levels of T-cell immunoglobulin and mucin domain-3 (TIM3), a protein that reflects T cell exhaustion, predicted severe (grade III/IV) GVHD and 1-year NRM.12,13 IL-6 also was observed to be elevated early post-transplant in patients who later developed GVHD, and its blockade has emerged as a potential prophylactic strategy.13,14
In addition to such markers of systemic inflammation, several proteins related to GVHD organ damage have now been identified. Elafin, an elastase inhibitor, was the first validated biomarker that was specific for GVHD of the skin. 15 Cytokeratin 18 fragments and HGF were found to correlate with visceral (gut and liver) GVHD.16,17 Several biomarkers have been identified and validated for GI GVHD, which is the target organ most refractory to treatment. Regenerating islet-derived 3 alpha (REG3α), either alone or in combination with other markers, has been validated as a biomarker of lower GI GVHD and long-term mortality.18–20 Suppressor of tumorigenesis 2 (ST2), which derives primarily from GI tissues during GVHD, has been shown to correlate with poor outcomes by a number of different groups.21–23 Amphiregulin, a weak agonist of the epidermal growth factor receptor produced by type 2 innate lymphoid cells that can heal damaged mucosa in murine models of GVHD, has been show to improve the accuracy of clinical severity of GVHD in predicting the risk of NRM.24–26
The discovery and validation of GVHD biomarkers is a principal objective of the Mount Sinai Acute GVHD International Consortium (MAGIC), a group of 25 HCT centers conducting GVHD research. MAGIC has validated an algorithm that combines two GI biomarkers (ST2 and REG3α) into a single value that estimates the probability of 6 month NRM for individual patients, known as the MAGIC algorithm probability (MAP). The MAP also predicts response to treatment and maximum GVHD severity, and is now commercially available and widely used among scores of centers in clinical practice. Both academic and commercial laboratories have demonstrated that MAPs are highly reproducible, with 92% of samples receiving the same risk category assigned by different laboratories. 27 The MAP will be the focus of the remainder of this review, with consideration of the categorization of types of biomarkers as defined by the United States National Institutes of Health (NIH) and Food and Drug Administration (FDA).
The MAP and GI GVHD biology
Gastrointestinal crypt damage is the major driver of GVHD mortality, and both triggers and amplifies systemic inflammation in GVHD through myriad mechanisms that relate to the biomarkers ST2 and REG3α. 28 ST2 is shed from multiple cell types of the gastrointestinal epithelium, endothelium, and stroma.21,29 REG3α is concentrated in mucous and stored in Paneth cells, and plays an important role in GI homeostasis. Its release into the systemic circulation correlates with damage to the crypt, and its rising levels inversely correlate with the ability to regenerate gastrointestinal tissue. 30 Each of these biomarkers reflects different aspects of gastrointestinal GVHD pathology, and their combination can be considered as a ‘liquid biopsy’ that quantitates crypt damage throughout the intestine; irreversible GI crypt damage is the principal driver of NRM from GVHD, and accounts for the predictive accuracy of the MAP. Indeed, a recent study showed that 83% of NRM deaths were directly due to acute GVHD with or without infection. 31
NIH-FDA overview
The Biomarker Working Group convened by both the Food and Drug Administration and the National Institutes of Health has recently issued a report of biomarker definitions and their uses, entitled
A risk biomarker is one that ‘indicates the potential for developing a disease or medical condition in an individual who does not currently have clinically apparent disease or the medical condition.’
32
A relevant example of a risk biomarker from oncology are breast cancer genes 1 and 2 (BRCA 1/2) mutations that increase the likelihood that individuals harboring mutations will develop breast, ovarian, prostate, and other types of cancer.33–35 As a risk biomarker for GVHD, the MAP accurately predicts the development of severe and lethal acute GVHD prior to the onset of clinical symptoms when measured at 7 days following HCT.
36
In a multicenter, prospective cohort study, HCT patients with elevated MAPs had a significantly greater incidence of NRM by 6 months compared with those with low MAPs (26%
A diagnostic biomarker is one that is ‘used to detect or confirm presence of a disease or condition of interest or to identify individuals with a subtype of the disease.’ 32 One of the most commonly used diagnostic biomarkers in medicine is an elevated level of glycosylated hemoglobin (HbA1c) to identify individuals with diabetes mellitus.37,38 As a diagnostic biomarker, REG3α has the ability to distinguish diarrhea caused by GVHD from diarrhea caused by infection (e.g. cytomegalovirus), the other primary cause of lower GI dysfunction early after HCT. 18 The combination of ST2 and REG3α in the MAP has not been formally validated as a diagnostic biomarker of acute GVHD.
A prognostic biomarker is one that is used ‘to identify the likelihood of a clinical event, disease recurrence or progression in patients who have the disease or medical condition of interest.’ 32 In acute myelogenous leukemia, measurable residual disease, or the ability to detect small numbers of malignant blasts below the traditional morphologic-based threshold of 5%, has emerged as a powerful prognostic biomarker for eventual relapse and mortality. 39 At the onset of GVHD symptoms, the MAP is prognostic of long-term outcomes of significant clinical consequence such as NRM and overall survival. 19 The MAP at onset of GVHD stratifies patients into three Ann Arbor GVHD scores that each have a distinct risk of NRM (Figure 1). Ann Arbor 1 patients (MAP < 0.141) account for almost half of patients at onset, and have a 6 month incidence of NRM of 8%. Patients with intermediate MAPs (0.141 ⩽ MAP ⩽ 0.290) represent one-quarter of the total, and have 6 month NRM of 24%, and patients with high MAPs of >0.290 account for the remaining quarter and have 6 month NRM of 46%. 36 After 1 week of treatment, the MAP also predicts day 28 response, NRM, and overall survival (OS) for steroid-refractory GVHD (Figure 2). 40 Recent data show that the MAP predicts both NRM and OS better than the clinical response to treatment when both are measured after 4 weeks of therapy. 39 Thus the MAP is a prognostic marker of GVHD outcomes when measured at a single time point.

MAP at GVHD onset divides patients at onset into Ann Arbor groups. Left: Six month cumulative incidences of NRM in each AA GVHD score (

MAP after 1 week of treatment predicts risk of NRM in steroid-resistant patients. Cumulative incidence of NRM for patients whose MAPs were either above (–) or below (–) the post-treatment threshold MAP = 0.290. The difference in NRM was statistically significant (
A predictive biomarker is one that is used ‘to identify individuals who are more likely than similar individuals without the biomarker to experience a favorable or unfavorable effect from exposure to a medical product or an environmental agent.’ 32 The deletion of the 5q chromosome in myelodysplastic syndromes (MDS) is an example of a predictive biomarker that indicates the likelihood of response to lenalidomide therapy. 40 Such biomarkers can personalize therapy for individual patients. ST2 was identified by comparing proteins elevated in glucocorticoid responsive and glucocorticoid nonresponsive patients. 21
In clinical practice, a biomarker may be prognostic, prognostic but not predictive, or both prognostic and predictive. A biomarker that is prognostic but not predictive is one that indicates the likelihood of a clinical event without indicating the clinical benefit of a particular therapy. 5q deletion is both a prognostic biomarker for outcomes of MDS a predictive biomarker of lenalidomide treatment of MDS.
41
The MAP is a prognostic biomarker because it determines the likelihood of NRM after several types of therapy.31,36,40 The MAP is also a predictive marker that indicates the likelihood of response to steroid therapy for GVHD. When the MAP was evaluated 7 days after HCT, patients with high MAPs experienced significantly higher rates of steroid refractory GVHD than those with low MAPs (35%
Biomarkers that are measured repeatedly over time to asses ‘status of a disease or medical condition or for evidence of exposure to (or effect of) a medical product or an environmental agent,’ are known as monitoring biomarkers. 32 A commonly used monitoring biomarker for anticoagulant therapy is the prothrombin time/international normalized ratio that monitors warfarin activity. Because the MAP functions well as a prognostic biomarker throughout the course of HCT and early GVHD, we hypothesized that it may also function as a monitoring biomarker. To assess its monitoring potential, we measured the MAP after 1, 2, and 4 weeks of systemic treatment for GVHD, and found it predicted 6 month NRM at each time point tested. 40 The addition of the clinical response to biomarker concentrations in a single algorithm did not improve the predictive accuracy of the MAP alone. Therefore, MAPs either alone or in combination with clinical data appear to be useful monitoring biomarkers the treatment of acute GVHD.
Response biomarkers are a subset of monitoring biomarkers that can be used ‘to show that a biological response has occurred in an individual who has been exposed to a medical product or an environmental agent.’
32
To evaluate whether the MAP could determine response to treatment for acute GVHD, we measured the change in MAPs between the start of treatment and 28 days later and compared that change in patients who experienced 6 month NRM to those who did not. Across all Ann Arbor groups, an increase in MAP correlated with nonrelapse death, whereas reductions were associated with survival (Figure 3). In patients with low MAPs at the start of treatment (Ann Arbor 1, MAP < 0.141), the rate of 6 month NRM was low, but most of those who died had substantial increases in MAP over the 1st month of treatment, and the increase in MAP for patients who died was significantly larger than in those who lived (

Change in MAP after 4 weeks measures response to treatment. The change in MAP after 28 days of systemic treatment with glucocorticoids is shown as reverse waterfall plots (left) and box-and-whisker plots (right). Patients who experienced 6 month NRM are shown as – and those who did not are shown as –. The difference in change in MAP was statistically significant (
The pattern of change in MAP over the 1st month of treatment led us to hypothesize that a patient whose MAP rose above a threshold defined after 4 weeks of treatment would fare worse. Previous work had validated a single post-treatment threshold (MAP = 0.290) that separated patients into two groups with distinctly different risks of NRM. 40 Indeed, Ann Arbor 1 and 2 patients whose MAPs rose above this threshold experienced a dramatic increase in 6 month NRM. In addition, Ann Arbor 3 patients who dropped below the threshold had better survival that was nearly identical to those who had remained Ann Arbor 2 throughout the first month of therapy. These findings suggest a novel endpoint for a clinical trial might be to achieve a MAP <0.290. 40 According to the BEST definition, a reasonably likely surrogate endpoint is one that is ‘supported by strong mechanistic and/or epidemiologic rationale such that an effect on the surrogate endpoint is expected to be correlated with an endpoint intended to assess clinical benefit in clinical trials.’ The MAP may qualify as such an endpoint, but prospective trials with confirmation of survival benefit would be needed to fully validate the usefulness of such an endpoint.
Many clinicians now use the MAP after treatment of acute GVHD because its PPV is significantly higher than that of clinical response to treatment (51%
Conclusion
The definitions developed by the FDA-NIH Biomarker Working Group have helped clarify the role of biomarkers in clinical practice. Using this framework, MAPs provide useful guidance for GVHD treatment in several scenarios, including determination of the risk of HCT patients who have not yet developed GVHD, assessment of the prognosis of GVHD patients, monitoring of the clinical status of GVHD patients after treatment, and evaluation of the response of patients to GVHD therapy. The MAP has consistently proven more accurate than clinical metrics such as the severity of GVHD symptoms at onset change in GVHD symptoms after treatment. In the future, MAPs may also serve as diagnostic biomarkers, as predictive biomarkers for individual therapies, and as novel clinical trial endpoints as a reasonably likely surrogate endpoints in clinical trials.
Footnotes
Acknowledgements
The authors thank the patients, their families, and the research staff at the following MAGIC centers who contributed data and samples: Bambino Gesù Hospital, Children’s Hospital Los Angeles, Children’s Hospital of Philadelphia, City of Hope Cancer Center, Columbia University Medical Center, Emory University, Erlangen University, Hospital for Sick Children, Icahn School of Medicine at Mount Sinai, King Chulalongkorn Memorial Hospital, Massachusetts General Hospital, Mayo Clinic, Ohio State University, University Hospital Carl Gustav Carus, University Medical Center Hamburg, University of Michigan, University of Pennsylvania Health System, University of Regensburg, Würzburg University, and Vanderbilt University.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and publication of this article: This work was supported by grants (P01CA03942 and P30CA196521) from the National Cancer Institute and (TL1 TR001434) from the National Center for the Advancement of Translational Science of the National Institutes of Health.
Conflict of interest statement
John E. Levine and James L.M. Ferrara are co-inventors on a GVHD biomarker patent.
