Abstract
The 6-Minute Walk Distance (6-MWD) has been the most utilized endpoint for judging the efficacy of pulmonary arterial hypertension (PAH) therapy in clinical trials conducted over the past two decades. Despite its simplicity, widespread use in recent trials and overall prognostic value, the 6-MWD has often been criticized over the past several years and pleas from several PAH experts have emerged from the literature to find alternative endpoints that would be more reliable in reflecting the pulmonary vascular resistance as well as cardiac status in PAH and their response to therapy. A meeting of PAH experts and representatives from regulatory agencies and pharmaceutical companies was convened in early 2012 to discuss the validity of current as well as emerging valuable endpoints. The current work represents the proceedings of the conference.
Clinical trials to assess the efficacy of drug therapy for pulmonary arterial hypertension (PAH) have routinely used the 6-Minute Walk Distance (6-MWD) as a validated primary endpoint since the first controlled trial over 15 years ago.[1] Although reflecting functional status and predicting survival in PAH, this simple and practical measure has certain limitations and therefore there have been recent calls for alternate endpoints[2] that are clinically significant, have pathophysiological relevance to the disease and are sensitive enough to be subjected to statistical analysis. The following is a summary from a roundtable discussion on clinical endpoints that brought together experts in PAH and representatives from Pharma and from regulatory agencies (i.e., the US Food and Drug Administration), whose task was to review the strengths and limitations of current endpoints, along with recommendations for improvement.
TRADITIONAL ENDPOINTS
Functional class
Functional classification (FC) is widely used as a marker of disease severity in cardiovascular disease and is strongly predictive of mortality[1,3–6] In pulmonary hypertension (PH), it provides a measure of the limits imposed on a patient by the disease.[7,8] Regulatory agencies include FC in their labeling of PAH-specific therapies. Published treatment guidelines include FC in their recommendations for the evaluation and treatment of patients,[7,8] and FC is commonly employed as an endpoint in clinical studies of PH therapies. In addition, FC correlates with quality of life (QoL) assessment, enforcing its usefulness as an intermediate endpoint.[9]
New York Heart Association and World Health Organization Classification
The New York Heart Association (NYHA) Functional Classification System was primarily developed and validated in heart failure studies.[10] In 1998, the World Health Organization (WHO) expert panel amended the NYHA diagnostic classification system specifically for patients with PH in order to include symptoms such as dyspnea, fatigue and chest pain, as well as syncope and near syncope more relevant to patients with PH.[11] Patients who have experienced syncope are generally assigned to WHO FC IV (although this is not explicitly stated in the WHO Functional Classification System). Due to similarities between the two classification systems, many clinicians refer to them collectively as NYHA/WHO Functional Classification.[12]
Prognostic value of functional classification
NYHA/WHO FC is a powerful predictor of survival in PH.[4,6] In untreated patients with idiopathic PAH (IPAH) or heritable PAH, the median survival was six months for WHO FC IV, two and a half years for WHO FC III and six years for WHO FC I and II.[13] In the REVEAL registry, patients who were NYHA/WHO FC IV at baseline had a significantly lower one-year survival rate than those with better functional class.[4] In observational studies of patients receiving epoprostenol therapy, survival was significantly longer in patients at NYHA FC III, compared with those with NYHA FC IV.[5] In a follow-up of patients receiving subcutaneous treprostinil, worse FC at baseline was associated with lower survival rates.[14] There was an association between increased mortality and baseline NYHA FC IV in a three-year follow-up of patients receiving oral bosentan.[6]
Functional classification as an endpoint in clinical trials
FC is an important endpoint in clinical trials of PAH therapy as it reflects patient well-being. The key advantages of FC as an endpoint are that changes can be easily measured and assessed within three months of therapy and are predictive of mortality. Therefore, FC can be used to trigger treatment adjustment.[11] It has also been included in the parameters defining goal-oriented treatment strategies.[15] Several clinical trials of prostacyclins, endothelin receptor antagonists and phosphodiesterase-5 inhibitors have shown improvements in FC [Table 1]. In the BREATHE-1 study, 42% of the bosentan-treated patients and 30% of the placebo-treated patients were in a better WHO FC at Week 16 than at baseline, which coincided with improvements in exercise capacity, Borg dyspnea score and reduced time to clinical worsening (TTCW).[16] In the SUPER study, sildenafil significantly improved WHO FC in addition to exercise capacity and hemodynamics.[17] In some clinical trials, improvements in FC were not evident despite improvements in other clinical endpoints. This may be related in part to the background treatments being used in these trials, making it more difficult to improve FC.[20,21]
Improvements in new York Heart Association/World Health Organization functional classification shown in pulmonary arterial hypertension clinical trials
Advantages/disadvantages of FC as a clinical endpoint Strengths of FC as a clinical endpoint:
Convenience
Ease of classification
Widely and broadly used
Can be predictive of survival as well as QoL
Weaknesses of FC as a clinical endpoint:
Self.reporting is required by patients
The subjective nature of functional classification results in great variability in how classifications are judged between physicians[9]
Multiple factors not mentioned in NYHA/WHO definitions may be used
Definition of symptoms may differ widely among clinicians and are not reliable in children
A questionnaire could be used to aid standardization (AIR study, unpublished)
Inconsistencies make inter.trial comparisons difficult
The simplistic nature of this endpoint may mean that this classification is poorly discriminating and that subtle changes in clinical status will not be detected[12]
The reliability and validity of this measure is not clearly established.
Recommendations
FC will continue to be an important secondary endpoint in future clinical trials as well as a component of primary composite endpoints
It provides a useful indicator of survival, physical capacity and well.being
It is recognized in guidelines and by regulatory authorities
However, as PH treatment improves, the focus of clinicians may shift to early detection of PH in patients
Development of tools to promote a uniform approach to NYHA/WHO Functional Classification is an important step in helping to standardize the clinical care of patients with PAH and in performing and interpreting clinical studies
A questionnaire for standardization to harmonize understanding of FC may be useful
There may be scope for development of a tighter or subdivided functional classification system[12]
NYHA/WHO Functional Classification may not serve as a single primary endpoint in clinical trials but may be useful as an important part of a composite endpoint. Indeed, it has been utilized successfully as part of a combined primary endpoint previously.[19]
Exercise capacity
The 6-Minute Walk Distance test
The use of exercise capacity as a measure of disease severity and treatment response is common in PH clinical trials and also in clinical practice. The most commonly used measure of exercise capacity is the 6-MWD test. It is not uncommon to enhance the 6-MWD test with a dyspnea rating at the end of the test, using either the Borg Dyspnea Index (BDI) or the Mahler dyspnea index. Adjunctive measures such as pulse oximetry (SpO2) and heart rate (HR) can also be added to further characterize exercise performance during the 6-MWD test.
Two major strengths of the 6-MWD test include its simplicity and its widespread use and validation in PAH. It is a test that reflects activities of daily living and, to the extent that 6-MWD can be improved, it is a worthwhile metric. In addition, the 6-MWD distance has been shown to predict survival in several cardiopulmonary disorders including PAH.[1,22] Baseline 6-MWD, as well as thresholds of 6-MWD distance reached under treatment, has been shown to correlate with patient outcome.[4,23–25] In contrast, the change in 6-MWD, either in response to treatment or as patients deteriorate, has not been shown to correlate well with outcome,[25,26] although results have been variable depending on the length of observation.[27] A recent study, using both distributional and anchor-based methods and a large cohort of PAH patients, determined the minimal important difference (MID) in the 6-MWD test to be approximately 33 m.[28] Limitations of the 6-MWD test include a lack of ability to account for physical patient characteristics such as stride length and weight, a learning effect and inability to provide information on the physiologic response to exercise. Recent data suggest that 6-MWD test alone is not sufficient to define the clinical status of the patient.[29]
Recommendations
The 6-MWD test should continue to be used as part of the clinical assessment of PH patients; however, it should not be considered a mandatory test
Performance of the 6-MWD test should be standardized and should follow American Thoracic Society (ATS) guidelines[30]
Patients must be developmentally able to perform the 6-MWD test and should not have physical or mental comorbidities that could influence the performance of the test
Adjunctive measurements such as HR and SpO2 can be used to enhance the interpretation of the test
The use of the 6-MWD test as a primary endpoint alone in PH clinical trials should be restricted to instances whereby the results are projected to be both statistically and clinically significant
Cardiopulmonary exercise testing (CPET)
The physiologic response to exercise can be assessed with a comprehensive evaluation of several key exercise variables known to be affected by pulmonary vascular disease. These include HR and blood pressure, submaximal oxygen consumption (anaerobic threshold, AT), peak oxygen consumption (VO2), ventilatory inefficiency (VE/VCO2, PETCO2), rest and exercise blood pressure and exercise and recovery patterns of these variables.
Strengths of CPET include its ability to evaluate physiologic severity, its prognostic use and its highly reproducible nature.[31–33] Limitations relative to the 6-MWD test include the need for technical expertise and longer time for administration and interpretation. The use of CPET in clinical trials has been discouraged due to the technical expertise required; however, this has not been the case in studies of other cardiopulmonary disorders such as heart failure, where CPET has often been the gold standard reference test.
Recommendations
CPET should continue to be used as part of the clinical assessment of PH patients; however, it should not be considered a mandatory test
Performance of CPET should be standardized and should follow ATS guidelines[34]
Patients must be developmentally able to perform CPET and should not have physical or mental comorbidities that could influence the performance of the test
As drug development with new targets and combination therapies emerges, the use of CPET as a primary endpoint in PH clinical trials should be reconsidered
A core CPET lab must be used to systematically interpret all physiologic data captured at clinical recruiting sites
Recruiting sites charged with obtaining CPET data for clinical trials must operate using standardized validation procedures and must be validated by the core CPET lab.
Hemodynamics
Use of hemodynamic endpoints to assess response to therapy in clinical trials is justified and reasonable as hemodynamic alterations are integral to the causal pathway of PAH. The significance of baseline hemodynamic alterations has long been recognized. For instance, elevated right atrial pressure (RAP) and decreased cardiac index (CI) are strong predictors of death and/or lung transplantation.[13,35,36] However, traditional hemodynamic measures of disease severity, such as CI and RAP are inconsistently associated with outcomes in certain PAH groups such as scleroderma-associated PAH (SSc-PAH).[37–39] This may be related to differential responses to cardiac loads between SSc-PAH and IPAH as demonstrated in studies utilizing pressure-volume relationships, suggesting decreased mean ventricular pressure at any given afterload in SSc-PAH.[40] Other hemodynamic measurements such as pulmonary arterial capacitance (as estimated by stroke volume divided by pulmonary artery pulse pressure) independently predict survival in SSc-PAH.[38] Further, stroke volume index (SVI), perhaps a more specific measure of right ventricle (RV) function compared to CI, is also strongly predictive of outcome in this cohort; neither CI nor RAP independently predicted survival in this specific group of patients.[36]
Changes in hemodynamic values have also been examined in more recent studies. Reductions in RAP and mean pulmonary arterial pressure (mPAP) and increases in CI after intravenous epoprostenol therapy are associated with improved survival.[5] A failure of pulmonary vascular resistance (PVR) to decrease after therapy with bosentan or epoprostenol is a harbinger of poor prognosis.[25,27] Low CI and elevated RAP and PVR after 3 months of therapy with inhaled iloprost are associated with an increased risk of death.[41] Some of these hemodynamic endpoints have clearly been validated as surrogate markers in controlled trials. Significant reductions in mPAP and PVR and increases in CI, have been shown after a 12-week treatment with intravenous epoprostenol, the only controlled study in PAH that has shown improved survival with treatment,[1] and in response to sildenafil therapy.[17] Improved hemodynamics was also recently shown in a randomized double-blind, placebo-controlled, dose ranging study of sildenafil in treatment-naïve children with PAH.[42] However, the FDA recently recommended against the use of this drug in this population since there was an increased risk of death in the high- versus low-dose groups. Hemodynamic changes (decreased mPAP and PVR) in response to intravenous epoprostenol have also been shown in SSc-PAH, although there was no change in survival in this group.[43] Other trials, however, have shown little or no change in hemodynamics between drug and placebo.[19,44] Hemodynamic values are not currently accepted as endpoints by regulatory authorities.[45]
Strengths
Hemodynamic data are accurate, reproducible and highly reflect the disease as integrated cardiopulmonary function
They are done routinely in all referral centers and have been standardized
They have baseline prognostic values and change in response to therapy (at least in IPAH).
Weaknesses
They are invasive and time-consuming and may not necessarily represent a direct benefit to the patient
They are usually obtained at baseline at rest and may not accurately reflect alterations related to exercise
Optimal hemodynamic endpoint has not been defined
They have not changed consistently in various recent trials; however, this may be related to the short time frame (e.g., 12 weeks), patient composition (e.g., SSc-PAH patients unlikely to show hemodynamic changes with current therapy) and add-on therapy trial.
Recommendations
Hemodynamic data may be considered as primary endpoints (e.g., PVR, SVI, stroke volume/pulse pressure [SV/PP]) in select trials (e.g., children trials where other endpoints may be less reliable) and in randomized controlled trial (RCT) greater than four to six months, although missing values for patients who drop out or refuse a repeat catheterization represent a significant limitation.
OTHER CLINICAL ENDPOINTS
Clinical endpoints for PAH have undergone an evolution from the straight 6-MWD test to more comprehensive endpoints that reflect disease progression and/or medical failure. Amongst these endpoints, death and transplantation remain relatively clear, but the definition of “time to clinical worsening” has not always been consistent between trials.
Health-related quality of life in pulmonary arterial hypertension
Ideally, therapeutic interventions in PAH should improve symptoms, prolong survival and enhance QoL. Of these three therapeutic goals, the impact of therapeutic interventions on PAH-specific QoL is least well characterized. Most available health-related QoL (HR-QoL) data in PAH has been derived from existing generic (e.g., SF-36, EQ-5D) or condition-specific heart failure instruments such as the Congestive Heart Failure Questionnaire directly employed or adapted for use as secondary endpoints in pharmaceutical-sponsored studies.[46,47] In this setting, instrument domains related to physical functioning appear to be the most sensitive to therapeutic interventions demonstrated to improve other functional parameters, such as the 6-MWD. However, the minimal change indicative of clinically meaningful improvement in PAH has not been determined for any instrument and the lack of consistency of HR-QoL instruments employed in therapeutic trials has made between-study comparisons difficult.
More recently, the Cambridge Pulmonary Hypertension Outcome Review (CAMPHOR) has been developed as a PAH-specific HR-QoL instrument.[48] This instrument was derived and validated from separate cohorts of PAH patients in the UK. The instrument has been validated outside of the UK (US),[49] although its performance in response to therapeutic interventions has yet to be determined. At least one additional PAH-specific HR-QoL instrument is currently in development.
The lack of a disease-specific instrument to assess HR-QoL in PAH in response to therapeutic interventions is currently an unmet need. The committee supports the development of specific and fully validated instruments for assessing HR-QoL in PAH.
All-cause mortality
Survival is the most meaningful clinical endpoint when evaluating new therapies; however, it can require the study of more than a thousand patients, which is not feasible in an orphan disease such as PAH. All-cause mortality is one endpoint that is easily measured but may overestimate the number of deaths attributable to PAH. Another option is to use “disease-related mortality” which would only include deaths due to PAH. This would require a clinical events committee to determine whether a death was due to PAH. Unfortunately, this is not always clear and may compromise the integrity of a trial. Further, mortality alone would not be a suitable endpoint because of the low event rate and inability to power a study adequately for a short-term trial.
Lung transplantation
Whether or not a patient undergoes lung transplantation is also clear to capture; however, there is still room for error here. There are likely center-specific patterns in lung transplant referral which may relate to the presence of a robust transplant program, success rates and average wait times, in addition to patient's disease severity and projected prognosis. Therefore, the likelihood of listing and actual transplantation during the course of a clinical trial may differ between centers. One may account for this in part by noting at the trial baseline whether a patient is “actively listed” or not. This information should be routinely included in baseline data collection. It seems reasonable to analyze the “time to transplant” for those “actively listed” at trial onset separately from those who are not. For the patients who were not previously listed, the need for “new transplant listing” should be the worsening event and the transplant a censored event. This is still not without bias based on center-specific practices.
Composite endpoints: TTCW
Since single surrogate endpoints such as the 6-MWD test are not ideal for clinical trials in PAH, composite endpoints have been proposed.[50] Therefore, composite endpoints have been used to increase the overall event rate and thereby reduce the number of patients needed for a trial. Indeed, the European Medicines Agency (EMA) encourages the use of a composite endpoint such as TTCW as the primary endpoint in PAH clinical trials (Table 2; EMA, 2009). TTCW has emerged as a frequently used secondary endpoint in recent long-term PH clinical trials. However, as McLaughlin et al.[45] emphasize in the Dana Point recommendations from 2009, “time to clinical worsening” has not been entirely consistent in its definition in various recent trials, but certainly has value as a composite endpoint. This endpoint was designed to be a more comprehensive analysis of disease progression, but has suffered from inconsistencies in definition, making trial comparisons more difficult. The most commonly used components of TTCW include events such as (1) all-cause mortality, (2) need for an interventional procedure including transplant or septostomy, (3) PAH-related hospitalization, and (4) some additional measures of clinical worsening which may include WHO FC progression, decline in 6-MWD by at least 15%, signs of worsening right heart failure and/or need for additional PAH-targeted therapies. Table 1[45] reports the various definitions used for TTCW in recent clinical trials. Composite endpoints include those that measure disease progression (e.g., TTCW) and those that assess improvement in a patient's physical capacity and well-being.
Definition of time to clinical worsening in different trials
Endpoints measuring disease progression and deterioration
The impact of a treatment on disease progression associated with PAH can be measured by TTCW. This endpoint is viewed as clinically relevant by clinicians and regulatory agencies and has been used in several clinical trials as a secondary, or more recently a primary endpoint.[51] The composition of this endpoint varies from study to study. The main components include the following:
Change in physical capacity, such as a 10-20% decrease in 6-MWD
Deterioration in NYHA FC
Significant clinical events such as need for hospitalization or additional therapy, transplantation, or mortality.
Endpoints measuring improvement of patient's physical capacity and well-being
Despite the usefulness of endpoints measuring disease progression and deterioration, from the perspective of both the patient and the treating physician, it may be more relevant to assess improvement in physical capacity and well-being. A composite endpoint was selected as the primary endpoint in the AIR study in order to give a more rigorous assessment of the efficacy of iloprost.[19] It included (1) an increase of at least 10% in the 6-MWD, (2) improvement in the NYHA/WHO FC and (3) absence of a deterioration in the clinical condition or death. A significant effect of treatment in favor of iloprost (P = 0.007) with an estimated odds ratio of 3.97 (95% CI 1.47-10.75) was found. Nearly 40% of patients showed increased 6-MWD by at least 10%. Approximately 20% of patients showed improvement in FC. Not all patients with improved FC had a 10% increase in 6-MWD. Thus, a larger proportion than met the primary endpoint met lesser criteria for clinical improvement to warrant continuation of therapy. Despite the usefulness of composite endpoints in measuring physical capacity and well-being and the obvious limitations of single surrogate endpoints, to date, the AIR study is the only large clinical trial that has employed such an endpoint.
Advantages/disadvantages of composite endpoints
Composite endpoints are derived from a combination of individual endpoints and have been validated in heart failure trials. They have several advantages over single endpoints:[52] (1) precision (and therefore statistical power) increases with event rate; (2) a composite endpoint can make it easier to detect a therapeutic benefit compared with analyzing each component separately, without requiring an increase in sample size (the higher the number of events, the smaller the sample size required based on more power to detect any treatment effect); and (3) besides mortality, clinically relevant components such as 6-MWD or NYHA/WHO FC may be incorporated, offering a more global assessment of the patient and their clinical condition. For both patients and physicians, it is more relevant to assess improvement over a short period of time rather than waiting for deterioration or death. Use of composite “improvement” endpoints allows individual responders to be identified, lowers the placebo response and thereby also lowers the number of patients needed. It also permits the investigation of a drug effect in a shorter period of time.
However, the use of composite endpoints in clinical trials also has several disadvantages: (1) for TTCW, the event rate may vary and is sometimes hard to predict at the start of a study. To mitigate this, more recent trials are “event-driven,” that is, they keep patients enrolled until an endpoint occurs, which sometimes leads to considerable adjustments of the sample size and duration of the study;[51] (2) an individual component may confound the entire composite endpoint;[53] (3) outcomes such as hospitalization can be driven by social and nonmedical factors and need to be defined as disease driven;[54] (4) the inclusion of individual endpoints with country-specific availability, that is, lung transplantation, may pose an imbalance in multinational studies; (5) a composite endpoint assumes that each of the components has equal implications to the patient and the physician. This may not always be the case. For example, TTCW may be driven by deterioration in 6-MWD as opposed to death; and (6) due to the rigorousness of a composite “improvement” endpoint, the responder rate may be viewed as low even though a high proportion of patients may have benefitted in their clinical well-being overall.
Recommendations
Appropriately designed and validated composite endpoints can provide a clinically relevant and valid means of investigating new treatments in trials
Should a composite endpoint such as TTCW or improvement be incorporated into a trial, the individual components of such an endpoint should be clinically relevant, of prognostic value and ideally standardized across clinical trials in PAH[50]
For non-PAH indications, the composition of such an endpoint may read differently and should be developed according to the underlying disease
The successful design and implementation of composite endpoints into clinical trials will require a consensus to be reached between PH experts, pharmaceutical companies and regulatory authorities
The group of experts from the Dana Point 4th World Symposium proposed the following:
A uniform definition of TTCW should be used in future pivotal (Phase III) RCTs in PAH. In the definition of TTCW, hard events would include the following:
All-cause mortality
Nonelective hospital stay for PAH (with predefined criteria, usually for initiation of intravenous prostanoids, lung transplantation, or septostomy)
Disease progression defined as a reduction from baseline in the 6-MWD by 15%, confirmed by two studies done within two weeks plus worsening FC (except for patients already in FC IV)
The consensus was that when TTCW is used in an RCT, there would be an infrastructure required to adjudicate events in question. This will be necessary particularly with respect to “worsening PH” events. Other insights from the FDA on the use of TTCW as an endpoint have been that while acceptable, perhaps a numerical value assignment to each component would further enhance the reliability of this endpoint. In addition, considering capturing the total number of events would also provide a broader, more inclusive endpoint. A numerical system which would include multiple events for a given patient could be designed.
BIOMARKERS
BNP and NT-pro BNP
Plasma brain natriuretic peptide (BNP)[55–57] and its terminal prohormone (NT-pro BNP)[58] are secreted mainly by the ventricular myocytes in response to volume overload and increased wall stress. Hesselstrand et al.[59] showed that natriuretic peptide levels were related to the transtricuspid gradient in 227 consecutive patients with scleroderma. Several studies have evaluated BNP and NT-pro BNP as biomarkers of prognosis in patients with PAH.[55–58] Various cutoff levels of BNP have been associated with poor outcomes compared to patients with lower levels.[60] BNP and NT-proBNP have also been used in patients with PH in the setting of chronic parenchymal lung disease,[61,62] congenital systemic-to-pulmonary shunts,[63] and in acute and chronic thromboembolic disease.[55,64,65]
Uric acid
Serum uric acid (UA) is a marker of impaired oxidative metabolism and is elevated in several chronic conditions such as heart failure and chronic obstructive pulmonary disease (COPD). In a study of99 IPAH patients, Nagaya et al.[66] showed that serum UA levels were elevated, correlated with pulmonary hemodynamics, had strong association with long-term mortality and decreased with vasodilator therapy.
Renal function
Decreased renal function as measured by elevated blood urea nitrogen levels[67] or increased serum creatinine and decreased glomerular filtration[68] have been shown to be associated with a worse hemodynamic profile and were independent predictors of mortality in patients with PAH.
Other circulating markers
Markers of endothelial dysfunction are of great interest in PAH. Endothelin-1 (ET-1) is a potent vasoconstrictor produced by endothelial cells[69] and has shown some promise as a biomarker for PAH. A small study[70] found that active ET-1 and its precursor, big ET-1, correlated with cardiopulmonary hemodynamics and 6-MWD and were strong prognostic markers for patients with IPAH. In a recent study of PAH patients,[71] ET-1/ET-3 ratio had a strong correlation with RAP, mixed venous oxygen saturation, WHO FC and 6-MWD.
D-dimer is elevated in patients with IPAH compared with controls and is associated with disease severity and 1-year survival.[72] Synthesized mainly in endothelial cells, plasma von Willebrand factor (vWF) plays a role in platelet aggregation and adhesion at sites of vascular injury, is elevated in severe PAH and changes in parallel with improvements in hemodynamics in response to prostacyclin therapy.[73] In a retrospective cohort study of PAH patients, increased vWF levels at baseline and follow-up were associated with reduced survival.[74] Elevated plasma vWF antigen (vWF: Ag) has also been found in PAH and baseline vWF: Ag correlated with the risk of death in the subsequent year.[75,76]
Several markers of inflammation, such as C-reactive protein,[77] growth differentiating factor-15,[78] and certain interleukins[79] have been shown to have potential for prognostic information as well; however, these require further study and validation. Cardiac troponin-T is a sensitive and specific marker for myocardial injury and can be detected in the setting of acute RV failure from acute pulmonary embolism.[80] Preliminary information suggests that detection of cardiac troponins may be markers of poor prognosis in patients PAH.[81] Very little information is available regarding changes in any of these biomarkers in response to therapy. A recent study showed an improvement in levels of angiopoietin 2, matrix metalloproteinase 9 and vascular endothelial growth factor with the addition of intravenous treprostinil.[82]
There is very little experience with using blood biomarkers to assess response to therapy in PAH and thus there is little data on their utility regarding response to therapy in clinical practice.
Recommendations regarding blood biomarkers
We should determine which biomarker (e.g., BNP vs. NT-pro BNP) should be used in clinical trials to ensure adequate validation of that variable
All clinical trials going forward should, at a minimum, include BNP or NT-pro BNP as an exploratory measure of outcome
All clinical trials should include at least two other biomarkers as exploratory outcome measures for future validation
Blood and tissue repositories should be created in association with all clinical trials going forward so that if new biomarkers become available in the future, their validity may be determined objectively
Imaging of the RV in pulmonary arterial hypertension clinical trials
Despite the tremendous attention that left ventricular (LV) failure has received, RV failure has remained understudied both at the preclinical and clinical level, although in patients with PAH, the status of the RV is the most important predictor of both morbidity and mortality.[83–85]
RV function can be affected by experimental therapies that target the pulmonary circulation. For example, the PDE 5 inhibitors have direct effects on the hypertrophied (but not normal) RV.[86] The other two classes of currently approved drugs for PAH were both initially developed to treat LV diseases and both failed clinical trials with potentially increased mortality, suggesting possible adverse effects on the myocardium. Thus, the possibility that a negative response to an experimental therapy may be due to a suppression of RV function (while there are still beneficial effects on the pulmonary vessels) needs to be considered as it could completely alter the interpretation of the results.
Echocardiography
This is used widely in the assessment of patients with PAH and RV disease, although it remains inferior to magnetic resonance imaging (MRI) for overall assessment of RV function (mostly due to the complex, crescent-like shape of the RV). However, recently, two methods have emerged as reliable indices of RV function and contractility. Tricuspid Annular Plane Systolic Excursion (TAPSE) reflects the longitudinal systolic excursion of the lateral tricuspid valve annulus toward the apex. It is usually measured using M-mode imaging in the 4-chamber view and studies showed good correlation between TAPSE and RV ejection fraction measured by radionuclide angiography.[87–89] Another noninvasive index of contractility based on the myocardial isovolumic acceleration (IVA) assessed by tissue Doppler has been described; IVA reflects RV myocardial contractile function and is less affected by preload and afterload within a physiologic range when compared to either dP/dt max or elastance and has been extensively validated clinically.[90–92] Both methods are used clinically and can be standardized for clinical trials.
Magnetic resonance imaging
Cardiac MRI (cMRI) is the gold standard for evaluating right heart structure and function. The complex 3D structure of the RV can be directly evaluated with MRI in order to measure RV volume, mass and function (e.g., ejection fraction)[93,94] without the need for computational assumptions; values for RV mass and volume in normal cohorts have also been reported.[95] Recent studies using MRI have demonstrated the prognostic value of RV mass and end-diastolic volumes assessed by MRI in PAH.[96] MRI has a very high inter-study reproducibility of all methods for measurement of chamber volumes and mass,[97,98] making it an important tool for clinical trials.
Pulmonary angiography may also be performed using MRI and pulmonary blood flow can be quantified in patients with PAH.[99] In addition, RV stress (e.g., adenosine) perfusion protocols can be added in a manner similar to those applied for LV ischemia.[7,100,101] Recent studies showed evidence of MRI-measured ischemia in the RV of SSc-PAH patients.[102] If RV ischemia is considered as a contributor to RV failure, this technique may allow protocols to directly measure perfusion. In addition, if experimental therapies to modulate angiogenesis are tested in PAH, their potential effect on the RV should be considered. MRI offers the ability to measure lung parenchyma and RV free wall ischemia in the same setting. MRI's ability to offer “single stop shop” comprehensive assessment of the “RV-pulmonary circulation” unit is increasingly being recognized.[86]
Metabolic and molecular imaging
There is some evidence that the metabolism of the RV which changes as it hypertrophies is etiologically involved in RV failure and can be therapeutically targeted.[103] This means that it could be followed by imaging tools like positron emission tomography (PET). There are still many questions that need to be resolved with mechanistic studies (e.g., whether a switch in metabolism might be related to transition from compensated to de-compensated RV function). In addition, the performance of appropriate PET studies is difficult to standardize as it is also possible that some patients with PAH may have a generalized metabolic disturbance (e.g., insulin resistance[104]). Overall, the use of PET is promising, but its inclusion in clinical trials may be premature.
Recommendations
cMRI is the gold standard test for assessment of RV function and remodeling. Therefore, some cMRI parameters (e.g., RV mass and RVEF) should be validated as endpoints for clinical trials.
All clinical trials should include imaging sub-studies which would allow validation of valuable imaging endpoints.
TAPSE should be validated as a reliable endpoint in response to therapy.
CONCLUSIONS
In this document, we have reviewed the evidence related to the validity of current and emerging endpoints in clinical trials for PAH. We believe there is at this time an urgent need to identify and validate novel endpoints that reliably reflect the disease status (both from a pulmonary vascular and RV standpoint) and its response to therapy. Composite endpoints seem to be most valuable at this time although defining the best objective endpoints (including survival and lung transplantation) to be included into a composite score may be challenging. As treatment of the disease is slowly moving to more effective targeted therapy, this effort at defining reliable endpoints should be rewarding.
