Sage Journals: Discover world-class research

Abstract

Background. Several novel drug- and cell-based potential therapies for spinal cord injury (SCI) have either been applied or will be considered for future clinical trials. Limitations on the number of eligible patients require trials be undertaken in a highly efficient and effective manner. However, this is particularly challenging when people living with incomplete SCI (iSCI) represent a very heterogeneous population in terms of recovery patterns and can improve spontaneously over the first year after injury. Objective. The current study addresses 2 requirements for designing SCI trials: first, enrollment of as many eligible participants as possible; second, refined stratification of participants into homogeneous cohorts from a heterogeneous iSCI population. Methods. This is a retrospective, longitudinal analysis of prospectively collected SCI data from the European Multicenter study about Spinal Cord Injury (EMSCI). We applied conditional inference trees to provide a prediction-based stratification algorithm that could be used to generate decision rules for the appropriate inclusion of iSCI participants to a trial. Results. Based on baseline clinical assessments and a defined subsequent clinical endpoint, conditional inference trees partitioned iSCI participants into more homogeneous groups with regard to the illustrative endpoint, upper extremity motor score. Assuming a continuous endpoint, the conditional inference tree was validated both internally as well as externally, providing stable and generalizable results. Conclusion. The application of conditional inference trees is feasible for iSCI participants and provides easily implementable, prediction-based decision rules for inclusion and stratification. This algorithm could be utilized to model various trial endpoints and outcome thresholds.

Keywords

unbiased recursive partitioning inclusion exclusion criteria upper extremity motor score clinical trial design

Introduction

In the field of spinal cord injury (SCI) research, many discoveries from basic research and clinical investigation are worthy of consideration as potential SCI therapeutics and have entered the translational process (for an overview of SCI clinical trials, see www.scope-sci.org). As SCI has a relatively low incidence,¹ there is a limited number of potentially appropriate acute and subacute trial participants. With the increasing number of new interventions, the recruitment of participants to acute and subacute SCI trials needs to be developed in a more inclusive and effective manner.

If SCI trials only proceed by sequentially enrolling participants with very narrow inclusion criteria,² so as to ensure subject homogeneity, enrollment progress will remain slow, affecting negatively the timely and financially sustainable accomplishment of the planned goals. In addition, with a narrow recruitment strategy, the scientific gain of knowledge brought about by the completion of a trial will be restricted to this narrow SCI subpopulation, requiring additional studies for SCI subjects with other characteristics and being therefore inefficient. Alternatively, once safety has been established, a more inclusive strategy would enroll participants with varying degrees of sensorimotor complete and incomplete SCI (iSCI). However, this can only be sensibly implemented with stratification algorithms to limit subject heterogeneity within a study cohort.^3,4

Including iSCI participants is justified and required since many of the experimental treatments being translated for human SCI were developed using animal models with incomplete lesions. It is also expected that iSCI trial participants are more likely to benefit from interventions that target the spared sensorimotor function. An inclusive enrollment strategy will more efficiently utilize potential trial participants, as well as decrease the time and therefore the costs for completing a SCI trial program.

However, inclusive recruitment and enrollment strategies of iSCI participants poses specific challenges. Incomplete SCI comprises a highly heterogeneous population in terms of level and severity of injury, as well as the diversity of recovery patterns.⁴ Since iSCI participants have some preserved sensory and/or motor function, any adverse events could also compromise spontaneous recovery.

Predictive algorithms are required that will quickly and accurately exclude those patients whose spontaneous recovery would be so extensive it would obscure any therapeutic benefit. In an acute study, the algorithm would have to accurately stratify participants on the basis of early clinical characteristics. Thus, the selection of sensitive and reliable endpoints is critical and such endpoints may be specific to a subset of stratified cohorts.

To capitalize on an inclusive recruitment approach, while carefully monitoring its limitations, requires the ability to recognize early clinically relevant subgroups of patients. The idea of identifying and stratifying similar study participants is demanding, but not new.⁵ In this study, we propose an advanced prediction-based stratification approach based on a recently developed Unbiased Recursive Partitioning technique called Conditional inference TREEs⁶ (URP-CTREE). Our aim is to assess the potential of URP-CTREE in defining a priori stratifications in a heterogeneous neurological disorder as represented by iSCI participants.

Methods

Data Source

The data utilized in this study was extracted from the European Multicenter study about Spinal Cord Injury (EMSCI, www.emsci.org, ClinicalTrials.gov Identifier: NCT01571531) according to the inclusion criteria reported below. EMSCI encompasses 21 centers from 7 European countries that have been tracking the functional and neurological status of patients during the first year of recovery from SCI in a rigorously standardized way since 2001.

Inclusion Criteria

From the 2597 patients present in the EMSCI database, we extracted all those de-identified individuals with an initial baseline assessment (<2 weeks) having the most caudal spinal segment with intact sensorimotor function (ie, neurological level of injury [NLI]) between C4 and C7 as defined by the International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI)⁷ and described by the American Spinal Injury Association Impairment Scale (AIS). SCI participants have typically been classified using the 5 grades (A-E) of the AIS. These SCI severity categories are broad and have limited value for research studies. Nevertheless, it is a commonly used classification system and a reasonable reference point for describing the participants in this study. We included AIS-B iSCI participants with sensory perception to S4 to S5, but no motor function preserved more than 3 segments below the cervical NLI (sensory incomplete SCI), as well as AIS-C participants (nonfunctional motor incomplete), where sensory and motor function are partially preserved below the NLI, but more than half the key muscles, below the NLI, have a muscle grade strength of <3/5. An additional minimum criterion was a neurological assessment at 6 months with entry of an upper extremity motor score (UEMS), since this was the primary endpoint of this illustrative analysis. Patients with suspected central cord syndrome⁸ were excluded (n = 5), as routinely done in many clinical trials.

We also excluded from this analysis tetraplegic sensorimotor complete (AIS-A) patients, as they have been previously examined and reported.⁹ Finally, we excluded cervical AIS-D participants. Spontaneous motor recovery after a cervical AIS-D injury is substantial with little room on the UEMS scale to measure a therapeutic benefit (ie, ceiling effect).

Predictors and Clinical Endpoint

Clinical neurological predictors included in this analysis, as well as the illustrative clinical endpoint, were selected based on the published literature^10,11 and clinical experience of the authors. The set of clinical characteristics (predictors) characterizing the neurological status of patients at baseline (assessed <2 weeks after iSCI) included the following: the NLI, the right and left motor levels, the right and left sensory levels, the bilateral sensory scores (light touch, pin prick), and motor scores (upper and lower extremity motor score), as well as epidemiological information, such as age at injury and cause of injury. The set of predictors characterize the neurological status of the subjects according to the criteria of the ISNCSCI examination.⁷

The illustrative clinical endpoint chosen for this study was the bilateral UEMS at 6 months after injury, which consists of the sum of the manual muscle tested (MMT) strength for each of the 10 key muscles of the upper extremity (5 on each side of the body) and ranges between 0 and 50. This endpoint has been previously related to changes in upper extremity function,¹² as well as suggested¹³ as potential endpoint in clinical trials. A number of trials also considered the change in UEMS as primary endpoint.^3,14 In this regard, we note that the stratification based on our illustrative endpoint does not preclude the use of change in motor score as trial endpoint. In fact, the 2 endpoints are strongly interconnected and their simultaneous considerations seems sensible. First, the direct application of URP-CTREE on the change in UEMS creates a situation for which both participants with severe lesions and those with high initial UEMS achieve similar, small change in UEMS (the first due to the severity of their lesion, the latter solely due to the ceiling effect on the scale used to measure motor function), making stratification for trials questionable. In addition, one has to consider UEMS so as to know whether the postulated treatment effect (especially if defined on the change in UEMS) can actually be measured on the UEMS scale. It should be also noted that, since the identification of more homogeneous groups with regard to UEMS at 6 months after injury is based on baseline UEMS (among others), information on changes in UEMS is actually contained in our conditional tree (see Figure 1). More generally, UEMS is an impairment-based clinical endpoint and may or may not be the most appropriate primary endpoint for a future trial. Phase III pivotal trials will likely require a function-based endpoint. Functional (activity-based) endpoints can be analyzed using the methods described here, but this is beyond the scope of this article.

Figure 1.

Conditional inference tree for the endpoint Upper Extremity Motor Score (UEMS) at 6 months after cervical motor-incomplete (American Spinal Injury Association Impairment Scale B and C) spinal cord injury (N = 122), using a broad set of neurological predictors assessed within the first 2 weeks after injury.

Conditional Inference Trees

In our analysis, we adopted the approach of recursive partitioning by conditional inference⁶ (URP-CTREE). Tree-based regression models of this type are particularly useful for screening heterogeneous patient populations to identify more homogeneous patient subgroups. In fact, this type of approach has been successfully applied in disparate clinical fields, including cardiovascular disease,¹⁵ genetic marker–tumor association studies,^16,17 and traumatic brain injury.¹⁸

The URP-CTREE algorithm is based on the iterative application of a variable selection and a population splitting procedure. In the first step, the algorithm assesses whether any early clinical predictor (baseline variable) is associated with the selected clinical endpoint. To each association, a multiple testing-corrected P value based on permutation tests is assigned. When no early clinical predictor conveys any information about the clinical endpoint (response variable), and therefore no association is statistically significant (P ≥ .05), the algorithm stops. In contrast, when at least one early predictor is significantly associated with the selected clinical endpoint, a split in the patient population is performed, based on the predictor that showed the most significant association with the endpoint. The split is established in a way that maximizes the difference between the 2 newly formed subgroups with regard to the endpoint. Subgroups that will be further split in successive iterations are referred to as inner nodes and shown graphically as circles indicating the predictor on which a further split is performed (as well as the multiple-testing corrected P value for the predictor-endpoint association; see upper part of Figure 1). In that sense, inner nodes are just graphical placeholders for subgroups that will be further split by the algorithm. The steps outlined above are reiterated until there are no statistically significant associations left between early clinical predictors and the selected clinical endpoint. When this is the case, the algorithm stops and returns a structure resembling an upside-down tree that divides an initially heterogeneous study population into more homogeneous subgroups with respect to the endpoint specified. Figure 1 shows an example where the initial, heterogeneous iSCI population “enters” the tree on the top inner node and is subsequently subdivided by the internal rules of the algorithm till no further partitioning is possible. At this stage, the endpoint distribution within each subgroup is displayed in the URP-CTREE’s terminal nodes (see boxplots in Figure 1).

A detailed technical description with several applications of the statistical model is reported in Hothorn et al,⁶ and a more pertinent SCI explanation, using cervical AIS-A SCI, has been recently completed.¹⁹ In this previous study,¹⁹ we investigated the value of URP-CTREE in comparison to established multivariate statistical approaches (eg, linear and logistic regression) for predicting clinical endpoints and for prospective stratification of sensorimotor-complete (AIS-A) SCI participants. Our analyses concluded that, while all methods employed had similar prediction accuracy, URP-CTREE provided additional advantages by directly identifying more homogeneous subgroups of patients, and describing the full endpoint distribution within each subgroup.

Fitting and Validation

Following model fitting (see Figure 1) with the EMSCI data, we inspected the resulting conditional inference tree with regard to variable selection and split point estimation. Internal validation of the EMSCI-based URP-CTREE was investigated by means of a resampling technique. From the original EMSCI sample, we drew 1000 random samples with replacement. For each random sample (of sample size n = 122), we fitted a new conditional inference tree allowing the same predictors. We then investigated the frequency of predictor selection and the distribution of split points.

External validity was assessed using a similar but independent sample of patient data from the Sygen²⁰ trial. We applied the decision rules provided by the inner nodes of the EMSCI-based tree (Figure 1) to Sygen participants with initial characteristics complying with the inclusion criteria outlined above. We then compared EMSCI-based terminal nodes to the results obtained for Sygen participants. Ninety-five percent confidence intervals for the median difference in all 4 pairs of terminal nodes were computed. All analysis were performed in the computing environment R, version 3.0.1,²¹ and based on the statistical packages “party: A laboratory for Recursive Partitioning (Version 1.0-13)”²² for conditional inference trees, and “coin: Conditional Inference Procedures in a Permutation Test Framework (Version 1.0-23)”²³ for confidence interval computation.

Results

Fitting the Conditional Inference Tree

URP-CTREE for UEMS at 6 months based on a sample of 122 tetraplegic (ie, cervical) AIS-B (n = 53) and AIS-C (n = 69) participants extracted from EMSCI is shown in Figure 1. The iterative identification of more homogeneous subgroups is based on 2 baseline neurological assessments, namely, UEMS and light touch sensory perception. Splits are indicated at the inner nodes 1, 2, 5. URP-CTREE produced 4 terminal nodes (3, 4, 6, 7), for which the endpoint distribution is shown in the boxplots at the bottom. Table 1 reports a numeric summary of outcome distribution as well as the initial baseline AIS distribution for each terminal node (Figure 1).

Table 1.

Distribution of the Illustrative Endpoint UEMS at 6 Months After Injury for Each Terminal Node in the Conditional Inference Tree Based on EMSCI Participants^a.

UEMS	Node 3	Node 4	Node 6	Node 7
Minimum	5	6	18	32
25% Quantile	10	23	31.75	41.75
Median	13	30	37.5	44.5
75% Quantile	20	36	42.25	48
Maximum	43	48	50	50
AIS-B (total 53)	8 (15%)	18 (34%)	15 (28%)	12 (23%)
AIS-C (total 69)	5 (7%)	19 (27%)	33 (48%)	12 (17%)

Abbreviations: UEMS, upper extremity motor score; EMSCI, European Multicenter study about Spinal Cord Injury; AIS, American Spinal Injury Association Impairment Scale.

Baseline ASIA Impairment Scale grades distribution are also reported.

Variable Selection

Variable selection is the fundamental step of URP-CTREE and relies on the computation of permutation tests between predictors and endpoint as well as the identification of the most significant associations. The P-value-based ranking of the best 4 predictors for each inner node is reported in Table 2. Baseline UEMS is the variable selected in inner node 1 (P value < .001) and inner node 5 (P value = .001), and baseline light touch sensation is selected in inner node 2 (P value = .024). The EMSCI-based conditional inference tree shows a stable variable selection, for which the selected variable is clearly the most significant from within all predictors included in the analysis. The conditional inference tree stopped growing (no further split was implemented), as no further predictor–endpoint association was significant at the P value = .05 level.

Table 2.

Variable Selection for the Inner Nodes of the EMSCI-Based Conditional Inference Tree^a.

Node 1		Node 2		Node 5
Predictors	P Value	Predictors	P Value	Predictors	P Value
UEMS	3.1 × 10⁻¹¹	Light touch	.024	UEMS	.001
ML (right)	.0002	UEMS	.437	ML (right)	.410
ML (left)	.001	ML (right)	.542	ML (left)	.476
NLI	.001	Pin prick	.868	Age	.724
Other predictors		Other predictors		Other predictors

Abbreviations: EMSCI, European Multicenter study about Spinal Cord Injury; UEMS, upper extremity motor score; ML, motor level; NLI, neurological level of injury.

The first 4 predictors are reported by increasing P value (decreasing importance). Nominal significance value is P = .05.

AIS Grades

As can be seen in Table 1, the initial classification of iSCI severity as AIS-B or AIS-C has little predictive value for the distribution of UEMS scores at 6 months. If it did, then most of the AIS-B participants would be stratified to cohort 3 where the lowest final UEMS scores were recorded and AIS-C participants would be more evident within cohort 7 where the highest UEMS scores were observed. Instead, a participant initially classified AIS-B was just as likely to show a substantial amount of motor function as an AIS-C participant (terminal node 7 has 12 AIS-B and 12 AIS-C). Thus, UEMS after 6 months could not be predicted by the initial baseline AIS grade. AIS-B and AIS-C participants at 6 months were distributed across all cohorts, often fairly equally.

Split Points

Following variable selection, a split is set so as to maximize discrepancy between the 2 newly formed subgroups. The split points estimated in each inner node are shown in Figure 2. The plots represent the standardized test statistics⁶ for each possible split as a measure of subgroup discrepancy; the vertical dotted line is the split point used in the EMSCI-based URP-CTREE. All vertical dotted lines aligned with the value that maximizes discrepancy between the 2 newly formed subgroups. This substantiated the validity of all 3 split points for the EMSCI-based conditional inference model. Even for inner node 2, whose split statistics seem to have 2 local maxima (2 “bumps”), the split was produced at the highest value statistic available and was therefore correctly implemented.

Figure 2.

Split point estimation of inner nodes for the European Multicenter Study about Spinal Cord Injury-based conditional inference tree.

Internal Validation

We examined the internal validity of the EMSCI-based tree using a standard resampling technique and investigated the frequency of predictor selection and the distribution of split points. Results of this resampling technique are shown in Figure 3. For inner node 1, baseline UEMS was selected in all resampling iterations (predictor 8, top panel left, Figure 3), and the split points resulted in an almost symmetric distribution around the first split point defining the maximal discrepancy for the 2 subgroups in EMSCI (vertical line, top panel right).

Figure 3.

Internal validation of variable selection and split point estimation based on resampling.

For inner node 2, baseline light touch was consistently selected, and no pattern of alternative variable selection emerged (predictor 6, middle panel left, Figure 3). For all iterations where light touch was the variable selected, the second split point distribution showed an approximately symmetric distribution centered at the split point implemented in the EMSCI-based conditional inference tree (vertical line, middle panel right).

For inner node 5, baseline UEMS was consistently selected, and no pattern of alternative variable selection emerged (predictor 8, bottom panel left, Figure 3). For all iterations where UEMS was the variable selected, the third split point distribution showed an almost symmetric distribution around the split point implemented in the EMSCI-based conditional inference tree (vertical line, bottom panel right).

External Validation

True validation of a statistical model can only be obtained by assessing its performance on an independent data set. We analyzed n = 83 participants from the Sygen²⁰ trial (from both control and treatment groups, as no significant difference was reported for these 2 different cohorts) by applying the decision rules provided by the inner nodes of the EMSCI-based tree and compared its terminal nodes to the results obtained for Sygen participants (Figure 4). Overall, we observed similar distributions for UEMS within terminal nodes for EMSCI and Sygen subjects.

Figure 4.

External validation of EMSCI-trained conditional inference tree based on Sygen participants with corresponding baseline characteristics.

Terminal node 6 was very similar for both data sets, whereas terminal nodes 3 and 7 showed a nonsignificant shift in distribution toward slightly higher UEMS values for Sygen participants. Terminal node 4 is similar for both data sets, especially in terms of interquartile range, but Sygen participants show a nonsignificant shift toward lower UEMS values. Terminal nodes 4 and 7 for Sygen as well as terminal node 3 for EMSCI represent a relatively small sample size.

Based on an asymptotic Wilcoxon Mann–Whitney rank sum test, a 95% confidence interval (CI) for the median difference (EMSCI − Sygen) was computed for each pair of terminal nodes. The sample estimate for difference in medians (hereafter med.diff) and relative 95% CIs were the following: med.diff (node 3) = −6.0 (CI: −13.0, 2.0); med.diff (node 4) = 3.0 (CI: −7.0, 13.0); med.diff (node 6) = 0.0 (CI: −4.0, 4.0); med.diff (node 7) = −2.0 (CI: −5.0, 0.0).

Discussion

The intent of this study was to determine whether well-informed and predictable stratification algorithms can be achieved for an acute or subacute SCI clinical trial. Previously, this has been difficult because of the highly variable spontaneous recovery patterns after iSCI. There is a strong need for a clear a priori stratification algorithm and, as seen with the current example, URP-CTREE may be useful in this regard. The advantage of this tree-based regression model is its ability to automatically stratify participants into relatively homogeneous subgroups on the basis of early clinical characteristics.

So far, very few studies in SCI have attempted to provide models for long-term prediction of neurological endpoints that could be used for patient stratification. To our knowledge, existing clinical algorithms refer to ambulatory^24,25 or functional²⁶ endpoints, which require substantial recovery and are therefore likely to depend on either a large treatment effect and/or spontaneous recovery. Thus, they are potentially less useful for application to an acute SCI clinical trial investigating a subtle therapeutic effect that would be initially detectable by some small neurological change. In addition, those functional endpoint studies are based on the statistical techniques of multiple linear and logistic regression. In a recent study,¹⁹ we reported that those techniques do not provide direct rules for identifying more homogeneous cohorts and do not consider the full endpoint distribution. Here, we demonstrate the value of conditional inference trees⁶ to provide an easily implementable, data-driven stratification rationale for the inclusion of iSCI into future SCI trial programs.

Conditional Inference Tree for EMSCI Data

The conditional inference tree for tetraplegic AIS-B and AIS-C based on EMSCI data produces 3 split points based on baseline UEMS and baseline light touch sensation (Figure 1). All 3 split points (inner nodes 1, 2, 5; Figure 1) are based on the most significant predictor (early participant characteristic) at that inner node.

As expected, baseline UEMS after iSCI plays an important role in predicting UEMS values 6 months later. A nonlinear relationship between baseline UEMS and the 6-month value is clearly shown by the 2 subsequent splits of baseline UEMS, as has been shown for general motor recovery in previous studies.^4,20 The lower range of UEMS is further divided depending on the perception of light touch. In our analysis, light touch was clearly more predictive for future UEMS than pin prick sensation (see Table 2). This seems to be in contrast with previous findings.^4,27 Those findings predicted ambulation for motor complete, sensory incomplete (AIS-B) participants based on their lower extremity and sacral pinprick score. Our results are therefore complementary rather than contrasting.

Although not always statistically significant, Table 2 also shows that the baseline motor level is consistently ranked within the first 3 most important predictors for each node. The relevance of the motor level for spontaneous motor recovery is dependent on the nature of SCI and has been reported before.²⁸ Its explicit incorporation in future analysis of motor endpoints is warranted.

AIS Grades for SCI Patient Stratification

Over the past few decades, stratification procedures for SCI trials were narrow,² broad,³ or omitted.¹⁴ Clinicians have been classifying SCI in wide-ranging categories, the most common of which is the AIS classification. As can be seen in Table 1, the initial classification of iSCI severity as AIS-B or AIS-C had little predictive value for the distribution of participants with regard to their scores at 6 months. AIS-B and AIS-C participants were distributed across all cohorts, often fairly equally (Figure 1, nodes 3, 4, 6, 7). AIS grades have been repeatedly proven not to be a valuable measure for endpoint definition,^2,13,20 especially so in the context of SCI clinical trials with expected small treatment effects. Our analysis, even though based on AIS-B and AIS-C participants only, and other studies²⁹ show that AIS grades might not be appropriate for a fine-grained stratification as required by clinical trial either.

Validation

The main goal of any clinical predictive model is to provide valid endpoint predictions for future participants.³⁰ With the goal to investigate the stability and generalizability of conditional inference tree, we undertook internal and external validation of our results.

Internal validation was based on resampling and showed that predictor selection was consistent for all inner nodes, where the predictors selected in the EMSCI-based URP-CTREE were reliably selected (Figure 3, left panels) across resampling iterations. In addition, no pattern of selection for other predictors emerged from the resampling technique. The split point distribution for the predictors showed an approximately symmetric distribution around the original split (vertical line in right panels, Figure 3). Both predictor selection and split point distribution provide strong evidence for the stability of the EMSCI-based conditional inference tree.

To externally validate and support the general applicability of our conditional inference tree for UEMS, we applied the decision rules provided by the inner nodes of the EMSCI-based tree to an independent sample of Sygen participants (n = 83; from both control and treatment groups, as no significant difference was reported). Overall, we observed similar terminal nodes (final subgroups, Figure 4) for EMSCI and Sygen subjects, which provides evidence for the external validity of the EMSCI conditional inference tree. Similarities between European and North American spontaneous recovery profiles have already been reported for motor recovery after sensorimotor complete cervical (AIS-A) SCI.³¹ Our analyses suggest that parallels extend to the iSCI population.

Specifically, nodes 3 and 4 (Figure 4) for Sygen patients provided a less sharp distinction between low (EMSCI boxplot node 3, Figure 4) and intermediate (EMSCI boxplot node 4, Figure 4) UEMS status at 6 months after injury. Terminal node 3 in Sygen showed higher UEMS scores than the corresponding EMSCI terminal node, while terminal node 4 presents the opposite behavior.

The different enrollment time frames of Sygen (first assessment within 3 days of SCI) compared to EMSCI (first assessment within 2 weeks, in average 8 days after injury) may contribute to any observed differences in terminal nodes. To support our hypothesis, we selected those EMSCI participants (n = 25) having an initial baseline assessment within 3 days of injury (ie, same time window as for Sygen), and we noted a median UEMS of 38 at 6 months. For those EMSCI participants that had an initial assessment >3 days after injury (n = 97), the median UEMS was 34 at 6 months. This analysis performed on the EMSCI data support our interpretation that enrollment time frame can influence UEMS outcomes.

Despite the observed differences (Figure 4), all 95% confidence intervals for the median difference of each terminal node pair included the value of zero, providing no evidence for a statistically significant shift in distribution in our data. Nonetheless, absence of evidence cannot be interpreted as evidence of absence, especially because of the small sample sizes.

While assessing external validity, we explicitly did not fit a new URP-CTREE to the Sygen data set as the goal of external validation is to see whether a fitted model provides valid prediction of an endpoint for an independent data set, and not whether the same model results from the 2 different datasets.

Applications to Clinical Trial Design

Our analysis was intended to provide an illustrative example of how URP-CTREE may be employed in the planning and designing of future clinical trials. Of the 4 nodes (3, 4, 6, 7) in Figure 1 and Table 1, a conservative enrollment strategy of iSCI participants in a phase II SCI clinical trial would only include participants within nodes 4 and 6. In this way, enrollment is restricted to iSCI participants that are likely to respond to a treatment, and participants where any treatment effect can be readily detected. In short, the assessment of a treatment effect is not hampered by floor or ceiling effects for the endpoint measurement. Participants in node 7 would likely be excluded because they will be constrained by a ceiling effect for the UEMS scale. For node 7, the median UEMS at 6 months is 44.5/50. Thus, only 5.5 motor points are available to detect a therapeutic effect in comparison to controls.

Participants in terminal node 3 could be excluded as well, because they show limited final upper extremity scores and any treatment targeting the enhancement of their spared sensorimotor function might only have a small effect. In addition, those participants are distinctly different from the other iSCI nodes in terms of final UEMS. In terms of UEMS and improvement in motor levels, the patterns for node 3 participants were very similar to those of cervical sensorimotor complete (AIS-A) participants and a similar clinical endpoint (2 motor level improvement) might be justified.^9,32 Therefore, iSCI participants like those in terminal node 3 could be stratified with cervical AIS-A participants in a clinical trial with an appropriate endpoint.

The potential number of eligible iSCI participants that may be recruited to any clinical study is an important consideration. Even if we applied the more restrictive criteria and suggested only recruiting participants within nodes 4 and 6 (Figure 1), a hypothetical trial could potentially still recruit 70% of eligible iSCI subjects (EMSCI sample = 37 + 48/122 = 70%). Undoubtedly, a number of other inclusion and exclusion criteria will influence the final number of participants enrolled (the “funnel effect”³³) but the potential inclusion of such a large percentage of iSCI combined with the a priori identification of more homogeneous subgroups represents a clear improvement in trial design.

Modeling Specific Endpoints

Independent from its use in clinical trials, one of the key advantages of URP-CTREE with respect to established statistical approaches^24-26 is that it presents a complete endpoint distribution within the terminal nodes.¹⁹ This allows the exploration of different neurological or functional endpoints and their appropriateness for distinct iSCI subgroups (terminal nodes).

Clearly, the use of URP-CTREE does not guarantee partitioning of a heterogeneous population in the same way as presented here. URP-CTREE is specific for a given patient population, clinical endpoint, and early clinical predictors (baseline variables). Changes in any of these components may influence the resulting partitioning by generating different splits, a different number of terminal nodes (subgroups), or a different endpoint distribution within each node.

With any comprehensive historical database, URP-CTREE can be employed to accomplish all the necessary modeling variations and gather insight into the expected behavior of the control group for any chosen endpoint. This is a major improvement and should provide additional confidence for the final trial protocol submission. In the study reported here, we have deliberately not chosen a threshold value for UEMS at 6 months after iSCI. A threshold for a binary endpoint could be explored using URP-CTREE, but selection of any endpoint threshold requires consideration of what is reasonable and/or meaningful.

Limitations

URP-CTREE is applicable to several types of regression problems, including nominal, ordinal, numeric, and censored endpoints.⁶ It remains to be demonstrated whether neurological scores like UEMS can actually be analyzed as a continuous endpoint, such as time or distance. Despite being the current default approach in SCI, total UEMS is in fact a sum of several ordinal variables (each manual muscle score, from a key muscle, is an ordinal scale in its own right). In other fields, it has been shown that the analysis of an ordinal scale with methods designed for the analysis of continuous endpoints led to errors in inference.^34-36 We are analyzing methods for correcting this limitation.

Conclusion

Our analyses provide an example for the employment of URP-CTREE to facilitate the design of more inclusive clinical trials. We utilized a conditional inference tree to provide an early prediction-based stratification of cervical iSCI patients as anchored by UEMS at 6 months after injury (inclusion/exclusion criteria). Internal validation of the EMSCI determined URP-CTREE proved it to be very stable. External validation based on comparable incomplete SCI participants from the independent Sygen study supports its generalizability. Our analysis supports the use of URP-CTREE as a statistical tool for modeling various other clinical endpoints and different patient populations. Further investigations on the external validity and the analysis of ordinal variables with techniques designed for continuous endpoint should be the object of refined analysis.

Footnotes

Acknowledgements

We appreciate the continuous assistance of René Koller with the EMSCI database.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors acknowledge the support of the European Multicenter study about Spinal Cord Injury (EMSCI), the International Foundation for Research in Paraplegia (IFP), the Spinal Cord Outcomes Partnership Endeavor (SCOPE), and the Clinical Research Priority Program in Neuro-rehabilitation (CRPP NeuroRehab) of the University of Zurich.

References

Wyndaele

. Incidence, prevalence and epidemiology of spinal cord injury: what learns a worldwide literature survey? Spinal Cord. 2006;44:523-529.

Lammertse

Jones

LAT

Charlifue

. Autologous incubated macrophage therapy in acute, complete spinal cord injury: results of the phase 2 randomized controlled multicenter trial. Spinal Cord. 2012;50:661-671.

Casha

Zygun

McGowan

Bains

Yong

John Hurlbert

. Results of a phase II placebo-controlled randomized trial of minocycline in acute spinal cord injury. Brain. 2012;135:1224-1236.

Fawcett

Curt

Steeves

. Guidelines for the conduct of clinical trials for spinal cord injury as developed by the ICCP panel: spontaneous recovery after spinal cord injury and statistical power needed for therapeutic clinical trials. Spinal Cord. 2006;45:190-205.

Zhu

Zhan

. Identification of prognosis-relevant subgroups in patients with chemoresistant triple-negative breast cancer. Clin Cancer Res. 2013;19:2723-2733.

Hothorn

Hornik

Zeileis

. Unbiased recursive partitioning: a conditional inference framework. J Comput Graph Stat. 2006;15. http://amstat.tandfonline.com/doi/full/10.1198/106186006X133933. Accessed January 20, 2014.

Kirshblum

Waring

Biering-Sorensen

. Reference for the 2011 revision of the international standards for neurological classification of spinal cord injury. J Spinal Cord. 2011;34:547-554.

Pouw

van Middendorp

van Kampen

. Diagnostic criteria of traumatic central cord syndrome. Part 1: A systematic review of clinical descriptors and scores. Spinal Cord. 2010;48:652-656.

Kramer

JLK

Lammertse

Schubert

Curt

Steeves

. Relationship between motor recovery and independence after sensorimotor-complete cervical spinal cord injury. Neurorehabil Neural Repair. 2012;26:1064-1071.

10.

Al-Habib

Attabib

Ball

Bajammal

Casha

Hurlbert

. Clinical predictors of recovery after blunt spinal cord trauma: systematic review. J Neurotrauma. 2011;28:1431-1443.

11.

Wilson

Cadotte

Fehlings

. Clinical predictors of neurological outcome, functional status, and survival after traumatic spinal cord injury: a systematic review. J Neurosurg Spine. 2012;17:11-26.

12.

Rudhe

van Hedel

HJA

. Upper extremity function in persons with tetraplegia: relationships between strength, capacity, and the spinal cord independence measure. Neurorehabil Neural Repair. 2009;23:413-421.

13.

Steeves

Lammertse

Curt

. Guidelines for the conduct of clinical trials for spinal cord injury (SCI) as developed by the ICCP panel: clinical trial outcome measures. Spinal Cord. 2006;45:206-221.

14.

Bracken

Shepard

Collins

. A randomized, controlled trial of methylprednisolone or naloxone in the treatment of acute spinal-cord injury—results of the Second National Acute Spinal Cord Injury Study. N Engl J Med. 1990;322:1405-1411.

15.

Costello

Swartz

Sabripour

Sharma

Etzel

. Use of tree-based models to identify subgroups and increase power to detect linkage to cardiovascular disease traits. BMC Genet. 2003;4:S66.

16.

Owzar

. Alternate statistical tools and limitations in genetic marker association studies in single-arm drug cancer trials. J Clin Oncol. 2008;26:1400-1401.

17.

De Roock

Claes

Bernasconi

. Effects of KRAS, BRAF, NRAS, and PIK3CA mutations on the efficacy of cetuximab plus chemotherapy in chemotherapy-refractory metastatic colorectal cancer: a retrospective consortium analysis. Lancet Oncol. 2010;11:753-762.

18.

Brown

Malec

McClelland

Diehl

Englander

Cifu

. Clinical elements that predict outcome after traumatic brain injury: a prospective multicenter recursive partitioning (decision-tree) analysis. J Neurotrauma. 2005;22:1040-1051.

19.

Tanadini

Steeves

Hothorn

. Identifying homogeneous subgroups in neurological disorders: unbiased recursive partitioning in cervical complete spinal cord injury. Neurorehabil Neural Repair. 2014;28:507-515.

20.

Geisler

Coleman

Grieco

Poonian

Group

. The SYGEN® multicenter acute spinal cord injury study. Spine. 2001;26:S87-S98.

21.

R Development Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org/. Accessed January 21, 2015.

22.

Hothorn

Hornik

Strobl

Zeileis

. party: a laboratory for recursive partitioning (R package, Version 1.0-13). http://cran.r-project.org/package=party. Accessed January 21, 2015.

23.

Hothorn

Hornik

van de Wiel

Zeileis

. coin: conditional inference procedures in a permutation test framework (R package, Version 1.0-23). http://cran.r-project.org/package=coin. Accessed January 21, 2015.

24.

Zörner

Blanckenhorn

Dietz

Curt

. Clinical algorithm for improved prediction of ambulation and patient stratification after incomplete spinal cord injury. J Neurotrauma. 2010;27:241-252.

25.

Van Middendorp

Hosman

Donders

ART

. A clinical prediction rule for ambulation outcomes after traumatic spinal cord injury: a longitudinal cohort study. Lancet. 2011;377:1004-1010.

26.

Wilson

Grossman

Frankowski

. A clinical prediction model for long-term functional outcome after traumatic spinal cord injury based on acute clinical and imaging factors. J Neurotrauma. 2012;29:2263-2271.

27.

Oleson

Burns

Ditunno

Geisler

Coleman

. Prognostic value of pinprick preservation in motor complete, sensory incomplete spinal cord injury. Arch Phys Med Rehabil. 2005;86:988-992.

28.

Coleman

. Injury severity as primary predictor of outcome in acute spinal cord injury: retrospective results from a large multicenter clinical trial. Spine J. 2004;4:373-378.

29.

Velstra

Bolliger

Tanadini

. Prediction and stratification of upper limb function and self-care in acute cervical spinal cord injury with the graded redefined assessment of strength, sensibility, and prehension (GRASSP). Neurorehabil Neural Repair. 2014;28:632-642.

30.

Steyerberg

. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating (Auflage: Softcover reprint of hardcover 1st ed). New York, NY: Springer; 2009.

31.

Steeves

Kramer

Fawcett

. Extent of spontaneous motor recovery after traumatic cervical sensorimotor complete spinal cord injury. Spinal Cord. 2011;49:257-265.

32.

Steeves

Lammertse

Kramer

JLK

. Outcome measures for acute/subacute cervical sensorimotor complete (AIS-A) spinal cord injury during a phase 2 clinical trial. Top Spinal Cord Inj Rehabil. 2012;18:1-14.

33.

Jones

LAT

Lammertse

Charlifue

. A phase 2 autologous cellular therapy trial in patients with acute, complete spinal cord injury: pragmatics, recruitment, and demographics. Spinal Cord. 2010;48:798-807.

34.

Winship

Mare

. Regression models with ordinal variables. Am Sociol Rev. 1984;512-525.

35.

Hastie

Botha

Schnitzler

. Regression with an ordered categorical response. Stat Med. 1989;8:785-794.

36.

Scott

Goldberg

Mayo

. Statistical assessment of ordinal outcomes in comparative studies. J Clin Epidemiol. 1997;50:45-55.

Toward Inclusive Trial Protocols in Heterogeneous Neurological Disorders

Abstract

Keywords

Introduction

Methods

Data Source

Inclusion Criteria

Predictors and Clinical Endpoint

Conditional Inference Trees

Fitting and Validation

Results

Fitting the Conditional Inference Tree

Variable Selection

AIS Grades

Split Points

Internal Validation

External Validation

Discussion

Conditional Inference Tree for EMSCI Data

AIS Grades for SCI Patient Stratification

Validation

Applications to Clinical Trial Design

Modeling Specific Endpoints

Limitations

Conclusion

Footnotes

Acknowledgements

Declaration of Conflicting Interests

Funding

References