Abstract
Background:
Currently, 6-month oxaliplatin-based chemotherapy has been recommended as the preferred adjuvant treatment against high-risk stage 2 and stage 3 colon cancer patients.
Methods:
Record retrieval was conducted in PubMed, Web of Science, Cochrane Central Register of Controlled Trials, American Society of Clinical Oncology and European Society for Medical Oncology meeting libraries from inception to November 2019. Regarding survival and tolerability, randomized controlled trials comparing different adjuvant systemic regimens against high-risk stage 2 and stage 3 colon cancer were eligible. Disease-free survival was primary endpoint. Network calculation was based on a random-effects model, and relative ranking of each node was numerically indicated by p score.
Results:
A total of 30 trials were included, corresponding to 54,109 patients. Regarding disease-free survival, none of the analyzed regimens displayed significant superiority against common comparator 6-month capecitabine plus oxaliplatin (XELOX), while 12-month [network hazard ratio (HR) 0.81 (0.60–1.10); 0.79 (0.57–1.10)] and 3-month XELOX [0.95 (0.86–1.04); 0.93 (0.83–1.05)] were top-ranking regimens showing non-inferiority among overall and stage 3 patients. Moreover, by pairwise meta-analysis, 3-month XELOX demonstrated significant superiority against 6-month XELOX among low-risk stage 3 patients [pairwise HR 0.78 (0.63–0.97)]. Concerning adverse events, 3-month oxaliplatin-based chemotherapy was significantly better than the 6-month counterpart with respect to peripheral sensory neuropathy, thrombocytopenia and fatigue. The 12-month capecitabine monotherapy failed to display non-inferiority among other major adverse events.
Conclusions:
The 3-month XELOX treatment could be an alternative option of the 6-month regimen among low-risk stage 3 patients. Among high-risk stage 3 patients, 6-month oxaliplatin-based regimens still seem more competitive. In addition, clinical application of 12-month capecitabine monotherapy should be cautious, despite its top rankings, especially among non-Asian countries.
Keywords
Introduction
Colorectal cancer is currently the fourth most common and fifth most lethal malignancy worldwide. It is estimated that nearly 1,400,000 cases occur annually, while almost 700,000 patients die because of it each year. 1 In the United States, although the overall incidence of colorectal cancer is reported to decrease during the past decade due to earlier diagnosis and better cancer prevention,2,3 more efforts are still required to enhance the survival probability among those cancer sufferers.
Adjuvant systemic chemotherapy has become the standard of care against stage 2 and stage 3 colon cancer following curative surgeries. Both National Comprehensive Cancer Network (NCCN; 2019.V2) and Chinese Society of Clinical Oncology (2019.V1) guidelines recommend oxaliplatin-based regimens in the adjuvant setting against stage 3 and high-risk stage 2 operable colon cancer, while fluoropyrimidine monotherapy (capecitabine or 5-FU/leucovorin) is optional for low-risk stage 2 patients.3,4 The latest European Society for Medical Oncology (ESMO) colon cancer guideline (2013) suggests that oxaliplatin-based regimens should be regarded as the preferred options among stage 3 patients, while no specific regimen is recommended for high-risk stage 2 cases. 5 In the Japanese Society for Cancer of the Colon and Rectum guideline for colorectal cancer (2019), oxaliplatin-based regimens have also been recognized as the preferred regimens against stage 3 cases, while capecitabine, S-1 and UFT monotherapy are also considered as effective alternatives. However, fluoropyrimidine monotherapy is not indicated for stage 2 cases due to lacking evidence from randomized controlled trials (RCTs) based on the Japanese population. 6 In recent years, duration of adjuvant treatments has become the research hotspot in this field. Five largescale RCTs including ACHIEVE, HORG-IDEA, IDEA, SCOT and TOSCA (results of CALGB/SWOG 80702 had not been formally published in journal or meeting abstract) studied the relative efficacy and tolerability of 3-month versus 6-month oxaliplatin-based regimens,1,7–10 which generally concluded that the option of shorter or longer treatment depended on regimen types and patient characteristics. The 3-month capecitabine plus oxaliplatin (XELOX) seemed to be non-inferior to 6-month regimen, while 3-month FOLFOX failed to show non-inferiority to its 6-month counterpart. Meanwhile, longer duration was more beneficial among high-risk stage 3 patients. However, the proportion of FOLFOX/XELOX regimen across five trials was not quite comparable, was the authors’ conclusions.1,7–10 Therefore, all these results added complexity to regimen selection in the adjuvant colorectal cancer setting.
Currently, there is still scarcity of comprehensive hierarchical evidence to compare and rank all possible regimens simultaneously, which could offer more statistically straightforward and accurate outcomes than pairwise comparisons. Network meta-analysis could provide indirect calculations between regimens that lack direct comparisons. 11 Hence, in consideration of the rapidly growing types of chemotherapeutic strategies, as well as methodological imperfections regarding pairwise RCTs and meta-analyses, we decided to perform the first systematic review and network meta-analysis in this field.
Methods
Registration and guidelines
The protocol of our systematic review and network meta-analysis had been listed in PROSPERO [CRD42020147304]. The design, conduct and writing of this systematic review and network meta-analysis complied with the requirements from Preferred Reporting Items for Systematic Review and Meta-Analysis Checklist for Network Meta-analysis and Cochrane Handbook 5.1. Each step was performed by two researchers of our group. Any disagreement was resolved by the third researcher.
Search strategy
Electronic databases including PubMed, Web of Science and Cochrane Central Register of Controlled Trials were thoroughly examined. Additionally, we also searched major databases for meeting abstracts, including the American Society of Clinical Oncology (ASCO) and ESMO Meeting Library. The searching process started from 1 July until 10 November 2019, covering possible indexes published from inception to November 2019. Both abstract and main text of the retrieved records were rigorously checked in order to guarantee the accuracy of selection. The full electronic search strategy is presented in the Supplemental Materials.
Selection criteria
Studies that met all following criteria were therefore included (Participants, Intervention, Comparator, Outcome and Study design [PICOS] framework):
(1) Participants: patients should be diagnosed with previously untreated high-risk stage 2 and stage 3 resectable colon cancer without pathological selection. For trials studying targeted therapies, subgroup data of certain pathological or genetic status was permitted; however, overall results of unselected population should also be provided. Upper rectal cancer cases were also allowed since they shared similar biological features and therapeutic options with colon cancer patients. Patients with synchronous malignancies other than colon cancer were not permitted.
(2) Intervention: adjuvant systemic treatments should be given after curative surgeries, including intravenous or oral chemotherapeutic and targeted medications. Since there were several outdated drugs that were used against colon cancer but were no longer utilized currently in the clinical setting (such as mitomycin-C, methotrexate, vincristine, semustine and edrecolomab), we only included chemotherapeutic and targeted drugs that were currently approved and recommended for use against colon cancer by major countries, including capecitabine, oxaliplatin, irinotecan, 5-FU/leucovorin, S-1, UFT, bevacizumab, cetuximab and raltitrexed. Comparisons between different regimens deriving from any of these drugs in the adjuvant setting were deemed eligible. Moreover, comparisons between different durations of treatment by the same chemotherapeutic regimen were also qualified. Therefore, trials containing hyperthermic intraperitoneal chemotherapy, intraarterial chemotherapy, preoperative or postoperative radiotherapy were regarded as ineligible. Also, since adjuvant systemic treatments had been widely accepted as standard of care for high-risk stage 2 and stage 3 colon cancer, trials featuring comparisons between chemotherapy and observation only were also not included.
(3) Comparator: ‘XELOX (6M)’ (6-month capecitabine plus oxaliplatin regimen), ‘FP (6M)’ (6-month fluoropyrimidine plus platinum regimen) were common comparator nodes of network meta-analysis under different scenarios.
(4) Outcome: time-to-event disease-free survival (DFS) data (hazard ratio or Kaplan–Meier curves) were mandatory, while results of overall survival (OS) and adverse events were dispensable.
(5) Study design: phase II and phase III RCTs reported from inception to November 2019 without language limitations.
Studies were excluded for the following reasons:
(1) Besides chemotherapeutic or targeted medications, auxiliary therapeutics were also contained and comparatively studied, including non-steroidal anti-inflammatory drugs (NSAIDs), nutritional supportive methods (vitamins), unspecified herbal medicine (lentinan), general immunomodulators (interferons, polysaccharide K, polyadenylic–polyuridylic acid and Bacillus Calmette–Guerin) or levamisole (eTable 1).
(2) Cross-over design of RCTs.
Risk-of-bias assessment
The quality of each eligible trial was assessed by The Cochrane Risk-of-Bias Tool. The entire scale consisted of seven categories, including random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective reporting and other sources of bias. According to Cochrane Handbook 5.1, each category could be scored as low risk, unclear risk or high risk of bias once met certain criteria. If the majority of items were judged as low risk of bias, then the entire methodological design of network meta-analysis was regarded as low risk of bias, and vice versa. Here, trials were regarded to be low quality if four or more categories were evaluated as high risk of bias.
Data extraction
Pre-designed forms were utilized to collect and organize the original data. Baseline characteristics, efficacy and tolerability data were extracted from main text, tables, survival curves or supplemental material, which had been cross-checked by two different researchers in our group before quantitative synthesis.
Endpoints and nodes
The primary endpoint was DFS, while secondary endpoints included OS and adverse events. The definitions of DFS were mainly consistent across different trials (Supplemental Material). In terms of adverse events, we analyzed 12 common types of treatment-related adverse events including leucopenia, neutropenia, anaemia, thrombocytopenia, nausea/vomiting, anorexia, diarrhoea, fatigue, hand–foot syndrome, peripheral sensory neuropathy, alanine transaminase (ALT)/aspartate transaminase (AST) and creatinine. We only counted grade 3 or higher (National Cancer Institute Common Terminology Criteria for Adverse Events) adverse events due to their clinical significance. Criteria for adverse events judgement were also generally consistent across different trials (Supplemental Material).
The major principle for node classification was to combine homogenic arms together so that sample sizes and advantages of direct randomization could be enlarged. Key indicators to ensure homogeneity were clinical and methodological features, which jointly contributed to statistical homogeneity across the trials. Since we only included RCTs into our pooled analysis, methodological heterogeneity was low among included trials. Therefore, clinical features were critical for maintaining homogeneity inside each node, such as treatment regimens and pathological stages. Moreover, since DFS was the primary endpoint, baseline DFS rate (3 years and 5 years) was crucial for preliminarily judging statistical homogeneity, which also reflected clinical homogeneity across different trials within the same node. Hence, taken together, we classified nodes by different treatment regimens since it was the main focus of our meta-analysis and also acted as the major clinical heterogenic factor inside the network. Nevertheless, if baseline survival rates of different studies inside the same node were still not consistent, this might hint other underlying clinical heterogeneity besides treatment regimens, such as clinical stages and lymphadenectomy statuses, which would be further analyzed via sensitivity and subgroup analysis. On the other hand, in order to form an intact network for statistical calculation and also minimize an unnecessary number of nodes to enhance statistical power, we also integrated some regimens that were slightly different in terms of treatment schedules into one node, as long as their baseline DFS rates were comparable. To be specific, the majority of nodes in our meta-analysis were made according to their original treatment schedules, such as node ‘XELOX (6M)’ corresponding to 6-month capecitabine plus oxaliplatin regimen. Although different studies utilized slightly different regimens of FOLFIRI, there was only one ‘FOLFIRI’ node inside our network. Among all eligible studies, regimens of 5-FU plus leucovorin had several types of variations; therefore, node ‘LV5FU2 (6M)’, ‘FU/FA-RP (Roswell Park regimen)’ and ‘FU/FA-MC (Mayo Clinic regimen)’ were created to fit different schedules. For chemotherapeutic drugs plus bevacizumab or cetuximab, although the actual chemotherapeutic regimens were not completely identical across included trials, we still integrated them into two nodes ‘F/bevacizumab’ and ‘F/Cetuximab’ (F here represented fluoropyrimidines) to facilitate network calculation, since their baseline survival rates were quite comparable. In addition, there were two types of node classification systems within our network meta-analysis, namely Node-1 and Node-2. The only difference between these two types was that Node-1 separated all fluoropyrimidine-plus-platinum regimens into specific regimens, such as mFOLFOX6, FOLFOX4, SOX and XELOX, while Node-2 combined them together so that comparisons between the 3-month schedule versus the 6-month schedule were much easier. As abovementioned, although we tried hard to restrict heterogeneity inside each node, there might still be certain degrees of heterogeneity that warranted further sensitivity or subgroup analysis. Treatment schedules of all included trials were listed in the Supplemental Material.
Statistical analysis
Hazard ratio (HR) and its 95% confidential interval (95% CI) were used as the effect size for DFS and OS. Risk ratio (RR) and its 95% CI were applied as the effect size for adverse events. If survival data were not directly provided, we estimated the values from Kaplan–Meier curves by methods described elsewhere.12,13
It was well known that network meta-analysis could offer a hierarchical ranking among multiple arms despite lacking direct comparisons.14,15 This vital advantage was based on two key assumptions of network meta-analysis that were known as transitivity and consistency, respectively.15,16
When pairwise comparisons of A versus C and B versus C were separately provided, transitivity of network meta-analysis further validated the statistical comparison between A and B. Nevertheless, it required comparable baseline characteristics as the prerequisite condition for minimizing selection bias and therefore justifying subsequent connections between indirect arms. 17 Because all eligible studies were randomized trials without significant heterogeneity on methodological design, clinical features were crucial to determine baseline heterogeneity, as well as network transitivity. We carefully compared key clinical features among different arms inside each node and then removed those with significant heterogeneity by performing sensitivity analysis. Besides possible clinical and methodological disparities, we also evaluated statistical heterogeneity inside our network calculation. I 2 was used as the main indicator for statistical heterogeneity, with its value <25%, 25–50% and >50% suggesting low, moderate and high heterogeneity, respectively. Moreover, Q static of heterogeneity also helped to assess statistical heterogeneity.
On the other side, consistency, another main assumption for network meta-analysis, referred to statistically consistent results between direct and indirect calculations concerning the same comparison. Significant differences between direct and indirect results could suggest inconsistency across network meta-analysis, as well as unsuitability for transitivity. Therefore, we utilized several approaches to evaluate network consistency, including the loop-specific method and the Q static. The loop-specific method could analyze mutual variance between direct and indirect results via closed loops. Inconsistency factor (IF) was the quantitative indicator for the loop-specific method, which hinted inconsistency once its 95% CI excluded zero. 15 Furthermore, the Q static of inconsistency was another indicator to estimate consistency across the network. Both consistency and homogeneity were fundamental requirements before producing reliable results by network meta-analysis. When inconsistency or significant heterogeneity was detected, data from the most inconsistent or heterogeneous comparisons were removed to examine whether the results remained stable.
Network plot as well as funnel plot were applied to demonstrate network structure and detect publication bias, respectively. The more symmetrical the funnel plot was, the less publication bias pooled results would have. We performed random-effects network calculation based on a frequentist model, with either HR or RR as the effect size. Based on the non-inferior margin of previous literature on a similar topic, 18 we set 1.12 for the HR, as well as 1.25 for the RR to be the non-inferior margin in our network calculation. In addition, we also utilized p score to rank all regimens based on their network estimates. The closer the p score approached 1, the better the regimen could be. However, if one regimen ranked in top place, however crossed a non-inferior margin as well, it still could not be fully recommended and trusted. The final conclusion was made by considering both the network ranking and non-inferiority of each regimen. Sensitivity analysis was performed to detect the stability of pooled outcomes by deleting studies with significant clinical heterogeneity. Subgroup analysis by different pathological stages was also conducted to validate potential heterogenic factors, as well as provide more clinically meaningful evidence. Network meta-analysis was conducted on R software 3.4.3, assisted by STATA 14.0 in terms of graphical functions.
Role of the funding source
The sponsors had no role in study design, data collection, data analysis, data interpretation or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.
Results
Baseline features
After screening 7053 preliminary records, 51 records were included into our systematic review and network meta-analysis, corresponding to 30 RCTs. Selection flowchart and reasons of ineligibility by full-text assessment are described in Figure 1 and eTable 1, respectively. Due to limitation on number of references, we included citations of all eligible trials in the Supplemental Material.

Selection flowchart.
All 30 trials were phase III RCTs, and the majority of them had formal registration identifiers. Of these, 22 trials were conducted among the Western population while only 8 trials were launched amid Asian countries, which were all completed by Japanese institutions. Total sample sizes of eligible trials were 54,109, ranging from 169 to 6088, individually. All included trials were relatively comparable in terms of median age (around 60-years old) and sex ratio of enrolled patients (male dominant). Only 15 trials recruited stage 3 colon cancer patients, while 15 trials studied both high-risk stage 2 and stage 3 colon cancer cases. The distribution of tumour location and performance status of included patients were also consistent across different trials. Therefore, overall, the baseline clinical features of all eligible trials were comparable, while the impact of different stages would be further analyzed by subgroup analysis (Tables 1 and 2).
Baseline features of all eligible trials (part 1).
The numbers suggesting integrated numbers of both ECOG-0 and ECOG-1.
For trials studying targeted drugs such as cetuximab and bevacizumab, results of efficacy were based on the unselected population to maintain histopathological homogeneity across all eligible trials. For ‘region’, some trials were completed by several Western countries instead of a sole nation, therefore ‘Western’ was used under this circumstance. Details of the rationale to name and constitute each node within our meta-analysis were depicted in the Methods section. The HR was the result of the upper arm versus lower arm in each trial, except for those that were specifically labelled, such as ‘m3 versus m6’.
6M, 6-month regimen; 12M, 12-month regimen; DFS, disease-free survival; ECOG, European Cooperative Oncology Group; F3/F6, FOLFOX4 (3M/6M); FA-MC, Mayo Clinic regimen; FA-RP, Roswell Park regimen; FB, FOLFOX4/bevacizumab; HR, hazard ratio; m3/m6, mFOLFOX6 (3M/6M); M/F, male/female; NA, not available; OS, overall survival; R/L, right/left; X3/X6, XELOX (3M/6M); XB, XELOX/bevacizumab; XELOX, capecitabine plus oxaliplatin.
Baseline features of all eligible trials (part 2).
The numbers suggesting integrated numbers of both ECOG-0 and ECOG-1.
For trials studying targeted drugs such as cetuximab and bevacizumab, results of efficacy were based on unselected population to maintain histopathological homogeneity across all eligible trials. For ‘region’, some trials were completed by several Western countries instead of a sole nation, therefore ‘Western’ was used under this circumstance. Details of the rationality to name and constitute each node within our meta-analysis were depicted in the Methods section. The HR was the result of upper arm versus lower arm in each trial, except for those that were specifically labelled, such as ‘m3 versus m6’.
6M, 6-month; 12M, 12-month; CI, confidence interval; DFS, disease-free survival; ECOG, European Cooperative Oncology Group; F3/F6, FOLFOX4 (3M/6M); FA, ; FA-MC, Mayo Clinic regimen; FA-RP, Roswell Park regimen; FLOX, ; FB, FOLFOX4/bevacizumab; FOLFIRI, ; FOLFOX, ; FP, ; FU, ; HR, hazard ratio; m3/m6, mFOLFOX6 (3M/6M); LV, ; M/F, male/female; NA, not available; OS, overall survival; R/L, right/left; SOX, ; UFT, ; X3/X6, XELOX (3M/6M); XB, XELOX/bevacizumab; XELOX, capecitabine plus oxaliplatin.
Risk of bias
Generally, the entire systematic review had a low risk of bias, since more than half the indicators scored as low risk of bias (56%), while unclear risk (24%) or high risk of bias (20%) took up smaller proportions (Figure 2). Individually, none of the included trials was in high risk of bias for methodological design (eTable 2).

Risk-of-bias assessment of eligible trials.
Of note, since the majority of trials were rigorously randomized as well as centrally allocated, 70% and 87% of included trials were scored as low risk of bias in terms of random sequence generation and allocation concealment, respectively, while no high risk of bias was reported in these two key domains. Due to open-label design and impossibility for treatment masking with greatly differently administered arms, all the include trials (100%) were scored as high risk of bias in terms of blinding of participants and personnel. The majority of trials did not report relevant information regarding blinding of outcome assessment, especially whether independent reviewers were introduced into the evaluation of DFS; therefore, most of them were scored as unclear risk of bias (87%). Since efficacy and tolerability of the majority of trials were based on intent to treat and safety analysis, set respectively, most trials reported enough endpoints; 93% and 67% of the eligible trials had low risk of bias regarding incomplete outcome data and selective reporting, respectively. Additionally, since many eligible trials featured balanced clinical characteristics, 70% of all qualified trials were scored as low risk of bias with respect to other sources of bias (Figure 2).
Primary endpoint: disease-free survival
Network geometry. 30 RCTs were merged into quantitative analysis, corresponding to 19 network nodes by type 1 node classification (Figure 3 and Tables 1 and 2).

Network structure plot of disease-free survival.
Transitivity. As was mentioned in the Methods section, we rearranged all included arms by different nodes to evaluate the homogeneity inside each node (eTable 3), especially their baseline DFS rate. For node ‘FU/FA-MC’, ‘FOLFOX4 (3M)’, ‘FOLFOX4 (6M)’, ‘SOX (6M)’, ‘capecitabine (12M)’, ‘mFOLFOX6 (3M)’, ‘mFOLFOX6 (6M)’, ‘S-1 (6M)’, ‘UFT/LV (18M)’, ‘F/bevacizumab (12M)’ and ‘Raltitrexed (6M)’, baseline survival rates were relatively comparable within each node, thus legitimizing transitivity across the network. For node ‘FOLFIRI’, ‘FU/FA-RP’, ‘XELOX (3M)’, ‘XELOX (6M)’, ‘UFT/LV (6M)’, ‘capecitabine (6M)’, ‘LV5FU2 (6M)’ and ‘F/cetuximab (6M)’, each node had one or two trials featuring slightly incomparable baseline survival rates with other trials in the same node, and those trials would be removed in the sensitivity analysis subsequently (eTable 3). Therefore, homogeneity inside each node of our network meta-analysis was guaranteed, assuming there was transitivity.
Consistency and heterogeneity. Five closed loops were found inside our network meta-analysis. The 95% CI of IF of all closed loops contained zero, suggesting there was no inconsistency between direct and indirect results (eTable 4). Q static for assessing inconsistency (Q inconsistency) also implied there was no inconsistency within the network (Q inconsistency: p = 0.262). In terms of statistical heterogeneity, both I 2 static (I 2 = 0%) and Q static (Q heterogeneity: p = 0.969) hinted there was no significant heterogeneity across eligible trials.
Publication bias. There was no publication bias amid all included trials due to symmetrical distribution of effect sizes by funnel plot (eFigure 1).
Network calculation. By Node-1 classification, ‘capecitabine (12M)’ [network HR 0.81 (0.60–1.10), p score = 0.967] was the highest-ranking regimen that displayed non-inferiority against common comparator ‘XELOX (6M)’ together with ‘XELOX (3M)’ [network HR 0.95 (0.86–1.04), p score = 0.834; Figures 4 and 5].

Network forest plot of disease-free survival.

Network league table of disease-free survival.
Sensitivity analysis. First, by Node-2 classification, which integrated all fluoropyrimidine-plus-platinum regimens, ‘capecitabine (12M)’ topped the entire ranking and was non-inferior to common comparator ‘FP (6M)’ [network HR 0.80 (0.61–1.05), p score = 0.981; eFigure 2]. The network remained in low heterogeneity and high consistency inside despite changing node classifications (data not shown). Besides, ‘FP (6M)’ demonstrated borderline superiority against ‘FP (3M)’ [network HR 1.08 (1.00–1.16)]. Second, by deleting trials that displayed incomparable baseline survival rates with other counterparts in the same node, as well as trials that contained smaller sample sizes (less than 200; eTable 3), ‘XELOX (3M)’ was the only node displaying non-inferiority against ‘XELOX (6M)’ (data not shown). Third, since the definitions of DFS were not always consistent among all eligible studies, we additionally deleted trials defining DFS similar to recurrence-free survival, which only counted tumour recurrences but not secondary malignancies as events (Supplemental Material). As a result, ‘XELOX (3M)’ was still the only non-inferior regimen compared with ‘XELOX (6M)’ in the hierarchy (data not shown). All these suggested that network outcomes for DFS were stable and solid.
Network subgroup analysis. Via Node-1 classification, we only calculated subgroup data for stage 3 due to insufficient data of high-risk stage 2 cases (eTables 5 and 6). Here, ‘capecitabine (12M)’ [network HR 0.79 (0.57–1.10), p score = 0.948] still ranked as the top node and demonstrated non-inferiority to ‘XELOX (6M)’, together with ‘XELOX (3M)’ [network HR 0.93 (0.83–1.05), p score = 0.775; Figure 6]. By Node-2 classification, ‘capecitabine (12M)’ topped the entire hierarchy among stage 3 patients and displayed non-inferiority against ‘FP (6M)’ [network HR 0.80 (0.61–1.06), p score = 0.965; eFigure 3]. On the other hand, ‘FP (3M)’ failed to show non-inferiority against ‘FP (6M)’ among stage 3 and high-risk stage 2 patients [stage 3: network HR 1.07 (0.99–1.16), p score = 0.551; high-risk stage 2: network HR 1.14 (0.95–1.38), p score = 0.207; eFigure 4].

Network forest plot of disease-free survival among stage 3 patients.
Pairwise subgroup analysis. Based on the abovementioned network calculation results, we did more specific pairwise meta-analyses to eliminate certain heterogenic factors that might bias comparisons between 3-month and 6-month regimens. Here, only the five major largescale RCTs, including ACHIEVE, HORG-IDEA, IDEA, SCOT and TOSCA were included (results of CALGB/SWOG 80702 had not been formally published in a journal or meeting abstract). Among low-risk stage 3 patients, the ‘XELOX (3M)’ regimen was significantly better than the ‘XELOX (6M)’ regimen [pairwise HR 0.78 (0.63–0.97), p = 0.02], while both ‘mFOLFOX6 (3M)’ [pairwise HR 1.16 (0.95–1.42), p = 0.15] and ‘FP (3M)’ [pairwise HR 1.03 (0.92–1.16), p = 0.60] could not demonstrate non-inferiority against their 6-month counterparts. Within high-risk stage 3 patients, ‘XELOX (3M)’ [pairwise HR 1.05 (0.90–1.23), p = 0.50] failed to show non-inferiority against its longer-duration counterpart, while ‘mFOLFOX6 (3M)’ [pairwise HR 1.31 (1.11–1.55), p = 0.002] and ‘FP (3M)’ [pairwise HR 1.14 (1.01–1.29), p = 0.03] were significantly worse than ‘mFOLFOX6 (6M)’ and ‘FP (6M)’, respectively.
Secondary endpoint: overall survival
25 trials were included in the OS calculation (Tables 1 and 2). Regardless of Node-1 or Node-2 classification, ‘capecitabine (12M)’ was the best node among all analyzed counterparts, displaying non-inferiority against common comparator ‘XELOX (6M)’ [Node-1: network HR 0.71 (0.48–1.06), p score = 0.976; Node-2: network HR 0.72 (0.51–1.02), p score = 0.990]. Moreover, ‘FP (3M)’ also demonstrated non-inferiority against ‘FP (6M)’ [network HR 1.01 (0.93–1.08), p score = 0.771; eFigures 5 and 6]. Overall, inconsistency and heterogeneity remained at a very low level (data not shown).
Secondary endpoint: adverse events
Details of safety profile are displayed in eTable 7. Node-2 classification was used here to present network results, since not all included types of adverse events provided separate data for Node-1 classification. ‘FP (3M)’ was significantly better than its 6-month counterpart with respect to peripheral sensory neuropathy [network RR 0.31 (0.23–0.42)], thrombocytopenia [network RR 0.68 (0.47–0.98)] and fatigue [network RR 0.56 (0.32–0.95)], while it being non-inferior to the 6-month regimen regarding neutropenia [network RR 0.74 (0.45–1.21)], leucopenia [network RR 0.81 (0.57–1.13)] and diarrhoea [network RR 0.79 (0.61–1.02)]. For anaemia [network RR 1.31 (0.37–4.62)], anorexia [network RR 1.14 (0.70–1.85)] and nausea/vomiting [network RR 1.09 (0.81–1.47)], ‘FP (3M)’ did not display non-inferiority against ‘FP (6M)’. The 12-month capecitabine monotherapy only exhibited superiority against 6-month oxaliplatin-based chemotherapy in terms of leucopenia [network RR 0.02 (0.00–0.94)] and thrombocytopenia [network RR 0.01 (0.00–0.66)], but failed to display non-inferiority among other major adverse events.
Discussion
Adjuvant systemic treatments for resectable colon cancer have drawn a lot of academic attention during the past decade. Currently, XELOX and FOLFOX regimens have been widely accepted as the standard options, especially for high-risk stage 2 and stage 3 patients.3,4,6 However, the evidence is mainly based on pairwise RCTs, and sometimes it is difficult to make accurate comparisons among so many regimens, especially since novel medications are constantly introduced to the market. Therefore, network meta-analysis is a necessity in this situation.
In terms of DFS, 12-month capecitabine monotherapy topped the hierarchy and showed non-inferiority against the 6-month XELOX regimen, together with the 3-month XELOX regimen, which ranked in the third place; however, with the most condensed interval-of-effect size. Similar results were obtained in terms of subgroup analysis among stage 3 patients. The more specific pairwise subgroup analysis suggested that 3-month XELOX was better than its 6-month counterpart among low-risk stage 3 patients, while none of the 3-month regimens displayed non-inferiority against 6-month treatments among high-risk stage 3 patients, and the 6-month mFOLFOX6 regimen was even significantly better than its 3-month counterpart. However, if we applied Node-2 classification by integrating all fluoropyrimidine-plus-platinum regimens together, 12-month capecitabine monotherapy rather than 3-month oxaliplatin-based chemotherapy became non-inferior to 6-month oxaliplatin-based chemotherapy among stage 3 patients, while no matter among low-risk or high-risk stage 3 patients, 3-month oxaliplatin-based chemotherapy failed to reach non-inferiority against its 6-month counterpart. This implies that types of fluoropyrimidine and schedules might have impact on survival benefits of 3-month treatment. For high-risk stage 2 patients, 6-month oxaliplatin-based chemotherapy was still the optimal option, since none of the included regimens seemed to be at least non-inferior to it. Nevertheless, more original trials are warranted in the future because network calculation of this part of subgroup analysis is only based on Node-2 classification, due to inadequate data of individual arms, which could possibly be biased by different types of fluoropyrimidine and schedule.
Regarding OS, 12-month capecitabine monotherapy topped the hierarchy, exhibiting non-inferiority against 6-month XELOX, while 3-month XELOX failed to do so. However, via Node-2 classification, both 12-month capecitabine monotherapy as well as 3-month oxaliplatin-based chemotherapy displayed non-inferiority against 6-month oxaliplatin-based chemotherapy. This might be caused mainly by the fact that most trials investigating 3-month versus 6-month regimens took DFS as the primary endpoint and did not report OS data, which resulted in the wide-range network-effect size of the 3-month XELOX regimen and crossed the non-inferiority margin. Therefore, for 12-month capecitabine monotherapy and 3-month fluoropyrimidine-plus-platinum regimens, more studies are needed to further investigate their OS benefits before making reliable conclusions. Regarding adverse events, although we could only make network analysis based on Node-2 classification, 3-month oxaliplatin-based chemotherapy was at least non-inferior to its 6-month counterpart among the most of common adverse events, especially peripheral sensory neuropathy, thrombocytopenia and fatigue, which the 3-month regimen was significantly better for. This result is also anticipated and easily understood, since shortened periods of chemotherapeutic treatments cause fewer detrimental effects on recipients. Nevertheless, 12-month capecitabine monotherapy only exhibited superiority against 6-month oxaliplatin-based chemotherapy in terms of leucopenia and thrombocytopenia, while failing to display non-inferiority among other major adverse events. This may probably hint that long-haul chemotherapy, despite of capecitabine monotherapy, will still worsen tolerability among treatment recipients. However, since there was only one trial reporting 12-month capecitabine monotherapy so far, we should also take the possible underpower of statistical calculation into account while making judgement on its real safety effects.
Current NCCN guideline on colon cancer suggests that the 3-month XELOX regimen could be used among low-risk stage 3 patients due to its non-inferiority against its 6-month counterpart, while 6-month oxaliplatin-based chemotherapy is still a more reliable choice regarding high-risk stage 3 patients. It also supports the application of 6-month oxaliplatin-based chemotherapy among high-risk stage 2 patients based on current evidence. 3 Meanwhile, reported by Sobrero and colleagues in the 2020 ASCO annual meeting, the latest pooled analysis of six IDEA trials also suggests that 3-month oxaliplatin-based regimens are non-inferior to 6-month regimens, especially among low-risk stage 3 patients. Although our network meta-analysis failed to make more groundbreaking discoveries when compared with current guidelines, this was still the first systematic review and network meta-analysis in this field, which might provide useful hints for design of largescale RCTs in the future. The confirmation by our meta-analysis might further support the use of corresponding regimens in the future, which therefore should be recognized as the major significance and novelty of our work. Meanwhile, somewhat surprisingly, 12-month capecitabine monotherapy also displayed non-inferiority against current standard treatments, despite lacking Western data on its suitability, as well as its possibly higher toxicity and worse compliance. The ranking of 12-month capecitabine monotherapy is the product of indirect network calculation that could also be regarded as a possible topic in future design of randomized trials.
Although our systematic review and network meta-analysis were rigorously designed and conducted, there were still some limitations. First, although all eligible trials were proven to be clinically comparable without significant heterogeneity, and sensitivity analysis had also been conducted to ensure the homogeneity of baseline survival rates in the same node, impact by underlying heterogeneity could not be fully eliminated, such as different regions, races and extents of lymphadenectomy. Therefore, future updates, especially individual patient data network meta-analyses, are welcomed. Second, we still need more trials (including CALGB/SWOG 80702 trial) to enhance statistical power, as well as provide more subgroup analyses for better clinical interpretations, such as subgroup data among low-risk and high-risk stage 3 patients, respectively.
Taken together, with its at least non-inferior survival benefit and even better safety profile, 3-month XELOX treatment could be an alternative option of traditional 6-month regimen among low-risk stage 3 patients. Among high-risk stage 3 patients, 6-month oxaliplatin-based regimens still seem more competitive. For high-risk stage 2 cases, we still recommend 6-month oxaliplatin-based regimens until more compelling evidence emerges. In addition, due to inadequate statistical power and possibly higher toxicity, clinical application of 12-month capecitabine monotherapy should still be undertaken with caution, despite its top ranking, especially among non-Asian countries.
Supplemental Material
sj-docx-1-tam-10.1177_1758835920974195 – Supplemental material for Comparative efficacy and tolerability of adjuvant systemic treatments against resectable colon cancer: a network meta-analysis
Supplemental material, sj-docx-1-tam-10.1177_1758835920974195 for Comparative efficacy and tolerability of adjuvant systemic treatments against resectable colon cancer: a network meta-analysis by Ji Cheng, Xiaoming Shuai, Jinbo Gao, Guobin Wang, Kaixiong Tao and Kailin Cai in Therapeutic Advances in Medical Oncology
Footnotes
Acknowledgements
We thank all members in our department for offering clinical and methodological suggestions during the entire performance of our meta-analysis.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: this meta-analysis was funded by National Natural Science Foundation of China (81902487) to Ji Cheng and National Natural Science Foundation of China (81874184) to Kaixiong Tao.
Conflict of interest statement
The authors declare that there is no conflict of interest.
Author contributions
Study design: Ji Cheng, Guobin Wang and Kaixiong Tao. Manuscript writing and revision: Ji Cheng, Kaixiong Tao and Kailin Cai. Literature retrieval: Ji Cheng and Jinbo Gao. Discretion of eligibility: Ji Cheng and Xiaoming Shuai. Quality assessment: Ji Cheng and Jinbo Gao. Data extraction: Ji Cheng and Xiaoming Shuai. Statistical analysis: Ji Cheng and Kaixiong Tao.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
