Abstract
Background:
While immune-chemotherapy represents the first-line standard for advanced esophageal squamous cell carcinoma (ESCC), fluorouracil-platinum (FP) and taxane-platinum (TP) regimens remain the fundamental cytotoxic backbones. Despite their widespread clinical application, robust comparative data on their relative efficacy and safety in real-world populations are lacking, creating an evidence gap that hampers optimal backbone selection.
Objectives:
This study aimed to compare the effectiveness and safety of first-line FP versus TP chemotherapy in patients with advanced ESCC using large-scale, multicenter real-world data.
Design:
This was a large-scale, multicenter, retrospective cohort study employing a propensity score-matched design to compare two treatment cohorts: patients receiving FP chemotherapy versus those receiving TP chemotherapy for unresectable advanced ESCC. Data spanned from January 2013 to December 2023, ensuring a contemporary and representative treatment population.
Methods:
Data were extracted from a national cancer database. Patients with unresectable ESCC receiving first-line FP or TP chemotherapy were included. Propensity score matching (PSM) in a 1:1 ratio was used to balance the baseline characteristics. The primary endpoint was progression-free survival (PFS). Secondary endpoints included objective response rate (ORR), duration of response (DOR), overall survival (OS), and safety.
Results:
After PSM, 3440 matched pairs were generated (N = 6880 total). The median follow-up was 6.3 years. The TP regimen demonstrated superior short-term efficacy compared to FP: median PFS was 5.2 versus 4.4 months (hazard ratio (HR), 0.91; 95% confidence interval (CI), 0.86–0.96; p = 0.0004); ORR was 22.2% versus 19.5% (p = 0.026); and among responders, median DOR was 10.8 versus 4.4 months (HR, 0.77; 95% CI, 0.64–0.94; p = 0.009). However, no significant difference was observed in OS (median OS: 15.9 vs 16.1 months; HR, 1.00; 95% CI, 0.94–1.06; p = 0.901). Toxicity profiles were distinctly regimen-specific, with FP associated with higher rates of hand-foot syndrome and gastrointestinal events, whereas TP was linked to higher incidences of neuropathy and alopecia.
Conclusion:
In this large-scale real-world analysis, first-line TP chemotherapy provided significantly better disease control and response durability compared to FP in advanced ESCC, albeit without an OS benefit. The distinct toxicity profiles of each regimen offer additional considerations for individualized treatment selection. These findings offer critical evidence to guide backbone selection in contemporary treatment algorithms.
Introduction
Esophageal cancer ranks as the sixth leading cause of cancer-related deaths globally and is histopathologically divided into two principal types: esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma (EAC). These two entities exhibit distinct epidemiological patterns, genetic profiles, and clinical outcomes. 1 In contrast to Western populations, where EAC predominates, ESCC constitutes approximately 90% of esophageal cancer cases in Asian countries, representing a major public health burden in this region. 2
The integration of immunotherapy with chemotherapy has fundamentally redefined the first-line treatment paradigm for advanced ESCC, as evidenced by landmark phase III trials such as KEYNOTE-590 and CheckMate-648, in which these combinations achieve objective response rates (ORRs) as high as 65%
Beyond historical practice patterns and institutional preferences, this geographical variation may also be underpinned by a putative biological rationale. Mechanistically, platinum agents exert their cytotoxic effects primarily through the formation of DNA adducts, leading to DNA damage and apoptosis.
7
Fluorouracil inhibits thymidylate synthase and intercalates into nucleic acid strands to cause dysfunction and interfere with DNA/RNA synthesis in tumor cells.
8
In contrast, taxanes act as antimitotic agents that stabilize microtubules, disrupt mitotic spindle dynamics, and induce cell cycle arrest and apoptosis. Importantly, emerging evidence suggests that taxanes possess immunomodulatory properties
Nevertheless, no prospective randomized trials have directly compared the efficacy of TP versus FP chemotherapy in advanced ESCC, despite both being widely used. Current evidence relies on small, single-arm studies or indirect comparisons, resulting in inconsistent conclusions and insufficient guidance for clinical decision-making. This evidence gap often leads to backbone selection based on regional preference and toxicity management rather than robust comparative data.
Therefore, this study aimed to leverage large-scale, multicenter real-world data. Utilizing rigorous propensity score matching (PSM), we conducted a head-to-head comparison of TP versus FP regimens as first-line therapy for advanced ESCC. This investigation seeks to generate high-quality, real-world evidence to inform and optimize the selection of the chemotherapeutic backbone in the contemporary era of chemoimmunotherapy.
Study design and methods
Data sources
Data for this retrospective analysis were derived from the National Cancer Information Database (NCID), a longitudinal, electronic medical record-based repository that aggregates comprehensive diagnostic and treatment information from tumor patients across 1400 hospitals in China. This study utilized records from 53 participating institutions, comprising 16 municipal-level and 37 provincial-level Grade A, Class 3 hospitals from 21 provinces and municipalities across China. This broad geographic representation enhances the generalizability of our findings to the broader Chinese ESCC population. All personally identifiable information was anonymized to ensure patient confidentiality.
Study subjects
We retrieved all patients from the NCID database with an initial diagnosis of pathologically confirmed ESCC between January 1, 2013, and December 31, 2023. Follow-up data were censored as of October 31, 2025. The median follow-up for the entire matched cohort was 6.3 years (range, 2.1
Eligible patients were categorized into two cohorts according to the first-line treatment regimen received. The TP group, which comprised patients administered paclitaxel, docetaxel, or nab-paclitaxel in combination with cisplatin, carboplatin, nedaplatin, or lobaplatin. The FP included those receiving 5-fluorouracil, tegafur, capecitabine, or S-1 in combination with cisplatin, carboplatin, nedaplatin, or lobaplatin.
The inclusion criteria were as follows: (1) Age ⩾18 years; (2) Histologically or cytologically confirmed ESCC via gastroscopic biopsy; (3) Radiologically confirmed unresectable stage III or IV disease according to the AJCC 8th edition TNM staging system; (4) No prior systemic therapy for advanced disease; (5) Treatment with chemotherapy only (specifically the two-drug combinations stated above) from initiation until disease progression (DP), with no concomitant radiotherapy, surgery, local therapy, targeted therapy, or immune checkpoint inhibitors; (6) Stage III patients not candidates for radical radiotherapy or concurrent chemoradiotherapy due to a history of radical radiotherapy or chemoradiotherapy for the primary tumor, the presence of an esophageal-tracheal or esophageal mediastinal fistula, or any contraindication to radiotherapy; (7) Prior neoadjuvant or adjuvant therapy was considered as first-line treatment if DP occurred within 6 months of its completion; and (8) Completion of at least two cycles of first-line chemotherapy.
The exclusion criteria included: (1) History of another primary malignant tumor; (2) Presence of severe acute or chronic comorbidities that could influence treatment efficacy or survival evaluation; (3) Absence of essential clinical data, including baseline radiological or pathological reports; or (4) Presence of any additional comorbid condition that could confound outcomes.
Endpoints
Primary endpoint
Progression-free survival (PFS) was defined as the time from the initiation of first-line chemotherapy to the first occurrence of either radiographic DP, as assessed by Response Evaluation Criteria In Solid Tumors (RECIST) 1.1, or death from any cause. Patients who did not experience progression or death by the last follow-up were censored at the time of their last radiographic assessment. Those lost to follow-up were censored on the last date they were documented to be alive and progression-free.
Secondary endpoints
(1) ORR, defined as the proportion of patients achieving a best overall response of complete response (CR) or partial response (PR) within the efficacy-evaluable population; (2) Duration of response (DOR), measured from the first documented radiological evidence of response (CR or PR) to the earliest of radiological PD, death from any cause, or initiation of subsequent anticancer therapy. Patients without a documented event were censored at the last radiological confirmation of response; (3) OS defined as the time from the first dose of first-line chemotherapy until death from any cause. Survivors were censored at the last known alive date. Patients lost to follow-up were censored on the last date of confirmed survival.
Statistical methods
The reporting of this study conforms to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement. 11 The completed checklist is provided as a Supplemental File. This retrospective cohort study utilized data from the NCID. To minimize baseline confounding, 1:1 PSM was performed between the TP and FP groups. Matching covariates included age, sex, tumor stage, and number of first-line treatment cycles, and year of first-line treatment initiation. Nearest-neighbor matching was performed with a caliper width set to 0.01 of the standard deviation of the logit of the propensity score, a stringent criterion chosen based on methodological literature to optimize covariate balance while retaining adequate sample size in our large cohort. 11 While some covariates showed statistically significant p-values post-matching due to the large sample size, the negligible standardized mean differences (SMDs; 0.1) indicate that these differences are not practically meaningful. 12
Categorical variables are presented as numbers (percentages) and were compared using Pearson’s χ2 test or Fisher’s exact test when the expected cell counts were <5. Continuous variables were tested for normality using the Shapiro–Wilk test. Normally distributed variables are summarized as mean ± standard deviation and compared using the independent samples t test, with Levene’s test for homogeneity of variances; Welch’s correction was applied when heterogeneity was detected. Non-normally distributed variables are expressed as the median with interquartile range and compared using the Mann–Whitney U test. PFS and OS were analyzed using the Kaplan–Meier method, with between-group comparisons performed with the log-rank test. Hazard ratios (HR) and corresponding 95% confidence intervals (CI) were estimated using Cox proportional hazards models. All statistical analyses were conducted with R software (version 4.2.1; R Foundation for Statistical Computing, Vienna, Austria). A two-tailed p-value <0.05 was considered statistically significant.
Results
Baseline characteristics and PSM
As presented in Table 1, the baseline variables were well balanced after PSM. Prior to PSM, substantial imbalances existed between the FP (N = 3622) and TP (N = 11,219) cohorts across key baseline characteristics, including age, gender, tumor stage, and year of treatment initiation (all p < 0.001 except tumor stage p = 0.055, with SMD up to 0.292). Following 1:1 PSM, 3440 matched pairs were generated, which effectively balanced all measured covariates. Post-matching differences were substantially reduced, with SMDs for all variables falling below 0.1 (range: 0–0.071), indicating negligible and clinically insignificant absolute differences. Specifically, the mean age was 62.3 versus 61.9 years (SMD = 0.048, p = 0.058), the proportion of male patients was 82.0% versus 79.3% (SMD = 0.071, p = 0.005), and the distribution of Stage IV disease was 53.0% versus 50.5% (SMD = 0.05, p = 0.038). This successful matching created well-balanced comparison groups for robust analysis of subsequent clinical outcomes.
Comparison of matching variables before and after PSM.
PSM, propensity score matching; SMD, standardized mean differences.
The post-matching baseline characteristics are detailed in Table 2. The FP (N = 3440) and TP (N = 3440) cohorts were well-balanced across all measured baseline and treatment characteristics. Comparisons showed no statistically significant differences (all p > 0.05). Specifically, key demographic and clinical factors, including body mass index (mean 21.5 vs 21.3), ECOG performance status (89.8% vs 90.0% for scores 0–1), smoking history, and alcohol history, were comparable between groups. Tumor characteristics such as differentiation grade, T stage, N stage, and initial disease status (79.4% vs 78.6% newly diagnosed) were also evenly distributed. The patterns of metastatic disease, including the presence of distant metastases (53.0% vs 50.5%), specific metastatic sites, and the number of involved sites, showed no significant imbalance. Critically, the profiles of subsequent lines of therapy were highly similar, with no significant differences observed in the utilization rates of chemotherapy, immunotherapy, targeted therapy, radiotherapy, local treatment, combination regimens, or participation in clinical research. This balance in post-progression treatment is essential for valid OS comparisons, as it minimizes the confounding effect of differential subsequent therapy access.
Comparison of baseline characteristics between the fluorouracil and taxane groups after propensity score matching.
Tumor response
Based on the RECIST 1.1 guidelines, efficacy was evaluable in 2318 (67.4%) and 2227 (64.7%) patients from the FP and TP groups, respectively. As summarized in Table 3, a statistically significant difference was observed in the ORR, which was higher in the TP group compared to the FP group (22.2% vs 19.5%; p = 0.026). This 2.7% absolute improvement in ORR suggests that the TP regimen induces more frequent tumor shrinkage, which may be clinically meaningful for patients requiring rapid symptom palliation or disease control. In contrast, the disease control rate was similar between the two groups (81.7% vs 81.1%; p = 0.622), indicating that while TP achieves more objective responses, both regimens are equally effective at preventing early DP.
Efficacy evaluation between the fluorouracil and taxane groups after propensity score matching.
CR, complete response; DCR, disease control rate; ORR, objective response rate; PD, progressive disease; PR, partial response; SD, stable disease.
Safety profile
The incidences of treatment-related adverse events (AEs) for the PSM cohorts are presented in Table 4. Analysis revealed distinct toxicity profiles associated with each regimen, characterized by several clinically relevant differences in both low-grade (Grade 1–2) and high-grade (Grade 3–4) events. Notably, the FP regimen was associated with a substantially higher burden of dermatological and gastrointestinal toxicities. This included a markedly greater incidence of hand-foot syndrome (42.6% vs 0.6% for Grade 1–2; 12.4% vs 0% for Grade 3–4), oral mucositis (18.7% vs 3.6% for Grade 1–2), and diarrhea (16.5% vs 6.4% for Grade 1–2). In contrast, the TP regimen was linked to higher rates of neurological and other regimen-specific AEs, including peripheral neuropathy (34.9% vs 3.6% for Grade 1–2), alopecia (8.6% vs 0.2% for Grade 1–2), arthralgia/myalgia (13.6% vs 8.7% for Grade 1–2), and allergic reactions (8.6% vs 1.2% for Grade 1–2). Edema was exclusively observed in the TP group (5.2% for Grade 1–2). The incidence of common hematological toxicities and constitutional symptoms such as nausea/vomiting, anorexia, and hepatic dysfunction was broadly comparable between the two treatment groups.
Comparison of adverse reactions among the fluorouracil and taxane groups.
Survival outcomes
Survival analyses are shown in Table 5. Treatment with a taxane-based regimen was associated with significantly improved PFS compared to a fluorouracil-based regimen, with a median PFS (mPHS) of 5.2 months (95% CI, 5.0–5.5) versus 4.4 months (95% CI, 4.1–4.7), respectively (HR, 0.91; 95% CI, 0.86–0.96; p = 0.0004). The 6-month PFS rate was 45.5% (95% CI, 43.8–47.3) in the TP group compared to 39.9% (95% CI, 38.2–41.7) in the FP group. Among responders, the median DOR (mDOR) was more than twofold longer in the TP group (10.8 months; 95% CI, 7.0–not estimable) than in the FP group (4.4 months; 95% CI, 3.6–6.7; HR, 0.77; 95% CI, 0.64–0.94; p = 0.009). This remarkable difference in DOR—with the TP group maintaining responses for more than twice as long—represents the most striking efficacy finding of this study and underscores the durability advantage of taxane-based chemotherapy in responders. In contrast, no significant difference was observed in OS between the two groups (mOS: TP, 15.9 months (95% CI, 15.2–16.8) vs FP, 16.1 months (95% CI, 15.4–17.2); HR, 1.00; 95% CI, 0.94–1.06; p = 0.901). The long-term OS rates were closely aligned, with 12-month OS rates of 60.5% and 60.8%, and 24-month OS rates of 36.7% and 38.5% for the TP and FP groups, respectively.
Survival analysis between the fluorouracil and taxane groups after propensity score matching.
CI, confidence interval; HR, hazard ratio.
Discussion
Immunotherapy combined with chemotherapy is now the global first-line standard for advanced ESCC. Despite this remarkable advancement, chemotherapy remains the indispensable backbone of first-line treatment regimens. However, the optimal chemotherapy backbone—whether fluoropyrimidine-based or taxane-based—has remained a subject of clinical debate due to the absence of head-to-head comparative trials. Currently, preliminary evidence from several small-scale studies suggests that TP combination chemotherapy may confer superior survival benefits in patients with advanced ESCC. Reported outcomes for this regimen include an ORR ranging from 30% to 62.1%, a mPFS between 4.7 and 7.9 months, and a mOS of 9.7–12 months.13–16 Conversely, the ORR of fluorouracil combined with platinum-based chemotherapy has shown an ORR of 22.7%–57.8%, an mPFS of 2.5–6.5 months, and an mOS of 7.5–12.7 months in similar patient populations.17–19 The findings from our extensive retrospective real-world study (n = 6880) align with the established trend in the existing literature. We directly evaluated these two backbone strategies in a PSM cohort of patients with unresectable ESCC. Our findings demonstrate that the TP regimen confers a statistically significant advantage in short-term efficacy. However, this does not translate into an OS benefit.
Our findings confirm the superior antitumor activity of TP chemotherapy. The regimen yielded a significantly higher ORR (22.2% vs 19.5%; p = 0.026) and a longer mPFS (5.2 vs 4.4 months; HR, 0.91; p = 0.0004). The superior antitumor activity of taxanes may be attributed to their distinct mechanism of action. Taxanes, by stabilizing microtubules and inducing potent, rapid apoptosis during the G2/M phase, are fundamentally cytotoxic agents that deliver swift tumor debulking. In contrast, fluorouracil primarily inhibits thymidylate synthase, disrupts DNA synthesis, and induces S-phase arrest without directly triggering rapid cell death. 8 This property is particularly valuable in oncological emergencies (e.g., superior vena cava syndrome) and may improve surgical conversion rates and disease control. Notably, to avoid confounding effects on PFS, we excluded locally advanced ESCC patients who were given opportunities for surgery after chemotherapy, precluding further analysis of conversion surgery rates.
The benefit was most pronounced among responders, with the TP group achieving a more than twofold longer mDOR (10.8 vs 4.4 months; HR, 0.77; p = 0.009). This is further corroborated by the widening separation in PFS rates over time (e.g., a 5.6% absolute improvement at 6 months), indicating that the initial cytoreductive advantage translates into a durable delay in DP. This sustained benefit may be partially attributable to the immunomodulatory properties of taxanes. Accumulating evidence indicates that taxanes can induce ICD, characterized by the release of damage-associated molecular patterns such as calreticulin, ATP, and HMGB1, which promote dendritic cell maturation and enhance tumor antigen presentation.20,21 Furthermore, taxanes have been shown to deplete immunosuppressive regulatory T cells and myeloid-derived suppressor cells, thereby favorably remodeling the tumor immune microenvironment. 20 These immunological effects may establish durable immune surveillance mechanisms that extend disease control beyond the direct cytotoxic phase, potentially explaining the remarkably prolonged DOR observed in our TP cohort.
However, this clear short-term efficacy advantage does not culminate in an OS benefit (mOS: 15.9 vs 16.1 months; HR, 1.00). The clinical interpretation of this PFS-OS dissociation warrants careful consideration. This dissociation is likely the result of multiple factors. First, the widespread and balanced use of highly effective subsequent therapies in our matched cohorts—including immunotherapy, targeted therapy, and radiotherapy—likely attenuates the impact of first-line PFS differences on ultimate OS. In the modern treatment landscape where multiple lines of effective therapy are available, first-line PFS differences may be progressively diluted by the cumulative effect of subsequent treatments. For example, the potential OS difference may have been diminished by both crossover to taxane-containing regimens among patients initially assigned to the fluorouracil group and the differential effects of subsequent anti-tumor therapies after first-line progression. Second, the absolute gain in mPFS (0.8 months), while statistically significant, may be of a magnitude insufficient to manifest as a detectable survival advantage when potent subsequent options are available. Nonetheless, the 0.8-month PFS improvement represents an 18% increase relative to the FP group’s mPFS, which, combined with the significantly higher ORR and remarkably prolonged DOR, constitutes a clinically meaningful short-term efficacy advantage. Furthermore, the high censoring rate in both groups impeded a comprehensive analysis of OS. Consequently, PFS was chosen as the primary endpoint to more reliably isolate and evaluate the treatment effect attributable to first-line therapy. This endpoint selection is consistent with regulatory precedent, as PFS has been accepted as a valid surrogate endpoint for drug approval in multiple solid tumor types, reflecting its ability to capture direct treatment effects without the confounding influence of subsequent therapies.
The comprehensive safety analysis revealed profoundly distinct and regimen-specific toxicity profiles between the two treatment groups. The FP regimen was characterized by a significantly higher burden of dermatological and gastrointestinal AEs, most notably hand-foot syndrome, oral mucositis, and diarrhea. In contrast, the TP regimen was associated with a markedly higher incidence of neurological toxicity (predominantly peripheral neuropathy), alopecia, arthralgia/myalgia, and allergic reactions. The incidence of common hematologic and constitutional toxicities was broadly comparable between the groups. This comparative safety profile is consistent with our previously published meta-analysis. 22 These distinct toxicity profiles have important implications for individualized treatment selection. For patients with preexisting neuropathy, diabetes, or occupations requiring fine motor skills, the FP regimen may be preferable despite its gastrointestinal toxicity burden. Conversely, for patients with inflammatory bowel disease, prior hand-foot syndrome, or those prioritizing response durability, the TP regimen may offer advantages.
Several limitations to these conclusions should be acknowledged. First, as a retrospective study, AE data were extracted from medical records and may be incomplete due to inter-hospital variability. For example, AE data may be subject to under-reporting. Second, although PSM achieved excellent balance across matching confounders (all post-matching SMDs <0.1), residual bias from unmeasured factors cannot be entirely excluded. Potential unmeasured confounders include performance status fluctuations, nutritional status, and physician preference factors that may influence regimen selection. Third, detailed data on later-line treatments, while balanced in category, lacked granularity on sequence and response, limiting a deeper exploration of their confounding effect on OS. Fourth, the inherent selection bias in retrospective cohort studies—whereby patients receiving different regimens may differ in ways not captured by available covariates—represents a fundamental limitation that only randomized controlled trials can definitively address. Furthermore, as the standard of care evolves to incorporate first-line immunotherapy, the relative efficacy of these chemotherapy backbones within modern immunochemotherapy combinations warrants prospective investigation.
In conclusion, this large-scale, multicenter real-world analysis indicates that TP chemotherapy demonstrates significant advantages in tumor shrinkage and progression delay in the first-line treatment for advanced ESCC, making it a compelling option when rapid disease control is a priority. The remarkable durability of response observed in the TP group—with mDOR exceeding 10 months—further supports its consideration in patients where sustained disease control is valued. However, this did not translate into a long-term survival benefit. The distinct toxicity profiles of each regimen provide additional considerations for personalized treatment selection based on individual patient characteristics and preferences. These findings provide robust, real-world evidence to inform clinical decision-making in the contemporary treatment landscape. Prospective studies are warranted to validate these results and further evaluate the performance of these backbones when combined with immune checkpoint inhibitors (Figures 1–3).

Kaplan–Meier curves of progression-free survival for the fluorouracil and taxane groups after propensity score matching.

Kaplan–Meier curves of duration of response for the fluorouracil and taxane groups after propensity score matching.

Kaplan–Meier curves of overall survival for the fluorouracil and taxane groups after propensity score matching.
Supplemental Material
sj-doc-1-tam-10.1177_17588359261436621 – Supplemental material for Comparative effectiveness of first-line taxane-platinum versus fluorouracil-platinum chemotherapy in advanced esophageal squamous cell carcinoma: a propensity score-matched real-world study
Supplemental material, sj-doc-1-tam-10.1177_17588359261436621 for Comparative effectiveness of first-line taxane-platinum versus fluorouracil-platinum chemotherapy in advanced esophageal squamous cell carcinoma: a propensity score-matched real-world study by Sisi Ye, Chao Li, Rongrui Liu, Chuanhua Zhao, Juan Li and Jianming Xu in Therapeutic Advances in Medical Oncology
Footnotes
Acknowledgements
The authors thank the participating institutions of the National Cancer Information Database (NCID) for providing the data.
Declarations
Supplemental material
Supplemental material for this article is available online.
Artificial intelligence assistance disclosure
No generative artificial intelligence was used in any stage of the preparation of this manuscript.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
