Abstract
This study aimed to develop and validate a risk score for early prediction of venous thromboembolism (VTE) in patients with lung cancer. A total of 827 patients with lung cancer from February 2013 to February 2018 in our hospital were retrospectively analyzed. Demographic and clinicopathological variables independently correlated to VTE were applied to develop the risk score in the development group while examined in the validation group. The regression coefficients of multivariable logistic regression test were applied to assign a risk score system. The incidence of VTE was 12.3%, 12.7%, and 11.8% in all patients, in the development and validation groups, respectively. The 496 patients in the development group were classified into 3 groups: low risk (scores ≤3), moderate risk (scores 4-5), and high risk (scores ≥6). The risk of VTE was significantly and positively related to the risk scores in both development and validation groups. The risk score system aided proper stratification of patients with either high or low risk of VTE in the development and validation groups (c statistic = 0.819 and 0.827, respectively). This risk score system based on the factors with most significant correlation showed good predictive ability and is potentially useful for predicting VTE in patients with lung cancer. However, it was developed and validated by a retrospective analysis and has significant limitations, and a prospective validation with all the classic variables assessing the thrombotic risk is needed for a solid conclusion.
Introduction
Venous thromboembolism (VTE), including deep vein thrombosis (DVT) and pulmonary embolism (PE), is a well-known complication of malignant disease, and it is recognized that patients with cancer have 4 to 7 times higher risk of VTE compared to the general population. 1,2 Several epidemiological studies have found that lung cancer was one of the malignant diseases with the highest incidence rate of VTE. 3 The rate of VTE in patients with lung cancer was estimated from 1.4% to 15%. 3 –5 Moreover, retrospective studies have reported associations between VTE and longer length of stay, higher rate of clinical complications, higher in hospital mortality rate, greater disability upon discharge, and more cost for various patients. 6 –8 Thus, a method to estimate risk of developing VTE in patients with lung cancer will be clinically valuable. Several risk factors for cancer-related VTE have been identified, including cancer type, age, gender, bed rest, central venous catheter (CVC), anticancer treatment, and so on. 7,9 Although several VTE risk score system have been published in the recent years, they rely on biochemical predictors or are not lung cancer specific and thus provide limited clinical benefit. 10 –12 Hence, the aim of this retrospective cohort study was to study clinical predictors of VTE in Chinese patients with lung cancer and to develop a scoring system to provide a reliable estimate of VTE risk.
Methods
Patients
We conducted a retrospective cohort study using the hospital records of the First Affiliated Hospital of Sun Yat-sen University in Guangzhou to assemble a population of 926 patients who were diagnosed with all stages of lung cancer between February 2013 and February 2018. Data were retrieved by data managers and research coordinators. Patient enrolled in our study were all inpatient because a large amount of the treatment of lung cancer was still inpatient therapy in China, partially due to the China’s medical insurance policy, which requires hospitalization to get reimbursement. Purpose of hospitalization included resectable tumor, newly diagnosed patients, tumor progression seeking new chemotherapy regimens, radiotherapy, and serious adverse events secondary to outpatient chemotherapy or radiotherapy. Patients with history of the other malignant tumors, acute myocardial infarction, and acute cerebral infarction were excluded. Further, those who did not undergo VTE assessment during hospitalization were also excluded. A total of 827 patients were finally included (Figure 1). Demographic and clinicopathological characteristics were collected for all patients, such as sex, age, body mass index (BMI), bed rest, clinical stage, histology, smoking status, hypertension, diabetes mellitus, hyperlipidemia, history of chemotherapy, history of radiotherapy, history of surgery, leukocyte, hemoglobin, platelet, albumin, alanine aminotransferase (ALT), aspartate aminotransferase (AST), creatinine (Cr), sodium, C-reactive protein (CRP), prothrombin time (PT),

Flowchart of patient inclusion and exclusion.
Outcome Definition and VTE Prophylaxis
The primary outcome was the development of VTE. The VTE included DVT and/or PE. DVT was defined as formation of a blood clot in a deep vein, most commonly the legs. PE is a result of detachment of a clot that travels to the pulmonary artery. Upper extremity DVT and catheter-associated DVT were also included, while both segmental and subsegmental PE were included in this study. DVT was confirmed by ultrasound, while PE was confirmed by computed tomography or magnetic resonance, which was independently reviewed by 2 radiologists. In general, hospitalized patients should be evaluated for both VTE and bleeding risk within 24 hours of admission and periodically during the hospital stay. The Padua Risk Assessment Model should be used to assess VTE risk in medical patients, while the Caprini Risk Assessment Model should be used to assess VTE risk in surgery patients. Early and frequent ambulation, mechanical prophylaxis, or pharmacologic prophylaxis will be recommended based on comprehensive consideration of VTE and bleed risk.
Statistical Analysis
Patients were randomly assigned into 2 groups: development group (60%, 496 patients) and validation group (40%, 331 patients). Random numbers were generated from the sequence of the medical records numbers, and grouping was then determined according to the ranking of random numbers via the software SPSS.
All data were analyzed by software SPSS Statistics version 22.0 of IBM. Classified variables are transformed as frequency, and χ2 statistic test was used for further assessment.
Two steps were taken for the development of the risk score system based on the development data set. Firstly, univariate analysis was performed to illustrate the preoperative risk factors for VTE. Second, factors with significant P value (<.05) in the first step were further analyzed with multivariable stepwise logistic regression analysis, and the odds ratios (ORs) and 95% confidence intervals (CIs) were calculated. The variables with significant P value (<.05) remained as risk factors for VTE. The Hosmer-Lemeshow statistic was used for goodness-of-fit assumption. Next, the β coefficients were divided by the smallest absolute value of regression coefficient and rounded to the nearest integer. Finally, all of the weighted coefficients were summed representing the patient’s risk score. According to this, every patient in the development group and validation group was assigned with a risk score that was consequently applied to classify patients into groups based on risk. Receiver operating characteristic curve analysis was next applied for establishment of the cutoff values of risk scores most predictive of VTE.
As the second step, the risk score system was tested. In brief, the discriminant validity of the risk score system was assessed by the c statistic. A total of 800 bootstrap samples were selected from the development data set and validation data set, respectively, to calculate the 95% CI for the c statistic. All tests were 2-tailed, and statistical significance was defined as P < .05.
Results
Patient Characteristics and VTE Incidence
The hospital medical records of 827 patients with lung cancer were reviewed. Overall, 423 patients received surgical resection as initial therapy, 259 patients received first-line chemotherapy as initial therapy, 114 patients received targeted therapy, and 31 patients received chemoradiation as initial therapy. The incidence of VTE in all patients was 12.3% (102/827), and 78 (76.4%) patients were symptomatic, whereas 24 (23.6%) patients were asymptomatic. The incidence of VTE was similar between the development group and the validation group (12.7% vs 11.8%; P = .694; Figure 2). Similarly, there were no significant differences of incidence in DVT only (7.5% vs 7.3%; P = .910), PE only (2.4% vs 2.1%; P = .775), and DVT + PE (2.6% vs 2.7%; P = .932) between the 2 groups. The basic characteristics of the 2 data sets are illustrated in Table 1.

Incidence of venous thromboembolism (VTE) in patients with lung cancer.
The Characteristics of Patients in Development and Validation Groups.
Abbreviations: AD, adenocarcinoma; ALT, alanine aminotransferase; AST, aspartate aminotransferase; BMI, body mass index; Cr, creatinine; CRP, C-reactive protein; CVC, central venous catheter; PT, prothrombin time.
Risk Factors for VTE Identified by Univariate Analysis
There were statistically significant differences between non-VTE patients and patients with VTE in the variables of sex, age, bed rest, clinical stage, histology, history of chemotherapy, history of surgery,
Contrast of Variables Between Non-VTE and VTE Patients in the Development Group.
Abbreviations: AD, adenocarcinoma; ALT, alanine aminotransferase; AST, aspartate aminotransferase; BMI, body mass index; CI, confidence interval; Cr, creatinine; CRP, C-reactive protein; CVC, central venous catheter; OR, odds ratio; PT, prothrombin time; VTE, venous thromboembolism.
The Risk Score System Developed by Multivariable Analyses for Predicting VTE
Variables showing statistically significant difference in univariate analysis were tested in the multivariable logistic regression model. Significant variables (P < .05) of male, age ≥65years, clinical stage III-IV, adenocarcinoma, history of chemotherapy, history of surgery,

Receiver operating characteristic (ROC) curves for the venous thromboembolism (VTE) prediction models using the development group and validation group.
Predictors of VTE Determined for the Development Data Sets by Multivariate Analysis.
Abbreviations: CVC, central venous catheter; VTE, venous thromboembolism; OR, odds ratio; CI, confidence interval.
According to the predicted incidence of the risk score, patients were classified into 3 groups (Table 4): low risk (scores ≤3 [predicted incidence < 5%, n = 215]), moderate risk (scores 4-5 [predicted incidence 5%-20%, n = 184]), and high risk (scores ≥6 [predicted incidence >20%, n = 97]). Incidence of VTE by 3 risk classes in the development group is shown in Figure 4.
Classification of the Patients According to the Predicted Risk of the Risk Score and the Actual Incidence of VTE.
Abbreviations: NPV, negative predictive value; PPV, positive predictive value; VTE, venous thromboembolism.

Incidence of venous thromboembolism (VTE) by the 3 risk classes for the development and validation group.
Verification of the Predictive Efficacy for the Risk Score System of VTE
The VTE score system illustrated good discriminatory ability in the validation group (c statistic = .827 [.782-.866]; Figure 3B). The risk of development of VTE was significantly and positively associated with the risk scores in the validation group (Pearson contingency coefficient = .429, P for trend <.001). Considering the classification above, 331 patients from the validation group were classified into 3 groups: low risk (scores ≤3 [predicted incidence <5%, n = 107]), moderate risk (scores 4-5 [predicted incidence 5%-20%, n = 145]), and high risk (scores ≥6 [predicted incidence >20%, n = 79]). Incidence of VTE by 3 risk classes in validation group is shown in Figure 4. The rates of VTE in each of the 3 risk groups in the validation set showed similarity with those in the risk groups of the development set.
Discussion
In this study involving 827 patients admitted with lung cancer during a 5-year period, we found that the incidence of VTE in patients with lung cancer was 12.3%, which is higher than some other studies. 3,5 In the literature review, the reported epidemiology of VTE varies widely probably due to different risk factors. In the present study, the clinical stage and histology in a quite high proportion of patients were III-IV and adenocarcinoma, and more strategy such as chemotherapy, radiotherapy, and surgery were used for clinical treatment, which might be the causes of high VTE incidence. 7
It is important to know that the incidence of VTE increases exponentially with increase in presented risk factors. 13 Hence, the first step to lower the incidence of VTE is to identify all potential risk factors and their effect on risk and to adjust the indications with patient’s risk factors and prophylactic management. A thorough risk score system should be helpful for physicians to plan particular prophylactic approaches for patients prone to develop VTE. In this study, factors significantly associated with risk were distinguished by the logistic regression, and all 8 factors were inserted into the score system.
To our knowledge, our score system is the first to give an anticipating score for possible incidence of VTE, especially for Chinese patients with lung cancer. The risk score system showed exponential increase in possible incidence of VTE according to mounting score. Scores of ≤3 related to low risk of VTE (2.2%), as scores of ≥6 related to high risk (35.8%). The discriminant validity of this VTE score system was validated in the validation group with satisfaction. Next, the significant difference of prognosis in the 3 risk groups illustrates the importance of risk classification. What is more, it is important for physicians to record full panel risk factors for VTE, alongside traditional factors such as age and bed rest, considering the high correlation for each variable of the 3 risk groups.
Ottawa score, Khorana score, and Caprini VTE risk assessment are the 3 most common and valuable predictive scoring systems for VTE in cancer population. 14 –17 Some of the clinical factors included in those systems were also covered in our risk score system; these included, but were not limited to, male, clinical stage, history of chemotherapy, history of surgery, and history of CVC. Therefore, more attention should be drawn to these factors, as they are more likely to show influence on the pathogenesis. The uncommon variables in our score system may be connected with pathophysiology in group of certain patients. For instance, the aforementioned systems were established on patients with various types of cancer, while our system was just based on patients with lung cancer. Body mass index (≥30 or 35 kg/m2) present in Khorana score and Caprini VTE risk assessment was not included in our risk model, 15,16 potentially owing to poor nutritional status in patients with lung cancer. In addition, the clinical variable of bed rest (≥3d) was another important risk predict factor that was used in Caprini VTE risk assessment but not adopted in our system. 16 Our center has built the practice of prevention strategy for bedridden patient due to its well-recognized correlation with VTE. The parameter of hemogram (hemoglobin, platelets, and leukocyte) that was used in the other studies 18 –20 was not included in our system because few cases above normal level were observed in our data set. Finally, the variables of the possible risk factors included in this study are different from the previous predictive models. Future studies are needed to confirm this presumption.
In consistence with previous studies, 21,22 in our predictive scoring system, adenocarcinoma was one of the most powerful predictors for VTE development. The risk of VTE in patients with adenocarcinoma was almost 2.5 times higher than that in the patients with non-adenocarcinoma in this study. In the past, various adenocarcinomas are most strongly associated with VTE, indicated by autopsy and retrospective studies. Blom et al 21 studied thrombotic risk in 537 patients with non-small cell lung cancer and observed that patients with adenocarcinoma showed 3-fold higher risk (incidence of 66.7‰) against squamous cell carcinoma (incidence of 21.2‰). Tagalakis et al 22 also reported a high incidence (13.6%) of DVT in a cohort of 493 patients with NSCLC. All of these have led to the widespread belief that adenocarcinoma plays a role in activating a procoagulant factor by secreting mucin components that may result in VTE.
According to the results shown in Figure 4, the risk of VTE was highly and positively correlated with the risk scores, we would therefore recommend pharmacologic prophylaxis for Chinese patients with high VTE risk and low bleed risk, mechanical prophylaxis for high VTE risk with high bleed risk, and mechanical prophylaxis for moderate VTE risk. In addition, we believe that this score is likely applicable for populations other than Chinese. First, this risk score showed excellent predictive ability based on 827 patients with lung cancer in our center. As a large tertiary center in Southern China, our patients represent a diversity of population. Second, the general characteristics of our patients are consistent with those reports from the United States and Europe. Nevertheless, we need to validate this score based on data from other populations.
There are several strong limitations in the current study. First, the absence of prospective validation cohort, personal and family history of venous thromboembolic disease, and the taking of an anticoagulant or antiplatelet treatment when lung cancer was diagnosed in patients as well as antithrombotics during treatments and may alter the universality of our results. Second, subgroup analysis of different treatment types was not conducted because of the small sample size, which would neglect the impact of treatment type on VTE. Third, the potential variables of VTE in this study were categorized into 2 groups based on clinical experience or literature rather than their best threshold values that are z. Finally, our risk score was developed and validated in a single center, and the majority population of the study may represent a certain population background. It has to be acknowledged that all these limitations would restrict the universality of this risk score system.
Conclusions
In conclusion, VTE is a frequent complication in patients with lung cancer, as in our study the incidence rate was 12.3%. A novel risk score was developed and validated by incorporating both demographic and clinicopathological characteristics. Our newly developed risk score system illustrated good predictive power for screening patients at high risk of VTE. Individual risk prediction as well as risk stratification based on the risk score may assist clinicians to assess the risk of VTE in patients with lung cancer. However, it was developed and validated by a retrospective analysis and has significant limitations, a prospective validation with all the classic variables assessing the thrombotic risk is needed for a solid conclusion.
Footnotes
Authors’ Note
Zilun Li and Guolong Zhang have contributed equally. Huiwen Weng and Zhenwei Peng contributed to conception and design; Zilun Li, Guolong Zhang, Mengping Zhang, and Jie Mei contributed to analysis and interpretation; Guolong Zhang, Mengping Zhang, Jie Mei, Huiwen Weng, and Zhenwei Peng contributed to data collection; Zilun Li and Guolong Zhang contributed in writing the manuscript; Huiwen Weng and Zhenwei Peng contributed to critical revision; Zilun Li, Guolong Zhang, Mengping Zhang, Jie Mei, Huiwen Weng, and Zhenwei Peng contributed to approval of the manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
