Abstract
Objective
This study aimed to evaluate the ability of seven bleeding risk assessment tools to predict bleeding events in patients with apixaban or rivaroxaban.
Methods
A retrospective analysis was conducted on 313 NVAF patients (apixaban: n = 164; rivaroxaban: n = 149) with at least one year of follow-up at a tertiary hospital in Malaysia. Bleeding risk scores were calculated at treatment initiation using original definitions. Bleeding events were defined according to ISTH criteria. Predictive performance was assessed via area under the receiver operating characteristic (AUROC) curves, with good diagnostic accuracy defined as AUROC ≥ 0.7. Subgroup analyses by age and sex were also performed.
Results
Overall bleeding incidence was 12.46%. Among rivaroxaban users, ORBIT (AUROC 0.68), RE-LY (0.66), and HEMORR2HAGES (0.66) showed sufficient accuracy. For apixaban users, A4C performed best (AUROC 0.64); HAS-BLED showed no diagnostic value (AUROC 0.42). Some tools showed good subgroup accuracy.
Conclusion
None of the bleeding risk assessment tools achieved good overall diagnostic accuracy in NVAF patients treated with apixaban or rivaroxaban. Further studies are needed to validate existing tools and to develop a bleeding risk score with acceptable diagnostic accuracy for this patient population.
Introduction
Apixaban and rivaroxaban are now the mainstay of treatment for thromboembolic prevention across a range of clinical conditions, including stroke prevention in atrial fibrillation (SPAF), venous thromboembolism (VTE), and post-operative thromboprophylaxis. 1 These drugs offer effective stroke and thromboembolism risk reduction, with a generally more favourable bleeding profile compared to traditional vitamin K antagonists (VKAs) such as warfarin. 2 However, the potential for bleeding remains a critical consideration in prescribing decisions, warranting the use of bleeding risk assessment tools to guide treatment, monitor follow-up, and inform patient counseling.
The current guidelines, including the 2024 European Society of Cardiology (ESC) guidelines for the management of atrial fibrillation have emphasized that bleeding risk assessment should not be used to deny patients of oral anticoagulation, but rather to identify modifiable risk factors and optimize management. 3 In clinical practice, tools such as HAS-BLED, ATRIA, ORBIT, and HEMORR2HAGES remain widely used; however, these scores were originally developed and validated in VKA-treated cohorts.4–7 Their performance and applicability in patients receiving DOACs are therefore uncertain. To address this gap, several tools—including simplified ABH, RE-LY, and A4C—have been specifically designed or adapted for DOAC-treated populations.8–10
Despite their development, there remains limited comparative data evaluating the performance of these tools in real-world DOAC-treated patients, particularly those with non-valvular atrial fibrillation (NVAF). Understanding the predictive accuracy of these tools is essential to improving clinical decision-making and patient safety. This study aimed to evaluate the ability of seven bleeding risk assessment tools to predict bleeding events in patients with apixaban or rivaroxaban.
Methods
Study Setting and Population
This study was conducted at Hospital Canselor Tuanku Muhriz (HCTM), a tertiary hospital in Malaysia that provides anticoagulation monitoring services. The study population consisted of patients with non-valvular atrial fibrillation (NVAF) who were treated with apixaban or rivaroxaban between January 2015 and December 2020. Patients included in the study were those who initiated apixaban or rivaroxaban therapy. A 1-year follow-up period was chosen to align with the design of existing bleeding risk assessment tools, which were derived from cohorts of newly anticoagulated atrial fibrillation patients and validated to predict bleeding events within the first year of treatment.11–13 The follow-up period of this study was from the initiation of apixaban and rivaroxaban until the occurrence of bleeding, drug discontinuation, switching of therapy or the end of study period. Ethical approval was obtained from the Universiti Kebangsaan Malaysia Ethical Committee (UKM PPI/111/8/JEP-2023-583).
Inclusion and Exclusion Criteria
Patients were eligible for inclusion if they (i) were aged 18 years or older, (ii) had been receiving either apixaban or rivaroxaban for at least one year, and (iii) had a diagnosis of non-valvular atrial fibrillation, defined as atrial fibrillation without mitral stenosis or a history of valvular surgery. Patients were excluded if they: (i) had a history of treatment for deep vein thrombosis (DVT) or pulmonary embolism (PE); (ii) had undergone mechanical or bioprosthetic heart valve surgery; (iii) experienced bleeding events attributed to anticoagulants other than apixaban or rivaroxaban, based on clinical records or laboratory testing; or (iv) had no identifiable exposure to apixaban or rivaroxaban, defined as the absence of documented dispensing and administration despite a recorded prescription.
Data Collection
Data were collected retrospectively from medical folders or electronic medical records of patients. Patient characteristics were defined by several factors: the start date of apixaban or rivaroxaban therapy, age at therapy initiation, sex, and body weight at start of treatment. Lifestyle factors, including smoking and alcohol consumption, were also recorded. Comorbidities were documented at the start of therapy, along with any history of clinically significant bleeding or stroke prior to treatment. Non-bleeding-related hospitalizations in the preceding 12 months were noted. Laboratory data included serum creatinine, liver function tests (ie ALT, ALP, bilirubin), and a full blood count (ie, red blood cell count, platelet count, haemoglobin levels). Estimated glomerular filtration rate was calculated using the Modification of Diet in Renal Disease (MDRD) formula. 14 Concomitant medications taken alongside apixaban or rivaroxaban, including those used within three months before bleeding events, were also recorded.
Calculation and Recording of Bleeding Risk Assessment Scores
Bleeding risk scores were calculated for each patient at treatment initiation, following the specific definitions and scoring methods of the risk assessment tools included in the study. If a score was already available in the patient's records, it was recorded directly; if not, the scores were manually calculated based on the relevant patient data. This process ensured that all patients had an accurate bleeding risk assessment at the initiation of treatment.
Study Outcomes
The primary outcome measure was the predictive performance of the bleeding (ie, HAS-BLED, ATRIA, HEMMOR2HAGES, ORBIT, simplified ABH, RE-LY and A4C) risk assessment tools. Bleeding events were defined according to International Society on Thrombosis and Haemostasis (ISTH) criteria. 15
Statistical Analysis
Data analyses were performed using IBM SPSS Statistics v29 and STATA v17. Categorical variables were expressed as frequencies and proportions, while continuous variables were presented as medians and interquartile ranges for non-normally distributed data and as means and standard deviations for normally distributed data. Normality was assessed using the Shapiro–Wilk test. The Kaplan–Meier method was used to estimate the time to first bleeding event (major or clinically relevant non-major bleeding) among patients receiving apixaban and rivaroxaban. The survival curves were generated for each treatment group and mean bleeding-free survival times with 95% confidence intervals (CI) were reported. The area under the receiver operating characteristic curve (AUROC) was used to evaluate the diagnostic ability of different assessment tools in predicting bleeding events. AUROC values ranging from 0.5 to 1.0 were considered clinically useful, 16 with values between 0.5 and 0.6 indicating poor diagnostic accuracy, 0.6 to 0.7 indicating sufficient accuracy, 0.7 to 0.8 indicating good accuracy, 0.8 to 0.9 indicating very good accuracy and 0.9 to 1.0 indicating excellent accuracy. An AUC <0.5 suggested no diagnostic value, indicating that outcomes were likely due to chance. All statistical tests were two-tailed, and p-values <0.05 were considered statistically significant.
Results
Demographic Characteristics
Figure 1 illustrates the screening and selection process for this study. A total of 1868 patients prescribed either apixaban (n = 844) or rivaroxaban (n = 1024) were initially screened. Of these, 991 patients with a confirmed diagnosis of non-valvular atrial fibrillation (NVAF) were identified. Following exclusions—678 patients lost to follow-up during the one-year observation period, 759 patients diagnosed with conditions other than NVAF, and 118 patients with untraceable records—a total of 313 patients remained for inclusion in the final analysis (rivaroxaban: n = 149; apixaban: n = 164).

Patient Screening and Selection Process Flow Chart.
Table 1 summarizes the demographic and clinical characteristics of the study population. The majority of patients were male (51.76%) and of Chinese ethnicity (56.55%), with a mean age of 70.92 years. The cohort was at high risk of stroke, as reflected by a median CHA2DS2-VASc score of 4. Most patients had multiple comorbidities, including hypertension, diabetes mellitus, prior stroke or transient ischemic attack (TIA), congestive heart failure, chronic kidney disease, prior bleeding, chronic obstructive pulmonary disease (COPD), cancer, anemia and vascular disease. Overall, 12.46% of the cohort experienced major or clinically relevant non-major bleeding events during the follow-up period. The NVAF patients receiving apixaban were followed for a median duration of 2.59 years [interquartile range (IQR): 1.66–3.63], while those receiving rivaroxaban were followed for a median of 3.42 years [IQR: 2.24–4.78].
Demographic and Clinical Characteristics of NVAF Patients.
NVAF: non-valvular atrial fibrillation; DOACs: direct oral anticoagulants; CHF: congestive heart failure; CKD: chronic kidney disease; COPD: chronic obstructive pulmonary disease; CCB: calcium channel blocker; ACEi: angiotensin converting enzyme inhibitors; ARB: angiotensin receptor blockers; eGFR: estimated glomerular filtration rate; CHA2DS2-VASc: Congestive heart failure, hypertension, age ≥75 (doubled), diabetes, stroke (doubled), vascular disease, age 65 to 74 and sex category (female); CHADS2: Congestive heart failure, hypertension, age of or greater than 75 years, or diabetes mellitus, and two points for a previous history of stroke or TIA; HAS-BLED: Hypertension, abnormal renal/liver function, stroke, bleeding history or predisposition, labile INR, elderly, drugs/alcohol concomitantly; HEMORR2HAGES: Hepatic or renal disease, ethanol (alcohol abuse), malignancy, older age (>75 years old), reduced platelet count or function, re-bleeding, hypertension, anaemia, genetic risk factors, excessive fall risk, stroke; simplified ABH: Age, history of bleeding, and non-bleeding related hospitalisation in the preceding 12 months. Values are expressed in: n (number of patients); mean ± standard deviation; frequency (proportions); median [interquartile ranges].
Kaplan–Meier Survival Analysis of Time to First Bleeding Event in NVAF Patients Receiving Apixaban and Rivaroxaban Therapy
Kaplan–Meier survival curves were generated to estimate the time to first bleeding event—defined as either major or clinically relevant non-major bleeding—among NVAF patients treated with rivaroxaban and apixaban. The mean bleeding-free survival time was 4.38 years (95% CI: 4.09-4.66) for the rivaroxaban group and 6.78 years (95% CI: 6.35-7.21) for the apixaban group. Details are presented in Table 2.
Estimated Bleeding-Free Survival Time among NVAF Patients Receiving Rivaroxaban.
Estimation is limited to the largest survival time if it is censored.
The survival curves demonstrated a progressive decline in bleeding-free survival over time, with a more pronounced drop observed between the third and fourth years in the apixaban cohort. In contrast, the rivaroxaban group exhibited a steadier decline throughout the follow-up period. Censored observations—patients who did not experience a bleeding event during follow-up—are indicated by tick marks on the curves. Kaplan–Meier survival curves for each treatment group are presented in Figure S1 of the Supplemental files.
Diagnostic Accuracy of Bleeding Risk Assessment Tools for Predicting Bleeding Events for Rivaroxaban and Apixaban
For rivaroxaban-treated patients, the RE-LY (0.66) and A4C (0.50) tools demonstrated AUROCs. The RE-LY tool achieved sufficient diagnostic accuracy, comparable to ORBIT (0.68) and HEMORR2HAGES (0.66), whereas A4C was at the threshold of poor diagnostic accuracy.
Among apixaban-treated patients, the RE-LY tool (0.50) aligned with poor diagnostic accuracy, while A4C (0.64) demonstrated sufficient diagnostic accuracy. In contrast, the HAS-BLED tool (0.42) showed no diagnostic value. A comprehensive summary of AUROC values is presented in Table 3, with rivaroxaban and apixaban results visualized in Figures 2 and 3, respectively.

Area Under the ROC Curves of the Different Bleeding Risk Assessment Tools to Predict Bleeding Events among NVAF Patients with Rivaroxaban.

Area Under the ROC Curves of the Different Bleeding Risk Assessment Tools to Predict Bleeding Events among NVAF Patients with Apixaban.
Area Under the ROC Curve of Different Bleeding Risk Assessment Tools.
AUROC: Area under the ROC curve of the different bleeding risk assessment tools; CI: confidence interval.
Age-Based Subgroup Analysis of Bleeding Risk Assessment Tools for Predicting Bleeding Events with Rivaroxaban and Apixaban
The RE-LY tool demonstrated good diagnostic accuracy in rivaroxaban users under 65 years old (AUROC 0.79). Meanwhile, for ages of 65 to 74 years old, the score with the highest AUC was ORBIT (0.68) score and for age greater or equal to 75 years old, the score with the highest AUC was the HAS-BLED (0.74) score. These AUC values correlated with having sufficient to good diagnostic accuracy.
Meanwhile, among patients taking apixabann in the 65–74 age group, A4C, RE-LY, and ORBIT scores all exhibited sufficient (≥ 0.61) diagnostic accuracy. Notably, HEMORR2HAGES was the only tool that demonstrated good diagnostic accuracy in this age-stratified analysis. Details are presented in Table S1 of the supplemental file.
Sex-Based Subgroup Analysis of Bleeding Risk Assessment Tools for Predicting Bleeding Events with Rivaroxaban and Apixaban
For male patients taking rivaroxaban, the highest AUC based on the ROC curve were HEMORR2HAGES For apixaban users, no tool reached good accuracy; however, RE-LY in females still performed better (AUROC 0.60) than in males (AUROC 0.48). Details are presented in Table S2 of the supplemental file.
Discussion
This study assessed the predictive performance of seven bleeding risk assessment tools in NVAF patients treated with apixaban or rivaroxaban. Despite their widespread application in clinical settings, none of the tools demonstrated good overall diagnostic accuracy (AUROC ≥ 0.7) within the study population. These findings highlight the limitations of current scoring systems when applied to DOAC-treated patients, particularly in real-world practice. Although simplified ABH, RE-LY and A4C were specifically developed for use in DOAC cohorts, their inconsistent performance in this analysis suggests they may not fully reflect the complex and multifactorial nature of bleeding risk in these patients.
Notably, subgroup analysis by age revealed that tools such as RE-LY, HAS-BLED, and HEMORR2HAGES achieved AUROCs exceeding 0.7, indicating good predictive accuracy in older adults. As age is a key risk factor for atrial fibrillation, and the risk of major bleeding may increase with age, especially among direct oral anticoagulant patients. 17 Conversely, tools with poor or no diagnostic value may offer limited clinical utility for guiding treatment or identifying high-risk patients. In such cases, unreliable stratification may lead to inappropriate clinical decisions, including underestimation or overestimation of bleeding risk. Therefore, cautious interpretation and application of these tools are recommended when managing patients on apixaban or rivaroxaban. This aligns with current international recommendations, which increasingly emphasize moving away from rigid reliance on bleeding risk scores alone and instead advocate for a comprehensive, individualized clinical evaluation. 3
The result of this study aligns with the ROC AUCs from previous studies that evaluated risk assessment tools on DOACs.18,19 In the study of Gorman et al, the predictive ability of the HAS-BLED bleeding risk assessment tool was evaluated in atrial fibrillation patients receiving rivaroxaban. 18 Risk assessment tool was evaluated in patients with atrial fibrillation receiving rivaroxaban. Among the cohort, 15 patients experienced bleeding events, while 90 did not. The area under the curve (AUC) for HAS-BLED was reported as 0.68, indicating sufficient discriminative ability. Lip et al assessed the HAS-BLED, ATRIA and ORBIT bleeding risk assessment tools for patients who took apixaban, rivaroxaban or dabigatran indicated for atrial fibrillation from Danish registries. The study reported that at 1 year, AUCs for these scores were only 0.58, 0.59, and 0.61, respectively. 19 The HAS-BLED tool provided the most benefit in terms of clinical usefulness. As a result, the 2018 European Heart Rhythm Association recommended the use of HAS-BLED tool to assess the risk of bleeding for AF patients that will take DOACs. 20 In this study, the HAS-BLED scores for rivaroxaban and apixaban were 0.63 and 0.42, respectively. Meanwhile, the ATRIA scores for rivaroxaban and apixaban were 0.61 and 0.56, respectively and the ORBIT scores for rivaroxaban and apixaban were 0.68 and 0.58, respectively.
This study highlights the limited ability of current bleeding risk assessment tools to accurately predict bleeding in NVAF patients treated with apixaban or rivaroxaban, especially when applied broadly without accounting for individual patient differences. Bleeding risk assessment tools such as HAS-BLED, ORBIT, and HEMORR2HAGES demonstrated varying levels of diagnostic performance in older patients (≥75 years), with HAS-BLED showing good, ORBIT poor, and HEMORR2HAGES sufficient diagnostic accuracy. However, their respective AUCs suggest they should be applied with caution in clinical settings. Importantly, this is consistent with evolving guideline recommendations that discourage the use of a single bleeding risk score to make treatment decisions. Instead, risk scores should serve as supportive tools, complementing broader clinical judgment that accounts for comorbidities, concomitant therapies, frailty, and patient preferences. 3
Therefore, clinicians are advised not to rely solely on these tools but to incorporate them into a broader clinical context, considering patient-specific factors and individual judgment (eg evaluating renal and hepatic function, assessing concomitant use of antiplatelet or NSAIDs, reviewing recent bleeding history, and accounting for frailty or fall risk. Consistent with these findings, prior research has also reported sufficient diagnostic accuracy of HAS-BLED and HEMORR2HAGES in DOAC-treated patients, with none achieving an AUC above 0.7. 21 These findings underscore the need to improve existing risk assessment tools or develop new, DOAC-specific risk assessment tools that more accurately reflect the characteristics of modern patient populations and current treatment practices.
There are several limitations in this study. Firstly, a low bleeding event rate had prevented us from making any conclusion regarding the performance of one risk assessment tool over the other. It is likely that a low bleeding event rate is due to a relatively safer profile of apixaban and rivaroxaban in the Asian population or a younger age of cohort (70.92 ± 9.55 years) and shorter duration of follow-up than those of the cohort in other studies. To improve the accuracy and reliability of the findings, future studies should consider a larger sample size and longer follow-up period. A larger sample size might help improve the precision of risk assessment tools for patients on apixaban and rivaroxaban therapy. Additionally, a longer follow-up period may yield different findings, warranting further investigation. While real-time, indefinite monitoring of all patients would be ideal, it is resource-intensive. Therefore, a more extensive study design with a larger sample size and extended follow-up would provide valuable insights into bleeding risk and the effectiveness of these risk assessment tools. Secondly, the study design is retrospective in nature that leads to documentation only of the outcomes based on the physicians’ documentation and patient reports. Some information like minor bleeding or if the patient consult different physician in-between visits might not be documented that may possibly lead of underrepresentation in clinical events. Lastly, genetic factors and fall risk (which are included in the HEMORR2HAGES tool), albumin less than 35 g/L (included in the A4C tool), body mass index (included in the RE-LY tool) and international normalized ratio (INR) lability (included in the HAS-BLED tool) were excluded from our analysis due to the lack of available information. This approach is consistent with previous studies,19,22,23 which also excluded such data when unavailable. While the inclusion of genetic data could improve the predictive accuracy of bleeding risk assessment tools, it is not routinely collected in standard clinical practice. Therefore, its omission is unlikely to significantly impact the overall utility of the risk assessment tools, which are designed to work with the most readily accessible clinical information. Future studies might explore the role of specific biomarkers (ie D-dimer, high-sensitivity troponin, cystatin C, and growth differentiation factor-15) in predicting bleeding risk. These biomarkers have shown potential in identifying patients at higher risk of bleeding, particularly in conjunction with risk assessment tools. 24 Additionally, INR values were not routinely monitored in our study because apixaban and rivaroxaban do not require INR testing, given of their predictable pharmacokinetics. 25 This aligns with the current clinical practice, where the absence of INR data may limit some risk assessment tools but reflects real-world prescribing and monitoring patterns. 26 Lastly, another potential limitation relates to the calculation of bleeding risk assessment scores. In this study, some scores were directly obtained from existing medical records, while others were calculated retrospectively by the researchers based on available clinical data. This mixed approach may introduce variability in the accuracy of the scores, although efforts were made to standardize calculations using predefined criteria.
Conclusion
No bleeding risk assessment tools had good diagnostic accuracy in predicting bleeding events for patients receiving apixaban or rivaroxaban. The RE-LY HAS-BLED, ATRIA, HEMORR2HAGES and ORBIT tools had similar performance in predicting bleeding events treated with rivaroxaban and were found to be sufficient in terms of diagnostic accuracy. Meanwhile, A4C was found to be the only tool to be sufficient in terms of diagnostic accuracy while ATRIA, HEMORR2HAGES, ORBIT and simplified ABH tools were found to be poor in terms of diagnostic accuracy for predicting bleeding events in apixaban patients. Further studies are needed to evaluate the performance of existing bleeding risk assessment tools in patients taking apixaban or rivaroxaban and to develop a bleeding risk assessment tool with acceptable diagnostic accuracy (AUROC ≥ 0.7) for predicting bleeding risk.
Supplemental Material
sj-pdf-1-cat-10.1177_10760296251390902 - Supplemental material for Bleeding Risk Prediction in Non-Valvular Atrial Fibrillation: Comparing Risk Scores in Apixaban- and Rivaroxaban-Treated Patients
Supplemental material, sj-pdf-1-cat-10.1177_10760296251390902 for Bleeding Risk Prediction in Non-Valvular Atrial Fibrillation: Comparing Risk Scores in Apixaban- and Rivaroxaban-Treated Patients by Christian Andrew Almalbis, Adyani Md Redzuan and Shamin Mohd Saffian in Clinical and Applied Thrombosis/Hemostasis
Footnotes
Acknowledgments
The authors are grateful to University of San Agustin and Department of Science and Technology-Science Education Institute (DOST-SEI) for their support. At the time of writing, C.A.A was a recipient of the DOST-SEI scholarship.
Ethics Approval
Ethical approval was granted by the Universiti Kebangsaan Malaysia Ethical Committee (UKM PPI/111/8/JEP-2023-583).
Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Author Contributions
All authors contributed to the study conception and design. Data collection and analysis were performed by C.A.A., S.M.S., and AM.The first draft of the manuscript was written by C.A.A. and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability
Not applicable.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
