Abstract
Imbalanced classification remains a critical challenge in decision-sensitive domains such as healthcare, finance, and cybersecurity, where minority class recognition is often paramount. This paper introduces FRSTU-Forest, a novel hybrid framework that integrates K-Nearest Neighbor (k-NN) imputation, Fixed Random State Undersampling (FRSTU), and Random Forest to enhance both minority class detection and model reproducibility. Unlike conventional undersampling, FRSTU applies deterministic sampling with a fixed random seed, ensuring consistent training subsets across runs and significantly reducing performance variance. The framework was comprehensively evaluated on seven benchmark datasets with moderate imbalance ratios (1.25–3.36) and rigorously tested on synthetic datasets with extreme imbalance ratios up to 1:100, high dimensionality (100 features), and substantial label noise (20%). FRSTU-Forest consistently outperformed baseline models (RF, k-NNimp+RF, RSTU+RF), achieving an average accuracy of 87.88%, minority-class F1-score up to 99.78%, and Cohen’s Kappa of 0.86 on benchmark datasets. More importantly, under extreme imbalance conditions (1:100 ratio), it maintained a balanced accuracy of 0.807 with 100 features, demonstrating remarkable robustness. Statistical significance was confirmed via the Bonferroni-Dunn test (
Keywords
Get full access to this article
View all access options for this article.
