Abstract
Parkinson's disease (PD) is a neurodegenerative disorder of the brain that primarily affects motor function. Clinical challenges associated with this condition include accurately diagnosing patients in the early stages of the disease and predicting how the condition will progress. This project aims to enhance PD detection by integrating feature selection and classification using supervised learning techniques. Two publicly available datasets—the speech and PD classification datasets—are utilized to evaluate model performance across diverse features. The proposed work employs class balancing through the Synthetic Minority Oversampling Technique (SMOTE) to address the issue of class imbalance in this highly unbalanced dataset. Subsequently, the Relief algorithm is used for feature selection to identify the most relevant predictors. An ensemble of models is applied using the RF-XGBoost-KNN classifiers due to their superior accuracy compared to other classifier combinations. The RF-XGBoost-KNN model stack achieved classification accuracies of 94.56% and 93.53% for the PD speech dataset and Parkinson's Disease Classification Dataset, respectively, demonstrating its potential as a robust tool for early and accurate PD diagnosis.
Get full access to this article
View all access options for this article.
