Abstract
There exists a huge amount of ECG data available in heart disease diagnosis which is found difficult in handing. Recently, many researchers focused on mining disease diagnosis to innovate the hidden patterns and their relevant features. Mining bio-medical data is one of the predominant research areas where clustering techniques are emphasized in heart disease diagnosis. But few people deal with large heart disease datasets and then classify disease data sets according to heart disease feature. We propose a method of anomaly threshold based on multiple classifiers can be well suited to datasets containing abnormal data, and use XGBoost algorithm as a sub-classifier to process massive ECG data. This research focuses on the heart disease classification problem. The data set is divided into two categories, and then it was classified into more specific categories, experimental results show that this method can improve classification accuracy. The experiments are conducted on massive instances of different heart disease obtained from the hospital actual cases and two data sets of UCI. In fact, we compared SVM, C4.5, Naive Bayes, Logistic, RandomForest and XGBoost algorithms, and found that tree-based model classifier is the best fit to predict arrhythmia. The method proposed in this paper is of great significance to the processing and forecasting system of large medical data sets, and promote the development of wisdom medical care.
Get full access to this article
View all access options for this article.
