Komparasi Model Ensemble dan Algoritma Machine Learning Untuk Memprediksi Penyakit Jantung

Muhammad Syarief Albani, Dedy Kurniawan, Ken Ditha Tania

Muhammad Syarief Albani, Dedy Kurniawan, Ken Ditha Tania

Medicine

Building of Informatics, Technology and Science (BITS)

0.0 (0 ratings)

Introduction

Komparasi model ensemble dan algoritma machine learning untuk memprediksi penyakit jantung. Studi komparasi model ensemble & algoritma machine learning (CatBoost, Random Forest) untuk prediksi penyakit jantung. Temukan CatBoost sebagai yang terbaik (akurasi 98%) untuk mendukung keputusan klinis & diagnosis.

53 views

Abstract

This study compared the performance of nine machine learning algorithms in predicting heart disease using a dataset dating back to 1988 and consisting of four databases: Cleveland, Hungary, Switzerland, and Long Beach totaling 1025 data. The dataset used includes medical features that reflect physiological states, clinical examination results, and cardiovascular risk factors, namely age, gender, type of chest pain, resting blood pressure, serum cholesterol levels, fasting blood sugar levels, resting electrocardiography results, maximum heart rate, chest pain during physical activity, ST segment depression, ST segment slope, number of major blood vessels visible by fluoroscopy, and thalassemia status. The stages of this study include data cleaning, data transformation, and evaluation carried out using the data splitting method for training and testing as well as K-fold cross-validation with metrics of accuracy, precision, recall, F1 score, and AUC-ROC. The algorithms used in this study are Decision Tree, Random Forest, Support Vector Machine, MLP Classifier, Bagging Classifier, Gradient Boosting, CatBoost, XGBoost, and LightGBM with ensemble-based models, such as CatBoost, Random Forest, XGBoost, and LightGBM, showing consistent performance on various evaluation metrics when compared to non-ensemble models. Among all models tested, CatBoost showed the best performance, with an accuracy reaching 98%, an F1-Score of 0.980, and a Recall of 0.9875 then followed by other ensemble algorithms such as Random Forest, XGBoost and LightGBM. The results of this study indicate that ensemble models are proven to be more effective in predicting heart disease. This study aims to present an in-depth comparative study of the performance of ensemble algorithms and modern machine learning in predicting heart disease, as well as enriching the literature related to the application of Knowledge Discovery in the health sector and providing a basis for selecting more reliable prediction algorithms to support clinical decision making and the development of machine learning-based heart disease diagnosis support systems.

Review

This study, titled "Komparasi Model Ensemble dan Algoritma Machine Learning Untuk Memprediksi Penyakit Jantung," presents a comprehensive comparative analysis of various machine learning algorithms, including both traditional and ensemble models, for the prediction of heart disease. Utilizing a dataset compiled from four distinct databases (Cleveland, Hungary, Switzerland, and Long Beach) dating back to 1988 and comprising 1025 records with 13 medical features, the research aims to identify the most effective predictive models. The chosen features span physiological states, clinical examination results, and cardiovascular risk factors, ensuring a robust foundation for the analysis. The methodology employed is thorough, encompassing essential steps such as data cleaning and transformation, followed by rigorous evaluation using both data splitting for training/testing and K-fold cross-validation. Performance was assessed across a suite of standard metrics including accuracy, precision, recall, F1-score, and AUC-ROC, providing a multi-faceted view of model effectiveness. Nine algorithms were tested: Decision Tree, Random Forest, Support Vector Machine, MLP Classifier, Bagging Classifier, Gradient Boosting, CatBoost, XGBoost, and LightGBM. A key finding consistently demonstrated across various evaluation metrics was the superior performance of ensemble-based models (CatBoost, Random Forest, XGBoost, and LightGBM) compared to their non-ensemble counterparts. Specifically, CatBoost emerged as the best performer, achieving an impressive accuracy of 98%, an F1-Score of 0.980, and a Recall of 0.9875. The results of this study clearly underscore the enhanced effectiveness of ensemble models in predicting heart disease, offering valuable insights for the selection of reliable prediction algorithms in healthcare. This research makes a significant contribution to the literature on Knowledge Discovery in the health sector, particularly in demonstrating the practical application and benefits of advanced machine learning techniques for critical medical predictions. The findings provide a strong empirical basis for developing more accurate and robust machine learning-based heart disease diagnosis support systems and can directly inform clinical decision-making, ultimately improving patient outcomes through earlier and more reliable detection.

Full Text

You need to be logged in to view the full text and Download file of this article - Komparasi Model Ensemble dan Algoritma Machine Learning Untuk Memprediksi Penyakit Jantung from Building of Informatics, Technology and Science (BITS) .

Comments

You need to be logged in to post a comment.

Top Blogs by Rating

Favorite Blog

Komparasi Model Ensemble dan Algoritma Machine Learning Untuk Memprediksi Penyakit Jantung

Home Research Details

Muhammad Syarief Albani, Dedy Kurniawan, Ken Ditha Tania