Comparative analysis of k-nn and naïve bayes algorithms for early-stage chronic kidney disease classification. Compare K-NN & Naïve Bayes ML algorithms for early Chronic Kidney Disease (CKD) classification. Naïve Bayes achieved 95.83% accuracy, proving more effective for CKD prediction. Potentially a rapid screening tool.
Chronic Kidney Disease (CKD) is a global health issue characterized by low early detection rates and high diagnostic costs. Artificial intelligence, particularly machine learning, offers a promising solution as a rapid and cost-effective decision support system. This study aims to comprehensively analyze and compare the performance of two simple and interpretable classification algorithms, K-Nearest Neighbor (K-NN) and Naïve Bayes (NB), for predicting CKD based on clinical data. The dataset was sourced from the UCI Machine Learning Repository, comprising 400 instances and 25 clinical attributes such as blood pressure and serum creatinine. The methodology included data preprocessing (median imputation for numerical features, mode imputation for categorical features), encoding, Min-Max normalization, data splitting (70:30 ratio), model training, K parameter optimization for K-NN via 5-fold cross-validation, and evaluation using accuracy, precision, recall, F1-Score, and Confusion Matrix metrics. Experimental results demonstrated that the Naïve Bayes algorithm achieved superior performance with an accuracy of 95.83%, precision of 95.95%, recall of 97.26%, and F1-Score of 96.60%. The K-NN algorithm with an optimal K=5 attained an accuracy of 91.67%. Statistical analysis using a paired t-test (α=0.05) with p-value=0.012 confirmed that this performance difference was significant. It is concluded that Naïve Bayes is more effective for this CKD dataset, likely due to its robustness in handling feature independence assumptions and varied data scales. This model holds strong potential for development into an early-stage CKD screening tool to assist healthcare professionals.
This study presents a timely and relevant comparative analysis of K-Nearest Neighbor (K-NN) and Naïve Bayes (NB) algorithms for early-stage Chronic Kidney Disease (CKD) classification, addressing the critical need for rapid and cost-effective diagnostic support systems. Given the global health burden of CKD and challenges in early detection, the authors' objective to evaluate simple and interpretable machine learning models for this task is commendable. The chosen algorithms, known for their transparency, are particularly suitable for clinical applications where model interpretability is paramount for trust and adoption by healthcare professionals. The methodology is clearly outlined, beginning with a standard dataset from the UCI Machine Learning Repository comprising 400 instances and 25 clinical attributes. Rigorous data preprocessing steps included median imputation for numerical features, mode imputation for categorical features, encoding, Min-Max normalization, and a 70:30 data split. Model training incorporated K parameter optimization for K-NN via 5-fold cross-validation, demonstrating a thorough approach to model configuration. Performance evaluation utilized a comprehensive suite of metrics: accuracy, precision, recall, F1-Score, and Confusion Matrix. The experimental results strongly indicate that the Naïve Bayes algorithm significantly outperformed K-NN, achieving an accuracy of 95.83% and an F1-Score of 96.60%, compared to K-NN's 91.67% accuracy (with optimal K=5). This performance difference was further confirmed to be statistically significant via a paired t-test (p-value=0.012). The conclusion that Naïve Bayes is more effective for this specific CKD dataset is well-supported by the detailed evaluation. The authors' attribution of NB's superiority to its robustness in handling feature independence assumptions and varied data scales offers a plausible explanation. The strong potential of this model for development into an early-stage CKD screening tool is a significant implication, promising valuable assistance to healthcare professionals. While the study provides a solid foundational comparison, future work could further explore the generalizability of these findings across larger, more diverse, and potentially multi-centric datasets, as well as investigate the integration of more complex feature engineering or ensemble techniques to potentially enhance robustness and predictive power in real-world clinical settings.
You need to be logged in to view the full text and Download file of this article - Comparative Analysis of K-NN and Naïve Bayes Algorithms for Early-Stage Chronic Kidney Disease Classification from Building of Informatics, Technology and Science (BITS) .
Login to View Full Text And DownloadYou need to be logged in to post a comment.
By Sciaria
By Sciaria
By Sciaria
By Sciaria
By Sciaria
By Sciaria