Optimasi Hyperparameter Random Forest untuk Klasifikasi Depresi Mahasiswa Menggunakan GridSearchCV dan RandomizedSearchCV

Eka Wahyu Utami, Defri Kurniawan

Eka Wahyu Utami, Defri Kurniawan

Informatics

Building of Informatics, Technology and Science (BITS)

0.0 (0 ratings)

Introduction

Optimasi hyperparameter random forest untuk klasifikasi depresi mahasiswa menggunakan gridsearchcv dan randomizedsearchcv. Optimalkan Random Forest untuk klasifikasi depresi mahasiswa menggunakan GridSearchCV & RandomizedSearchCV. Tingkatkan akurasi deteksi & manajemen kesehatan mental siswa.

30 views

Abstract

Student mental health is an important issue that requires a data-driven approach to support the classification process of student depression. This study aims to analyze the factors that cause depression and optimize the performance of the classification model by applying the Random Forest algorithm. The data used in this research is secondary data from the Student Depression Dataset obtained from the Kaggle platform, with a total of 27,901 data points. The research stages begin with data collection followed by Exploratory Data Analysis (EDA), which includes descriptive statistical analysis and correlation between variables using a heatmap. Data preprocessing involves removing irrelevant features, handling missing values, encoding categorical data, and splitting the data into training and testing sets. Model development is carried out through three scenarios: a baseline model, hyperparameter optimization using GridSearchCV, and RandomizedSearchCV. Model performance evaluation is measured using a Confusion Matrix to analyze accuracy, precision, recall, and F1-score. The results show that all models produce relatively stable accuracy in the range of 0.84–0.85. The model with GridSearchCV optimization provides the best performance with a recall value of 0.8869 and an F1-score of 0.8719. This increase in recall is important to minimize the risk of false negatives in identifying students experiencing depression. It is hoped that these findings can contribute as a decision support system for educational institutions in more accurately detecting and managing students' mental health.

Review

This study presents a timely and well-structured approach to classifying student depression, a critical issue in contemporary education. The authors effectively utilize a substantial dataset from Kaggle, employing a systematic methodology that includes comprehensive Exploratory Data Analysis (EDA) and meticulous data preprocessing. A notable strength is the comparison of a baseline Random Forest model against optimized versions using both GridSearchCV and RandomizedSearchCV, demonstrating a commitment to enhancing model performance. The explicit focus on improving recall to minimize false negatives is particularly commendable, as it directly addresses a crucial practical concern in the early detection and intervention of mental health issues. While the study's methodology is robust, certain details could further enrich the presented findings. The abstract indicates a relatively stable accuracy range (0.84–0.85) across all models, suggesting that the baseline Random Forest model already performed quite well. A more explicit quantitative comparison of the baseline model's recall and F1-score against the optimized models would better highlight the specific impact and practical significance of the hyperparameter tuning. Additionally, although the study aimed to analyze factors causing depression and conducted correlation analysis during EDA, the abstract does not detail specific insights derived from these analyses regarding influential features or the "irrelevant features" mentioned as being removed during preprocessing, which could offer deeper contextual understanding. Overall, this research makes a valuable contribution to the application of machine learning in addressing student mental health. The systematic use of hyperparameter optimization, particularly the emphasis on improving recall, offers a promising pathway for developing more effective decision support systems for educational institutions. The findings provide a strong foundation for more accurate and timely identification of students experiencing depression, which is vital for enabling proactive support and interventions. Future work could potentially expand on these results by exploring the interpretability of features, comparing performance with a wider array of classification algorithms, or investigating the model's generalizability across different student populations.

Full Text

You need to be logged in to view the full text and Download file of this article - Optimasi Hyperparameter Random Forest untuk Klasifikasi Depresi Mahasiswa Menggunakan GridSearchCV dan RandomizedSearchCV from Building of Informatics, Technology and Science (BITS) .

Comments

You need to be logged in to post a comment.

Top Blogs by Rating

Favorite Blog

Optimasi Hyperparameter Random Forest untuk Klasifikasi Depresi Mahasiswa Menggunakan GridSearchCV dan RandomizedSearchCV

Home Research Details

Eka Wahyu Utami, Defri Kurniawan