Predicting AI Job Salary Classes Through a Comparative Study of Machine Learning Algorithms

Vincent Vincent, Robet Robet, Edi Wijaya

Vincent Vincent, Robet Robet, Edi Wijaya

Informatics

JURNAL RISET KOMPUTER (JURIKOM)

0.0 (0 ratings)

Introduction

Predicting ai job salary classes through a comparative study of machine learning algorithms . Predict AI job salaries (Low, Medium, High) using machine learning. Logistic Regression achieves 85.4% accuracy. Experience, remote work, & key skills are vital for AI labor market analysis.

21 views

Abstract

The rapid growth of Artificial Intelligence (AI) has brought significant transformation to the global job market, particularly in salary structures across various AI-related professions. This study aims to classify AI job salaries into three categories—Low, Medium, and High—using supervised machine learning algorithms. The dataset, sourced from Kaggle, combines two real-world datasets featuring key attributes such as experience level, job type, education level, technical skills, remote work ratio, and salary in USD. Preprocessing techniques include One-Hot Encoding for categorical data, StandardScaler for normalization, and MultiLabelBinarizer to handle multi-skill entries. Four machine learning models—Logistic Regression, Random Forest, Gradient Boosting, and XGBoost—were trained and evaluated using consistent pipelines, with evaluation metrics including accuracy, precision, recall, and F1-score, applying macro-averaging to address class imbalance. Logistic Regression achieved the highest performance with 85.4% accuracy and 77.6% F1-score, followed by Gradient Boosting with 84.8% accuracy and 76.3% F1-score. High-salary classes were predicted with higher precision and recall than low-salary classes, indicating skewness in class distribution. Feature importance analysis shows that experience, remote work ratio, and key skills such as Python and SQL significantly affect prediction accuracy. This study demonstrates that traditional machine learning methods, when applied with appropriate preprocessing, can effectively support salary classification and labor market analysis in the AI domain.

Review

This study presents a timely and relevant investigation into predicting AI job salary classes using a comparative analysis of supervised machine learning algorithms. The authors clearly delineate their objective to categorize salaries into Low, Medium, and High, employing a well-structured methodology encompassing data preprocessing (One-Hot Encoding, StandardScaler, MultiLabelBinarizer) and the evaluation of four common models: Logistic Regression, Random Forest, Gradient Boosting, and XGBoost. A key strength is the consistent application of evaluation metrics, including macro-averaged accuracy, precision, recall, and F1-score, which appropriately addresses class imbalance concerns. The finding that Logistic Regression achieved the highest performance (85.4% accuracy, 77.6% F1-score) is notable, demonstrating the potential for simpler models to yield robust results in this domain, while the identification of experience, remote work ratio, and specific technical skills as significant predictors offers valuable insights for both job seekers and recruiters. While the study provides a solid foundation, several areas warrant further consideration. The reliance on a Kaggle dataset, albeit combining two "real-world" sources, raises questions about its overall representativeness and potential biases concerning geographic scope, industry sectors, or temporal relevance. A more detailed exposition of the dataset's provenance, timeframes, and potential limitations would enhance the generalizability of the findings. Furthermore, despite using macro-averaging, the abstract notes that "High-salary classes were predicted with higher precision and recall than low-salary classes, indicating skewness." This suggests that further advanced techniques for addressing class imbalance, beyond just evaluation metrics (e.g., weighted loss functions, over/under-sampling techniques like SMOTE on the training data), could be explored to improve prediction parity across all salary classes. It would also be beneficial to understand if the "consistent pipelines" included exhaustive hyperparameter tuning for all models, especially for the more complex ensemble methods, to ensure that Logistic Regression's superior performance wasn't due to sub-optimal tuning of its counterparts. In conclusion, this research makes a valuable contribution to understanding the dynamics of AI job salaries by effectively applying traditional machine learning techniques. The clear demonstration that experience, remote work ratio, and specific skills are critical determinants of salary class provides actionable intelligence for the AI labor market. Future work could strengthen this foundation by diversifying data sources to improve external validity, exploring more sophisticated class imbalance strategies, and potentially investigating the impact of unmentioned factors like company size or regional economic differences. Such enhancements would further solidify the practical implications of this study for career guidance, workforce planning, and policy-making within the rapidly evolving AI landscape.

Full Text

You need to be logged in to view the full text and Download file of this article - Predicting AI Job Salary Classes Through a Comparative Study of Machine Learning Algorithms from JURNAL RISET KOMPUTER (JURIKOM) .

Comments

You need to be logged in to post a comment.

Top Blogs by Rating

Favorite Blog

Predicting AI Job Salary Classes Through a Comparative Study of Machine Learning Algorithms

Home Research Details

Vincent Vincent, Robet Robet, Edi Wijaya