QUANTUM-ASSISTED FEATURE SELECTION FOR IMPROVING PREDICTION MODEL ACCURACY ON LARGE AND IMBALANCED DATASETS
Home Research Details
Safii Safii, Mochamad Wahyudi, Dedy Hartama

QUANTUM-ASSISTED FEATURE SELECTION FOR IMPROVING PREDICTION MODEL ACCURACY ON LARGE AND IMBALANCED DATASETS

0.0 (0 ratings)

Introduction

Quantum-assisted feature selection for improving prediction model accuracy on large and imbalanced datasets. Improve ML accuracy on large, imbalanced datasets using quantum-assisted feature selection. This study applies Simulated Annealing for QUBO, integrating SMOTE for superior prediction performance.

0
67 views

Abstract

One of the biggest obstacles to creating precise machine learning models is choosing representative and pertinent characteristics from big, unbalanced datasets. While too many features raise the risk of overfitting and computational expense, class imbalance frequently results in decreased accuracy and bias. The Simulated Annealing technique is used in this study to tackle a Quadratic Unconstrained Binary Optimization (QUBO) problem that is formulated as a quantum-assisted feature selection method to handle these problems. The technique seeks to reduce inter-feature redundancy and the number of selected features. There are 102,487 samples in the majority class and 11,239 in the minority class, totaling 28 characteristics in the experimental dataset. Nine ideal features were found during the feature selection method (12, 14, 15, 22, 23, 24, 25, 27, and 28). Ten-fold cross-validation was used to assess a Random Forest Classifier that was trained using an 80:20 split. With precision, recall, f1-score, and accuracy all hitting 1.00, the suggested QUBO+SMOTE method demonstrated exceptional performance. Comparatively, QUBO without SMOTE performed worse with accuracy 0.95 and minority-class f1-score of only 0.71, whereas a traditional Recursive Feature Elimination (RFE) approach obtained accuracy 0.97 with minority-class f1-score of 0.94. These findings indicate that QUBO can reduce dimensionality and address class imbalance which requires its integration with SMOTE. This study demonstrates how quantum computing can enhance the effectiveness and efficiency of machine learning, especially for large-scale imbalanced datasets


Review

The paper "QUANTUM-ASSISTED FEATURE SELECTION FOR IMPROVING PREDICTION MODEL ACCURACY ON LARGE AND IMBALANCED DATASETS" addresses a critical challenge in machine learning: effectively selecting features from high-dimensional, imbalanced datasets to prevent overfitting, reduce computational costs, and enhance model generalization. The authors propose a novel quantum-inspired approach to feature selection, leveraging Simulated Annealing to solve a Quadratic Unconstrained Binary Optimization (QUBO) problem. This framework aims to identify a parsimonious yet representative subset of features by minimizing inter-feature redundancy, thereby contributing to more accurate and robust prediction models, particularly in scenarios complicated by significant class imbalance. The methodology centers on formulating feature selection as a QUBO problem, which is then tackled using a classical Simulated Annealing algorithm, presented as a "quantum-assisted" method due to the QUBO formulation's relevance in quantum computing. The study utilizes a substantial, highly imbalanced dataset with 28 initial features (102,487 majority vs. 11,239 minority samples), demonstrating the practical relevance of their approach. A key aspect of their work is the necessary integration of the QUBO-selected features with the Synthetic Minority Over-sampling Technique (SMOTE) to effectively counter the class imbalance. The performance of a Random Forest Classifier trained on the chosen features (9 out of 28) is rigorously evaluated using 10-fold cross-validation with an 80:20 split, providing a robust assessment of the model's predictive capabilities across standard classification metrics. The experimental results showcase the considerable advantages of the proposed QUBO+SMOTE method. It reports remarkable performance metrics, including precision, recall, f1-score, and accuracy, all achieving 1.00. This highly impressive outcome, while potentially raising questions about the dataset's inherent separability or potential for data leakage, nonetheless strongly highlights the efficacy of the combined approach. Crucially, the paper demonstrates that QUBO alone, without SMOTE, performed significantly worse (accuracy 0.95, minority-class f1-score 0.71), underscoring the vital role of imbalance handling. Furthermore, the QUBO+SMOTE method significantly outperforms a traditional Recursive Feature Elimination (RFE) baseline, which achieved an accuracy of 0.97 and a minority-class f1-score of 0.94. These findings collectively support the authors' claim that quantum-assisted optimization, when integrated with appropriate data preprocessing techniques like SMOTE, can substantially enhance machine learning model performance on challenging large-scale and imbalanced datasets.


Full Text

You need to be logged in to view the full text and Download file of this article - QUANTUM-ASSISTED FEATURE SELECTION FOR IMPROVING PREDICTION MODEL ACCURACY ON LARGE AND IMBALANCED DATASETS from JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer) .

Login to View Full Text And Download

Comments


You need to be logged in to post a comment.