Deteksi Cyberbullying pada Komentar Media Sosial Berbahasa Indonesia Menggunakan Pendekatan Hibrida IndoBERTweet- BiLSTM

Reza Ramadhon Aditya, Arry Maulana Syarif

Reza Ramadhon Aditya, Arry Maulana Syarif

Informatics

Building of Informatics, Technology and Science (BITS)

0.0 (0 ratings)

Introduction

Deteksi cyberbullying pada komentar media sosial berbahasa indonesia menggunakan pendekatan hibrida indobertweet- bilstm. Deteksi otomatis cyberbullying di media sosial Indonesia menggunakan model hibrida IndoBERTweet-BiLSTM. Capai F1-Score 87.53% untuk identifikasi akurat komentar berbahaya.

49 views

Abstract

Cyberbullying on Indonesian-language social media has become a serious issue with significant psychological and social consequences, necessitating the development of reliable automated detection systems. However, the informal, ambiguous, and highly contextual nature of social media language, including the frequent use of slang and sarcasm, poses substantial challenges for conventional text classification approaches. This study proposes a hybrid cyberbullying detection model that integrates the domain-specific pre-trained language model IndoBERTweet with a Bidirectional Long Short-Term Memory (BiLSTM) architecture. IndoBERTweet is employed to generate contextualized semantic representations aligned with the linguistic characteristics of Indonesian Twitter data, while BiLSTM is utilized to capture bidirectional sequential dependencies at the sentence level. Experiments were conducted using a publicly available, manually annotated Indonesian Twitter dataset consisting of 13,091 samples, which were reformulated into a binary classification scheme. To address class imbalance, a combination of class weighting and label smoothing was applied during model training. Model performance was evaluated using Accuracy, Precision, Recall, F1-Score, ROC-AUC, and PR-AUC metrics. Experimental results show that the IndoBERTweet–BiLSTM model achieved the best performance with an F1-Score of 87.53%, Recall of 88.80%, Precision of 86.31%, ROC-AUC of 92.91%, and PR-AUC of 94.25%. This performance consistently outperforms baseline models based on IndoBERT and IndoBERT-p1 with identical architectural configurations. These findings highlight the critical role of domain alignment in enhancing cyberbullying detection performance for Indonesian social media text.

Review

The paper "Deteksi Cyberbullying pada Komentar Media Sosial Berbahasa Indonesia Menggunakan Pendekatan Hibrida IndoBERTweet-BiLSTM" addresses a highly pertinent and challenging problem: the automated detection of cyberbullying on Indonesian-language social media. Given the significant psychological and social ramifications of cyberbullying, the development of robust detection systems is crucial. The authors correctly identify the inherent difficulties posed by the informal, ambiguous, and context-dependent nature of social media language, particularly the frequent use of slang and sarcasm in Indonesian. This study's central contribution lies in proposing a novel hybrid model that aims to overcome these challenges through a synergistic combination of domain-specific language modeling and sequence learning. Methodologically, the research presents a well-conceived approach. The integration of IndoBERTweet, a pre-trained language model specifically tailored for Indonesian Twitter data, is a strong methodological choice, ensuring that the model generates contextualized semantic representations aligned with the target domain's linguistic intricacies. This is effectively complemented by a Bidirectional Long Short-Term Memory (BiLSTM) architecture, which is adept at capturing the sequential dependencies crucial for understanding nuanced textual patterns at the sentence level. The experimental setup utilizes a publicly available, manually annotated Indonesian Twitter dataset of substantial size (13,091 samples), reformulated into a binary classification scheme. Furthermore, the authors demonstrate an awareness of practical challenges by employing class weighting and label smoothing to mitigate issues arising from class imbalance, thereby enhancing the reliability of the training process. The experimental results convincingly demonstrate the efficacy of the proposed IndoBERTweet-BiLSTM model. Achieving an F1-Score of 87.53%, alongside strong performance across other critical metrics such as Recall (88.80%), Precision (86.31%), ROC-AUC (92.91%), and PR-AUC (94.25%), indicates a highly effective detection system. Crucially, the model consistently outperforms established baseline models (IndoBERT and IndoBERT-p1) when configured identically, underscoring the significant advantage of the domain-aligned IndoBERTweet component. These findings provide compelling evidence for the critical role of domain-specific linguistic modeling in enhancing cyberbullying detection performance within the complex landscape of Indonesian social media text. This study represents a valuable advancement in the field and offers a robust foundation for further research and practical applications.

Full Text

You need to be logged in to view the full text and Download file of this article - Deteksi Cyberbullying pada Komentar Media Sosial Berbahasa Indonesia Menggunakan Pendekatan Hibrida IndoBERTweet- BiLSTM from Building of Informatics, Technology and Science (BITS) .

Comments

You need to be logged in to post a comment.

Top Blogs by Rating

Favorite Blog

Deteksi Cyberbullying pada Komentar Media Sosial Berbahasa Indonesia Menggunakan Pendekatan Hibrida IndoBERTweet- BiLSTM

Home Research Details

Reza Ramadhon Aditya, Arry Maulana Syarif