Identifikasi Pernyataan Mengandung Pelecehan Seksual Berdasarkan Komentar Media Sosial Menggunakan Indobert Embedding dan Algoritma Support Vector Machine
Identification of Statements Containing Sexual Harassment Based on Social Media Comments Using Indobert Embedding and Support Vector Machine Algorithm

Date
2024Author
Amanda, Christine
Advisor(s)
Jaya, Ivan
Arisandi, Dedy
Metadata
Show full item recordAbstract
The influence of social media that provides unlimited leeway for users to interact has led to an increase in incidents of online harassment in various forms. One of the most common forms of harassment is sexual harassment through comments containing sexualized and demeaning language, which often appear in response to posts on social media, especially those shared by women. Such comments certainly create an uncomfortable situation for the recipient of the information. Unfortunately, identifying comments that contain elements of sexual harassment is still difficult. This difficulty is not only triggered by the number of comments that need to be identified, but also by the differences in subjectivity that affect the consistency and credibility in determining whether a comment is sexually harassing or not. As a result, it takes a long time for the manual identification process. Thus, it is necessary to design a system that has the ability to identify comments containing sexual harassment so that the issue of sexual harassment on social media can be managed and handled more effectively. In this research, a combination of IndoBERT Embedding method and Support Vector Machine algorithm is applied to identify Indonesian comments on social media platforms that include sexual harassment statements and do not include such statements. The data set used consists of 3500 data taken from comments on Instagram and YouTube social media. The evaluation result through Confusion Matrix shows an accuracy rate of 91.1%. By referring to the evaluation, it can be stated the results of the integration of the IndoBERT Embedding method as word embedding and the Support Vector Machine algorithm are able to effectively recognize comments that contain sexual harassment statements and do not contain sexual harassment statements.
Collections
- Undergraduate Theses [1181]