• Login
    View Item 
    •   USU-IR Home
    • Faculty of Computer Science and Information Technology
    • Department of Information Technology
    • Undergraduate Theses
    • View Item
    •   USU-IR Home
    • Faculty of Computer Science and Information Technology
    • Department of Information Technology
    • Undergraduate Theses
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Deteksi Spam pada Media Sosial X Berdasarkan Post dan Repost dengan Menggunakan Metode Random Forest Classifier

    Spam Detection on Social Media X Based on Post and Repost Using Random Forest Classifier

    Thumbnail
    View/Open
    Cover (650.2Kb)
    Fulltext (2.238Mb)
    Date
    2024
    Author
    Sinaga, Suryana Meissarah Zaini
    Advisor(s)
    Jaya, Ivan
    Purnamasari, Fanindia
    Metadata
    Show full item record
    Abstract
    In the era of globalization, technology is rapidly advancing, it makes communication be easier through social media. One of the most widely used platform is X, formerly known as Twitter. X has had a huge impact on industry, business, and politic, with 19.5 million users in Indonesia out of a global total of 500 million. However, its popularity also attracts spammers who engage in activities such as political campaigns, dissemination of misleading information, and irrelevant promotions. Spam, defined as unwanted mass messages, disrupts user privacy and convenience. Therefore, research is needed to detect spam and non-spam posts, in order to enhance user the convenience and security of the users. This study aims to detect Indonesian-language spam on social media X based on posts and reposts using the Random Forest Classifier and TF-IDF. The study use 2800 data posts and reposts from X user accounts. Preprocessing stages included removing unwanted variables, emojis, change words to lowercase, removing punctuation or symbols, normalization, stop-word removal, and tokenization. The research employed TF-IDF for word embedding to convert words in the data into vector, which will be identified using the Random Forest Classifier method. The evaluation methods of this research is Confusion Matrix, resulting in an accuracy of 0.97. Based on the evaluation outcomes, it can be concluded that the algorithm used in this study effectively detects spam posts and reposts with high performance.
    URI
    https://repositori.usu.ac.id/handle/123456789/96527
    Collections
    • Undergraduate Theses [767]

    Repositori Institusi Universitas Sumatera Utara (RI-USU)
    Universitas Sumatera Utara | Perpustakaan | Resource Guide | Katalog Perpustakaan
    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of USU-IRCommunities & CollectionsBy Issue DateTitlesAuthorsAdvisorsKeywordsTypesBy Submit DateThis CollectionBy Issue DateTitlesAuthorsAdvisorsKeywordsTypesBy Submit Date

    My Account

    LoginRegister

    Repositori Institusi Universitas Sumatera Utara (RI-USU)
    Universitas Sumatera Utara | Perpustakaan | Resource Guide | Katalog Perpustakaan
    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV