Download PDF Open PDF in browser Current version

Hate Speech Analysis and Moderation on Twitter Data Using BERT and Ensemble Techniques

EasyChair Preprint 13799, version 1

Versions: 12→history

6 pages•Date: July 3, 2024

Sukriti Narang, Sejal Karki, Suhani Chauhan, Keshav Garg and Surender Samant

Abstract

Twitter, a popular social media platform, has be- come a platform for spreading hate speech, racism, sexism and other sentiments. This has raised ethical, social, and legal concerns, and researchers have developed methods to identify and classify hate speech. This paper investigates Twitter discourse with a focus on detecting hate speech, a prevalent form of online expression. The study utilizes a curated dataset to analyze negative tweets, employing the BERT model and ensemble techniques in the model, trained to detect and classify hateful content. The best classification results were achieved by BERT and CatBoost with hyper-parameter tuning, yielding an accuracy of 92% and 91.1% on the test data, respectively. Additionally, response strategies are devised to moderate content and foster constructive engagement among users. Sentiment analysis is employed to explore the emotional landscape of Twitter discourse. Furthermore, the research is expanded by utilizing clustering to classify hate speech, aiming for a detailed characterization of online hate speech to enhance our understanding. The analysis encompasses a dedicated exploration of racism and sexism detection, identifying tweets exhibiting bias. The study culminates in providing a comprehensive understanding of online discourse, with potential applications spanning content moderation, user engagement strategies, and the cultivation of a more positive digital space.

Keyphrases: BERT, CatBoost, Classification, Ensemble Techniques, Moderation, Sentiment, cluster, deep learning, hyper-parameter

Links:

https://easychair.org/publications/preprint/p3Nn

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:13799,
  author    = {Sukriti Narang and Sejal Karki and Suhani Chauhan and Keshav Garg and Surender Samant},
  title     = {Hate Speech Analysis and Moderation on Twitter Data Using BERT and Ensemble Techniques},
  howpublished = {EasyChair Preprint 13799},
  year      = {EasyChair, 2024}}

Download PDF Open PDF in browser Current version