Abusive Language Detection is an important NLP task that can lead to improved content moderation and is particularly crucial for a safer online experience for all. Most online spaces, such as social media or video games, employ some form of content moderation and allow users to report instances of abusive language but fall short when it comes to real-time detection and intervention.
Our work focuses on abusive language detection in social online domains. We begin by exploring the use of pre-trained transformer models for this task and provide a set of baseline results. We introduce an offline filtering step that identifies and removes non-keywords using pointwise mutual information to address linguistic challenges such as domain-specific language and misspelled words. This process emphasizes the remaining words and builds a stronger association between word-category pairs. We also explore an online filtering framework that aims to replicate and remove the offline filtering step by utilizing transformer-produced context-aware embeddings to score words. Identifying and removing non-keywords can significantly improve performance for this task but relies on knowledge of word-category co-occurrences.
Identifer | oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/49336 |
Date | 25 September 2024 |
Creators | Saravanan, Duruvan |
Contributors | Plummer, Bryan |
Source Sets | Boston University |
Language | en_US |
Detected Language | English |
Type | Thesis/Dissertation |
Rights | Attribution 4.0 International, http://creativecommons.org/licenses/by/4.0/ |
Page generated in 0.0015 seconds