Global ETD Search

Return to search

Cyberbullying detection in Urdu language using machine learning

Yes / Cyberbullying has become a significant problem with the surge in the use of social media. The most basic way to prevent cyberbullying on these social media platforms is to identify and remove offensive comments. However, it is hard for humans to read and remove all the comments manually. Current research work focuses on using machine learning to detect and eliminate cyberbullying. Although most of the work has been conducted on English texts to detect cyberbullying, limited to no work can be found in Urdu. This paper aims to detect cyberbullying from the users' comments posted in Urdu on Twitter using machine learning and Natural Language Processing (NLP) techniques. To the best of our knowledge, cyberbullying detection on Urdu text comments has not been performed due to the lack of a publicly available standard Urdu dataset. In this paper, we created a dataset of offensive user-generated Urdu comments from Twitter. The comments in the dataset are classified into five categories. n-gram techniques are used to extract features at character and word levels. Various supervised machine-learning techniques are applied to the dataset to detect cyberbullying. Evaluation metrics such as precision, recall, accuracy and F1 scores are used to analyse the performance of machine learning techniques.

Cyberbullying

Machine learning

Natural language processing

Twitter

Identifer	oai:union.ndltd.org:BRADFORD/oai:bradscholars.brad.ac.uk:10454/19312
Date	11 January 2023
Creators	Khan, Sara, Qureshi, Amna
Publisher	IEEE
Source Sets	Bradford Scholars
Language	English
Detected Language	English
Type	Conference paper, Accepted manuscript
Rights	© 2022 IEEE. Reproduced in accordance with the publisher's self-archiving policy., Unspecified

Page generated in 0.0012 seconds

Cyberbullying detection in Urdu language using machine learning

Description

Links & Downloads

Tags

Additional Fields