Show simple item record

dc.contributor.advisorNunavath, Vimala
dc.contributor.authorTabassum, Israt
dc.date.accessioned2024-06-27T16:42:59Z
dc.date.available2024-06-27T16:42:59Z
dc.date.issued2024
dc.identifierno.usn:wiseflow:7125873:59380203
dc.identifier.urihttps://hdl.handle.net/11250/3136316
dc.description.abstractSocial media sites like Facebook, Instagram, Twitter, LinkedIn, and Facebook are important channels for content creation and distribution that have a big impact on business, politics, and interpersonal relationships. People like to spend their free time using social media by uploading their pictures, post, videos for sharing their daily activities with other people, and view other people’s activities. Due to their concise and captivating format, short videos have become more and more popular on these platforms recently. However, they frequently receive comments known that are mixed positive and negative and take the form of text, images, and multimodal data as memes. This makes it more difficult to recognize and deal with instances of cyberbullying. The problem like cyberbullying, a serious problem where victims of abusive online communication can experience despair, anxiety, and loneliness. Many studies have been conducted on the classification of cyberbullying. However, the majority of these studies concentrated on binary classification on multi-modal data or multi-classification on textual data. Despite significant advancements in deep learning techniques for cyberbullying classification, there was a gap in the multi-class classification of cyberbullying using multimodal data. The goal of this thesis was to close this gap by accurately classifying cyberbullying across multi-modal data types using a hybrid (RoBERTa+ViT) deep learning approach that combined models, Vision Transformer (ViT) for images and RoBERTa for text of the multi-modal data. Two datasets were used in this thesis to classify cyberbullying: a private dataset that was collected from comments on social media videos and a public dataset that was downloaded from existing re- search. In this thesis, three sets of experiments were conducted for multi-class classification of cyber- bullying. The first set of experiments were done on using text data by deep learning models such as LSTM, GRU, RoBERTa, BERT, DistilBERT, and Hybrid (CNN+LSTM) model for public data and RoBERTa model for private dataset, the second set of experiments were done on using image data by using deep learning models such as ResNET-50, CNN and ViT model for public dataset and ViT model for pri- vate dataset, and the last set of experiments were performed on using multi-modal data (i.e., memes) of both public and private dataset using hybrid deep learning models such as Hybrid (RoBERTa+ViT) model. Using the public dataset, we trained nine deep learning models: ResNET-50, CNN and ViT for image data, and LSTM, GRU, RoBERTa, BERT, DistilBERT, and Hybrid (CNN+LSTM) model for textual data. The experimental results showed that the ViT model obtained an accuracy of 99.5%, F1-score of 0.995, for multi-class classification on image data. Whereas RoBERTa model performed better when compared to other models on textual data with an accuracy of 99.2% and F1-score of 0.992. With this outcome, for private data, RoBERTa model for text data and ViT model for image data were developed. As a result, the RoBERTa model attained F1-score of 0.986 and an accuracy of 98.6%. Whereas, for image data, the ViT model achieved F1-score of 0.9319 and an accuracy of 93.20%. For multi-modal data, a hybrid model with a late fusion module (Roberta+ViT) was developed that combined RoBERTa and ViT model to classify the multi-class classification of cyberbullying and attained an accuracy of 99.24%, and 96.01% and F1-score 0.992, and 0.9599 respectively. From the obtained results, it can be concluded that deep learning models like RoBERTa and Vision Transformer (ViT) models are very effective for classifying various forms of cyberbullying. RoBERTa works well with text, producing nearly perfect results, whereas ViT is particularly strong at handling images. Furthermore, when these models were combined into a hybrid (RoBERTa+ViT) model, they became even more effective at classifying cyberbullying in multi-modal data, such as memes.
dc.languageeng
dc.publisherUniversity of South-Eastern Norway
dc.titleA Hybrid Deep-Learning Approach for Multi-class Cyberbullying Classification of Cyberbullying using Social Medias’ Multi-modal Data
dc.typeMaster thesis


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record