• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • Tagged with
  • 5
  • 5
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Sentiment analysis : text, pre-processing, reader views and cross domains

Haddi, Emma January 2015 (has links)
Sentiment analysis has emerged as a field that has attracted a significant amount of attention since it has a wide variety of applications that could benefit from its results, such as news analytics, marketing, question answering, knowledge management and so on. This area, however, is still early in its development where urgent improvements are required on many issues, particularly on the performance of sentiment classification. In this thesis, three key challenging issues affecting sentiment classification are outlined and innovative ways of addressing these issues are presented. First, text pre-processing has been found crucial on the sentiment classification performance. Consequently, a combination of several existing preprocessing methods is proposed for the sentiment classification process. Second, text properties of financial news are utilised to build models to predict sentiment. Two different models are proposed, one that uses financial events to predict financial news sentiment, and the other uses a new interesting perspective that considers the opinion reader view, as opposed to the classic approach that examines the opinion holder view. A new method to capture the reader sentiment is suggested. Third, one characteristic of financial news is that it stretches over a number of domains, and it is very challenging to infer sentiment between different domains. Various approaches for cross-domain sentiment analysis have been proposed and critically evaluated.
2

DEFENDING BERT AGAINST MISSPELLINGS

Nivedita Nighojkar (8063438) 06 April 2021 (has links)
Defending models against Natural Language Processing adversarial attacks is a challenge because of the discrete nature of the text dataset. However, given the variety of Natural Language Processing applications, it is important to make text processing models more robust and secure. This paper aims to develop techniques that will help text processing models such as BERT to combat adversarial samples that contain misspellings. These developed models are more robust than off the shelf spelling checkers.
3

Comparing LSTM and GRU for Multiclass Sentiment Analysis of Movie Reviews.

Sarika, Pawan Kumar January 2020 (has links)
Today, we are living in a data-driven world. Due to a surge in data generation, there is a need for efficient and accurate techniques to analyze data. One such kind of data which is needed to be analyzed are text reviews given for movies. Rather than classifying the reviews as positive or negative, we will classify the sentiment of the reviews on the scale of one to ten. In doing so, we will compare two recurrent neural network algorithms Long short term memory(LSTM) and Gated recurrent unit(GRU). The main objective of this study is to compare the accuracies of LSTM and GRU models. For training models, we collected data from two different sources. For filtering data, we used porter stemming and stop words. We coupled LSTM and GRU with the convolutional neural networks to increase the performance. After conducting experiments, we have observed that LSTM performed better in predicting border values. Whereas, GRU predicted every class equally. Overall GRU was able to predict multiclass text data of movie reviews slightly better than LSTM. GRU was computationally expansive when compared to LSTM.
4

Combining Lexicon- and Learning-based Approaches for Improved Performance and Convenience in Sentiment Classification

Sommar, Fredrik, Wielondek, Milosz January 2015 (has links)
Sentiment classification is the process of categorizing data into categories based on its polarity with a wide array of applications across several industries. This report examines a combination of two prominent approaches to sentiment classification using a lexicon of weighted words and machine learning respectively. These approaches are compared with the combined hybrid approach in order to give an account of their relative strengths and weaknesses. When run on a set of IMDb movie reviews the results indicate that the hybrid model performs better than the lexicon-based approach, in turn being outperformed by the learning-based approach. However, the gain in convenience brought on by eliminating the need for training data makes the hybrid model an appealing alternative to the other approaches with a slight trade-off in performance. / Att klassificera text i kategorier baserat på känslan de uttrycker är ett aktuellt område idag och kan tillämpas inom många industrier. Rapporten undersöker en kombination av de två framstående tillvägagångssätten till denna typ av klassificering baserade på ett lexikon med definerade ordvikter respektive maskininlärning. Denna hybridlösning jämförs mot de två andra tillvägagångssätten för att framlägga deras relativa styrkor och svagheter. På ett dataset med filmrecensioner från IMDb får maskininlärningsklassificeraren bäst resultat, följt av hybridlösningen och sist den lexikonbaserade lösningen. Trots det kan hybridlösningen vara att föredra i situationer där det är ogenomförbart eller oskäligt att förbereda träningsdata för maskininlärningsklassificeraren, dock med ett visst avkall på prestanda.
5

Sentiment Analysis Of IMDB Movie Reviews : A comparative study of Lexicon based approach and BERT Neural Network model

Domadula, Prashuna Sai Surya Vishwitha, Sayyaparaju, Sai Sumanwita January 2023 (has links)
Background: Movies have become an important marketing and advertising tool that can influence consumer behaviour and trends. Reading film reviews is an im- important part of watching a movie, as it can help viewers gain a general under- standing of the film. And also, provide filmmakers with feedback on how their work is being received. Sentiment analysis is a method of determining whether a review has positive or negative sentiment, and this study investigates a machine learning method for classifying sentiment from film reviews. Objectives: This thesis aims to perform comparative sentiment analysis on textual IMDb movie reviews using lexicon-based and BERT neural network models. Later different performance evaluation metrics are used to identify the most effective learning model. Methods: This thesis employs a quantitative research technique, with data analysed using traditional machine learning. The labelled data set comes from an online website called Kaggle (https://www.kaggle.com/datasets), which contains movie review information. Algorithms like the lexicon-based approach and the BERT neural networks are trained using the chosen IMDb movie reviews data set. To discover which model performs the best at predicting the sentiment analysis, the constructed models will be assessed on the test set using evaluation metrics such as accuracy, precision, recall and F1 score. Results: From the conducted experimentation the BERT neural network model is the most efficient algorithm in classifying the IMDb movie reviews into positive and negative sentiments. This model achieved the highest accuracy score of 90.67% over the trained data set, followed by the BoW model with an accuracy of 79.15%, whereas the TF-IDF model has 78.98% accuracy. BERT model has the better precision and recall with 0.88 and 0.92 respectively, followed by both BoW and TF-IDF models. The BoW model has a precision and recall of 0.79 and the TF-IDF has a precision of 0.79 and a recall of 0.78. And also the BERT model has the highest F1 score of 0.88, followed by the BoW model having a F1 score of 0.79 whereas, TF-IDF has 0.78. Conclusions: Among the two models evaluated, the lexicon-based approach and the BERT transformer neural network, the BERT neural network is the most efficient, having a good performance score based on the measured performance criteria.

Page generated in 0.0453 seconds