• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 10
  • 7
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 25
  • 25
  • 12
  • 8
  • 8
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

The Impact of the Retrieval Text Set for Text Sentiment Classification With the Retrieval-Augmented Language Model REALM / Effekten av hämtningstextsetet för sentimenttextklassificering med den hämtningsförstärkta språkmodellen REALM

Blommegård, Oscar January 2023 (has links)
Large Language Models (LLMs) have demonstrated impressive results across various language technology tasks. By training on large corpora of diverse text collections from the internet, these models learn to process text effectively, allowing them to acquire comprehensive world knowledge. However, this knowledge is stored implicitly in the parameters of the model, and it is necessary to train ever-larger networks to capture more information. Retrieval-augmented language models have been proposed as a way of improving the interpretability and adaptability of normal language models by utilizing a separate retrieval text set during application time. These models have demonstrated state-of-the-art results on knowledge-intensive tasks such as question-answering and fact-checking. However, their effectiveness in text classification remains unexplored. This study investigates the impact of the retrieval text set on the performance of the retrieval-augmented language model REALM model for sentiment text classification tasks. The results indicate that the addition of retrieval text data fails to improve the prediction capabilities of REALM for sentiment text classification tasks. This outcome is mainly due to the difference in functionality of the retrieval mechanisms during pre-training and fine-tuning. During pre-training, the neural knowledge retriever focuses on retrieving factual knowledge such as dates, cities and names to enhance the prediction of the model. During fine-tuning, the retriever aims to retrieve texts that can strengthen the prediction of the text sentiment classification task. The findings suggest that retrieval models may hold limited potential to enhance performance for text sentiment classification tasks. / Stora språkmodeller har visat imponerande resultat inom många olika språkteknologiska uppgifter. Genom att träna på stora textmängder från internet lär sig dessa modeller att effektivt processa text, vilket gör att de kan förvärva omfattande världskunskap. Denna kunskap lagras emellertid implicit i modellernas parametrar, och det är nödvändigt att träna allt större nätverk för att fånga mer information. Hämtningsförstärkta språkmodeller (retrieval-augmented language models) har föreslagits som ett sätt att förbättra tolknings- och anpassningsförmågan hos språkmodeller genom att använda en separat hämtningstextmängd (retrieval text set) vid prediktion. Dessa modeller har visat imponerande resultat på kunskapsintensiva uppgifter som frågebesvarande (question-answering) och faktakontroll. Deras effektivitet för textklassificering är dock outforskad. Denna studie undersöker effekten av hämtningstextmängden på prestandan för den hämtningsförstärkta språkmodellen REALM för sentimenttextklassificeringsuppgifter. Resultaten indikerar att användning av hämtningstextmängd vid predicering inte lyckas förbättra REALM prediktionsförmåga för sentimenttextklassificeringsuppgifter. Detta beror främst på skillnaden i funktionalitet hos hämtningsmekanismen under förträning och finjustering. Under förträningen fokuserar hämtningsmekanismen på att hämta fakta som datum, städer och namn för att förbättra modellens predicering. Under finjusteringen syftar hätmningsmekanismen till att hämta texter som kan stärka förutsägelsen av sentimenttextklassificeringsuppgiften. Resultaten tyder på att hämtningsförstärkta modeller kan ha begränsad potential att förbättra prestandan för sentimenttextklassificeringsuppgifter.
22

Aspektbaserad Sentimentanalys för Business Intelligence inom E-handeln / Aspect-Based Sentiment Analysis for Business Intelligence in E-commerce

Eriksson, Albin, Mauritzon, Anton January 2022 (has links)
Many companies strive to make data-driven decisions. To achieve this, they need to explore new tools for Business Intelligence. The aim of this study was to examine the performance and usability of aspect-based sentiment analysis as a tool for Business Intelligence in E-commerce. The study was conducted in collaboration with Ellos Group AB which supplied anonymous customer feedback data. The implementation consists of two parts, aspect extraction and sentiment classification. The f irst part, aspect extraction, was implemented using dependency parsing and various aspect grouping techniques. The second part, sentiment classification, was implemented using the language model KB-BERT, a Swedish version of the BERT model. The method for aspect extraction achieved a satisfactory precision of 79,5% but only a recall of 27,2%. Moreover, the result for sentiment classification was unsatisfactory with an accuracy of 68,2%. Although the results underperform expectations, we conclude that aspect-based sentiment analysis in general is a great tool for Business Intelligence. Both as a means of generating customer insights from previously unused data and to increase productivity. However, it should only be used as a supportive tool and not to replace existing processes for decision-making. / Många företag strävar efter att fatta datadrivna beslut. För att åstadkomma detta behöver de utforska nya metoder för Business Intelligence. Syftet med denna studie var att undersöka prestandan och användbarheten av aspektbaserad sentimentanalys som ett verktyg för Business Intelligence inom e-handeln. Studien genomfördes i samarbete med Ellos Group AB som tillhandahöll data bestående av anonym kundfeedback. Implementationen består av två delar, aspektextraktion och sentimentklassificering. Aspektextraktion implementerades med hjälp av dependensparsning och olika aspektgrupperingstekniker. Sentimentklassificering implementerades med hjälp av språkmodellen KB-BERT, en svensk version av BERT. Metoden för aspektextraktion uppnådde en tillfredsställande precision på 79,5% men endast en recall på 27,2%. Resultatet för sentimentklassificering var otillfredsställande med en accuracy på 68,2%. Även om resultaten underpresterar förväntningarna drar vi slutsatsen att aspektbaserad sentimentanalys i allmänhet är ett bra verktyg för Business Intelligence. Både som ett sätt att generera kundinsikter från tidigare oanvända data och som ett sätt att öka produktiviteten. Det bör dock endast användas som ett stödjande verktyg och inte ersätta befintliga processer för beslutsfattande.
23

透過圖片標籤觀察情緒字詞與事物概念之關聯 / An analysis on association between emotion words and concept words based on image tags

彭聲揚, Peng, Sheng-Yang Unknown Date (has links)
本研究試圖從心理學出發,探究描述情緒狀態的分類方法為何, 為了進行情緒與語意的連結,我們試圖將影像當作情緒狀態的刺激 來源,針對Flickr網路社群所共建共享的內容進行抽樣與觀察,使 用心理學研究中基礎的情緒字詞與詞性變化,提取12,000張帶有字 詞標籤的照片,進行標籤字詞與情緒分類字詞共現的計算、關聯規則 計算。同時,透過語意差異量表,提出了新的偏向與強度的座標分類 方法。 透過頻率門檻的過濾、詞性加註與詞幹合併字詞的方法,從 65983個不重複的文字標籤中,最後得到272個帶有情緒偏向的事物 概念字詞,以及正負偏向的情緒關聯規則。為了透過影像驗證這些字 詞是否與影像內容帶給人們的情緒狀態有關聯,我們透過三種查詢 管道:Flickr單詞查詢、google image單詞查詢、以及我們透過照片 標籤綜合指標:情緒字詞比例、社群過濾參數來選定最後要比較的 42張照片。透過語意差異量表,測量三組照片在136位使用者的答案 中,是否能吻合先前提出的強度-偏向模型。 實驗結果發現,我們的方法和google image回傳的結果類似, 使用者問卷調查結果支持我們的方法對於正負偏向的判定,且比 google有更佳的強弱分離程度。 / This study attempts to proceed from psychology to explore the emotional state of the classification method described why, in order to be emotional and semantic links, images as we try to stimulate the emotional state of the source, the Internet community for sharing Flickr content sampling and observation, using basic psychological research in terms of mood changes with the parts of speech, with word labels extracted 12,000 photos, label and classification of words and word co-occurrence of emotional computing, computing association rules. At the same time, through the semantic differential scale, tend to put forward a new classification of the coordinates and intensity. Through the frequency threshold filter, filling part of speech combined with the terms of the method stems from the 65,983 non-duplicate text labels, the last 272 to get things with the concept of emotional bias term, and positive and negative emotions tend to association rules. In order to verify these words through images is to bring people's emotional state associated with our pipeline through the three sources: Flickr , google image , and photos through our index labels: the proportion of emotional words, the community filtering parameters to select the final 42 photos to compare. Through the semantic differential scale, measuring three photos in 136 users of answers, whether the agreement made earlier strength - bias model. Experimental results showed that our methods and google image similar to the results returned, the user survey results support our approach to determine the positive and negative bias, and the strength of better than google degree of separation.
24

Sentiment-Driven Topic Analysis Of Song Lyrics

Sharma, Govind 08 1900 (has links) (PDF)
Sentiment Analysis is an area of Computer Science that deals with the impact a document makes on a user. The very field is further sub-divided into Opinion Mining and Emotion Analysis, the latter of which is the basis for the present work. Work on songs is aimed at building affective interactive applications such as music recommendation engines. Using song lyrics, we are interested in both supervised and unsupervised analyses, each of which has its own pros and cons. For an unsupervised analysis (clustering), we use a standard probabilistic topic model called Latent Dirichlet Allocation (LDA). It mines topics from songs, which are nothing but probability distributions over the vocabulary of words. Some of the topics seem sentiment-based, motivating us to continue with this approach. We evaluate our clusters using a gold dataset collected from an apt website and get positive results. This approach would be useful in the absence of a supervisor dataset. In another part of our work, we argue the inescapable existence of supervision in terms of having to manually analyse the topics returned. Further, we have also used explicit supervision in terms of a training dataset for a classifier to learn sentiment specific classes. This analysis helps reduce dimensionality and improve classification accuracy. We get excellent dimensionality reduction using Support Vector Machines (SVM) for feature selection. For re-classification, we use the Naive Bayes Classifier (NBC) and SVM, both of which perform well. We also use Non-negative Matrix Factorization (NMF) for classification, but observe that the results coincide with those of NBC, with no exceptions. This drives us towards establishing a theoretical equivalence between the two.
25

All Negative on the Western Front: Analyzing the Sentiment of the Russian News Coverage of Sweden with Generic and Domain-Specific Multinomial Naive Bayes and Support Vector Machines Classifiers / På västfronten intet gott: attitydanalys av den ryska nyhetsrapporteringen om Sverige med generiska och domänspecifika Multinomial Naive Bayes- och Support Vector Machines-klassificerare

Michel, David January 2021 (has links)
This thesis explores to what extent Multinomial Naive Bayes (MNB) and Support Vector Machines (SVM) classifiers can be used to determine the polarity of news, specifically the news coverage of Sweden by the Russian state-funded news outlets RT and Sputnik. Three experiments are conducted.  In the first experiment, an MNB and an SVM classifier are trained with the Large Movie Review Dataset (Maas et al., 2011) with a varying number of samples to determine how training data size affects classifier performance.  In the second experiment, the classifiers are trained with 300 positive, negative, and neutral news articles (Agarwal et al., 2019) and tested on 95 RT and Sputnik news articles about Sweden (Bengtsson, 2019) to determine if the domain specificity of the training data outweighs its limited size.  In the third experiment, the movie-trained classifiers are put up against the domain-specific classifiers to determine if well-trained classifiers from another domain perform better than relatively untrained, domain-specific classifiers.  Four different types of feature sets (unigrams, unigrams without stop words removal, bigrams, trigrams) were used in the experiments. Some of the model parameters (TF-IDF vs. feature count and SVM’s C parameter) were optimized with 10-fold cross-validation.  Other than the superior performance of SVM, the results highlight the need for comprehensive and domain-specific training data when conducting machine learning tasks, as well as the benefits of feature engineering, and to a limited extent, the removal of stop words. Interestingly, the classifiers performed the best on the negative news articles, which made up most of the test set (and possibly of Russian news coverage of Sweden in general).

Page generated in 0.1437 seconds