Global ETD Search

21	Finfördelad Sentimentanalys : Utvärdering av neurala nätverksmodeller och förbehandlingsmetoder med Word2Vec / Fine-grained Sentiment Analysis : Evaluation of Neural Network Models and Preprocessing Methods with Word2Vec Phanuwat, Phutiwat January 2024 (has links) Sentimentanalys är en teknik som syftar till att automatiskt identifiera den känslomässiga tonen i text. Vanligtvis klassificeras texten som positiv, neutral eller negativ. Nackdelen med denna indelning är att nyanser går förlorade när texten endast klassificeras i tre kategorier. En vidareutveckling av denna klassificering är att inkludera ytterligare två kategorier: mycket positiv och mycket negativ. Utmaningen med denna femklassificering är att det blir svårare att uppnå hög träffsäkerhet på grund av det ökade antalet kategorier. Detta har lett till behovet av att utforska olika metoder för att lösa problemet. Syftet med studien är därför att utvärdera olika klassificerare, såsom MLP, CNN och Bi-GRU i kombination med word2vec för att klassificera sentiment i text i fem kategorier. Studien syftar också till att utforska vilken förbehandling som ger högre träffsäkerhet för word2vec. Utvecklingen av modellerna gjordes med hjälp av SST-datasetet, som är en känd dataset inom finfördelad sentimentanalys. För att avgöra vilken förbehandling som ger högre träffsäkerhet för word2vec, förbehandlades datasetet på fyra olika sätt. Dessa innefattar enkel förbehandling (EF), samt kombinationer av vanliga förbehandlingar som att ta bort stoppord (EF+Utan Stoppord) och lemmatisering (EF+Lemmatisering), samt en kombination av båda (EF+Utan Stoppord/Lemmatisering). Dropout användes för att hjälpa modellerna att generalisera bättre, och träningen reglerades med early stopp-teknik. För att utvärdera vilken klassificerare som ger högre träffsäkerhet, användes förbehandlingsmetoden som hade högst träffsäkerhet som identifierades, och de optimala hyperparametrarna utforskades. Måtten som användes i studien för att utvärdera träffsäkerheten är noggrannhet och F1-score. Resultaten från studien visade att EF-metoden presterade bäst i jämförelse med de andra förbehandlingsmetoderna som utforskades. Den modell som hade högst noggrannhet och F1-score i studien var Bi-GRU. / Sentiment analysis is a technique aimed at automatically identifying the emotional tone in text. Typically, text is classified as positive, neutral, or negative. The downside of this classification is that nuances are lost when text is categorized into only three categories. An advancement of this classification is to include two additional categories: very positive and very negative. The challenge with this five-class classification is that achieving high performance becomes more difficult due to the increased number of categories. This has led to the need to explore different methods to solve the problem. Therefore, the purpose of the study is to evaluate various classifiers, such as MLP, CNN, and Bi-GRU in combination with word2vec, to classify sentiment in text into five categories. The study also aims to explore which preprocessing method yields higher performance for word2vec. The development of the models was done using the SST dataset, which is a well-known dataset in fine-grained sentiment analysis. To determine which preprocessing method yields higher performance for word2vec, the dataset was preprocessed in four different ways. These include simple preprocessing (EF), as well as combinations of common preprocessing techniques such as removing stop words (EF+Without Stopwords) and lemmatization (EF+Lemmatization), as well as a combination of both (EF+Without Stopwords/Lemmatization). Dropout was used to help the models generalize better, and training was regulated with early stopping technique. To evaluate which classifier yields higher performance, the preprocessing method with the highest performance was used, and the optimal hyperparameters were explored. The metrics used in the study to evaluate performance are accuracy and F1-score. The results of the study showed that the EF method performed best compared to the other preprocessing methods explored. The model with the highest accuracy and F1-score in the study was Bi-GRU. fine-grained sentiment analysis machine learning word2vec MLP CNN Bi-GRU finfördelad sentimentanalys maskininlärning word2vec MLP CNN Bi-GRU Computer Sciences Datavetenskap (datalogi)
22	Coronavirus-Related Sentiment and Stock Prices : Measuring Sentiment Effects on Swedish Stock Indices / Coronavirus-relaterat sentiment och aktiepriser : En studie av sentimenteffekter på svenska aktieindex Piksina, Olga, Vernholmen, Patricia January 2020 (has links) This thesis examines the effect of coronavirus-related sentiment on Swedish stock market returns during the coronavirus pandemic. We study returns on the large cap and small cap price indices OMXSLCPI and OMXSSCPI during the period January 2, 2020 – April 30, 2020. Coronavirus sentiment proxies are constructed from news articles clustered into topics using latent Dirichlet allocation and scored through sentiment analysis. The impact of the sentiment proxies on the stock indices is then measured using a dynamic multiple regression model. The results show that the proxies representing fundamental changes in our model — Swedish Politics and Economic Policy — have a strongly significant impact on the returns of both indices, which is consistent with financial theory. We also find that sentiment proxies Sport and Coronavirus Spread are statistically significant and impact Swedish stock prices. This implies that coronavirus-related news influenced market sentiment in Sweden during the research period and could be exploited to uncover arbitrage. Finally, the amount of sentiment-inducing news published daily is shown to have an impact on stock price volatility. / Denna studie undersöker den effekt coronavirus-relaterat sentiment haft på avkastningen på svenska aktieindex under coronaviruspandemin. Vi studerar avkastningen på large cap- och small cap-prisindexen OMXSLCPI och OMXSSCPI under perioden 2 januari 2020 – 30 april 2020. Proxier för coronavirus-sentiment konstrueras från nyhetsartiklar som klustrats i ämnen genom latent Dirichlet-allokering och poängsatts genom sentimentanalys. Sentimentproxiernas påverkan på aktieindexen mäts sedan med en dynamisk multipel regressionsmodell. Resultaten visar att proxierna som representerar fundamentala förändringar i vår modell — svensk politik och ekonomisk policy — har en starkt signifikant inverkan på avkastningen på båda indexen, vilket är konsekvent med finansiell teori. Vi finner även att sentimentproxierna sport och spridning av coronaviruset är statistiskt signifikanta i sin påverkan på svenska aktiepriser. Detta innebär att coronavirus-relaterade nyheter påverkade marknadssentiment i Sverige under undersökningsperioden och skulle kunna användas för att upptäcka arbitrage. Slutligen visas mängden sentimentframkallande nyheter publicerade per dag ha en inverkan på aktieprisvolatilitet. market sentiment behavioural finance market efficiency coronavirus Swedish stock market text analytics sentiment analysis news mining marknadssentiment beteendefinans marknadseffektivitet coronaviruset svensk aktiemarknad textanalys sentimentanalys news mining Engineering and Technology Teknik och teknologier
23	Evaluation of Approaches for Representation and Sentiment of Customer Reviews / Utvärdering av tillvägagångssätt för representation och uppfattning om kundrecensioner Giorgis, Stavros January 2021 (has links) Classification of sentiment on customer reviews is a real-world application for many companies that offer text analytics and opinion extraction on customer reviews on different domains such as consumer electronics, hotels, restaurants, and car rental agencies. Natural Language Processing’s latest progress has seen the development of many new state-of-the-art approaches for representing the meaning of sentences, phrases, and words in the text using vector space models, so-called embeddings. In this thesis, we evaluated the most current and most popular text representation techniques against traditional methods as a baseline. The evaluation dataset consists of customer reviews from different domains with different lengths used by a text analysis company. Through a train dataset exploration, we evaluated which datasets were the most suitable for this specific task. Furthermore, we explored different techniques that could be used to alter a language model’s decisions without retraining it. Finally, all the methods were evaluated against their time performance and the resource requirements to present an overall experimental assessment that could potentially help the company decide which is the most appropriate technique to replace its system in a production environment. / Klassificeringen av attityd och känsloläge i kundrecensioner är en tillämpning med praktiskt värde för flera företag i marknadsanalysbranschen. Aktuell forskning i språkteknologi har etablerat vektorrum som standardrepresentation för ord, fraser och yttranden, så kallade embeddings. Denna uppsats utvärderar den senaste tidens mest framgångsrika textrepresentationsmodeller jämfört med mer traditionella vektorrum. Utvärdering görs genom att jämföra automatiska analyser med mänskliga bedömningar för kundrecensioner av varierande längd från olika domäner tillhandahållna av ett textanalysföretag. Inom ramen för studien har olika testmängder jämförts och olika sätt att modifera en språkmodells klassficering utan om träning. Alla modeller har också jämförts med avseende på resurs- och tidsåtgång för träning för att hjälpa uppdragsgivaren fatta beslut om vilken teknik som utgör den mest ändamålsenliga utvecklingsvägen för dess driftsatta system. machine learning nlp text analytics sentiment analysis transformers tfidf bow fasttext word2vec bert xlnet roberta maskininlärning nlp textanalys sentimentanalys transformatorer tfidf bow fasttext word2vec bert xlnet roberta Computer and Information Sciences Data- och informationsvetenskap
24	The Exposed Gender : The representation of trans gender in Czech media 2017-2020, a corpus-based discourse analysis Thál, Jonas January 2022 (has links) This thesis aims to explore the representation of transgender individuals in online Czech discourses. The study utilizes corpus linguistics to analyze news media and Facebook in order to understand the attitudes towards transgender individuals in these discourses. Previous research on gender in the Czech language and other languages is considered in order to contextualize the findings of this study. The research reflects on the Czech Press Act and Audio-visual Act, which provide guidelines for objectivity and balanced reporting in the media. Using data from Czech National Corpus the study aims to shed light on the representation of transgender individuals in online Czech discourses and provide exposure of potential hateful speech towards trans people or imbalance between the discourses. / Denna studie avser att undersöka representationer av transpersoner i webbaserade diskurser i den tjeckiska språkmiljön. I uppsatsen används korpuslingvistik för analys av nyhetstexter och Facebook med syftet att förstå vilka attityder mot transpersoner som förekommer i dessa diskurser. Tidigare lingvistisk forskning med tjeckiska korpusbaserade diskursanalyser i fokus tas i beräkning och kontextualiseras i denna studie. Tjeckisk lagstiftning kring media och objektivitetskrav reflekteras i forskningen. Denna uppsats ämnar exponera och analysera representationer av transpersoner i olika diskurser samt reflektera över huruvida kraven som den tjeckiska lagstiftningen ställer speglas i dessa onlinemiljöer. Corpus discourse gender transgender Czech media Facebook news sentiment analysis representation hate speech quantitative analysis Korpuslingvistik diskursanalys genus transgender tjeckiska media Facebook sentimentanalys representation hatspråk kvantitativ analys Specific Languages Studier av enskilda språk Gender Studies Genusstudier
25	Best Practices for Innovation Management. : A Study on Large Companies in Sweden. / God innovationsledningspraxis. : En studie om stora företag i Sverige. CELUKANOVS, ANDREJS, WATTLE BJÖRK, SEBASTIAN January 2019 (has links) The overall aim of this thesis was to identify and analyze good innovation management practices in Sweden’s most innovative large companies, excluding governmentally owned organizations. Out of 500 large organizations in Sweden, the top 25 most innovative companies have been ranked based upon over 7,000 printed press articles from 2018 available through Retriever Media. The companies are ranked by their innovations score which is calculated by the number of articles a company is mentioned in, adjusted to the company size, and multiplied with the mean sentiment score. The top 25 companies from the ranking was compared with 25 reference companies, active within the same industry based on the Swedish Standard Industrial Classification (SNI) number, that received a lower innovation score. Good innovation management practices were analyzed based on 14 qualitative interviews in 12 of the top 15 ranked companies and a quantitative survey responded by 20 top ranked and 17 reference companies. The interviews were semi structured with open ended questions to identify used practices, and the reasoning behind them. Spearman’s correlation method has been used to investigate if there was any correlation between the company’s innovation score, the mean performance score, and the mean importance score rated by respondents. The company case studies provide authentic examples on how and when different methods and concepts are used within industry. However, while theoretical frameworks often are strictly defined and described in solitary, the interviews have shown that when used within industry, it is rather the opposite. In many of the interviewed companies, frameworks and methods are modified, combined and constantly evolving. Aspects that the interviewees have expressed as important for an innovative company are: Innovation and change should be iterative, decentralized and started in small scale while receiving full support from top management. Examples of identified practices are: The innovation vision is used in the decision-making process for new ideas. Keywords connected to innovation are used for guiding new aspirations. There is an overall aim to become industry or/and digital leaders. Although the interviewed companies had similar innovation management practices, they were usually modified to fit within the company’s own organization and industry. The interviews contributed with interesting collection of practices within their authentic setting from which other companies could draw inspiration from. Lastly, a handbook was created describing how to conduct the innovation ranking annually, including a description of how to use the software as well as the required script of code. / Det övergripande syftet med denna uppsats har varit att identifiera och analysera hur ett antal stora och framgångsrika bolag bedriver innovationsledning. Av 500 svenska företag har de 25 mest innovativa rankats baserat på hur företagen framställts i över 7000 tryckta artiklar under 2018. I artiklarna som tagits fram via Retriever Media har företagen poängsatts efter antalet artiklar som de omnämnts i, korrigerat efter företagens storlek, multiplicerat med artiklarnas genomsnittliga sentimentvärde. De 25 högst rankade företagen jämfördes sedan med 25 referensföretag aktiva inom samma bransch enligt standarden för svensk näringsindelning (SNI). God innovationsledningspraxis identifierades och analyserades genom 14 intervjuer med 12 av de 15 högst rankade företagen, samt en enkätstudie som besvarades av 20 av de 25 högst rankade företagen och 17 av referensbolagen. Intervjuerna var semi strukturerade med öppna frågor för att identifiera den innovationsledningspraxis som företagen använder sig av samt bakomliggande resonemang. Spearmans rangkorrelation användes för att identifiera eventuella korrelationer mellan företagens innovationsrankning och hur företaget presterar med avseende på olika innovationsaspekter samt hur viktiga dessa aspekter anses. Analysen av innovationsledningspraxis resulterade i praktiska exempel på hur och när olika metoder, verktyg och strategier användes inom företagen. Managementteorier som kan uppfattas som strikta i litteraturen visade sig kombineras, modifieras och utvecklas i flera av de intervjuade företagen. Aspekter som företagen lyfte fram som viktiga var att innovation och förändring behöver ske iterativt, decentraliseras och startas småskaligt med full uppbackning av företagsledningen. Några av de olika sätt att framgångsrikt leda innovation som identifierats är att: Det finns en vision för hur för företaget ska jobba med innovation och denna vision ligger till grund för mycket av den beslutsfattande processen när det kommer till nya idéer. Nyckelord kopplade till olika innovationsmål används frekvent för att leda forskning och utveckling i rätt riktning. Det finns även ett övergripande mål om att bli det ledande företaget inom olika områden och näringsgrenar. Även om många av de intervjuade företagen hade liknande innovationsledningspraxis så var denna ofta modifierade för att passa det enskilda bolaget eller branschen. De intervjuade företagen bidrog med en stor mängd intressanta metoder och insikter som andra företag kan inspireras och dra nytta av för att förbättra sin innovationsledningsförmåga. Slutligen sammanställdes en handbok för att genomföra en innovationsrankning, inklusive hur man använder de programvaror som krävs samt all nödvändig kod för att möjliggöra en återkommande rankning av innovativa företag. Innovation Innovation Management Best Practices for Innovation Management Innovation Practices ISO 56002 Ten Types of Innovation Innovation Ranking Sweden’s Most Innovative Companies Organizational Innovativeness Innovation Management System Sentiment analysis Opinion mining. Innovation Innovationsledning Praxis inom Innovationsledning ISO 56002 Ten Types of Innovation Innovationrankning Sveriges Mest Innovativa Företag Innovationsledningssystem Sentimentanalys. Engineering and Technology Teknik och teknologier
26	Feature Selection for Sentiment Analysis of Swedish News Article Titles / Val av datarepresentation för sentimentsanalys av svenska nyhetsrubriker Dahl, Jonas January 2018 (has links) The aim of this study was to elaborate the possibilities of sentiment analyzing Swedish news article titles using machine learning approaches and find how the text is best represented in such conditions. Sentiment analysis has traditionally been conducted by part-of-speech tagging and counting word polarities, which performs well for large domains and in absence of large sets of training data. For narrower domains and previously labeled data, supervised learning can be used. The work of this thesis tested the performance of a convolutional neural network and a Support Vector Machine on different sets of data. The data sets were constructed to represent various language features. This included for example a simple unigram bag-of-words model storing word counts, a bigram bag-of-words model to include the ordering of words and an integer vector summary of the title. The study concluded that each of the tested feature sets gave information about the sentiment to various extents. The neural network approach with all feature sets combined performed better than the two annotators of the study. Despite the limited data set, overfitting did not seem to be a problem when using the features together. / Målet med detta arbete var att undersöka möjligheten till sentimentanalys av svenska nyhetsrubriker med hjälp av maskininlärning och förstå hur dessa rubriker bäst representeras. Sentimentanalys har traditionellt använt ordklassmärkning och räknande av ordpolariteter, som fungerar bra för stora domäner där avsaknaden av större uppmärkt träningsdata är stor. För mindre domäner och tidigare uppmärkt data kan övervakat lärande användas. Inom ramen för detta arbete undersöktes ett artificiellt neuronnät med faltning och en stödvektormaskin på olika datamängder. Datamängderna formades för att representera olika språkegenskaper. Detta inkluderade bland annat en enkel ordräkningsmodell, en bigramräkningsmodell och en heltalssummering av generella egenskaper för rubriken. I studien dras slutsatsen att varje datamängd innebar att ny information kunde tillföras i olika stor utsträckning. Det artificiella neuronnätet med alla datamängder tillsammans presterade bättre än de två personer som märkte upp data till denna studie. Trots en begränsad datamängd inträffade verkade inte modellerna övertränas. sentiment analysis machine learning sentiments ml ai neural networks ann artificial neural networks convolutional neural network cnn language technology natural language news titles Swedish sentimentanalys maskininlärning sentiment ml ai neuronnät ann artificiella neuronnät cnn nyheter nyhetsrubriker språkteknologi Computer Sciences Datavetenskap (datalogi)
27	Effect of polysemy and homography on sentiment analysis / Effekten av polysemi och homografi på sentimentanalys Ljung, Oskar January 2024 (has links) This bachelor's thesis studied the difference in sentiment between different homographic or polysemous senses of individual words. It did this by training a linear regression model on a version of the British National corpus that had been disambiguated along WordNet word senses (synsets) and analysing sentiment data from SentiWordNet. Results were partial, but indicated that word senses differ somewhat in sentiment. In the process of this study, a new and improved version of the Lesk disambiguation algorithm was also developed, named Nomalised Lesk. The validation of that algorithm compared to the regular Lesk algorithm is presented here as well. sentiment analysis disambiguation vector space polysemy homonomy homography linear regression British National Corpus SentiWordNet WordNet Lesk Algorithm Normalised Lesk sentimentanalys disambiguering vecorrymd polysemi homonymi homografi linjär regression British National Corpus SentiWordNet WordNet Lesk-algoritmen Normalised Lesk General Language Studies and Linguistics

Page generated in 0.0766 seconds