Global ETD Search

21	Anemone: a Visual Semantic Graph Ficapal Vila, Joan January 2019 (has links) Semantic graphs have been used for optimizing various natural language processing tasks as well as augmenting search and information retrieval tasks. In most cases these semantic graphs have been constructed through supervised machine learning methodologies that depend on manually curated ontologies such as Wikipedia or similar. In this thesis, which consists of two parts, we explore in the first part the possibility to automatically populate a semantic graph from an ad hoc data set of 50 000 newspaper articles in a completely unsupervised manner. The utility of the visual representation of the resulting graph is tested on 14 human subjects performing basic information retrieval tasks on a subset of the articles. Our study shows that, for entity finding and document similarity our feature engineering is viable and the visual map produced by our artifact is visually useful. In the second part, we explore the possibility to identify entity relationships in an unsupervised fashion by employing abstractive deep learning methods for sentence reformulation. The reformulated sentence structures are qualitatively assessed with respect to grammatical correctness and meaningfulness as perceived by 14 test subjects. We negatively evaluate the outcomes of this second part as they have not been good enough to acquire any definitive conclusion but have instead opened new doors to explore. / Semantiska grafer har använts för att optimera olika processer för naturlig språkbehandling samt för att förbättra sökoch informationsinhämtningsuppgifter. I de flesta fall har sådana semantiska grafer konstruerats genom övervakade maskininlärningsmetoder som förutsätter manuellt kurerade ontologier såsom Wikipedia eller liknande. I denna uppsats, som består av två delar, undersöker vi i första delen möjligheten att automatiskt generera en semantisk graf från ett ad hoc dataset bestående av 50 000 tidningsartiklar på ett helt oövervakat sätt. Användbarheten hos den visuella representationen av den resulterande grafen testas på 14 försökspersoner som utför grundläggande informationshämtningsuppgifter på en delmängd av artiklarna. Vår studie visar att vår funktionalitet är lönsam för att hitta och dokumentera likhet med varandra, och den visuella kartan som produceras av vår artefakt är visuellt användbar. I den andra delen utforskar vi möjligheten att identifiera entitetsrelationer på ett oövervakat sätt genom att använda abstraktiva djupa inlärningsmetoder för meningsomformulering. De omformulerade meningarna utvärderas kvalitativt med avseende på grammatisk korrekthet och meningsfullhet såsom detta uppfattas av 14 testpersoner. Vi utvärderar negativt resultaten av denna andra del, eftersom de inte har varit tillräckligt bra för att få någon definitiv slutsats, men har istället öppnat nya dörrar för att utforska. Neo4j Topic Modelling Semantic Graph Latent Dirichlet Allocation (LDA) NER Sentence Reformulation. Computer and Information Sciences Data- och informationsvetenskap
22	Influencers in Confinement : Measuring Covid-19’s Impact on Leadership in Pro- Eating Disorder Twitter Communities Ennis, Jacquelynn January 2024 (has links) The Covid-19 pandemic presented unprecedented challenges, with global lockdowns impacting individuals on a profound scale. Many took to social media to cope with feelings of anxiety and isolation. Lockdown conditions and social media carry with them particular challenges, triggers and temptations for those with eating disorders, namely in the form of online communities promoting eating disorders and disordered behaviors as a legitimate lifestyle choice. This study examines pro-eating disorder Twitter communities before, during and after the initial Covid-19 lockdown (Mid-March to May 2020) to examine the influence of confinement on leadership dynamics and content trends. Utilizing data obtained through Twitter’s Academic API, I constructed monthly retweet network time-slices spanning from November 2019 to September 2020. Through social network analysis and analyzing turnover rates of top users, the evolution of influential users was assessed to test whether the circumstances created by Covid-19 restrictions would disrupt the established leadership paradigm or the period would maintain stable leadership based on expectations proposed by the literature of preferential attachment in scale-free networks. Contrary to expectations, influential users exhibited high turnover throughout the period and the network showed no tendency towards preferential attachment or any scale-free behavior in degree distributions. The high rate of leader turnover further increased in May and a higher proportion of new users achieved the highest number of in-degree ties into the latter months, but this hint at a cohort shift did not align with covid lockdown as predicted, instead occurring at the end of lockdown and continuing until the end of the studied period. Ultimately, users’ mostly fleeting popularity was largely based on the current content interests of the group rather than the individual user’s network position. The increase in activity predicted to co-occur with covid restrictions did not materialize until the summer months, therefore cannot be definitively linked to lockdown. The fluctuations in topic popularity detected in the topic model suggest a possible seasonal component to the rhythms of this community that requires further research. This exclusive longitudinal analysis of retweet networks as they were affected by covid-19 lockdown conditions challenges previous research on influence in social networks and online communities with findings of more dynamic leadership. Understanding the influence dynamics of this community can inform efforts to combat the spread of potentially harmful content and provide valuable insights for eating disorder specialists navigating the influences that may affecting their patients. Social Network Analysis Twitter Pro-eating Disorder Eating Disorders Covid-19 Topic Modelling Online Communities Sociology Sociologi
23	Intersecting Identities : A Computational Exploration of Gender and Race in The Guardian’s Political Coverage, 2017 – 2022 Sampa, Vasiliki January 2024 (has links) This study examines The Guardian’s portrayal of intersectional feminism, with a focus on gender and race, analysing how social movements, particularly Black Lives Matter, influence its political coverage. Arising from Kimberlé Crenshaw’s concept of intersectionality, which recognizes the interconnected nature of various forms of oppression and privilege, the research employs a combination of quantitative and qualitative methods to analyse 647 political articles. Quantitative methods, including topic modeling and keyword frequency analysis, provide the structural framework of the thesis. Topic modeling reveals twenty topics, and keyword frequency analysis emphasizes in nine keywords related to intersectional feminism and their prevalence. Qualitative methods, such as collocation analysis and close reading, examine particularly “gender” and “race”. Close reading is used for a deeper examination at every step of the analysis. Despite theme variations, certain subjects like the gender gap and gender identity consistently underscore their enduring significance. Discussions related to Black Lives Matter show spikes in coverage post – 2020, indicating an increased emphasis on diversity and racial justice themes. However, the infrequent use of the term “intersectionality” suggests a potential disparity between the conceptual framework and its direct representation in The Guardian’s political articles. Intersectional feminism gender race Black Lives Matter distant reading topic modelling Other Humanities not elsewhere specified Övrig annan humaniora
24	An Approach to Extending Ontologies in the Nanomaterials Domain Leshi, Olumide January 2020 (has links) As recently as the last decade or two, data-driven science workflows have become increasingly popular and semantic technology has been relied on to help align often parallel research efforts in the different domains and foster interoperability and data sharing. However, a key challenge is the size of the data and the pace at which it is being generated, so much that manual procedures lag behind. Thus, eliciting automation of most workflows. In this study, the effort is to continue investigating ways by which some tasks performed by experts in the nanotechnology domain, specifically in ontology engineering, could benefit from automation. An approach, featuring phrase-based topic modelling and formal topical concept analysis is further motivated, together with formal implication rules, to uncover new concepts and axioms relevant to two nanotechnology-related ontologies. A corpus of 2,715 nanotechnology research articles helps showcase that the approach can scale, as seen in a number of experiments conducted. The usefulness of document text ranking as an alternative form of input to topic models is highlighted as well as the benefit of implication rules to the task of concept discovery. In all, a total of 203 new concepts are uncovered by the approach to extend the referenced ontologies Ontology Nanomaterials Concept Discovery Formal Concept Analysis (FCA) Topic Modelling Association Rule Mining (ARM) Duquenne-Guigues Information Systems
25	Att hitta populism i nyhetsmedier : En temamodellering av artiklar publicerade i svenska nyhetsmedier 2012–2022 / Finding Populism in News Media : topic modelling of articles published in Swedish news 2012–2022 Flygt Branje, Richard January 2023 (has links) In an explorative approach, this thesis draws upon the benefits of using Artificial Intelligence (AI) for analysing text with Topic Modelling in an attempt to measure populism in Swedish news. This project breaks new ground in the field of media and communication studies by including 15 200 000 words from Swedish news articles published between 2012 and 2022 and steps into the next generation of news analysis that incorporates data driven methods to unload the burden of quantitative content analysis. By extracting the most salient aspects of populism and feeding them to the Top2Vec algorithm, keywords related to populism is measured over time and space and a new value describing to what degree news agencies is complicit in media populism is developed. Some of the most noticeable findings include, identifying keywords related to populism, Aftonbladet’s elevated degree of media populism and that the focus of Swedish media populism shifts over time, from the “People” to the “Anti-Elite” aspects of populism. topic modelling text analysis populism democracy news analysis media and communication studies media populism Media and Communications Medie- och kommunikationsvetenskap Media Studies Medievetenskap Communication Studies Kommunikationsvetenskap
26	COMPARING PSO-BASED CLUSTERING OVER CONTEXTUAL VECTOR EMBEDDINGS TO MODERN TOPIC MODELING Samuel Jacob Miles (12462660) 26 April 2022 (has links) <p>Efficient topic modeling is needed to support applications that aim at identifying main themes from a collection of documents. In this thesis, a reduced vector embedding representation and particle swarm optimization (PSO) are combined to develop a topic modeling strategy that is able to identify representative themes from a large collection of documents. Documents are encoded using a reduced, contextual vector embedding from a general-purpose pre-trained language model (sBERT). A modified PSO algorithm (pPSO) that tracks particle fitness on a dimension-by-dimension basis is then applied to these embeddings to create clusters of related documents. The proposed methodology is demonstrated on three datasets across different domains. The first dataset consists of posts from the online health forum r/Cancer. The second dataset is a collection of NY Times abstracts and is used to compare</p> <p>the proposed model to LDA. The third is a standard benchmark dataset for topic modeling which consists of a collection of messages posted to 20 different news groups. It is used to compare state-of-the-art generative document models (i.e., ETM and NVDM) to pPSO. The results show that pPSO is able to produce interpretable clusters. Moreover, pPSO is able to capture both common topics as well as emergent topics. The topic coherence of pPSO is comparable to that of ETM and its topic diversity is comparable to NVDM. The assignment parity of pPSO on a document completion task exceeded 90% for the 20News-Groups dataset. This rate drops to approximately 30% when pPSO is applied to the same Skip-Gram embedding derived from a limited, corpus specific vocabulary which is used by ETM and NVDM.</p> Digital processor architectures Particle Swarm Optimization Algorithm Topic Modelling Vector Embedding Natural Language Processing Computer Engineering
27	The state of network research / Tillståndet för nätverksforskning Zhu, Haoyu January 2020 (has links) In the past decades, networking researchers experienced great changes. Being familiar with the development of networking researches is the first step for most scholars to start their work. The targeted areas, useful documents, and active institutions are helpful to set up the new research. This project is focused on developing an assistant tool based on public accessed papers and information on the Internet that allows researchers to view most cited papers in networking conferences and journals. NLP tools are implemented over crawled full-text in order to classify the papers and extract the keywords. Papers are located based on authors to show the most active countries around the world that are working in this area. References are analyzed to view the most cited topics and detailed paper information. We draw some interesting conclusions from our system, showing that some topics attract more attention in the past decades. / Under de senaste decennierna upplevde nätverksundersökningar stora förändringar. Att känna till utvecklingen av nätverksundersökningar är det första steget för de flesta forskare att starta sitt arbete. De riktade områdena, användbara dokument och aktiva institutioner är användbara för att skapa den nya forskningen. Projektet fokuserade på att utveckla ett assistentverktyg baserat på offentliga åtkomstpapper och information via internet. Som gör det möjligt för forskare att se de mest citerade artiklarna i nätverkskonferenser och tidskrifter. NLP- verktyg implementeras över genomsökt fulltext för att klassificera papperet och extrahera nyckelorden. Artiklar är baserade på författare för att visa de mest aktiva länderna runt om i världen som arbetar inom detta område. Hänvisningar analyseras för att se det mest citerade ämnet och detaljerad pappersinformation. Vi drar några intressanta slutsatser från vårt system och visar att något ämne inte lockar till sig mer under de senaste decennierna. Natural Language Processing Network Spider Network Topic Modelling State of Research. Naturlig Språkbearbetning Nätverksspindel Nätverk Ämnesmodellering Forskningstillstånd. Computer and Information Sciences Data- och informationsvetenskap
28	Improving information gathering for IT experts. : Combining text summarization and individualized information recommendation. Bergenudd, Anton January 2022 (has links) Information gathering and information overload is an ever growing topic of concernfor Information Technology (IT) experts. The amount of information dealt withon an everyday basis is large enough to take up valuable time having to scatterthrough it all to find the relevant information. As for the application area of IT,time is directly related to money as having to waste valuable production time ininformation gathering and allocation of human resources is a direct loss of profitsfor any given company. Two issues are mainly addressed through this thesis: textsare too lengthy and the difficulty of finding relevant information. Through the useof Natural Language Processes (NLP) methods such as topic modelling and textsummarization, a proposed solution is constructed in the form of a technical basiswhich can be implemented in most business areas. An experiment along with anevaluation session is setup in order to evaluate the performance of the technical basisand enforce the focus of this paper, namely ”How effective is text summarizationcombined with individualized information recommendation in improving informationgathering of IT experts?”. Furthermore, the solution includes a construction of userprofiles in an attempt to individualize content and theoretically present more relevantinformation. The results for this project are affected by the substandard quality andmagnitude of data points, however positive trends are discovered. It is stated thatthe use of user profiles further enhances the amount of relevant articles presentedby the model along with the increasing recall and precision values per iteration andaccuracy per number of updates made per user. Not enough time is spent as for theextent of the evaluation process to confidently state the validity of the results morethan them being inconsistent and insufficient in magnitude. However, the positivetrends discovered creates further speculations on if the project is given enough timeand resources to reach its full potential. Essentially, one can theoretically improveinformation gathering by summarizing texts combined with individualization. Text summarization information gathering individualization topic modelling natural language processes profiling. Computer Sciences Datavetenskap (datalogi)
29	Happy, risky assets: Uncertainty and (mis)trust in non-fungible token (NFT) conversations on Twitter Meyns, Sarah C.A. January 2022 (has links) Background: Non-fungible token (NFT) trade has grown drastically over recent years. While scholarship on the technical aspects and potential applications of NFTs has been steadily increasing, less attention has been directed to the human perception of or attitudes toward this new type of digital asset; in particular, about potential concerns that users may have around the use of NFTs. Aim: The aim of this research is to investigate what concerns, if any, are expressed in relation to non-fungible tokens by those who engage with NFTs on social media platform Twitter, with special attention to possible concerns about crime, using NFT marketplaces, and market dynamics. Methods: This research offers a mixed methods, largely qualitative, study. The method of data gathering is online non-participant observation of NFT-related posts and conversations on the social media platform Twitter. The methods of data analysis are topic modelling and thematic analysis, with additional attention to visual analysis of images and animated or video material associated with posts. Two datasets (with 18,373 and 36,354 individual tweet records respectively) were obtained for quantitative analysis; two smaller-scale datasets (both ca. 1000 records, with supplementary conversation details and visual material) were obtained for qualitative analysis. Conclusion: This study proposes an interpretation of NFTs as functioning as ‘happy objects’ in NFT conversations on Twitter, wherein NFTs are represented as digital objects that hold a ‘promise’ of the happiness or fulfilment associated with financial gain. Concerns around NFTs as expressed on Twitter fall into broadly three categories: (1) concerns relating to not being to able to engage in, or being locked out of the possibility of, NFT trade; (2) concerns about the conditions, security and safety of engaging in NFT trade; and (3) concerns about whether any of the anticipated (financial) rewards or gains from engaging in NFTs will actually be obtained. Hence, many of the concerns that come up within NFT conversations on Twitter relate to conditions that may stand in the way of these happy object in fact bringing about their desired result. Overall, this study offers a better understanding of the expressions of attitudes of concern, uncertainty and possible experience of barriers associated with NFT trading. These findings contribute to theoretical insight, and can moreover function as a basis for developing practical (design or policy) interventions. non-fungible token NFT concern blockchain Twitter happy object digital asset topic modelling Information Systems, Social aspects
30	Covid-19 Related Conspiracy Theories on Social Media : How to identify misinformation through patterns in language usage on social media / Covid-19 relaterade konspirationsteorier på sociala medier Savinainen, Oskar, Hvidbjerg Hansen, Thor January 2022 (has links) Distinguishing between information and disinformation is an ever growing issue. The dramatic structure of a conspiracy theory easily captures a large audience and with the advent of social media, this disinformation can spread at an ever growing rate. This is especially true with the infodemic following the Covid-19 pandemic in early 2020, where there was a drastic increase in Covid-19 related misinformation on social media. When misinformation replaces fact, some people will inevitably follow borderline dangerous advice. This could unfortunately be seen in the ivermection issue where people injected this substance in hope of preventing/curing a Covid-19 infection. This is why finding patterns in disinformation that distinguishes it from facts would allow us to take measures against the spread of conspiracy theories. We have found patterns in our dataset suggesting that there is a significant difference in the language patterns for terms relating to conspiracy theories, and non-conspiratorial terms. We find that the sentiment of conspiracy theories is very volatile when compared to that of non-conspiratorial terms which follow a more neutral pattern in terms of sentiment. Suggesting that the language usage in a post can be used as a factor when determining the credibility of its content. We also find that conspiracy theories tend to see a drastic increase in mentions when previously being relatively lowin mentions. The result of this thesis could therefore be used as a start for developing tools and processes which would seek to combat the spread of conspiracy theories and limit the potential harm that could come from them. LDA sentiment analysis covid-19 conspiracy theories topic modelling social media reddit language patterns machine learning Other Computer and Information Science Annan data- och informationsvetenskap

Search results