• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 32
  • 1
  • Tagged with
  • 36
  • 36
  • 15
  • 13
  • 11
  • 11
  • 10
  • 8
  • 8
  • 7
  • 7
  • 6
  • 6
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Perceiving Umeå : Instagram's Lens on Neighborhoods in the City

Fuhler, Rick January 2023 (has links)
This master thesis in human geography explores how neighborhoods are represented and perceived on the popular social media platform Instagram. By analyzing user-generated content, both visually and textually, this study aims to uncover the predominant themes, characteristics, and subjective perspectives associated with neighborhood representation on Instagram. Through a systematic analysis of the content shared by Instagram users, the research identifies recurring themes, visual motifs, and distinguishing features that emerge when portraying and expressing experiences of different neighborhoods using topic modelling and sentiment analysis in Orange. The study specifically focuses on Umeå, allowing for a deeper understanding of how Instagram users perceive and portray the various neighborhoods within the city. The findings of this research hold potential implications for urban planning practices, as they shed light on the factors influencing neighborhood representation on Instagram and their relevance to decision-making processes related to urban development, community engagement, and social well-being. Overall, this study provides valuable insights into the interplay between social media and neighborhood representation.
12

Unveiling the Swedish philosophical landscape : A topic model study of the articles of a Swedish philosophical journal from 1980-2020

Lindqvist, Björn January 2023 (has links)
Bibliometric research is an important tool for examining the scientific output of various fields of study. By conducting such research, it is possible to see how the influences of different people, ideologies and discoveries have affected the scientific discourse. One way of doing this is through topic modelling, which consists of organizing the words that are used within a set of text data into different topics. To the knowledge of the author, no topic modelling study of Swedish philosophy had previously been conducted. For this reason, this study aimed to partially fill the gap by exploring the publications of one specific Swedish philosophical journal. Using Python, a topic model with 14 topics was created from the journal Filosofisk tidskrift. The change of these topics between the years 1980 and 2020 was examined. Specific attention was given to possible differences between analytic and Continental philosophy. To validate the results, an interview was also held with Fredrik Stjernberg, professor in theoretical philosophy. The results displayed a varied popularity and change for each topic. Too little Continental philosophy was discovered for a proper comparison, leading to the conclusion that Continental philosophy is not very influential in Swedish philosophical discourse. Future research should be conducted on peer-reviewed articles and be backed up by greater professional philosophical aid.
13

​​The Cognitive Revolution – Fact or Fiction?​ : ​​Using topic modelling to look for signs of a paradigm shift in a Swedish journal​

Fagerlind, Johannes January 2023 (has links)
Traditionally, when social scientists wanted to analyze large amounts of documents, they have resorted to using manual coding techniques. This process can be made easier by using machine learning approaches. One such approach, called topic modelling, can find which words commonly occur together and in doing so provide the researcher with semantically coherent topics. This thesis utilizes topic modelling to investigate Nordic Psychology, a psychology journal published in the Nordic languages. Articles published between 1949 and 2005 are examined to map out how discourse has changed during the second half of the 20:th century. Psychology textbooks and researchers active in the late sixties frequently refer to something called the cognitive revolution taking place. Accounts of this revolution paint a picture of something resembling a paradigm shift. This thesis therefore sets out to look for signs of the cognitive revolution being a paradigm shift. The topic model used in this thesis does however not find the traces of a paradigm shift within the dataset, suggesting that if a paradigm shift did take place, it was not reflected in the Nordic Psychology journal.
14

Spatial Regularization for Analysis of Text and Epidemiological Data

MAITI, ANIRUDDHA, 0000-0002-1142-6344 January 2022 (has links)
Use of spatial data has become an important aspect of data analysis. Use of location information can provide useful insight into the dataset. Advancement of sensor technologies and improved data connectivity have made it possible to the generation of large amounts of passively generated user location data. Apart from passively generated data from users, explicit effort has been made by commercial vendors to curate large amounts of location related data such as residential histories from a variety of sources such as credit records, litigation data, driving license records etc. Such spatial data, when linked with other datasets can provide useful insights. In this dissertation, we show that spatial information of data enables us to derive useful insights in domains of text analysis and epidemiology. We investigated primarily two types of data having spatial information - text data with location information and disease related data having residential address information. We show that in the case of text data, spatial information helps us find spatially informative topics. In the case of epidemiological data, we show residential information can be used to identify high risk spatial regions. There are instances where a primary analysis is not sufficient to establish a statistically robust conclusion. For instance, in domains such as epidemiology, where a finding is not considered to be relevant unless some statistical significance is established. We proposed techniques for significant tests which can be applied to text analysis, topic modelling, and disease mapping tasks in order to establish significance of the findings. / Computer and Information Science
15

Simplifying Q&A Systems with Topic Modelling

Kozee, Troy January 2017 (has links)
No description available.
16

Latent Dirichlet Allocation for the Detection of Multi-Stage Attacks

Lefoane, Moemedi, Ghafir, Ibrahim, Kabir, Sohag, Awan, Irfan U. 19 December 2023 (has links)
No / The rapid shift and increase in remote access to organisation resources have led to a significant increase in the number of attack vectors and attack surfaces, which in turn has motivated the development of newer and more sophisticated cyber-attacks. Such attacks include Multi-Stage Attacks (MSAs). In MSAs, the attack is executed through several stages. Classifying malicious traffic into stages to get more information about the attack life-cycle becomes a challenge. This paper proposes a malicious traffic clustering approach based on Latent Dirichlet Allocation (LDA). LDA is a topic modelling approach used in natural language processing to address similar problems. The proposed approach is unsupervised learning and therefore will be beneficial in scenarios where traffic data is not labeled and analysis needs to be performed. The proposed approach uncovers intrinsic contexts that relate to different categories of attack stages in MSAs. These are vital insights needed across different areas of cybersecurity teams like Incident Response (IR) within the Security Operations Center (SOC), the insights uncovered could have a positive impact in ensuring that attacks are detected at early stages in MSAs. Besides, for IR, these insights help to understand the attack behavioural patterns and lead to reduced time in recovery following an incident. The proposed approach is evaluated on a publicly available MSAs dataset. The performance results are promising as evidenced by over 99% accuracy in identified malicious traffic clusters.
17

Fifty Years of Information Management Research: A Conceptual Structure Analysis using Structural Topic Modeling

Sharma, A., Rana, Nripendra P., Nunkoo, R. 10 January 2021 (has links)
Yes / Information management is the management of organizational processes, technologies, and people which collectively create, acquire, integrate, organize, process, store, disseminate, access, and dispose of the information. Information management is a vast, multi-disciplinary domain that syndicates various subdomains and perfectly intermingles with other domains. This study aims to provide a comprehensive overview of the information management domain from 1970 to 2019. Drawing upon the methodology from statistical text analysis research, this study summarizes the evolution of knowledge in this domain by examining the publication trends as per authors, institutions, countries, etc. Further, this study proposes a probabilistic generative model based on structural topic modeling to understand and extract the latent themes from the research articles related to information management. Furthermore, this study graphically visualizes the variations in the topic prevalences over the period of 1970 to 2019. The results highlight that the most common themes are data management, knowledge management, environmental management, project management, service management, and mobile and web management. The findings also identify themes such as knowledge management, environmental management, project management, and social communication as academic hotspots for future research.
18

Characterisation of a developer’s experience fields using topic modelling

Déhaye, Vincent January 2020 (has links)
Finding the most relevant candidate for a position represents an ubiquitous challenge for organisations. It can also be arduous for a candidate to explain on a concise resume what they have experience with. Due to the fact that the candidate usually has to select which experience to expose and filter out some of them, they might not be detected by the person carrying out the search, whereas they were indeed having the desired experience. In the field of software engineering, developing one's experience usually leaves traces behind: the code one produced. This project explores approaches to tackle the screening challenges with an automated way of extracting experience directly from code by defining common lexical patterns in code for different experience fields, using topic modeling. Two different techniques were compared. On one hand, Latent Dirichlet Allocation (LDA) is a generative statistical model which has proven to yield good results in topic modeling. On the other hand Non-Negative Matrix Factorization (NMF) is simply a singular value decomposition of a matrix representing the code corpus as word counts per piece of code.The code gathered consisted of 30 random repositories from all the collaborators of the open-source Ruby-on-Rails project on GitHub, which was then applied common natural language processing transformation steps. The results of both techniques were compared using respectively perplexity for LDA, reconstruction error for NMF and topic coherence for both. The two first represent how well the data could be represented by the topics produced while the later estimates the hanging and fitting together of the elements of a topic, and can depict human understandability and interpretability. Given that we did not have any similar work to benchmark with, the performance of the values obtained is hard to assess scientifically. However, the method seems promising as we would have been rather confident in assigning labels to 10 of the topics generated. The results imply that one could probably use natural language processing methods directly on code production in order to extend the detected fields of experience of a developer, with a finer granularity than traditional resumes and with fields definition evolving dynamically with the technology.
19

Improving the speed and quality of an Adverse Event cluster analysis with Stepwise Expectation Maximization and Community Detection

Erlanson, Nils January 2020 (has links)
Adverse drug reactions are unwanted effects alongside the intended benefit of a drug and might be responsible for 3-7\% of hospitalizations. Finding such reactions is partly done by analysing individual case safety reports (ICSR) of adverse events. The reports consist of categorical terms that describe the event.Data-driven identification of suspected adverse drug reactions using this data typically considers single adverse event terms, one at a time. This single term approach narrows the identification of reports and information in the reports is ignored during the search. If one instead assumes that each report is connected to a topic, then by creating a cluster of the reports that are connected to the topic more reports would be identified. More context would also be provided by virtue of the topics. This thesis takes place at Uppsala Monitoring Centre which has implemented a probabilistic model of how an ICSR, and its topic, is assumed to be generated. The parameters of the model are estimated with expectation maximization (EM), which also assigns the reports to clusters. The clusters are improved with Consensus Clustering that identify groups of reports that tend to be grouped together by several runs of EM. Additionally, in order to not cluster outlying reports all clusters below a certain size are excluded. The objective of the thesis is to improve the algorithm in terms of computational efficiency and quality, as measured by stability and clinical coherence. The convergence of EM is improved using stepwise EM, which resulted in a speed up of at least 1.4, and a decrease of the computational complexity. With all the speed improvements the speed up factor of the entire algorithm can reach 2 but is constrained by the size of the data. In order to improve the clusters' quality, the community detection algorithm Leiden is used. It is able to improve the stability with the added benefit of increasing the number of clustered reports. The clinical coherence score performs worse with Leiden. There are good reasons to further investigate the benefits of Leiden as there were suggestions that community detection identified clusters with greater resolution that still appeared clinically coherent in a posthoc analysis.
20

COP TOPICS: TOPIC MODELING-ASSISTED DISCOVERIES OF POLICE-RELATED THEMES IN AFRICAN-AMERICAN JOURNALISTIC TEXTS

Lemire Garlic, Nicole January 2017 (has links)
The analysis of mainstream newspaper content has long been mined by communication scholars and researchers for insights into public opinion and perceptions. In recent years, scholars have been examining African-American authored periodicals to obtain similar insights. Hearkening back to the 1950s and 1960s civil rights movement in the United States, the highly-publicized killings of African-American men by police officers during the past several years have highlighted longstanding strained police-community relations. As part of its role as both a reflection of, and an advocate for, the African-American community, African-American journalistic texts contain a wealth of data about African-American public opinion about, and perceptions of, police. In years past, media content analysts would manually sift through newspapers to divine interesting police-related themes and variables worthy of study. But, with the exponential growth of digitized texts, communication scholars are experimenting with computerized text analysis tools like topic modeling software to aid them in their content analyses. This thesis considers to what degree topic modeling software can be used at the exploratory stage of designing a content analysis study to aid in uncovering themes and variables worthy of further investigation. Appendix A contains results of the manual exploratory content analysis. The list of topics generated by the topic modeling software may be found in Appendix B. / Media Studies & Production / Accompanied by one .pdf file: NLG Thesis Appendices Final.pdf

Page generated in 0.1135 seconds