• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 75
  • 7
  • 4
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 130
  • 130
  • 42
  • 40
  • 35
  • 32
  • 31
  • 28
  • 25
  • 24
  • 22
  • 22
  • 22
  • 21
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Unsupervised Topic Modeling to Improve Stormwater Investigations

Arvidsson, David January 2022 (has links)
Stormwater investigations are an important part of the detail plan that is necessary for companies and industries to write. The detail plan is used to show that an area is well suited for among other things, construction. Writing these detail plans is a costly and time consuming process and it is not uncommon they get rejected. This is because it is difficult to find information about the criteria you need to meet and what you need to address within the investigation. This thesis aims to make this problem less ambiguous by applying the topic modeling algorithm LDA (latent Dirichlet allocation) in order to identify the structure of stormwater investigations. Moreover, sentences that contain words from the topic modeling will be extracted to give each word a perspective of how it can be used in the context of writing a stormwater investigation. Finally a knowledge graph will be created with the extracted topics and sentences. The result of this study indicates that topic modeling and NLP (natural language processing) can be used to identify the structure of stormwater investigations. Furthermore it can also be used to extract useful information that can be used as a guidance when learning and writing stormwater investigations.
52

Topic propagation over time in internet security conferences : Topic modeling as a tool to investigate trends for future research / Ämnesspridning över tid inom säkerhetskonferenser med hjälp av topic modeling

Johansson, Richard, Engström Heino, Otto January 2021 (has links)
When conducting research, it is valuable to find high-ranked papers closely related to the specific research area, without spending too much time reading insignificant papers. To make this process more effective an automated process to extract topics from documents would be useful, and this is possible using topic modeling. Topic modeling can also be used to provide topic trends, where a topic is first mentioned, and who the original author was. In this paper, over 5000 articles are scraped from four different top-ranked internet security conferences, using a web scraper built in Python. From the articles, fourteen topics are extracted, using the topic modeling library Gensim and LDA Mallet, and the topics are visualized in graphs to find trends about which topics are emerging and fading away over twenty years. The result found in this research is that topic modeling is a powerful tool to extract topics, and when put into a time perspective, it is possible to identify topic trends, which can be explained when put into a bigger context.
53

The Mass-Personal Divide: Bridging Scholarship and Paving Ground Through the Lens of Environmental Discourse on Public Land Use

Seroka, Laura A. 19 November 2019 (has links)
No description available.
54

Topic modeling of IS research on the Covid-19 pandemic / Temamodellering på IS forskning relaterad till Covid-19 pandemin

Gräntz, Carl January 2023 (has links)
This study presents eighteen topics and their distribution over the corpus of 891 abstracts, within the scope of IS research on Covid-19. With the goal of describing the IS-fields contribution to society in fighting the Covid-19 pandemic. The topics were created by collecting 844 abstracts from 63 IS journals and 160 IS related abstracts from non-IS journals, all from the Web of Science Core Collection database. The abstracts were then fitted with the topic model BERTopic; that provided the eighteen topics which then were manually labeled. Flaws to this study is that it utilizes a relatively small corpus for topic modeling, and that the topic model BERTopic lacks the ability to assign documents to multiple topics. The result has similarities to a previous literature review but lacks the distinguished topic of government response and IS field agendas. However, this study’s resulting topics can give a more general perspective over a considerably larger body of research papers, and help identify further research directions.
55

Topic discovery and document similarity via pre-trained word embeddings

Chen, Simin January 2018 (has links)
Throughout the history, humans continue to generate an ever-growing volume of documents about a wide range of topics. We now rely on computer programs to automatically process these vast collections of documents in various applications. Many applications require a quantitative measure of the document similarity. Traditional methods first learn a vector representation for each document using a large corpus, and then compute the distance between two document vectors as the document similarity.In contrast to this corpus-based approach, we propose a straightforward model that directly discovers the topics of a document by clustering its words, without the need of a corpus. We define a vector representation called normalized bag-of-topic-embeddings (nBTE) to encapsulate these discovered topics and compute the soft cosine similarity between two nBTE vectors as the document similarity. In addition, we propose a logistic word importance function that assigns words different importance weights based on their relative discriminating power.Our model is efficient in terms of the average time complexity. The nBTE representation is also interpretable as it allows for topic discovery of the document. On three labeled public data sets, our model achieved comparable k-nearest neighbor classification accuracy with five stateof-art baseline models. Furthermore, from these three data sets, we derived four multi-topic data sets where each label refers to a set of topics. Our model consistently outperforms the state-of-art baseline models by a large margin on these four challenging multi-topic data sets. These works together provide answers to the research question of this thesis:Can we construct an interpretable document represen-tation by clustering the words in a document, and effectively and efficiently estimate the document similarity? / Under hela historien fortsätter människor att skapa en växande mängd dokument om ett brett spektrum av publikationer. Vi förlitar oss nu på dataprogram för att automatiskt bearbeta dessa stora samlingar av dokument i olika applikationer. Många applikationer kräver en kvantitativmått av dokumentets likhet. Traditionella metoder först lära en vektorrepresentation för varje dokument med hjälp av en stor corpus och beräkna sedan avståndet mellan two document vektorer som dokumentets likhet.Till skillnad från detta corpusbaserade tillvägagångssätt, föreslår vi en rak modell som direkt upptäcker ämnena i ett dokument genom att klustra sina ord , utan behov av en corpus. Vi definierar en vektorrepresentation som kallas normalized bag-of-topic-embeddings (nBTE) för att inkapsla de upptäckta ämnena och beräkna den mjuka cosinuslikheten mellan två nBTE-vektorer som dokumentets likhet. Dessutom föreslår vi en logistisk ordbetydelsefunktion som tilldelar ord olika viktvikter baserat på relativ diskriminerande kraft.Vår modell är effektiv när det gäller den genomsnittliga tidskomplexiteten. nBTE-representationen är också tolkbar som möjliggör ämnesidentifiering av dokumentet. På tremärkta offentliga dataset uppnådde vår modell jämförbar närmaste grannklassningsnoggrannhet med fem toppmoderna modeller. Vidare härledde vi från de tre dataseten fyra multi-ämnesdatasatser där varje etikett hänvisar till en uppsättning ämnen. Vår modell överensstämmer överens med de högteknologiska baslinjemodellerna med en stor marginal av fyra utmanande multi-ämnesdatasatser. Dessa arbetsstöd ger svar på forskningsproblemet av tisthesis:Kan vi konstruera en tolkbar dokumentrepresentation genom att klustra orden i ett dokument och effektivt och effektivt uppskatta dokumentets likhet?
56

Exploring the Potential of Twitter Data and Natural Language Processing Techniques to Understand the Usage of Parks in Stockholm / Utforska potentialen för användning av Natural Language Processing på Twitter data för att förstå användningen av parker i Stockholm

Norsten, Theodor January 2020 (has links)
Traditional methods used to investigate the usage of parks consists of questionnaire which is both a very time- and- resource consuming method. Today more than four billion people daily use some form of social media platform. This has led to the creation of huge amount of data being generated every day through various social media platforms and has created a potential new source for retrieving large amounts of data. This report will investigate a modern approach, using Natural Language Processing on Twitter data to understand how parks in Stockholm being used. Natural Language Processing (NLP) is an area within artificial intelligence and is referred to the process to read, analyze, and understand large amount of text data and is considered to be the future for understanding unstructured text. Twitter data were obtained through Twitters open API. Data from three parks in Stockholm were collected between the periods 2015-2019. Three analysis were then performed, temporal, sentiment, and topic modeling analysis. The results from the above analysis show that it is possible to understand what attitudes and activities are associated with visiting parks using NLP on social media data. It is clear that sentiment analysis is a difficult task for computers to solve and it is still in an early stage of development. The results from the sentiment analysis indicate some uncertainties. To achieve more reliable results, the analysis would consist of much more data, more thorough cleaning methods and be based on English tweets. One significant conclusion given the results is that people’s attitudes and activities linked to each park are clearly correlated with the different attributes each park consists of. Another clear pattern is that the usage of parks significantly peaks during holiday celebrations and positive sentiments are the most strongly linked emotion with park visits. Findings suggest future studies to focus on combining the approach in this report with geospatial data based on a social media platform were users share their geolocation to a greater extent. / Traditionella metoder använda för att förstå hur människor använder parker består av frågeformulär, en mycket tids -och- resurskrävande metod. Idag använder mer en fyra miljarder människor någon form av social medieplattform dagligen. Det har inneburit att enorma datamängder genereras dagligen via olika sociala media plattformar och har skapat potential för en ny källa att erhålla stora mängder data. Denna undersöker ett modernt tillvägagångssätt, genom användandet av Natural Language Processing av Twitter data för att förstå hur parker i Stockholm används. Natural Language Processing (NLP) är ett område inom artificiell intelligens och syftar till processen att läsa, analysera och förstå stora mängder textdata och anses vara framtiden för att förstå ostrukturerad text. Data från Twitter inhämtades via Twitters öppna API. Data från tre parker i Stockholm erhölls mellan perioden 2015–2019. Tre analyser genomfördes därefter, temporal, sentiment och topic modeling. Resultaten från ovanstående analyser visar att det är möjligt att förstå vilka attityder och aktiviteter som är associerade med att besöka parker genom användandet av NLP baserat på data från sociala medier. Det är tydligt att sentiment analys är ett svårt problem för datorer att lösa och är fortfarande i ett tidigt skede i utvecklingen. Resultaten från sentiment analysen indikerar några osäkerheter. För att uppnå mer tillförlitliga resultat skulle analysen bestått av mycket mer data, mer exakta metoder för data rensning samt baserats på tweets skrivna på engelska. En tydlig slutsats från resultaten är att människors attityder och aktiviteter kopplade till varje park är tydligt korrelerat med de olika attributen respektive park består av. Ytterligare ett tydligt mönster är att användandet av parker är som högst under högtider och att positiva känslor är starkast kopplat till park-besök. Resultaten föreslår att framtida studier fokuserar på att kombinera metoden i denna rapport med geospatial data baserat på en social medieplattform där användare delar sin platsinfo i större utsträckning.
57

A Data-Driven Approach for Incident Handling in DevOps

Annadata, Lakshmi Ashritha January 2023 (has links)
Background: Maintaining system reliability and customer satisfaction in a DevOps environment requires effective incident management. In the modern day, due to increasing system complexity, several incidents occur daily. Incident prioritization and resolution are essential to manage these incidents and lessen their impact on business continuity. Prioritization of incidents, estimation of recovery time objective (RTO), and resolution times are traditionally subjective processes that rely more on the DevOps team’s competence. However, as the volume of incidents rises, it becomes increasingly challenging to handle them effectively.  Objectives: This thesis aims to develop an approach that prioritizes incidents and estimates the corresponding resolution times and RTO values leveraging machine learning. The objective is to provide an effective solution to streamline DevOps activities. To verify the performance of our solution, an evaluation is later carried out by the users in a large organization (Ericsson).  Methods: The methodology used for this thesis is design science methodology. It starts with the problem identification phase, where a rapid literature review is done to lay the groundwork for the development of the solution. Cross-Industry Standard Process for Data Mining (CRISP-DM) is carried out later in the development phase. In the evaluation phase, a static validation is carried out in a DevOps environment to collect user feedback on the tool’s usability and feasibility.  Results:  According to the results, the tool helps the DevOps team prioritize incidents and determine the resolution time and RTO. Based on the team’s feedback, 84% of participants agree that the tool is helpful, and 76% agree that the tool is easy to use and understand. The tool’s performance evaluation of the three metrics chosen for estimating the priority was accuracy 93%, Recall 78%, F1 score 87% on average for all four priority levels, and the BERT accuracy for estimating the resolution time range was 88%. Hence, we can expect the tool to help speed up the incident response’s efficiency and decrease the resolution time.  Conclusions: The tool’s validation and implementation indicate that it has the potential to increase the reliability of the system and the effectiveness of incident management in a DevOps setting. Prioritizing incidents and predicting resolution time ranges based on impact and urgency can enable the DevOps team to make well-informed decisions. Some of the future progression for the tool can be to investigate how to integrate it with other third-party DevOps tools and explore areas to provide guidelines to handle sensitive incident data. Another work could be to analyze the tool in a live project and obtain feedback.
58

Topic Modeling for Customer Insights : A Comparative Analysis of LDA and BERTopic in Categorizing Customer Calls

Axelborn, Henrik, Berggren, John January 2023 (has links)
Customer calls serve as a valuable source of feedback for financial service providers, potentially containing a wealth of unexplored insights into customer questions and concerns. However, these call data are typically unstructured and challenging to analyze effectively. This thesis project focuses on leveraging Topic Modeling techniques, a sub-field of Natural Language Processing, to extract meaningful customer insights from recorded customer calls to a European financial service provider. The objective of the study is to compare two widely used Topic Modeling algorithms, Latent Dirichlet Allocation (LDA) and BERTopic, in order to categorize and analyze the content of the calls. By leveraging the power of these algorithms, the thesis aims to provide the company with a comprehensive understanding of customer needs, preferences, and concerns, ultimately facilitating more effective decision-making processes.  Through a literature review and dataset analysis, i.e., pre-processing to ensure data quality and consistency, the two algorithms, LDA and BERTopic, are applied to extract latent topics. The performance is then evaluated using quantitative and qualitative measures, i.e., perplexity and coherence scores as well as in- terpretability and usefulness of topic quality. The findings contribute to knowledge on Topic Modeling for customer insights and enable the company to improve customer engagement, satisfaction and tailor their customer strategies.  The results show that LDA outperforms BERTopic in terms of topic quality and business value. Although BERTopic demonstrates a slightly better quantitative performance, LDA aligns much better with human interpretation, indicating a stronger ability to capture meaningful and coherent topics within company’s customer call data.
59

Trend Analysis on Artificial Intelligence Patents

Cotra, Aditya Kousik 28 June 2021 (has links)
No description available.
60

Changing research topic trends as an effect of publication rankings – The case of German economists and the Handelsblatt Ranking

Buehling, Kilian 07 September 2023 (has links)
In order to arrive at informed judgments about the quality of research institutions and individual scholars, funding agencies, academic employers and researchers have turned to publication rankings. While such rankings, often based on journal citations, promise a more efficient and transparent funding allocation, individual researchers are at risk of showing adaptive behavior. This paper investigates whether the use of journal rankings in assessing the quality of scholarly research results in the unintended consequence of researchers adapting their research topics to the publishing interests of high-ranked journals. The introduction of the Handelsblatt Ranking (HBR) for economists in German language institutions serves as a quasi-natural experiment, allowing for an examination of research topic dynamics in economics via topic modeling and text classification. It is found that the Handelsblatt Ranking did not cause a significant shift of topics researched by German-affiliated authors in comparison to their international counterparts, even though topic convergence is apparent.

Page generated in 0.1411 seconds