• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 278
  • 31
  • 25
  • 22
  • 9
  • 8
  • 5
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 432
  • 208
  • 162
  • 157
  • 150
  • 136
  • 112
  • 102
  • 92
  • 80
  • 77
  • 74
  • 73
  • 71
  • 62
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
391

Evaluation of methods for question answering data generation : Using large language models / Utvärdering av metoder för skapande av fråge-svar data

Bissessar, Daniel, Bois, Alexander January 2022 (has links)
One of the largest challenges in the field of artificial intelligence and machine learning isthe acquisition of a large quantity of quality data to train models on.This thesis investigates and evaluates approaches to data generation in a telecom domain for the task of extractive QA. To do this a pipeline was built using a combination ofBERT-like models and T5 models for data generation. We then evaluated our generateddata using the downstream task of QA on a telecom domain data set. We measured theperformance using EM and F1-scores. We achieved results that are state of the art on thetelecom domain data set.We found that synthetic data generation is a viable approach to obtaining synthetictelecom QA data with the potential of improving model performance when used in addition to human-annotated data. We also found that using models from the general domainprovided results that are on par or better than domain-specific models for the generation, which provides possibilities to use a single generation pipeline for many differentdomains. Furthermore, we found that increasing the amount of synthetic data providedlittle benefit for our models on the downstream task, with diminishing returns setting inquickly. We were unable to pinpoint the reason for this. In short, our approach works butmuch more work remains to understand and optimize it for greater results
392

Improving customer support efficiency through decision support powered by machine learning

Boman, Simon January 2023 (has links)
More and more aspects of today’s healthcare are becoming integrated with medical technology and dependent on medical IT systems, which consequently puts stricter re-quirements on the companies delivering these solutions. As a result, companies delivering medical technology solutions need to spend a lot of resources maintaining high-quality, responsive customer support. In this report, possible ways of increasing customer support efficiency using machine learning and NLP is examined at Sectra, a medical technology company. This is done through a qualitative case study, where empirical data collection methods are used to elicit requirements and find ways of adding decision support. Next, a prototype is built featuring a ticket recommendation system powered by GPT-3 and based on 65 000 available support tickets, which is integrated with the customer supports workflow. Lastly, this is evaluated by having six end users test the prototype for five weeks, followed by a qualitative evaluation consisting of interviews, and a quantitative measurement of the user-perceivedusability of the proposed prototype. The results show some support that machine learning can be used to create decision support in a customer support context, as six out of six test users believed that their long-term efficiency could improve using the prototype in terms of reducing the average ticket resolution time. However, one out of the six test users expressed some skepticism towards the relevance of the recommendations generated by the system, indicating that improvements to the model must be made. The study also indicates that the use of state-of-the-art NLP models for semantic textual similarity can possibly outperform keyword searches.
393

Utilizing Transformers with Domain-Specific Pretraining and Active Learning to Enable Mining of Product Labels

Norén, Erik January 2023 (has links)
Structured Product Labels (SPLs), the package inserts that accompany drugs governed by the Food and Drugs Administration (FDA), hold information about Adverse Drug Reactions (ADRs) that exists associated with drugs post-market. This information is valuable for actors working in the field of pharmacovigilance aiming to improve the safety of drugs. One such actor is Uppsala Monitoring Centre (UMC), a non-profit conducting pharmacovigilance research. In order to access the valuable information of the package inserts, UMC have constructed an SPL mining pipeline in order to mine SPLs for ADRs. This project aims to investigate new approaches to the solution to the Scan problem, the part of the pipeline responsible for extracting mentions of ADRs. The Scan problem is solved by approaching the problem as a Named Entity Recognition task, a subtask of Natural Language Processing. By using the transformer-based deep learning model BERT, with domain-specific pre-training, an F1-score of 0.8220 was achieved. Furthermore, the chosen model was used in an iteration of Active Learning in order to efficiently extend the available data pool with the most informative examples. Active Learning improved the F1-score to 0.8337. However, the Active Learning was benchmarked against a data set extended with random examples, showing similar improved scores, therefore this application of Active Learning could not be determined to be effective in this project.
394

The future of IT Project Management & Delivery: NLP AI opportunities & challenges

Viznerova, Ester January 2023 (has links)
This thesis explores the opportunities and challenges of integrating recent Natural Language Processing (NLP) Artificial Intelligence (AI) advancements into IT project management and delivery (PM&D). Using a qualitative design through hermeneutic phenomenology strategy, the study employs a semi-systematic literature review and semi-structured interviews to delve into NLP AI's potential impacts in IT PM&D, from both theoretical and practical standpoints. The results revealed numerous opportunities for NLP AI application across Project Performance Domains, enhancing areas such as stakeholder engagement, team productivity, project planning, performance measurement, project work, delivery, and risk management. However, challenges were identified in areas including system integration, value definition, team and stakeholder-related issues, environmental considerations, and ethical concerns. In-house and third-party model usage also presented their unique set of challenges, emphasizing cost implications, data privacy and security, result quality, and dependence issues. The research concludes the immense potential of NLP AI in IT PM&D is tempered by these challenges, and calls for robust strategies, sound ethics, comprehensive training, new ROI evaluation frameworks, and responsible AI usage to effectively manage these issues. This thesis provides valuable insights to academics, practitioners, and decision-makers navigating the rapidly evolving landscape of NLP AI in IT PM&D.
395

A Method for the Assisted Translation of QA Datasets Using Multilingual Sentence Embeddings / En metod för att assistera översättning av fråga-svarskorpusar med hjälp av språkagnostiska meningsvektorer

Vakili, Thomas January 2020 (has links)
This thesis presents a method which reduces the amount of labour required to translate the English question answering dataset SQuAD into Swedish. The purpose of the study is to contribute to shrinking the gap between natural language processing research in English and research in lesser-resourced languages by providing a method for creating datasets in these languages which are counterparts to those used in English. This would allow for the results from English studies to be evaluated in more languages. The method put forward by this thesis uses multilingual sentence embeddings to search for and rank answers to English SQuAD questions in SwedishWikipedia articles associated with the question. The resulting search results are then used to pair SQuAD questions with sentences that contain their answers. We also estimate to what extent SQuAD questions have answers in the Swedish edition of Wikipedia, concluding that this proportion of questions is small but still useful in size. Further, the evaluation of the method shows that it provides a clear reduction in the labour required for translating SQuAD into Swedish, while impacting the amount of datapoints retained in a resulting translation to a degree which is acceptable for many use-cases. Manual labour is still required for translating the SQuAD questions and for locating the answers within the Swedish sentences which contain them. Researching ways to automate these processes would further increase the utility of the approach, but are outside the scope of this thesis. / I detta examensarbete presenteras en metod som syftar till att minska mängden arbete som krävs för att översätta fråga-svarskorpuset SQuAD från engelska till svenska. Syftet med studien är att bidra till att minska glappet mellan språkteknologisk forskning på engelska och forskningen på språk med mindre resurser. Detta åstadkoms genom att beskriva en metod för att skapa korpusar liknande dem som används inom forskning på engelska och som kan användas för att utvärdera i vilken utsträckning resultat från den forskningen generaliserar till andra språk. Metoden använder språkagnostiska meningsvektorer för att söka efter svar på engelska SQuAD-frågor i svenska Wikipedia-artiklar, och sedan ranka dessa. Sökresultaten används sedan för att para samman SQuAD-frågor med de svenska meningar som innehåller deras svar. Även utsträckningen i vilken svar på engelska SQuAD-frågor står att finna i den svenska upplagan av Wikipedia undersöktes. Andelen SQuAD-frågor där ett svar fanns i den svenska Wikipedia-artikel som var associerad med frågan var liten men ändå användbar. Vidare visar utvärderingen av metoden att den innebär en tydlig minskning av mängden arbete som krävs för att översätta SQuAD till svenska. Denna minskning åstadkoms samtidigt som mängden fråga-svarspar som missas som en konsekvens av detta är acceptabel för många användningsområden. Manuellt arbete krävs fortfarande för att översätta SQuAD-frågorna från engelska och för att hitta var i de svenska meningarna som svaren finns. Vidare studier kring dessa frågor skulle bidra till att göra metoden än mer användbar, men ligger utanför avgränsningen för denna uppsats.
396

Evaluating Hierarchical LDA Topic Models for Article Categorization

Lindgren, Jennifer January 2020 (has links)
With the vast amount of information available on the Internet today, helping users find relevant content has become a prioritized task in many software products that recommend news articles. One such product is Opera for Android, which has a news feed containing articles the user may be interested in. In order to easily determine what articles to recommend, they can be categorized by the topics they contain. One approach of categorizing articles is using Machine Learning and Natural Language Processing (NLP). A commonly used model is Latent Dirichlet Allocation (LDA), which finds latent topics within large datasets of for example text articles. An extension of LDA is hierarchical Latent Dirichlet Allocation (hLDA) which is an hierarchical variant of LDA. In hLDA, the latent topics found among a set of articles are structured hierarchically in a tree. Each node represents a topic, and the levels represent different levels of abstraction in the topics. A further extension of hLDA is constrained hLDA, where a set of predefined, constrained topics are added to the tree. The constrained topics are extracted from the dataset by grouping highly correlated words. The idea of constrained hLDA is to improve the topic structure derived by a hLDA model by making the process semi-supervised. The aim of this thesis is to create a hLDA and a constrained hLDA model from a dataset of articles provided by Opera. The models should then be evaluated using the novel metric word frequency similarity, which is a measure of the similarity between the words representing the parent and child topics in a hierarchical topic model. The results show that word frequency similarity can be used to evaluate whether the topics in a parent-child topic pair are too similar, so that the child does not specify a subtopic of the parent. It can also be used to evaluate if the topics are too dissimilar, so that the topics seem unrelated and perhaps should not be connected in the hierarchy. The results also show that the two topic models created had comparable word frequency similarity scores. None of the models seemed to significantly outperform the other with regard to the metric.
397

Hierarchical Text Topic Modeling with Applications in Social Media-Enabled Cyber Maintenance Decision Analysis and Quality Hypothesis Generation

SUI, ZHENHUAN 27 October 2017 (has links)
No description available.
398

Designing Cost Effective and Flexible Vinyl Windows Supply Chain: Assembly Line Design Using CM/SERU Concepts and Simultaneous Selection of Facilities and Suppliers

Khan, Mohd Rifat 19 September 2017 (has links)
No description available.
399

An ontology-based framework for formulating spatio-temporal influenza (flu) outbreaks from twitter

Jayawardhana, Udaya Kumara 29 July 2016 (has links)
No description available.
400

Text Curation for Clustering of Free-text Survey Responses / Textbehandling för klustring av fritextsresponer i enkäter

Gefvert, Anton January 2023 (has links)
When issuing surveys, having the option for free-text answer fields is only feasible where the number of respondents is small, as the work to summarize the answers becomes unmanageable with a large number of responses. Using NLP techniques to cluster these answers and summarize them would allow a greater range of survey creators to incorporate free-text answers in their survey, without making their workload too large. Academic work in this domain is sparse, especially for smaller languages such as Swedish. The Swedish company iMatrics is regularly hired to do this kind of summarizing, specifically for workplace-related surveys. Their method of clustering has been semiautomatic, where both manual preprocessing and postprocessing have been necessary to accomplish this task. This thesis aims to explore if using more advanced, unsupervised NLP text representation methods, namely SentenceBERT and Sent2Vec, can improve upon these results and reduce the manual work needed for this task. Specifically, three questions are to be answered. Firstly, do the methods show good results? Secondly, can they remove the time-consuming postprocessing step of combining a large number of clusters into a smaller number? Lastly, can a model where unsupervised learning metrics can be shown to correlate to the real-world usability of the model, thus indicating that these metrics can be used to optimize the model for new data? To answer these questions, several models are trained, employed, and then compared using both internal and external metrics: Sent2Vec, SentenceBERT, and traditional baseline models. A manual evaluation procedure is performed to assess the real-world usability of the clusterings looks like, to see how well the models perform as well as to see if there is any correlation between this result and the internal metrics for the clustering. The results indicate that improving the text representation step is not sufficient for fully automating this task. Some of the models show promise in the results of human evaluation, but given the unsupervised nature of the problem and the large variance between models, it is difficult to predict the performance of new data. Thus, the models can serve as an improvement to the workflow, but the need for manual work remains.

Page generated in 0.051 seconds