Global ETD Search

1	Can an LLM find its way around a Spreadsheet? Lee, Cho Ting 05 June 2024 (has links) Spreadsheets are routinely used in business and scientific contexts, and one of the most vexing challenges data analysts face is performing data cleaning prior to analysis and evaluation. The ad-hoc and arbitrary nature of data cleaning problems, such as typos, inconsistent formatting, missing values, and a lack of standardization, often creates the need for highly specialized pipelines. We ask whether an LLM can find its way around a spreadsheet and how to support end-users in taking their free-form data processing requests to fruition. Just like RAG retrieves context to answer users' queries, we demonstrate how we can retrieve elements from a code library to compose data processing pipelines. Through comprehensive experiments, we demonstrate the quality of our system and how it is able to continuously augment its vocabulary by saving new codes and pipelines back to the code library for future retrieval. / Master of Science / Spreadsheets are frequently utilized in both business and scientific settings, and one of the most challenging tasks that must be accomplished before analysis and evaluation can take place is the cleansing of the data. The ad-hoc and arbitrary nature of issues in data quality, such as typos, inconsistent formatting, missing values, and lack of standardization, often creates the need for highly specialized data cleaning pipelines. Within the scope of this thesis, we investigate whether a large language model (LLM) can navigate its way around a spreadsheet, as well as how to assist end-users in bringing their free-form data processing requests to fruition. Just like Retrieval-Augmented Generation (RAG) retrieves context to answer user queries, we demonstrate how we can retrieve elements from a Python code reference to compose data processing pipelines. Through comprehensive experiments, we showcase the quality of our system and how it is capable of continuously improving its code-writing ability by saving new codes and pipelines back to the code library for future retrieval. LLMs data cleaning end-user programming
2	Investigating the use of LLMs for automated test generation: challenges, benefits, and suitability Hurani, Muaz, Idris, Hamzeh January 2024 (has links) This thesis investigates the application of Large Language Models (LLMs) in auto-mated test generation for software development, focusing on their challenges, bene-fits, and suitability for businesses. The study employs a mixed-methods approach, combining a literature review with empirical evaluations through surveys, interviews, and focus groups involving software developers and testers. Key findings indicate that LLMs enhance the efficiency and speed of test case generation, offering substantial improvements in test coverage and reducing development costs. However, the integration of LLMs poses several challenges, including technical complexities, the need for extensive customization, and concerns about the quality and reliability of the generated test cases. Additionally, ethical issues such as data biases and the potential impact on job roles were highlighted. The results show that while LLMs excel in generating test cases for routine tasks, their effectiveness diminishes in complex scenarios requiring deep domain knowledge and intricate system interactions. The study concludes that with proper training, continuous feedback, and iterative refinement, LLMs can be effectively integrated into existing workflows to complement traditional testing methods. AI LLMs Machine Learning Software Testing Software Engineering Programvaruteknik
3	Evaluating the Impact of Hallucinations on User Trust and Satisfaction in LLM-based Systems Oelschlager, Richard January 2024 (has links) Hallucinations in LLMs refer to instances where the models generateoutputs that are unrelated, incorrect, or misleading based on the inputprovided. This thesis investigates the impact of hallucinations in largelanguage model (LLM)-based systems on user trust and satisfaction, a criticalissue as AI becomes increasingly integrated into everyday applications.Hallucinations in LLMs—instances where the model generates incorrect ormisleading information—pose significant challenges for user reliability andoverall system effectiveness. Given the expanding role of AI in sectorsrequiring high trust levels, such as healthcare and finance, understanding andmitigating these errors is paramount. To address this issue, a controlled experiment was designed tosystematically assess how hallucinations affect user trust and satisfaction.Participants interacted with an AI system designed to exhibit varying levelsof hallucinatory behavior. Quantitative measures of trust and satisfactionwere collected through standardized questionnaires pre- and post-interaction,accompanied by statistical analyses to evaluate changes in user perception. The results clearly demonstrate that hallucinations significantly diminishuser trust and satisfaction, confirming the hypothesis that the accuracy of AIoutputs is crucial for user reliance. These findings not only contribute to theacademic discourse on human-AI interaction, but also have practicalimplications for AI developers and policymakers focusing on creating andregulating reliable AI technologies. This study bridges a crucial knowledge gap and provides a foundation forfuture research aimed at developing more robust and trustworthy AI systems.Readers engaged in AI development, implementation, and policymaking willfind the insights particularly relevant, encouraging further exploration intostrategies that could enhance user trust in AI technologies. LLMs hallucinations ChatGPT trust satisfaction Engineering and Technology Teknik och teknologier
4	Analyzing Large Language Models For Classifying Sexual Harassment Stories With Out-of-Vocabulary Word Substitution Seung Yeon Paik (18419409) 25 April 2024 (has links) <p dir="ltr">Sexual harassment is regarded as a serious issue in society, with a particularly negative impact on young children and adolescents. Online sexual harassment has recently gained prominence as a significant number of communications have taken place online. Online sexual harassment can happen anywhere in the world because of the global nature of the internet, which transcends geographical barriers and allows people to communicate electronically. Online sexual harassment can occur in a wide variety of environments such as through work mail or chat apps in the workplace, on social media, in online communities, and in games (Chawki & El Shazly, 2013).<br>However, especially for non-native English speakers, due to cultural differences and language barriers, may vary in their understanding or interpretation of text-based sexual harassment (Welsh, Carr, MacQuarrie, & Huntley, 2006). To bridge this gap, previous studies have proposed large language models to detect and classify online sexual harassment, prompting a need to explore how language models comprehend the nuanced aspects of sexual harassment data. Prior to exploring the role of language models, it is critical to recognize the current gaps in knowledge that these models could potentially address in order to comprehend and interpret the complex nature of sexual harassment.</p><p><br></p><p dir="ltr">The Large Language Model (LLM) has attracted significant attention recently due to its exceptional performance on a broad spectrum of tasks. However, these models are characterized by being very sensitive to input data (Fujita et al., 2022; Wei, Wang, et al., 2022). Thus, the purpose of this study is to examine how various LLMs interpret data that falls under the domain of sexual harassment and how they comprehend it after replacing Out-of-Vocabulary words.</p><p dir="ltr"><br>This research examines the impact of Out-of-Vocabulary words on the performance of LLMs in classifying sexual harassment behaviors in text. The study compares the story classification abilities of cutting-edge LLM, before and after the replacement of Out-of-Vocabulary words. Through this investigation, the study provides insights into the flexibility and contextual awareness of LLMs when managing delicate narratives in the context of sexual harassment stories as well as raises awareness of sensitive social issues.</p> Crime and social justice Natural language processing Sexual harassment Large Language Models (LLMs) out-of-vocabulary (OOV) words
5	Augmenting Large Language Models with Humor Theory To Understand Puns Ryan Rony Dsilva (18429846) 25 April 2024 (has links) <p dir="ltr">This research explores the application of large language models (LLMs) to comprehension of puns. Leveraging the expansive capabilities of LLMs, this study delves into the domain of pun classification by examining it through the prism of two humor theories: the Computational Model of Humor and the Benign Violation theory, which is an extension of the N+V Theory. The computational model posits that for a phrase to qualify as a pun, it must possess both ambiguity and distinctiveness, characterized by a word that can be interpreted in two plausible ways, each interpretation being supported by at least one unique word. On the other hand, the Benign Violation theory posits that puns work by breaching one linguistic rule while conforming to another, thereby creating a "benign violation." By leveraging the capabilities of large language models (LLMs), this research endeavors to scrutinize a curated collection of English language puns. Our aim is to assess the validity and effectiveness of the use of these theoretical frameworks in accurately classifying puns. We undertake controlled experiments on the dataset, selectively removing a condition specific to one theory and then evaluating the puns based on the criteria of the other theory to see how well it classifies the altered inputs. This approach allows us to uncover deeper insights into the processes that facilitate the recognition of puns and to explore the practical implications of applying humor theories. The findings of our experiments, detailed in the subsequent sections, sheds light on how the alteration of specific conditions impacts the ability of the LLMs to accurately classify puns, according to each theory, where each component of the theory does not influence the result to the same extent, thereby contributing to our understanding of humor mechanics through the eyes of LLMs.</p> Natural language processing Deep learning Computational linguistics Large Language Models (LLMs) puns wordplay humor
6	Large Language Models for Unsupervised Keyphrase Extraction and Biomedical Data Analytics Haoran Ding (18825838) 03 September 2024 (has links) <p dir="ltr">Natural Language Processing (NLP), a vital branch of artificial intelligence, is designed to equip computers with the ability to comprehend and manipulate human language, facilitating the extraction and utilization of textual data. NLP plays a crucial role in harnessing the vast quantities of textual data generated daily, facilitating meaningful information extraction. Among the various techniques, keyphrase extraction stands out due to its ability to distill concise information from extensive texts, making it invaluable for summarizing and navigating content efficiently. The process of keyphrase extraction usually begins by generating candidates first and then ranking them to identify the most relevant phrases. Keyphrase extraction can be categorized into supervised and unsupervised approaches. Supervised methods typically achieve higher accuracy as they are trained on labeled data, which allows them to effectively capture and utilize patterns recognized during training. However, the dependency on extensive, well-annotated datasets limits their applicability in scenarios where such data is scarce or costly to obtain. On the other hand, unsupervised methods, while free from the constraints of labeled data, face challenges in capturing deep semantic relationships within text, which can impact their effectiveness. Despite these challenges, unsupervised keyphrase extraction holds significant promise due to its scalability and lower barriers to entry, as it does not require labeled datasets. This approach is increasingly favored for its potential to aid in building extensive knowledge bases from unstructured data, which can be particularly useful in domains where acquiring labeled data is impractical. As a result, unsupervised keyphrase extraction is not only a valuable tool for information retrieval but also a pivotal technology for the ongoing expansion of knowledge-driven applications in NLP.</p><p dir="ltr">In this dissertation, we introduce three innovative unsupervised keyphrase extraction methods: AttentionRank, AGRank, and LLMRank. Additionally, we present a method for constructing knowledge graphs from unsupervised keyphrase extraction, leveraging the self-attention mechanism. The first study discusses the AttentionRank model, which utilizes a pre-trained language model to derive underlying importance rankings of candidate phrases through self-attention. This model employs a cross-attention mechanism to assess the semantic relevance between each candidate phrase and the document, enhancing the phrase ranking process. AGRank, detailed in the second study, is a sophisticated graph-based framework that merges deep learning techniques with graph theory. It constructs a candidate phrase graph using mutual attentions from a pre-trained language model. Both global document information and local phrase details are incorporated as enhanced nodes within the graph, and a graph algorithm is applied to rank the candidate phrases. The third study, LLMRank, leverages the strengths of large language models (LLMs) and graph algorithms. It employs LLMs to generate keyphrase candidates and then integrates global information through the text's graphical structures. This process reranks the candidates, significantly improving keyphrase extraction performance. The fourth study explores how self-attention mechanisms can be used to extract keyphrases from medical literature and generate query-related phrase graphs, improving text retrieval visualization. The mutual attentions of medical entities, extracted using a pre-trained model, form the basis of the knowledge graph. This, coupled with a specialized retrieval algorithm, allows for the visualization of long-range connections between medical entities while simultaneously displaying the supporting literature. In summary, our exploration of unsupervised keyphrase extraction and biomedical data analysis introduces novel methods and insights in NLP, particularly in information extraction. These contributions are crucial for the efficient processing of large text datasets and suggest avenues for future research and applications.</p> Natural language processing Natural Language Processing Unsupervised Keyphrase Extraction Large Language Models (LLMs) Knowledge Graph
7	Capturing Style Through Large Language Models - An Authorship Perspective Anuj Dubey (18398505) 10 December 2024 (has links) <p dir="ltr">This research investigates the use of Large Language Model (LLM) embeddings to capture the unique stylistic features of authors in Authorship Attribution (AA) tasks. Specifically, the focus of this research is on evaluating whether LLM-generated embeddings can effectively capture stylistic nuances that distinguish different authors, ultimately assessing their utility in tasks such as authorship attribution and clustering.The dataset comprises news articles from The Guardian authored by multiple writers, and embeddings were generated using OpenAI's text-embedding-ada-002 model. These embeddings were subsequently passed through a Siamese network with the objective of determining whether pairs of texts were authored by the same individual. The resulting model was used to generate style embeddings for unseen articles, which were then evaluated through classification and cluster analysis to assess their effectiveness in identifying individual authors across varying text samples. The classification task tested the model's accuracy in distinguishing authors, while the clustering analysis examined whether style embeddings primarily captured authorial identity or reflected domain-specific topics.</p><p dir="ltr">Our findings demonstrate that the proposed architecture achieves high accuracy for authors not previously encountered, outperforming traditional stylometric features and highlighting the effectiveness of LLM-based style embeddings. Additionally, our experiments reveal that authorship attribution accuracy decreases as the number of authors increases, yet improves with longer text lengths. </p><p dir="ltr"><br></p> Natural language processing Deep learning LLM Large Language Models LLMs Natural language processiong (NLP) authorship attribution
8	Learning to Predict Software Performance Changes based on Microbenchmarks David, Lucie 22 July 2024 (has links) Detecting performance regressions early in the software development process is paramount since performance bugs can lead to severe issues when introduced into a productive system. However, it is impractical to run performance tests with every committed code change due to their resource-intense nature. This study investigates to what extent NLP methods specialized on source code can effectively predict software performance regressions by utilizing source code obtained through line coverage information from microbenchmark exe- cutions. Contributing to the overarching goal of supporting test case selection and thereby increasing efficiency of performance benchmarking, we evaluate several models at different levels of complexity ranging from a simple logistic regression classifier to Transformers. Our results show that all implemented models exhibit challenges in accurately predicting regression-introducing code changes and that simple ML classifiers employing a Bag-of-Words encoding reach similar predictive performance as a BERT-based Transformer model. We further employed a statistical n-gram model to examine if the 'natural- ness' of source code can serve as reliable indicator for software performance regressions and concluded that the approach is not applicable to the data set at hand. This further underlines the challenge of effectively predicting perfor- mance based on source code and puts into question whether the current quality and quantity of available data is sufficient in order to render an NLP-based machine learning approach on regression detection suitable. info:eu-repo/classification/ddc/000 ddc:000
9	A Framework to Identify Online Communities for Social Media Analysis Nikhil Mehta (9750842) 16 October 2024 (has links) <p dir="ltr">Easy access, variety of content, and fast widespread interactions are some of the reasons that have made social media increasingly popular in our society. This has lead to many people use social media everyday for a variety of reasons, such as interacting with friends or consuming news content. Thus, understanding content on social media is more important than ever.</p><p dir="ltr">An increased understanding on social media can lead to improvements on a large number of important tasks. In this work, we particularly focus on fake news detection and political bias detection. Fake news, text published by news sources with an intent to spread misinformation and sway beliefs, is ever prevalent in today's society. Detecting it is an important and challenging problem to prevent large scale misinformation and maintain a healthy society. In a similar way, detecting the political bias of news content can provide insights about the different perspectives on social media.</p><p dir="ltr">In this work, we view the problem of understanding social media as reasoning over the relationships between sources, the articles they publish, and the engaging users. We start by analyzing these relationships in a graph-based framework, and then use Large Language Models to do the same. We hypothesize that the key to understanding social media is understanding these relationships, such as identifying which users have similar perspectives, or which articles are likely to be shared by similar users.</p><p dir="ltr">Throughout this thesis, we propose several frameworks to capture the relationships on social media better. We initially tackle this problem using supervised learning systems, improving them to achieve strong performance. However, we find that automatedly modeling the complexities of the social media landscape is challenging. On the contrary, having humans analyze and interact with all news content to find relationships, is not scalable. Thus, we then propose to approach enhance our supervised approaches by approaching the social media understanding problem \textit{interactively}, where humans can interact to help an automated system learn a better social media representation quality.</p><p dir="ltr">On real world events, our experiments show performance improvements in detecting the factuality and political bias of news sources, both when trained with and without minimal human interactions. We particularly focus on one of the most challenging setups of this task, where test data is unseen and focuses on new topics when compared with the training data. This realistic setting shows the real world impact of our work in improving social media understanding.</p> Natural language processing social media analysis Large Language Models (LLMs) Natural Language Processing Model
10	AUTOMATED EVALUATION OF NEUROLOGICAL DISORDERS THROUGH ELECTRONIC HEALTH RECORD ANALYSIS Md Rakibul Islam Prince (18771646) 03 September 2024 (has links) <p dir="ltr">Neurological disorders present a considerable challenge due to their variety and diagnostic complexity especially for older adults. Early prediction of the onset and ongoing assessment of the severity of these disease conditions can allow timely interventions. Currently, most of the assessment tools are time-consuming, costly, and not suitable for use in primary care. To reduce this burden, the present thesis introduces passive digital markers for different disease conditions that can effectively automate the severity assessment and risk prediction from different modalities of electronic health records (EHR). The focus of the first phase of the present study in on developing passive digital markers for the functional assessment of patients suffering from Bipolar disorder and Schizophrenia. The second phase of the study explores different architectures for passive digital markers that can predict patients at risk for dementia. The functional severity PDM uses only a single EHR modality, namely medical notes in order to assess the severity of the functioning of schizophrenia, bipolar type I, or mixed bipolar patients. In this case, the input of is a single medical note from the electronic medical record of the patient. This note is submitted to a hierarchical BERT model which classifies at-risk patients. A hierarchical attention mechanism is adopted because medical notes can exceed the maximum allowed number of tokens by most language models including BERT. The functional severity PDM follows three steps. First, a sentence-level embedding is produced for each sentence in the note using a token-level attention mechanism. Second, an embedding for the entire note is constructed using a sentence-level attention mechanism. Third, the final embedding is classified using a feed-forward neural network which estimates the impairment level of the patient. When used prior to the onset of the disease, this PDM is able to differentiate between severe and moderate functioning levels with an AUC of 76%. Disease-specific severity assessment PDMs are only applicable after the onset of the disease and have AUCs of nearly 85% for schizophrenia and bipolar patients. The dementia risk prediction PDM considers multiple EHR modalities including socio-demographic data, diagnosis codes and medical notes. Moreover, the observation period and prediction horizon are varied for a better understanding of the practical limitations of the model. This PDM is able to identify patients at risk of dementia with AUCs ranging from 70% to 92% as the observation period approaches the index date. The present study introduces methodologies for the automation of important clinical outcomes such as the assessment of the general functioning of psychiatric patients and the prediction of risk for dementia using only routine care data.</p> Natural language processing Deep learning Neural networks Semi- and unsupervised learning language model integration Large Language Models (LLMs) machine learning and AI Dementia -- Prevention Schizophrenia Patients Schizophrenia bipolar disorder patients Psychiatric patient BERT models Llama-2

Search results