Global ETD Search

41	Developing Intelligent Chatbots at Scania : Integrating Technological Solutions and Data Protection Considerations Söderberg, Johan January 2024 (has links) his thesis researches the complex intersection of Data Protection and Intelligent Chatbots (IC)at Scania Group. Developing intelligent chatbots in a secure and GDPR compliant way is highlycomplicated and multifaceted task. The purpose of this research is to provide Scania withorganizational knowledge on how this can be achieved. This study utilizes the Action DesignResearch framework to develop an artifact which integrates technological solutions with dataprotection considerations. By conducting a literature review and semi-structured interviews withemployees at Scania, three potential solutions are identified evaluated: ChatGPT Enterprise, theSecured AI Knowledge Repository (SAIKR), and Techtalker. Each solution offers differentcapabilities and compliance strategies: ChatGPT Enterprise, while practical, relies on contractualassurances for GDPR compliance with data stored in the USA. SAIKR, on the other hand, offersmore control with data stored and encrypted in Sweden, allowing for the use of advancedprivacy-preserving techniques. Techtalker, which is hosted directly by Scania, provides enhancedsecurity measures tailored to specific technical use cases. Based on the artifact and conclusionsof this research, generalized design principles for developing intelligent chatbots within acorporate structure are formulated. These four design principles encourages the utilization ofRAG and LLMs, safe and legal data localization, strong contractual safeguards with third-partyproviders, and a comprehensive risk analysis with stringent security measures. Read more Generative AI Data Protection Privacy GDPR Large Language Models RAG Design Principles Communication Systems Kommunikationssystem Computer Systems Datorsystem
42	Narrative Engineering: Tools, Computational Structure, and Impact of Stories DeBuse, Michael A. 23 December 2024 (has links) (PDF) Computational Linguistics has a long history of applying mathematics to the grammatical and syntactic structure of language; however, applying math to the more complex aspects of language, such as narrative, plot, scenes, character relations, causation, etc. remains a difficult topic. The goal of my research is to bridge the narrative humanities with mathematics, to computationally grasp at these difficult topic, and help develop the field of Narrative Engineering. I view narrative and story with the same mathematical scrutiny as other engineering fields, to take the creativity and fluidity of story and encode it in mathematical representations that have meaning beyond probability and statistical predictions that are the primary function of modern large language models. Included in this research is how stories and narratives are structured, evolve, and change, implying that there exists an inherent narrative computation that we as humans do to merge and combine ideas into new and novel ones. Our thoughts and knowledge and opinions determine the stories we tell, as a combination of everything we have seen, read, heard, and otherwise experienced. Narratives have the ability to inform and change those thoughts and opinions, which then lead to the creation of new and novel narratives. In essence, stories can be seen as a programming language for people. My dissertation, then, is to better understand stories and the environments in which stories are shared. I do this through developing tools that detect, extract, and model aspects of stories and their environments; developing mathematical models of stories and their spread environments; and investigating the impact and effects on stories and their spread environments. I then finish with a discussion on the ethical concerns of research in narrative influence and opinion control. Read more Narrative Modeling Opinion Dynamics Analysis Tools Natural Language Processing Large Language Models Machine Learning Ethics Physical Sciences and Mathematics
43	Topic Modeling for Heterogeneous Digital Libraries: Tailored Approaches Using Large Language Models Dasu, Pradyumna Upendra 10 January 2025 (has links) Digital libraries hold vast and diverse content, with electronic theses and dissertations (ETDs) being among the most diverse. ETDs span multiple disciplines and include unique terminology, making achieving clear and coherent topic representations challenging. Existing topic modeling techniques often struggle with such heterogeneous collections, leaving a gap in providing interpretable and meaningful topic labels. This thesis addresses these challenges through a three-step framework designed to improve topic modeling outcomes for ETD metadata. First, we developed a custom preprocessing pipeline to enhance data quality and ensure consistency in text analysis. Second, we applied and optimized multiple topic modeling techniques to uncover latent themes, including LDA, ProdLDA, NeuralLDA, Contextualized Topic Models, and BERTopic. Finally, we integrated Large Language Models (LLMs), such as GPT-4, using prompt engineering to augment traditional topic models, refining and interpreting their outputs without replacing them. The framework was tested on a large corpus of ETD metadata, including through preliminary testing on a small subset. Quantitative metrics and user studies were used to evaluate performance, focusing on the clarity, accuracy, and relevance of the generated topics. The results demonstrated significant improvements in topic coherence and interpretability, with user study participants highlighting the value of the enhanced representations. These findings underscore the potential of combining customized preprocessing, advanced topic modeling, and LLM-driven refinements to better represent themes in complex collections like ETDs, providing a foundation for downstream tasks such as searching, browsing, and recommendation. / Master of Science / Digital libraries store vast information, including books, research papers, and electronic theses and dissertations (ETDs). ETDs are incredibly diverse, covering most academic fields and using highly specialized language. This diversity makes it challenging to create clear and meaningful summaries of the main themes within these collections. Our study addresses this challenge by developing a three-step framework and applying it to ETDs. First, we cleaned and standardized the data to make it easier to analyze. Second, we used advanced techniques to uncover patterns and group similar topics together. Finally, we improved these topics using powerful tools like GPT-4, which helped make the themes more precise, more accurate, and easier to interpret. We tested this framework on both a small and a large collection of ETDs. Combining quantitative evaluations and user feedback showed that our methods significantly improved how the topics represented the content. This work lays the foundation for more effective future tools to help people search, explore, and navigate large collections of academic works. Read more Topic Modeling Natural Language Processing Large Language Models Electronic Theses and Dissertations Digital Libraries Information Storage and Retrieval Artificial Intelligence Search and Recommendation
44	LLMS FOR SENTIMENT ANALYSIS IN EDUCATION: A STUDY IN RESOURCE-LIMITED SETTINGS J Hwang (10867428) 06 March 2025 (has links) <p dir="ltr">Sentiment analysis is a computational technique employed to extract and interpret subjective information from textual data. It involves the identification and classification of sentiments, opinions, and emotions expressed within the text. By analyzing linguistic cues, such as word choice, syntax, and sentiment lexicons, sentiment analysis can discern a range of emotions, from positive to negative, as well as more nuanced sentiments, such as anger, joy, or surprise. This powerful tool has the potential to unlock valuable insights from vast amounts of unstructured text data, which enables informed decision-making and effective communication in various domains, including education. </p><p dir="ltr">Recent advances in sentiment analysis have leveraged the power of deep neural networks, particularly general-purpose Large Language Models (LLMs) trained on extensive labeled datasets. However, real-world applications frequently encounter challenges related to the availability of large, high-quality labeled data and the computational resources necessary for training such models. </p><p dir="ltr">This research addresses these challenges by investigating effective strategies for utilizing LLMs in scenarios with limited data and computational resources. Specifically, this study explores three techniques: zero-shot learning, <i>N</i>-shot learning and fine-tuning. By evaluating these methods, this research aims to demonstrate the feasibility of employing general-purpose LLMs for sentiment analysis within educational contexts even when access to computational resources and labeled data is limited. The findings of this study reveal that different adaptation methods lead to significantly different LLM performance outcomes.</p> Read more Natural language processing Affective computing Context learning Deep learning Large Language Models (LLMs) Sentiment Analysis Education
45	How do wage wars affect employer reputation in a competitive labor market? Evidence from Indeed.com reviews Catabia, Hannah B. 05 March 2025 (has links) 2024 / This thesis empirically evaluates the impact of voluntary minimum wage changes to firm reputation using data from the hiring website Indeed.com. As a starting point, I show that when Target and Amazon unilaterally raised their minimum wages, their ratings on Indeed.com improved substantially across multiple dimensions: work-life balance, compensation, job security, management, and culture. Next, I examine the impact of a focal firm voluntarily raising its minimum wage on the ratings of similar firms in proximal locations. Using a differences-in-differences (DiD) design, I present preliminary evidence that competitors that are located near the focal firm may expe- rience a negative reputational shock relative to similar firms that are geographically distant. Additionally, this thesis applies novel sentiment analysis techniques to eval- uate minimum wage policies on review text. Using state-of-the-art NLP models such as Claude, ChatGPT, and RoBERTa, I identify and score two topics that are im- portant to job reviewers, but do not receive star ratings on Indeed.com: ”Scheduling and Hours,” and ”Workload and Compensation.” Finally, I use LLMs to perform zero-shot fine-grained sentiment analysis to investigate how a company’s reputation in regards to these topics is impacted by voluntary minimum wage policies. In these analyses, I am not able to refute the null hypothesis, though the method demostrates promise for further development. Read more Computer science Economics Marketing difference-in-differences (DiD) Indeed.com Labor market Large language models (LLMs) Minimum wage Online reviews
46	Generating Terraform Configuration Files with Large Language Models / Att skapa Terraform-konfigurationsfiler med stora språkmodeller Bonde, Oskar January 2022 (has links) This thesis explores how large language models can be used to generate configuration files for Terraform from natural language descriptions. Few-shot and fine-tuning paradigms are evaluated on decoder-only models of varying size, including the state-of-the-art Codex model. The generated configuration files are evaluated with regard to functional correctness on a custom dataset using Terraform, to account for the large space of functionally equivalent configuration files. Results show that the largest model Codex is very capable at generating configuration files given an English description of network infrastructure even without fine-tuning. The result could be a useful tool for engineers who know Terraform fundamentals and have experience with the cloud platforms: AWS, GCP, or Azure. A future study could fine-tune Codex for Terraform using OpenAI's API or create an open source Codex-replication by fine-tuning the GPT-3 replication OPT, which in turn can be \hbox{fine-tuned}. / Denna avhandling undersöker hur stora språkmodeller kan användas till att generera konfigurationsfiler för Terraform med hjälp av språkbeskrivningar. Både few-shot och fine-tuning paradigm utvärderas på decoder-only modeller i olika storlekar, inklusive Codex. För att ta hänsyn till konfigurationsfiler som i utseende ser olika ut men som är funktionellt ekvivalenta utvärderas konfigurationsfilerna utifrån deras funktion. Resultaten visar att Codex, som är den största modellen, har förmågan att generera konfigurationsfiler givet en engelsk beskrivning av nätverksinfrastruktur, trots att Codex inte har undergått fine-tuning. Resultatet kan vara ett användbart verktyg för ingenjörer som har grundläggande kunskap om Terraform och erfarenhet av molnplattformarna: AWS, GCP eller Azure. En framtida studie skulle kunna träna Codex för Terraform med OpenAI:s API eller skapa en Codex-kopia genom att träna GPT-3 kopian OPT som i sin tur kan bli tränad för Terraform. Read more Terraform Transformer models Generating configuration files Large Language Models Codex Terraform Transformer-modeller Generera konfigurationsfiler Stora språkmodeller Codex Computer Sciences Datavetenskap (datalogi)
47	Language Models as Evaluators : A Novel Framework for Automatic Evaluation of News Article Summaries / Språkmodeller som Utvärderare : Ett Nytt Ramverk för Automatiserad Utvärdering av Nyhetssammanfattningar Helgesson Hallström, Celine January 2023 (has links) The advancements in abstractive summarization using Large Language Models (LLMs) have brought with it new challenges in evaluating the quality and faithfulness of generated summaries. This thesis explores a human-like automated method for evaluating news article summaries. By leveraging two LLMs with instruction-following capabilities (GPT-4 and Claude), the aim is to examine to what extent the quality of summaries can be measured by predictions of an LLM. The proposed framework involves defining specific attributes of desired summaries, which are used to design generation prompts and evaluation questions. These questions are presented to the LLMs in natural language during evaluation to assess of various summary qualities. To validate the effectiveness of the evaluation method, an adversarial approach is employed, in which a dataset comprising summaries with distortions related to various summary attributes is generated. In an experiment, the two LLMs evaluate the adversarial dataset, and their ability to detect known distortions is measured and analyzed. The findings suggest that the LLM-based evaluations demonstrate promise in detecting binary qualitative issues, such as incorrect facts. However, the reliability of the zero-shot evaluation varies depending on the evaluating LLM and the specific questions used. Further research is required to validate the accuracy and generalizability of the results, particularly in subjective dimensions where the results of this thesis are inconclusive. Nonetheless, this thesis provides insights that can serve as a foundation for future advancements in the field of automatic text evaluation. / De framsteg som gjorts inom abstrakt sammanfattning med hjälp av stora språkmodeller (LLM) har medfört nya utmaningar när det gäller att utvärdera kvaliteten och sanningshalten hos genererade sammanfattningar. Detta examensarbete utforskar en mänskligt inspirerad automatiserad metod för att utvärdera sammanfattningar av nyhetsartiklar. Genom att dra nytta av två LLM:er med instruktionsföljande förmågor (GPT-4 och Claude) är målet att undersöka i vilken utsträckning kvaliteten av sammanfattningar kan bestämmas med hjälp av språkmodeller som utvärderare. Det föreslagna ramverket innefattar att definiera specifika egenskaper hos önskade sammanfattningar, vilka används för att utforma genereringsuppmaningar (prompts) och utvärderingsfrågor. Dessa frågor presenteras för språkmodellerna i naturligt språk under utvärderingen för att bedöma olika kvaliteter hos sammanfattningar. För att validera utvärderingsmetoden används ett kontradiktoriskt tillvägagångssätt där ett dataset som innefattar sammanfattningar med förvrängningar relaterade till olika sammanfattningsattribut genereras. I ett experiment utvärderar de två språkmodellerna de motstridiga sammanfattningar, och deras förmåga att upptäcka kända förvrängningar mäts och analyseras. Resultaten tyder på att språkmodellerna visar lovande resultat vid upptäckt av binära kvalitativa problem, såsom faktafel. Dock varierar tillförlitligheten hos utvärderingen beroende på vilken språkmodell som används och de specifika frågorna som ställs. Ytterligare forskning krävs för att validera tillförlitligheten och generaliserbarheten hos resultaten, särskilt när det gäller subjektiva dimensioner där resultaten är osäkra. Trots detta ger detta arbete insikter som kan utgöra en grund för framtida framsteg inom området för automatisk textutvärdering. Read more Natural Language Processing Large Language Models Automatic Text Evaluation Text Summarization Multilingualism Naturlig Språkbehandling Stora Språkmodeller Automatisk Textutvärdering Textsammanfattning Flerspråkighet Computer and Information Sciences Data- och informationsvetenskap
48	Cookie Monsters : Using Large Language Models to Measure GDPR Compliance in Cookie Banners Automatically Otterström, Marcus, Palonkorpi, Oliver January 2023 (has links) There is a widespread problem of cookie banners not being compliant with the General Data Protection Regulation (GDPR), which negatively impacts user experience and violates personal data rights. To mitigate this issue, strides need to be made in violation detection to assist developers, designers, lawyers, organizations, and authorities in designing and enforcing GDPR-compliant cookie banners. In this thesis, we present a novel method and an open-source tool for automatically analyzing the GDPR compliance of cookie banners. The tool uniquely leverages large language models together with static code analysis to locate and analyze any cookie banner, using only the website address as input. Informed by the Design Science Research methodology, our research process involved interviews with GDPR legal experts and a thorough review of current literature in order to understand the problem context and define the objectives for our solution. After an initial version of the tool was created, an evaluation was performed by a GDPR legal expert. The feedback revealed that even at this early development stage, the tool approaches the capabilities of a trained eye, which illustrates its potential. Furthermore, our proposed method is generalizable and can be used under many domains to solve various problems (e.g., more generalized web scraping). However, further development and testing with the help of legal experts is required to enhance the tool's accuracy and validity. Read more cookie banners gdpr compliance consent large language models design science research Information Systems, Social aspects
49	Characterizing, classifying and transforming language model distributions Kniele, Annika January 2023 (has links) Large Language Models (LLMs) have become ever larger in recent years, typically demonstrating improved performance as the number of parameters increases. This thesis investigates how the probability distributions output by language models differ depending on the size of the model. For this purpose, three features for capturing the differences between the distributions are defined, namely the difference in entropy, the difference in probability mass in different slices of the distribution, and the difference in the number of tokens covering the top-p probability mass. The distributions are then put into different distribution classes based on how they differ from the distributions of the differently-sized model. Finally, the distributions are transformed to be more similar to the distributions of the other model. The results suggest that classifying distributions before transforming them, and adapting the transformations based on which class a distribution is in, improves the transformation results. It is also shown that letting a classifier choose the class label for each distribution yields better results than using random labels. Furthermore, the findings indicate that transforming the distributions using entropy and the number of tokens in the top-p probability mass makes the distributions more similar to the targets, while transforming them based on the probability mass of individual slices of the distributions makes the distributions more dissimilar. Read more Large Language Models (LLMs) GPT BERT NLP deep learning machine learning computational linguistics language technology
50	Self-Reflection on Chain-of-Thought Reasoning in Large Language Models / Självreflektion över Chain-of-Thought-resonerande i stora språkmodeller Praas, Robert January 2023 (has links) A strong capability of large language models is Chain-of-Thought reasoning. Prompting a model to ‘think step-by-step’ has led to great performance improvements in solving problems such as planning and question answering, and with the extended output it provides some evidence about the rationale behind an answer or decision. In search of better, more robust, and interpretable language model behavior, this work investigates self-reflection in large language models. Here, self-reflection consists of feedback from large language models to medical question-answering and whether the feedback can be used to accurately distinguish between correct and incorrect answers. GPT-3.5-Turbo and GPT-4 provide zero-shot feedback scores to Chain-of-Thought reasoning on the MedQA (medical questionanswering) dataset. The question-answering is evaluated on traits such as being structured, relevant and consistent. We test whether the feedback scores are different for questions that were either correctly or incorrectly answered by Chain-of-Thought reasoning. The potential differences in feedback scores are statistically tested with the Mann-Whitney U test. Graphical visualization and logistic regressions are performed to preliminarily determine whether the feedback scores are indicative to whether the Chain-of-Thought reasoning leads to the right answer. The results indicate that among the reasoning objectives, the feedback models assign higher feedback scores to questions that were answered correctly than those that were answered incorrectly. Graphical visualization shows potential for reviewing questions with low feedback scores, although logistic regressions that aimed to predict whether or not questions were answered correctly mostly defaulted to the majority class. Nonetheless, there seems to be a possibility for more robust output from self-reflecting language systems. / En stark förmåga hos stora språkmodeller är Chain-of-Thought-resonerande. Att prompta en modell att tänka stegvis har lett till stora prestandaförbättringar vid lösandet av problem som planering och frågebesvarande, och med den utökade outputen ger det en del bevis rörande logiken bakom ett svar eller beslut. I sökandet efter bättre, mer robust och tolk bart beteende hos språkmodeller undersöker detta arbete självreflektion i stora språkmodeller. Forskningsfrågan är: I vilken utsträckning kan feedback från stora språkmodeller, såsom GPT-3.5-Turbo och GPT-4, på ett korrekt sätt skilja mellan korrekta och inkorrekta svar i medicinska frågebesvarande uppgifter genom användningen av Chainof-Thought-resonerande? Här ger GPT-3.5-Turbo och GPT-4 zero-shot feedback-poäng till Chain-ofThought-resonerande på datasetet för MedQA (medicinskt frågebesvarande). Frågebesvarandet bör vara strukturerat, relevant och konsekvent. Feedbackpoängen jämförs mellan två grupper av frågor, baserat på om dessa besvarades korrekt eller felaktigt i första hand. Statistisk testning genomförs på skillnaden i feedback-poäng med Mann-Whitney U-testet. Grafisk visualisering och logistiska regressioner utförs för att preliminärt avgöra om feedbackpoängen är indikativa för huruvida Chainof-Thought-resonerande leder till rätt svar. Resultaten indikerar att bland resonemangsmålen tilldelar feedbackmodellerna fler positiva feedbackpoäng till frågor som besvarats korrekt än de som besvarats felaktigt. Grafisk visualisering visar potential för granskandet av frågor med låga feedbackpoäng, även om logistiska regressioner som syftade till att förutsäga om frågorna besvarades korrekt eller inte för det mesta majoritetsklassen. Icke desto mindre verkar det finnas potential för robustare från självreflekterande språksystem. Read more Large language models Chain-of-Thought reasoning Metareasoning Question answering Selfcorrection Ethical AI Stora språkmodeller Chain-of-Thought-resonemang Metareasoning Frågesvar Självkorrigering Etisk AI Computer and Information Sciences Data- och informationsvetenskap

Search results