Global ETD Search

61	<b>Leveraging Advanced Large Language Models To Optimize Network Device Configuration</b> Mark Bogdanov (18429435) 24 April 2024 (has links) <p dir="ltr">Recent advancements in large language models such as ChatGPT and AU Large allow for the effective integration and application of LLMs into network devices such as switches and routers in terms of the ability to play a role in configuration and management. The given devices are an essential part of every network infrastructure, and the nature of physical networking topologies is complex, which leads to the need to ensure optimal network efficiency and security via meticulous and precise configurations.</p><p dir="ltr">The research explores the potential of an AI-driven interface that utilizes AU Large to streamline, enhance, and automate the configuration process of network devices while ensuring that the security of the whole process is guaranteed by running the entire system on-premise. Three core areas are of primary concern in the given study: the effectiveness of integrating the AU Large into network management systems, the impact on efficiency, accuracy, and error rates in network configurations, and the scalability and adaptability to more complex requirements and growing network environments.</p><p dir="ltr">The key performance metrics evaluated are the error rate in the generated configurations, scalability by looking at the performance as more network devices are added, and the ability to generate incredibly complex configurations accurately. The high-level results of the critical performance metrics show an evident correlation between increased device count and increased prompt complexity with a degradation in the performance of the AU Large model from Mistral AI.</p><p dir="ltr">This research has significant potential to alter preset network management practices by applying AI to make network configuration more efficient, reduce the scope for human error, and create an adaptable tool for diverse and complex networking environments. This research contributes to both AI and network management fields by highlighting a path toward the “future of network management.”</p> Natural language processing Deep learning Neural networks Mistral LLM Large Language Models Cisco Networking AI Switches Routers ChatGPT AU Large Ansible Git Network configuration Network Management
62	Natural Language Based AI Tools in Interaction Design Research : Using ChatGPT for Qualitative User Research Insight Analysis Saare, Karmen January 2024 (has links) This thesis investigates the use of Artificial Intelligence, specifically the Large Language Model (LLM) application ChatGPT in the context of qualitative user research, with the goal of enhancing the user research interview analysis process. Through an empirical study where ChatGPT was used in the process of a typical user research insight analysis, the limitations and opportunities of the AI tool are examined. The study's results highlight the most significant insights from the empirical investigation, serving as examples to raise awareness of the implications of using ChatGPT in the context of user interview analysis. The study concludes that ChatGPT has the potential to enhance the interpretation of primarily individual interviews by generating well-articulated summaries, provided their accuracy can be verified. Additionally, ChatGPT may be particularly useful in low-risk design projects where the consequences of potential misinterpretations are minimal. Finally, the significance of clearly articulated written instructions for ChatGPT for best results is pointed out. ChatGPT user research UX research methods qualitative research user interviews LLM NLP Computer and Information Sciences Data- och informationsvetenskap Human Computer Interaction Design Design
63	The shifting landscape of data : learning to tame distributional shifts Ibrahim, Adam 05 1900 (has links) Les modèles d'apprentissage automatique (ML) atteignent des performances remarquables sur les tâches pour lesquelles ils sont entraînés. Cependant, ils sont souvent sensibles aux changements dans la distribution des données, ce qui peut nuir à leur fiabilité. Cela peut se produire lorsque la distribution des données rencontrées au déploiement diffère de celle vue pendant l'entraînement, entraînant une dégradation considérable des performances. Pire encore, les attaquants peuvent également induire de tels changements afin d'induire les modèles d'apprentissage automatique en erreur. Enfin, cela peut même arriver si l'entraînement est effectué séquentiellement sur des distributions de données différentes. Ces changements de distribution sont omniprésents en ML, nuisant à l'équité, à la fiabilité, à la sécurité et à l'efficacité des modèles d'apprentissage automatique. Cette thèse se concentre sur la compréhension et l'amélioration de la robustesse et de l'adaptation des modèles de ML aux changements de distribution, englobant à la fois des travaux théoriques et expérimentaux. Tout d'abord, nous étudions les limites fondamentales de l'optimisation différentiable à plusieurs objectifs. Une meilleure compréhension de ces limites est importante car les travaux sur les changements de distribution reposent souvent sur des formulations de la théorie des jeux. Nous fournissons de nouvelles bornes inférieures sur la vitesse de convergence d'une large classe de méthodes, ainsi que de nouvelles métriques de conditionnement qui aident à évaluer la difficulté d'optimiser des classes de jeux, et expliquent le potentiel de convergence rapide, même sans forte convexité ou forte concavité. Deuxièmement, nous abordons le manque de robustesse aux attaques adversarielles contre plusieurs types d'attaques, une limitation courante des méthodes de pointe. Nous proposons une approche inspirée de la généralisation de domaine, utilisant l'extrapolation des risques (REx) pour promouvoir la robustesse à plusieurs attaques. Notre méthode atteint des performances supérieures aux bases de référence existantes, que les attaques aient été vues ou non lors de l'entraînement. Enfin, nous nous intéressons aux défis du pré-entraînement continu pour les grands modèles de langage (LLM). Ces modèles sont confrontés à un compromis: soit ils oublient de manière catastrophique les connaissances antérieures lorsqu'ils sont mis à jour sur de nouvelles données, soit ils nécessitent un réentraînement complet coûteux en calcul. Nous démontrons qu'une combinaison de réchauffement et de re-décroissance du taux d'apprentissage, et de réutilisation des données précédemment utilisées permet aux LLM d'apprendre continuellement à partir de nouvelles distributions tout en préservant leurs performances sur les données auparavant apprises. Cette approche permet d'atteindre les performances d'un réentraînement complet, mais à une fraction du coût en calcul. Dans l'ensemble, cette thèse apporte des considérations importantes pour améliorer la robustesse et l'adaptation aux changements de distribution. Ces contributions ouvrent des voies prometteuses pour relever les défis du ML du monde réel dans l'optimisation multiobjectif, la défense contre les adversaires et l'apprentissage continu des grands modèles de langage. / Machine learning (ML) models achieve remarkable performance on tasks they are trained for. However, they often are sensitive to shifts in the data distribution, which may lead to unexpected behaviour. This can happen when the data distribution encountered during deployment differs from that used for training, leading to considerable degradation of performance. Worse, attackers may also induce such shifts to fool machine learning models. Finally, this can even happen when training sequentially on different data distribution. These distributional shifts are pervasive in ML, hindering the fairness, reliability, safety and efficiency of machine learning models. This thesis is focused on understanding and improving the robustness and adaptation of ML models to distributional shifts, encompassing both theoretical and experimental work. First, we investigate the fundamental limits of differentiable multiobjective optimisation. This investigation is important because works on distributional shifts often rely on game theoretical formulations. We provide new lower bounds on the speed of convergence of a large class of methods, along with novel condition numbers that help assess the difficulty to optimise classes of games, and explain the potential for fast convergence even without strong convexity or strong concavity. Second, we address the lack of adversarial robustness against multiple attack types, a common limitation of state-of-the-art methods. We propose a domain generalisation-inspired approach, using Risk Extrapolation (REx) to promote robustness across a range of attacks. Our method achieves performance superior to existing baselines for both seen and novel types of attacks. Finally, we tackle the challenges of continual pretraining for large language models (LLMs). These models face a trade-off: either they catastrophically forget previous knowledge when updated on new data, or they require computationally expensive full retraining. We demonstrate that a combination of learning rate re-warming, re-decaying, and the replay of previous data allows LLMs to continually learn from new distributions while preserving past knowledge. This approach matches the performance of full retraining, but at a fraction of the computational cost. Overall, this thesis contributes impactful considerations towards improving robustness and adaptation to distributional shifts. These contributions open promising avenues for addressing real-world ML challenges across multiobjective optimisation, adversarial defense, and continual learning of large language models. machine learning artificial intelligence distributional shift continual learning large language models llm adversarial optimization game theory intelligence artificielle apprentissage automatique changements de distribution attaques adversarielles apprentissage continu grand modèle de langage optimisation théorie des jeux
64	Direct Preference Optimization for Improved Technical WritingAssistance : A Study of How Language Models Can Support the Writing of Technical Documentation at Saab / En studie i hur språkmodeller kan stödja skrivandet av teknisk dokumentation på Saab Bengtsson, Hannes, Habbe, Patrik January 2024 (has links) This thesis explores the potential of Large Language Models (LLMs) to assist in the technical documentation process at Saab. With the increasing complexity and regulatory demands on such documentation, the objective is to investigate advanced natural language processing techniques as a means of streamlining the creation of technical documentation. Although many standards exist, this thesis particularly focuses on the standard ASD-STE100, Simplified Technical English abbrv. STE, a controlled language for technical documentation. STE's primary aim is to ensure that technical documents are understandable to individuals regardless of their native language or English proficiency. The study focuses on the implementation of Direct Preference Optimization (DPO) and Supervised Instruction Fine-Tuning (SIFT) to refine the capabilities of LLMs in producing clear and concise outputs that comply with STE. Through a series of experiments, we investigate the effectiveness of LLMs in interpreting and simplifying technical language, with a particular emphasis on adherence to STE standards. The study utilizes a dataset comprised of target data paired with synthetic source data generated by a LLM. We apply various model training strategies, including zero-shot performance, supervised instruction fine-tuning, and direct preference optimization. We evaluate the various models' output using established quantitative metrics for text simplification and substitute human evaluators with company internal software for evaluating adherence to company standards and STE. Our findings suggest that while LLMs can significantly contribute to the technical writing process, the choice of training methods and the quality of data play crucial roles in the model's performance. This study shows how LLMs can improve productivity and reduce manual work. It also looks at the problems and suggests ways to make technical documentation automation better in the future. Large Language Models LLM Natural Language Processing NLP Technical Writing Simplified Technical English Direct Preference Optimization DPO Supervised Instruction Fine-tuning LoRA AI
65	Människors förtroende för AI: Könsrelaterad bias i AI-språkmodeller / People's Trust in AI: Gender Bias in Large Language Models Forsman, Angela, Martinsson, Jonathan January 2024 (has links) I en tid då AI-språkmodeller används alltmer i vår vardag, blir det relevant att undersöka hur det påverkar samhället. Denna studie undersöker, utifrån teorier om etik och jämställdhet, hur AI-språkmodeller i sina texter ger uttryck för mångfald, icke-diskriminering och rättvisa. Studien fokuserar på att identifiera och analysera förekomsten av könsbias i AI-språkmodellernas svar samt hur det påverkar människors förtroende för dessa system. En fallstudie genomfördes på tre AI-språkmodeller - ChatGPT 3.5, Gemini och Llama-2 70B, där data insamlades via intervjuer med dessa modeller. Därefter gjordes intervjuer med mänskliga informanter som reflekterade över AI-språkmodellernas svar. AI-språkmodellerna visade en obalans i hur de behandlar kvinnor och män vilket kan förstärka befintliga könsstereotyper. Detta kan påverka människors förtroende för AI-språkmodeller och informanterna lyfte problematiken om vad neutralitet och rättvisa innebär. För att skapa mer ansvarsfulla och rättvisa AI-system krävs medvetna insatser för att integrera etiska och jämställdhetsperspektiv i AI-utveckling och användning. / In a time when Large Language Models (LLMs) are increasingly used in our daily lives, it becomes important to investigate how this affects society. This study examines how LLMs express diversity, non-discrimination, and fairness in texts, based on theories of ethics and gender equality. The study focuses on identifying and analyzing the presence of gender bias in the responses of LLMs and how this impacts people's trust in these systems. A case study was conducted on three LLMs: ChatGPT 3.5, Gemini, and Llama-2 70B, where data was collected through interviews with them. Subsequently, interviews were conducted with human informants who reflected on the LLMs’ responses. The LLMs showed imbalance towards gender, potentially reinforcing existing gender stereotypes. This can affect people's trust in LLMs, and the informants highlighted the issue of what neutrality and fairness entail. To create more responsible and fair AI systems, conscious efforts are required to integrate ethical and equality perspectives into AI development and usage. Artificiell intelligens (AI) Large Language Model (LLM) ChatGPT 3.5 Llama-2 70B Gemini AI ethics gender bias gender equality fairness diversity Artificiell intelligens (AI) AI-språkmodell ChatGPT 3.5 Llama-2 70B Gemini AI-etik könsbias jämställdhet rättvisa mångfald Information Systems
66	Applying Large Language Models in Business Processes : A contribution to Management Innovation / Tillämpning av stora språkmodeller i affärsprocesser : Ett bidrag till Management Innovation Bergman Larsson, Niklas, Talåsen, Jonatan January 2024 (has links) This master thesis explores the transformative potential of Large Language Models (LLMs) in enhancing business processes across various industries, with a specific focus on Management Innovation. As organizations face the pressures of digitalization, LLMs emerge as powerful tools that can revolutionize traditional business workflows through enhanced decision-making, automation of routine tasks, and improved operational efficiency. The research investigates the integration of LLMs within four key business domains: Human Resources, Tender Management, Consultancy, and Compliance. It highlights how LLMs facilitate Management Innovation by enabling new forms of workflow automation, data analysis, and compliance management, thus driving substantial improvements in efficiency and innovation. Employing a mixed-method approach, the study combines an extensive literature review with surveys and interviews with industry professionals to evaluate the impact and practical applications of LLMs. The findings reveal that LLMs not only offer significant operational benefits but also pose challenges related to data security, integration complexities, and privacy concerns. This thesis significantly contributes to the academic and practical understanding of LLMs, proposing a framework for their strategic adoption to foster Management Innovation. It underscores the need for businesses to align LLM integration with both technological capabilities and strategic business objectives, paving the way for a new era of management practices shaped by advanced technologies. / Denna masteruppsats utforskar den transformativa potentialen hos Stora Språkmodeller (LLMs) i att förbättra affärsprocesser över olika industrier, med särskilt fokus på Management Innovation. När organisationer möter digitaliseringens press, framträder LLMs som kraftfulla verktyg som kan revolutionera traditionella affärsarbetsflöden genom förbättrat beslutsfattande, automatisering av rutinuppgifter och förbättrad operationell effektivitet. Forskningen undersöker integrationen av LLMs inom fyra centrala affärsområden: Human Resources, Anbudshantering, Konsultverksamhet och Regelefterlevnad. Den belyser hur LLMs underlättar Management Innovation genom att möjliggöra nya former av arbetsflödesautomatisering, dataanalys och efterlevnadshantering, vilket driver påtagliga förbättringar i effektivitet och innovation. Genom att använda en blandad metodansats kombinerar studien en omfattande litteraturöversikt med enkäter och intervjuer med branschproffs för att utvärdera påverkan och praktiska tillämpningar av LLMs. Resultaten visar att LLMs inte bara erbjuder betydande operationella fördelar utan även medför utmaningar relaterade till datasäkerhet, integrationskomplexitet och integritetsfrågor. Denna uppsats bidrar avsevärt till den akademiska och praktiska förståelsen av LLMs, och föreslår en ram för deras strategiska antagande för att främja Management Innovation. Den understryker behovet för företag att anpassa LLM-integrationen med både teknologiska kapabiliteter och strategiska affärsmål, vilket banar väg för en ny era av ledningspraxis formad av avancerade teknologier. Large Language Models (LLMs) Business Processes Artificial Intelligence Natural Language Processing (NLP) Automation Machine Learning Business Intelligence Data Analytics Process Optimization Digital Transformation Stora språkmodeller (LLM) Affärsprocesser Artificiell intelligens Naturlig språkbehandling (NLP) Automation Maskininlärning Business Intelligence Dataanalys Processoptimering Digital transformation Economics and Business Ekonomi och näringsliv Other Engineering and Technologies Annan teknik
67	Minds, Machines & Metaphors : Limits of AI Understanding Másson, Mímir January 2024 (has links) This essay critically examines the limitations of artificial intelligence (AI) in achieving human-like understanding and intelligence. Despite significant advancements in AI, such as the development of sophisticated machine learning algorithms and neural networks, current systems fall short in comprehending the cognitive depth and flexibility inherent in human intelligence. Through an exploration of historical and contemporary arguments, including Searle's Chinese Room thought experiment and Dennett's Frame Problem, this essay highlights the inherent differences between human cognition and AI. Central to this analysis is the role of metaphorical thinking and embodied cognition, as articulated by Lakoff and Johnson, which are fundamental to human understanding but absent in AI. Proponents of AGI, like Kurzweil and Bostrom, argue for the potential of AI to surpass human intelligence through recursive self-improvement and technological integration. However, this essay contends that these approaches do not address the core issues of experiential knowledge and contextual awareness. By integrating insights from contemporary scholars like Bender, Koller, Buckner, Thorstad, and Hoffmann, the essay ultimately concludes that AI, while a powerful computational framework, is fundamentally incapaple of replicating the true intelligence and understanding unique to humans. Philosophy Filosofi
68	KERMIT: Knowledge Extractive and Reasoning Model usIng Transformers Hameed, Abed Alkarim, Mäntyniemi, Kevin January 2024 (has links) In the rapidly advancing field of artificial intelligence, Large Language Models (LLMs) like GPT-3, GPT-4, and Gemini have revolutionized sectors by automating complex tasks. Despite their advancements, LLMs and more noticeably smaller language models (SLMs) still face challenges, such as generating unfounded content "hallucinations." This project aims to enhance SLMs for broader accessibility without extensive computational infrastructure. By supervised fine-tuning of smaller models with new datasets, SQUAD-ei and SQUAD-GPT, the resulting model, KERMIT-7B, achieved superior performance in TYDIQA-GoldP, demonstrating improved information extraction while retaining generative quality. / Inom det snabbt växande området artificiell intelligens har stora språkmodeller (LLM) som GPT-3, GPT-4 och Gemini revolutionerat sektorer genom att automatisera komplexa uppgifter. Trots sina framsteg stårdessa modeller, framför allt mindre språkmodeller (SLMs) fortfarande inför utmaningar, till exempel attgenerera ogrundat innehåll "hallucinationer". Denna studie syftar till att förbättra SLMs för bredare till-gänglighet utan krävande infrastruktur. Genom supervised fine-tuning av mindre modeller med nya data-set, SQUAD-ei och SQUAD-GPT, uppnådde den resulterande modellen, KERMIT-7B, överlägsen pre-standa i TYDIQA-GoldP, vilket visar förbättrad informationsutvinning samtidigt som den generativa kva-liteten bibehålls. Keywords: KERMIT-7B SQUAD-ei SQUAD-GPT Artificial Intelligence (AI) Large Language Models (LLMs) Small Language Models (SLMs) Supervised Fine-tuning Information Extraction. KERMIT-7B SQUAD-ei SQUAD-GPT Artificiell intelligens (AI) stora språkmodeller (LLM) små språkmodeller (SLM) övervakad finjustering informationsutvinning.
69	Introducing Generative Artificial Intelligence in Tech Organizations : Developing and Evaluating a Proof of Concept for Data Management powered by a Retrieval Augmented Generation Model in a Large Language Model for Small and Medium-sized Enterprises in Tech / Introducering av Generativ Artificiell Intelligens i Tech Organisationer : Utveckling och utvärdering av ett Proof of Concept för datahantering förstärkt av en Retrieval Augmented Generation Model tillsammans med en Large Language Model för små och medelstora företag inom Tech Lithman, Harald, Nilsson, Anders January 2024 (has links) In recent years, generative AI has made significant strides, likely leaving an irreversible mark on contemporary society. The launch of OpenAI's ChatGPT 3.5 in 2022 manifested the greatness of the innovative technology, highlighting its performance and accessibility. This has led to a demand for implementation solutions across various industries and companies eager to leverage these new opportunities generative AI brings. This thesis explores the common operational challenges faced by a small-scale Tech Enterprise and, with these challenges identified, examines the opportunities that contemporary generative AI solutions may offer. Furthermore, the thesis investigates what type of generative technology is suitable for adoption and how it can be implemented responsibly and sustainably. The authors approach this topic through 14 interviews involving several AI researchers and the employees and executives of a small-scale Tech Enterprise, which served as a case company, combined with a literature review. The information was processed using multiple inductive thematic analyses to establish a solid foundation for the investigation, which led to the development of a Proof of Concept. The findings and conclusions of the authors emphasize the high relevance of having a clear purpose for the implementation of generative technology. Moreover, the authors predict that a sustainable and responsible implementation can create the conditions necessary for the specified small-scale company to grow. When the authors investigated potential operational challenges at the case company it was made clear that the most significant issue arose from unstructured and partially absent documentation. The conclusion reached by the authors is that a data management system powered by a Retrieval model in a LLM presents a potential path forward for significant value creation, as this solution enables data retrieval functionality from unstructured project data and also mitigates a major inherent issue with the technology, namely, hallucinations. Furthermore, in terms of implementation circumstances, both empirical and theoretical findings suggest that responsible use of generative technology requires training; hence, the authors have developed an educational framework named "KLART". Moving forward, the authors describe that sustainable implementation necessitates transparent systems, as this increases understanding, which in turn affects trust and secure use. The findings also indicate that sustainability is strongly linked to the user-friendliness of the AI service, leading the authors to emphasize the importance of HCD while developing and maintaining AI services. Finally, the authors argue for the value of automation, as it allows for continuous data and system updates that potentially can reduce maintenance. In summary, this thesis aims to contribute to an understanding of how small-scale Tech Enterprises can implement generative AI technology sustainably to enhance their competitive edge through innovation and data-driven decision-making. UX HCD HCI AI Artificial Intelligence Generative AI Implementing AI LLM GPT RAG Chatbot Prompt engineering Prompting framework Hallucinations Transparency Interaction Technologies Usability Linguistics Innovation SME Tech Enterprise Qte Growth factors Documentation Communication Knowledge retention Competence development Education Training Retrieval Model Data Data Management Database Knowledge base Maintenance Data security Data driven decision making Data analytics Social sustainability Economical sustainability Figma Trello Github OpenAI GTPs VsCode Azure CoPilot Google Bard Zapier GCS API Pipeline Parsing data Sorting data Storing data Human Computer Interaction Interaction Technologies Interaktionsteknik

Search results