61 |
An initial investigation of Automatic Program Repair for Solidity Smart Contracts with Large Language Models / En första undersökning av automatisk lagning av solidity smarta kontrakt med stora språkmodellerCruz, Erik January 2023 (has links)
This thesis investigates how Large Language Models can be used to repair Solidity Smart Contracts automatically through the main contribution of this thesis, the Transformative Repair Tool. The Transformative Repair Tool achieves similar results to current state-of-the-art tools on the Smartbugs Curated Dataset and is the first published tool that uses Large Language Models to repair Solidity Smart Contracts. Moreover, the thesis explores different prompt strategies to repair Smart Contracts and assess their performance. / Detta masterexamensarbete undersöker hur stora språkmodeller kan användas för att automatisk laga solidity smarta kontrakt genom verktyget Transformative Repair Tool, som är detta masterexamensarbete huvudsakliga bidrag. Transformative Repair Tool presterar liknande som dagens bästa verktyg inom automatisk lagning av smarta kontrakt på Smartbugs Curated datasettet och är det första publicerade verktyget som just använder stora språkmodeller för att reparera solidity smarta kontrakt. Dessutom så utforskar denna rapport olika textprompts och dess prestanda för att laga smarta kontrakt
|
62 |
DEEP LEARNING BASED METHODS FOR AUTOMATIC EXTRACTION OF SYNTACTIC PATTERNS AND THEIR APPLICATION FOR KNOWLEDGE DISCOVERYMdahsanul Kabir (16501281) 03 January 2024 (has links)
<p dir="ltr">Semantic pairs, which consist of related entities or concepts, serve as the foundation for comprehending the meaning of language in both written and spoken forms. These pairs enable to grasp the nuances of relationships between words, phrases, or ideas, forming the basis for more advanced language tasks like entity recognition, sentiment analysis, machine translation, and question answering. They allow to infer causality, identify hierarchies, and connect ideas within a text, ultimately enhancing the depth and accuracy of automated language processing.</p><p dir="ltr">Nevertheless, the task of extracting semantic pairs from sentences poses a significant challenge, necessitating the relevance of syntactic dependency patterns (SDPs). Thankfully, semantic relationships exhibit adherence to distinct SDPs when connecting pairs of entities. Recognizing this fact underscores the critical importance of extracting these SDPs, particularly for specific semantic relationships like hyponym-hypernym, meronym-holonym, and cause-effect associations. The automated extraction of such SDPs carries substantial advantages for various downstream applications, including entity extraction, ontology development, and question answering. Unfortunately, this pivotal facet of pattern extraction has remained relatively overlooked by researchers in the domains of natural language processing (NLP) and information retrieval.</p><p dir="ltr">To address this gap, I introduce an attention-based supervised deep learning model, ASPER. ASPER is designed to extract SDPs that denote semantic relationships between entities within a given sentential context. I rigorously evaluate the performance of ASPER across three distinct semantic relations: hyponym-hypernym, cause-effect, and meronym-holonym, utilizing six datasets. My experimental findings demonstrate ASPER's ability to automatically identify an array of SDPs that mirror the presence of these semantic relationships within sentences, outperforming existing pattern extraction methods by a substantial margin.</p><p dir="ltr">Second, I want to use the SDPs to extract semantic pairs from sentences. I choose to extract cause-effect entities from medical literature. This task is instrumental in compiling various causality relationships, such as those between diseases and symptoms, medications and side effects, and genes and diseases. Existing solutions excel in sentences where cause and effect phrases are straightforward, such as named entities, single-word nouns, or short noun phrases. However, in the complex landscape of medical literature, cause and effect expressions often extend over several words, stumping existing methods, resulting in incomplete extractions that provide low-quality, non-informative, and at times, conflicting information. To overcome this challenge, I introduce an innovative unsupervised method for extracting cause and effect phrases, PatternCausality tailored explicitly for medical literature. PatternCausality employs a set of cause-effect dependency patterns as templates to identify the key terms within cause and effect phrases. It then utilizes a novel phrase extraction technique to produce comprehensive and meaningful cause and effect expressions from sentences. Experiments conducted on a dataset constructed from PubMed articles reveal that PatternCausality significantly outperforms existing methods, achieving a remarkable order of magnitude improvement in the F-score metric over the best-performing alternatives. I also develop various PatternCausality variants that utilize diverse phrase extraction methods, all of which surpass existing approaches. PatternCausality and its variants exhibit notable performance improvements in extracting cause and effect entities in a domain-neutral benchmark dataset, wherein cause and effect entities are confined to single-word nouns or noun phrases of one to two words.</p><p dir="ltr">Nevertheless, PatternCausality operates within an unsupervised framework and relies heavily on SDPs, motivating me to explore the development of a supervised approach. Although SDPs play a pivotal role in semantic relation extraction, pattern-based methodologies remain unsupervised, and the multitude of potential patterns within a language can be overwhelming. Furthermore, patterns do not consistently capture the broader context of a sentence, leading to the extraction of false-positive semantic pairs. As an illustration, consider the hyponym-hypernym pattern <i>the w of u</i> which can correctly extract semantic pairs for a sentence like <i>the village of Aasu</i> but fails to do so for the phrase <i>the moment of impact</i>. The root cause of this limitation lies in the pattern's inability to capture the nuanced meaning of words and phrases in a sentence and their contextual significance. These observations have spurred my exploration of a third model, DepBERT which constitutes a dependency-aware supervised transformer model. DepBERT's primary contribution lies in introducing the underlying dependency structure of sentences to a language model with the aim of enhancing token classification performance. To achieve this, I must first reframe the task of semantic pair extraction as a token classification problem. The DepBERT model can harness both the tree-like structure of dependency patterns and the masked language architecture of transformers, marking a significant milestone, as most large language models (LLMs) predominantly focus on semantics and word co-occurrence while neglecting the crucial role of dependency architecture.</p><p dir="ltr">In summary, my overarching contributions in this thesis are threefold. First, I validate the significance of the dependency architecture within various components of sentences and publish SDPs that incorporate these dependency relationships. Subsequently, I employ these SDPs in a practical medical domain to extract vital cause-effect pairs from sentences. Finally, my third contribution distinguishes this thesis by integrating dependency relations into a deep learning model, enhancing the understanding of language and the extraction of valuable semantic associations.</p>
|
63 |
<b>Leveraging Advanced Large Language Models To Optimize Network Device Configuration</b>Mark Bogdanov (18429435) 24 April 2024 (has links)
<p dir="ltr">Recent advancements in large language models such as ChatGPT and AU Large allow for the effective integration and application of LLMs into network devices such as switches and routers in terms of the ability to play a role in configuration and management. The given devices are an essential part of every network infrastructure, and the nature of physical networking topologies is complex, which leads to the need to ensure optimal network efficiency and security via meticulous and precise configurations.</p><p dir="ltr">The research explores the potential of an AI-driven interface that utilizes AU Large to streamline, enhance, and automate the configuration process of network devices while ensuring that the security of the whole process is guaranteed by running the entire system on-premise. Three core areas are of primary concern in the given study: the effectiveness of integrating the AU Large into network management systems, the impact on efficiency, accuracy, and error rates in network configurations, and the scalability and adaptability to more complex requirements and growing network environments.</p><p dir="ltr">The key performance metrics evaluated are the error rate in the generated configurations, scalability by looking at the performance as more network devices are added, and the ability to generate incredibly complex configurations accurately. The high-level results of the critical performance metrics show an evident correlation between increased device count and increased prompt complexity with a degradation in the performance of the AU Large model from Mistral AI.</p><p dir="ltr">This research has significant potential to alter preset network management practices by applying AI to make network configuration more efficient, reduce the scope for human error, and create an adaptable tool for diverse and complex networking environments. This research contributes to both AI and network management fields by highlighting a path toward the “future of network management.”</p>
|
64 |
Natural Language Based AI Tools in Interaction Design Research : Using ChatGPT for Qualitative User Research Insight AnalysisSaare, Karmen January 2024 (has links)
This thesis investigates the use of Artificial Intelligence, specifically the Large Language Model (LLM) application ChatGPT in the context of qualitative user research, with the goal of enhancing the user research interview analysis process. Through an empirical study where ChatGPT was used in the process of a typical user research insight analysis, the limitations and opportunities of the AI tool are examined. The study's results highlight the most significant insights from the empirical investigation, serving as examples to raise awareness of the implications of using ChatGPT in the context of user interview analysis. The study concludes that ChatGPT has the potential to enhance the interpretation of primarily individual interviews by generating well-articulated summaries, provided their accuracy can be verified. Additionally, ChatGPT may be particularly useful in low-risk design projects where the consequences of potential misinterpretations are minimal. Finally, the significance of clearly articulated written instructions for ChatGPT for best results is pointed out.
|
65 |
The shifting landscape of data : learning to tame distributional shiftsIbrahim, Adam 05 1900 (has links)
Les modèles d'apprentissage automatique (ML) atteignent des performances remarquables sur les tâches pour lesquelles ils sont entraînés. Cependant, ils sont souvent sensibles aux changements dans la distribution des données, ce qui peut nuir à leur fiabilité. Cela peut se produire lorsque la distribution des données rencontrées au déploiement diffère de celle vue pendant l'entraînement, entraînant une dégradation considérable des performances. Pire encore, les attaquants peuvent également induire de tels changements afin d'induire les modèles d'apprentissage automatique en erreur. Enfin, cela peut même arriver si l'entraînement est effectué séquentiellement sur des distributions de données différentes. Ces changements de distribution sont omniprésents en ML, nuisant à l'équité, à la fiabilité, à la sécurité et à l'efficacité des modèles d'apprentissage automatique. Cette thèse se concentre sur la compréhension et l'amélioration de la robustesse et de l'adaptation des modèles de ML aux changements de distribution, englobant à la fois des travaux théoriques et expérimentaux.
Tout d'abord, nous étudions les limites fondamentales de l'optimisation différentiable à plusieurs objectifs. Une meilleure compréhension de ces limites est importante car les travaux sur les changements de distribution reposent souvent sur des formulations de la théorie des jeux. Nous fournissons de nouvelles bornes inférieures sur la vitesse de convergence d'une large classe de méthodes, ainsi que de nouvelles métriques de conditionnement qui aident à évaluer la difficulté d'optimiser des classes de jeux, et expliquent le potentiel de convergence rapide, même sans forte convexité ou forte concavité.
Deuxièmement, nous abordons le manque de robustesse aux attaques adversarielles contre plusieurs types d'attaques, une limitation courante des méthodes de pointe. Nous proposons une approche inspirée de la généralisation de domaine, utilisant l'extrapolation des risques (REx) pour promouvoir la robustesse à plusieurs attaques. Notre méthode atteint des performances supérieures aux bases de référence existantes, que les attaques aient été vues ou non lors de l'entraînement.
Enfin, nous nous intéressons aux défis du pré-entraînement continu pour les grands modèles de langage (LLM). Ces modèles sont confrontés à un compromis: soit ils oublient de manière catastrophique les connaissances antérieures lorsqu'ils sont mis à jour sur de nouvelles données, soit ils nécessitent un réentraînement complet coûteux en calcul. Nous démontrons qu'une combinaison de réchauffement et de re-décroissance du taux d'apprentissage, et de réutilisation des données précédemment utilisées permet aux LLM d'apprendre continuellement à partir de nouvelles distributions tout en préservant leurs performances sur les données auparavant apprises. Cette approche permet d'atteindre les performances d'un réentraînement complet, mais à une fraction du coût en calcul.
Dans l'ensemble, cette thèse apporte des considérations importantes pour améliorer la robustesse et l'adaptation aux changements de distribution. Ces contributions ouvrent des voies prometteuses pour relever les défis du ML du monde réel dans l'optimisation multiobjectif, la défense contre les adversaires et l'apprentissage continu des grands modèles de langage. / Machine learning (ML) models achieve remarkable performance on tasks they are trained for. However, they often are sensitive to shifts in the data distribution, which may lead to unexpected behaviour. This can happen when the data distribution encountered during deployment differs from that used for training, leading to considerable degradation of performance. Worse, attackers may also induce such shifts to fool machine learning models. Finally, this can even happen when training sequentially on different data distribution. These distributional shifts are pervasive in ML, hindering the fairness, reliability, safety and efficiency of machine learning models. This thesis is focused on understanding and improving the robustness and adaptation of ML models to distributional shifts, encompassing both theoretical and experimental work.
First, we investigate the fundamental limits of differentiable multiobjective optimisation. This investigation is important because works on distributional shifts often rely on game theoretical formulations. We provide new lower bounds on the speed of convergence of a large class of methods, along with novel condition numbers that help assess the difficulty to optimise classes of games, and explain the potential for fast convergence even without strong convexity or strong concavity.
Second, we address the lack of adversarial robustness against multiple attack types, a common limitation of state-of-the-art methods. We propose a domain generalisation-inspired approach, using Risk Extrapolation (REx) to promote robustness across a range of attacks. Our method achieves performance superior to existing baselines for both seen and novel types of attacks.
Finally, we tackle the challenges of continual pretraining for large language models (LLMs). These models face a trade-off: either they catastrophically forget previous knowledge when updated on new data, or they require computationally expensive full retraining. We demonstrate that a combination of learning rate re-warming, re-decaying, and the replay of previous data allows LLMs to continually learn from new distributions while preserving past knowledge. This approach matches the performance of full retraining, but at a fraction of the computational cost.
Overall, this thesis contributes impactful considerations towards improving robustness and adaptation to distributional shifts. These contributions open promising avenues for addressing real-world ML challenges across multiobjective optimisation, adversarial defense, and continual learning of large language models.
|
66 |
Direct Preference Optimization for Improved Technical WritingAssistance : A Study of How Language Models Can Support the Writing of Technical Documentation at Saab / En studie i hur språkmodeller kan stödja skrivandet av teknisk dokumentation på SaabBengtsson, Hannes, Habbe, Patrik January 2024 (has links)
This thesis explores the potential of Large Language Models (LLMs) to assist in the technical documentation process at Saab. With the increasing complexity and regulatory demands on such documentation, the objective is to investigate advanced natural language processing techniques as a means of streamlining the creation of technical documentation. Although many standards exist, this thesis particularly focuses on the standard ASD-STE100, Simplified Technical English abbrv. STE, a controlled language for technical documentation. STE's primary aim is to ensure that technical documents are understandable to individuals regardless of their native language or English proficiency. The study focuses on the implementation of Direct Preference Optimization (DPO) and Supervised Instruction Fine-Tuning (SIFT) to refine the capabilities of LLMs in producing clear and concise outputs that comply with STE. Through a series of experiments, we investigate the effectiveness of LLMs in interpreting and simplifying technical language, with a particular emphasis on adherence to STE standards. The study utilizes a dataset comprised of target data paired with synthetic source data generated by a LLM. We apply various model training strategies, including zero-shot performance, supervised instruction fine-tuning, and direct preference optimization. We evaluate the various models' output using established quantitative metrics for text simplification and substitute human evaluators with company internal software for evaluating adherence to company standards and STE. Our findings suggest that while LLMs can significantly contribute to the technical writing process, the choice of training methods and the quality of data play crucial roles in the model's performance. This study shows how LLMs can improve productivity and reduce manual work. It also looks at the problems and suggests ways to make technical documentation automation better in the future.
|
67 |
Människors förtroende för AI: Könsrelaterad bias i AI-språkmodeller / People's Trust in AI: Gender Bias in Large Language ModelsForsman, Angela, Martinsson, Jonathan January 2024 (has links)
I en tid då AI-språkmodeller används alltmer i vår vardag, blir det relevant att undersöka hur det påverkar samhället. Denna studie undersöker, utifrån teorier om etik och jämställdhet, hur AI-språkmodeller i sina texter ger uttryck för mångfald, icke-diskriminering och rättvisa. Studien fokuserar på att identifiera och analysera förekomsten av könsbias i AI-språkmodellernas svar samt hur det påverkar människors förtroende för dessa system. En fallstudie genomfördes på tre AI-språkmodeller - ChatGPT 3.5, Gemini och Llama-2 70B, där data insamlades via intervjuer med dessa modeller. Därefter gjordes intervjuer med mänskliga informanter som reflekterade över AI-språkmodellernas svar. AI-språkmodellerna visade en obalans i hur de behandlar kvinnor och män vilket kan förstärka befintliga könsstereotyper. Detta kan påverka människors förtroende för AI-språkmodeller och informanterna lyfte problematiken om vad neutralitet och rättvisa innebär. För att skapa mer ansvarsfulla och rättvisa AI-system krävs medvetna insatser för att integrera etiska och jämställdhetsperspektiv i AI-utveckling och användning. / In a time when Large Language Models (LLMs) are increasingly used in our daily lives, it becomes important to investigate how this affects society. This study examines how LLMs express diversity, non-discrimination, and fairness in texts, based on theories of ethics and gender equality. The study focuses on identifying and analyzing the presence of gender bias in the responses of LLMs and how this impacts people's trust in these systems. A case study was conducted on three LLMs: ChatGPT 3.5, Gemini, and Llama-2 70B, where data was collected through interviews with them. Subsequently, interviews were conducted with human informants who reflected on the LLMs’ responses. The LLMs showed imbalance towards gender, potentially reinforcing existing gender stereotypes. This can affect people's trust in LLMs, and the informants highlighted the issue of what neutrality and fairness entail. To create more responsible and fair AI systems, conscious efforts are required to integrate ethical and equality perspectives into AI development and usage.
|
68 |
Applying Large Language Models in Business Processes : A contribution to Management Innovation / Tillämpning av stora språkmodeller i affärsprocesser : Ett bidrag till Management InnovationBergman Larsson, Niklas, Talåsen, Jonatan January 2024 (has links)
This master thesis explores the transformative potential of Large Language Models (LLMs) in enhancing business processes across various industries, with a specific focus on Management Innovation. As organizations face the pressures of digitalization, LLMs emerge as powerful tools that can revolutionize traditional business workflows through enhanced decision-making, automation of routine tasks, and improved operational efficiency. The research investigates the integration of LLMs within four key business domains: Human Resources, Tender Management, Consultancy, and Compliance. It highlights how LLMs facilitate Management Innovation by enabling new forms of workflow automation, data analysis, and compliance management, thus driving substantial improvements in efficiency and innovation. Employing a mixed-method approach, the study combines an extensive literature review with surveys and interviews with industry professionals to evaluate the impact and practical applications of LLMs. The findings reveal that LLMs not only offer significant operational benefits but also pose challenges related to data security, integration complexities, and privacy concerns. This thesis significantly contributes to the academic and practical understanding of LLMs, proposing a framework for their strategic adoption to foster Management Innovation. It underscores the need for businesses to align LLM integration with both technological capabilities and strategic business objectives, paving the way for a new era of management practices shaped by advanced technologies. / Denna masteruppsats utforskar den transformativa potentialen hos Stora Språkmodeller (LLMs) i att förbättra affärsprocesser över olika industrier, med särskilt fokus på Management Innovation. När organisationer möter digitaliseringens press, framträder LLMs som kraftfulla verktyg som kan revolutionera traditionella affärsarbetsflöden genom förbättrat beslutsfattande, automatisering av rutinuppgifter och förbättrad operationell effektivitet. Forskningen undersöker integrationen av LLMs inom fyra centrala affärsområden: Human Resources, Anbudshantering, Konsultverksamhet och Regelefterlevnad. Den belyser hur LLMs underlättar Management Innovation genom att möjliggöra nya former av arbetsflödesautomatisering, dataanalys och efterlevnadshantering, vilket driver påtagliga förbättringar i effektivitet och innovation. Genom att använda en blandad metodansats kombinerar studien en omfattande litteraturöversikt med enkäter och intervjuer med branschproffs för att utvärdera påverkan och praktiska tillämpningar av LLMs. Resultaten visar att LLMs inte bara erbjuder betydande operationella fördelar utan även medför utmaningar relaterade till datasäkerhet, integrationskomplexitet och integritetsfrågor. Denna uppsats bidrar avsevärt till den akademiska och praktiska förståelsen av LLMs, och föreslår en ram för deras strategiska antagande för att främja Management Innovation. Den understryker behovet för företag att anpassa LLM-integrationen med både teknologiska kapabiliteter och strategiska affärsmål, vilket banar väg för en ny era av ledningspraxis formad av avancerade teknologier.
|
69 |
Minds, Machines & Metaphors : Limits of AI UnderstandingMásson, Mímir January 2024 (has links)
This essay critically examines the limitations of artificial intelligence (AI) in achieving human-like understanding and intelligence. Despite significant advancements in AI, such as the development of sophisticated machine learning algorithms and neural networks, current systems fall short in comprehending the cognitive depth and flexibility inherent in human intelligence. Through an exploration of historical and contemporary arguments, including Searle's Chinese Room thought experiment and Dennett's Frame Problem, this essay highlights the inherent differences between human cognition and AI. Central to this analysis is the role of metaphorical thinking and embodied cognition, as articulated by Lakoff and Johnson, which are fundamental to human understanding but absent in AI. Proponents of AGI, like Kurzweil and Bostrom, argue for the potential of AI to surpass human intelligence through recursive self-improvement and technological integration. However, this essay contends that these approaches do not address the core issues of experiential knowledge and contextual awareness. By integrating insights from contemporary scholars like Bender, Koller, Buckner, Thorstad, and Hoffmann, the essay ultimately concludes that AI, while a powerful computational framework, is fundamentally incapaple of replicating the true intelligence and understanding unique to humans.
|
70 |
KERMIT: Knowledge Extractive and Reasoning Model usIng TransformersHameed, Abed Alkarim, Mäntyniemi, Kevin January 2024 (has links)
In the rapidly advancing field of artificial intelligence, Large Language Models (LLMs) like GPT-3, GPT-4, and Gemini have revolutionized sectors by automating complex tasks. Despite their advancements, LLMs and more noticeably smaller language models (SLMs) still face challenges, such as generating unfounded content "hallucinations." This project aims to enhance SLMs for broader accessibility without extensive computational infrastructure. By supervised fine-tuning of smaller models with new datasets, SQUAD-ei and SQUAD-GPT, the resulting model, KERMIT-7B, achieved superior performance in TYDIQA-GoldP, demonstrating improved information extraction while retaining generative quality. / Inom det snabbt växande området artificiell intelligens har stora språkmodeller (LLM) som GPT-3, GPT-4 och Gemini revolutionerat sektorer genom att automatisera komplexa uppgifter. Trots sina framsteg stårdessa modeller, framför allt mindre språkmodeller (SLMs) fortfarande inför utmaningar, till exempel attgenerera ogrundat innehåll "hallucinationer". Denna studie syftar till att förbättra SLMs för bredare till-gänglighet utan krävande infrastruktur. Genom supervised fine-tuning av mindre modeller med nya data-set, SQUAD-ei och SQUAD-GPT, uppnådde den resulterande modellen, KERMIT-7B, överlägsen pre-standa i TYDIQA-GoldP, vilket visar förbättrad informationsutvinning samtidigt som den generativa kva-liteten bibehålls.
|
Page generated in 0.0416 seconds