1 |
PRECISION PAIRINGS : Consultant Assignment Matching with Local Large Language ModelsArlt Strömberg, Wilmer January 2023 (has links)
This master thesis explores the application of local Large Language Models (LLMs) in the consultancy industry, specifically focusing on the challenge of matching consultants to client assignments. The study develops and evaluates a structured pipeline integrating an LLM to automate the consultantassignment matching process. The research encompasses a comprehensive methodology, and culminating in a sophisticated LLM application. The core of the thesis is an in-depth analysis of how the LLM, along with its constituent components like nodes, embedding models, and vector store indexes, contributes to the matching process. Special emphasis is placed on the role of temperature settings in the LLM and their impact on match accuracy and quality. Through methodical experimentation and evaluation, the study sheds light on the effectiveness of the LLM in accurately matching consultants to assignments and generating coherent motivations. This master thesis establishes a foundational framework for the utilization of LLMs in consultancy matching, presenting a significant step towards the integration of AI in the field. The thesis opens avenues for future research, aiming to enhance the efficiency and precision of AI-driven consultant matching in the consulting industry.
|
2 |
Evaluating LLM based web application penetration testing: How does AI improve efficiency?Brüsemeister, Patrick 10 May 2024 (has links)
Die vorliegende Arbeit untersucht die Verwendung von Large Language Models (LLMs) in Penetrationstests von Web-Anwendungen. Ziel ist es, die Arbeit von Penetrationstestern zu unterstützen und den Prozess zu beschleunigen, um Sicherheitslücken in Web-Anwendungen effektiver aufzudecken und zu beheben. Die Arbeit vergleicht verschiedene Ansätze und prüft, wie LLMs wie ChatGPT und andere die Effizienz des Penetrationstests verbessern können. Es wird evaluiert, ob durch die Anwendung von LLMs der notwendige Aufwand für Penetrationstests reduziert werden kann, um Sicherheitslücken in Web-Anwendungen effektiver aufzudecken und zu beheben. Die Arbeit leistet einen Beitrag zum Thema, indem sie die Möglichkeiten und Grenzen von LLMs im Kontext der Penetrationstestung untersucht, bewertet und den aktuellen Stand skizziert.:1 Intro
2 Basics
2 1 Web Application Security
2 2 Penetration Testing
2 3 Penetration Testing Standards
2 4 Penetration Testing Tools
2 5 Artificial Intelligence
2 6 Large Language Models
2 7 LLM prompting techniques
2 8 AI’s Growing Role in Cybersecurity
2 9 Penetration Testing and AI
2 10 Research Objectives and Scope
2 11 Significance of the Study and Research Question
2 12 Structure of the Thesis
3 Literature Review
4 Market Analysis
4 1 Use of LLMs in Combination with Existing Penetration Testing Software
4 2 Open-Source Solutions Leveraging LLMs
4 3 Commercial Solutions Leveraging LLMs for Cybersecurity purposes
4 4 ChatGPT-GPTs
4 5 Identifying the Need for Optimization in Penetration Testing Processes
4 6 Opinions of Penetration Testers on Generative AI Use
5 Methodology
5 1 Research Methods and Approaches
5 2 Benchmarks Used for Evaluation
6 Concept and Implementation
6 1 Limitations of LLMs
6 2 Deciding Which LLM Models to Use
6 3 Identifying and Executing Tasks with LLMs
6 4 Tailoring the LLM for Penetration Testing
6 5 Resource Requirements
7 Evaluation of LLMs for Penetration Testing
7 1 Interviews: Identifying the use of LLMs for Pentesting
7 2 Preparing the Test Environment
7 3 Evaluation of Command Generation
7 4 ChatGPT Assistant GPT
7 5 Google Gemini Advanced
7 6 Discussion of results
7 7 Answering the Research Question
7 8 Resulting Penetration Testing Workflow
8 Conclusion / The thesis examines the use of Large Language Models (LLMs) in web application penetration testing. The goal is to support penetration testers and accelerate the process, to identify and fix security vulnerabilities in web applications more effectively. The thesis compares different approaches and evaluates how LLMs, such as ChatGPT and others, can improve the efficiency of penetration testing. It is evaluated whether the application of LLMs can reduce the necessary effort for penetration testing, to more effectively identify and fix security vulnerabilities in web applications. The research contributes to the topic by investigating, evaluating, and outlining the possibilities and limitations of LLMs in the context of penetration testing.:1 Intro
2 Basics
2 1 Web Application Security
2 2 Penetration Testing
2 3 Penetration Testing Standards
2 4 Penetration Testing Tools
2 5 Artificial Intelligence
2 6 Large Language Models
2 7 LLM prompting techniques
2 8 AI’s Growing Role in Cybersecurity
2 9 Penetration Testing and AI
2 10 Research Objectives and Scope
2 11 Significance of the Study and Research Question
2 12 Structure of the Thesis
3 Literature Review
4 Market Analysis
4 1 Use of LLMs in Combination with Existing Penetration Testing Software
4 2 Open-Source Solutions Leveraging LLMs
4 3 Commercial Solutions Leveraging LLMs for Cybersecurity purposes
4 4 ChatGPT-GPTs
4 5 Identifying the Need for Optimization in Penetration Testing Processes
4 6 Opinions of Penetration Testers on Generative AI Use
5 Methodology
5 1 Research Methods and Approaches
5 2 Benchmarks Used for Evaluation
6 Concept and Implementation
6 1 Limitations of LLMs
6 2 Deciding Which LLM Models to Use
6 3 Identifying and Executing Tasks with LLMs
6 4 Tailoring the LLM for Penetration Testing
6 5 Resource Requirements
7 Evaluation of LLMs for Penetration Testing
7 1 Interviews: Identifying the use of LLMs for Pentesting
7 2 Preparing the Test Environment
7 3 Evaluation of Command Generation
7 4 ChatGPT Assistant GPT
7 5 Google Gemini Advanced
7 6 Discussion of results
7 7 Answering the Research Question
7 8 Resulting Penetration Testing Workflow
8 Conclusion
|
3 |
Large Language Models for Documentation : A Study on the Effects on Developer ProductivityAlrefai, Adam, Alsadi, Mahmoud January 2024 (has links)
This thesis explores the integration of generative AI and large language models (LLMs) into software documentation processes, assessing their impact on developer productivity. The research focuses on the development of a documentation system powered by an LLM, which automates the creation and retrieval of software documentation. The study employs a controlled experiment followed by a survey involving software developers to quantify changes in productivity through various metrics such as effectiveness, velocity, and quality of documentation generated by the system. Background: The increasing complexity of software development necessitates efficient documentation systems. Traditional methods, often manual and time-consuming, struggle to keep pace with the dynamics of software development, potentially leading to outdated and inadequate documentation. Objectives: To investigate whether a documentation system powered by an LLM can enhance developers’ productivity in software documentation tasks by assisting developers with the creation of development documentation and facilitating the retrieval of relevant information. Method: A controlled experiment followed by a survey were conducted, wherein participants were tasked with generating and using documentation through both manual and LLM-assisted methods. The effectiveness, velocity, and quality of the documentation were measured and compared. Results: The findings indicate that the LLM-powered documentation system significantly enhances developer productivity. Developers using the system were able to produce and comprehend documentation more quickly and accurately than those using the manual method. Furthermore, the quality of the documentation, assessed in terms of comprehensibility, completeness, and readability, was consistently higher when generated by the LLM system. Conclusions: The integration of LLMs into software documentation processes can significantly enhance developer productivity by automating routine tasks and improving the quality of documentation. This supports software developers in maintaining current projects and also assists in the onboarding process of new team members by providing easier access to necessary documentation. / Denna avhandling utforskar integrationen av generativ AI och stora språkmodeller (LLM) i processer för mjukvarudokumentation, och bedömer deras inverkan på utvecklares produktivitet. Forskningen fokuserar på utvecklingen av ett dokumentationssystem drivet av en LLM, som automatiserar skapandet och hämtningen av mjukvarudokumentation. Studien använder ett kontrollerat experiment följt av en enkät som involverar professionella mjukvaruutvecklare för att kvantifiera förändringar i produktivitet genom olika mått som effektivitet, hastighet och kvalitet på dokumentation genererad av systemet. Bakgrund: Den ökande komplexiteten i mjukvaruutveckling kräver effektiva dokumentationssystem. Traditionella metoder, ofta manuella och tidskrävande, har svårt att hålla jämna steg med dynamiken i mjukvaruutveckling, vilket potentiellt kan leda till föråldrad och otillräcklig dokumentation. Syfte: Att undersöka om ett dokumentationssystem drivet av en LLM kan förbättra utvecklares produktivitet i uppgifter relaterade till mjukvarudokumentation genom att assistera utvecklare med att skapa utvecklingsdokumentation och underlätta hämtningen av relevant information. Metod: Ett kontrollerat experiment följt av en enkät genomfördes, där deltagarna hade i uppgift att generera och använda dokumentation genom både manuella och LLM-assisterade metoder. Effektiviteten, hastigheten och kvaliteten på dokumentationen mättes och jämfördes. Resultat: Resultaten visar att dokumentationssystemet drivet av LLM väsentligen förbättrar utvecklarnas produktivitet. Utvecklare som använde systemet kunde producera och förstå dokumentation snabbare och mer exakt än de som använde den manuella metoden. Vidare var kvaliteten på dokumentationen, bedömd i termer av begriplighet, fullständighet och läsbarhet, konsekvent högre när den genererades av LLM-systemet. Slutsatser: Integrationen av LLM i mjukvarudokumentationsprocesser kan väsentligen förbättra utvecklarnas produktivitet genom att automatisera rutinuppgifter och förbättra kvaliteten på dokumentation. Detta stöder inte bara mjukvaruutvecklare i att underhålla pågående projekt utan hjälper också till med introduktionen av nya teammedlemmar genom att ge enklare tillgång till nödvändig dokumentation.
|
4 |
DEMOCRATISING DEEP LEARNING IN MICROBIAL METABOLITES RESEARCH / DEMOCRATISING DEEP LEARNING IN NATURAL PRODUCTS RESEARCHDial, Keshav January 2023 (has links)
Deep learning models are dominating performance across a wide variety of tasks. From protein folding to computer vision to voice recognition, deep learning is changing the way we interact with data. The field of natural products, and more specifically genomic mining, has been slow to adapt to these new technological innovations. As we are in the midst of a data explosion, it is not for lack of training data. Instead, it is due to the lack of a blueprint demonstrating how to correctly integrate these models to maximise performance and inference. During my PhD, I showcase the use of large language models across a variety of data domains to improve common workflows in the field of natural product drug discovery. I improved natural product scaffold comparison by representing molecules as sentences. I developed a series of deep learning models to replace archaic technologies and create a more scalable genomic mining pipeline decreasing running times by 8X. I integrated deep learning-based genomic and enzymatic inference into legacy tooling to improve the quality of short-read assemblies. I also demonstrate how intelligent querying of multi-omic datasets can be used to facilitate the gene signature prediction of encoded microbial metabolites. The models and workflows I developed are wide in scope with the hopes of blueprinting how these industry standard tools can be applied across the entirety of natural product drug discovery. / Thesis / Doctor of Philosophy (PhD)
|
5 |
Improving Vulnerability Description Using Natural Language GenerationAlthebeiti, Hattan 01 January 2023 (has links) (PDF)
Software plays an integral role in powering numerous everyday computing gadgets. As our reliance on software continues to grow, so does the prevalence of software vulnerabilities, with significant implications for organizations and users. As such, documenting vulnerabilities and tracking their development becomes crucial. Vulnerability databases addressed this issue by storing a record with various attributes for each discovered vulnerability. However, their contents suffer several drawbacks, which we address in our work. In this dissertation, we investigate the weaknesses associated with vulnerability descriptions in public repositories and alleviate such weaknesses through Natural Language Processing (NLP) approaches. The first contribution examines vulnerability descriptions in those databases and approaches to improve them. We propose a new automated method leveraging external sources to enrich the scope and context of a vulnerability description. Moreover, we exploit fine-tuned pretrained language models for normalizing the resulting description. The second contribution investigates the need for uniform and normalized structure in vulnerability descriptions. We address this need by breaking the description of a vulnerability into multiple constituents and developing a multi-task model to create a new uniform and normalized summary that maintains the necessary attributes of the vulnerability using the extracted features while ensuring a consistent vulnerability description. Our method proved effective in generating new summaries with the same structure across a collection of various vulnerability descriptions and types. Our final contribution investigates the feasibility of assigning the Common Weakness Enumeration (CWE) attribute to a vulnerability based on its description. CWE offers a comprehensive framework that categorizes similar exposures into classes, representing the types of exploitation associated with such vulnerabilities. Our approach utilizing pre-trained language models is shown to outperform Large Language Model (LLM) for this task. Overall, this dissertation provides various technical approaches exploiting advances in NLP to improve publicly available vulnerability databases.
|
6 |
Responsible AI in Educational Chatbots: Seamless Integration and Content Moderation Strategies / Ansvarsfull AI i pedagogiska chatbots: strategier för sömlös integration och moderering av innehållEriksson, Hanna January 2024 (has links)
With the increasing integration of artificial intelligence (AI) technologies into educational settings, it becomes important to ensure responsible and effective use of these systems. This thesis addresses two critical challenges within AI-driven educational applications: the effortless integration of different Large Language Models (LLMs) and the mitigation of inappropriate content. An AI assistant chatbot was developed, allowing teachers to design custom chatbots and set rules for them, enhancing students’ learning experiences. Evaluation of LangChain as a framework for LLM integration, alongside various prompt engineering techniques including zero-shot, few-shot, zero-shot chain-of-thought, and prompt chaining, revealed LangChain’s suitability for this task and highlighted prompt chaining as the most effective method for mitigating inappropriate content in this use case. Looking ahead, future research could focus on further exploring prompt engineering capabilities and strategies to ensure uniform learning outcomes for all students, as well as leveraging LangChain to enhance the adaptability and accessibility of educational applications.
|
7 |
Enhancing Document Accessibility and User Interaction through Large Language Model: A Comparative Study for Educational Content : A Comparative Analysis of LLM and Traditional Site SearchUmar, Fatima January 2024 (has links)
This research integrates LLMs with RAG (Retrieval-Augmented Generation) to develop a conversational interface allowing users to post queries and ask questions from a website. It compares the LLMRAGmethodwith traditional site search functionality to determine which method users perceive as better, specifically regarding response quality and response time. The perceived results for response quality and response time were evaluated under the null hypothesis that there is no difference between the two methods. The study showed that the LLM RAG method was perceived as better in terms of response quality, and those results were significant. However, for response time, the traditional site search method was perceived as better, but the results were not significant, so the null hypothesis could not be rejected. Overall, the integration of LLMs with RAG frameworks promises to enhance information retrieval systems on digital platforms.
|
8 |
KGScore-Open: Leveraging Knowledge Graph Semantics For Open-QA EvaluationHausman, Nicholas 01 June 2024 (has links) (PDF)
Evaluating active Question Answering (QA) systems, as users ask questions outside of the original testing data, has proven to be difficult, due to the difficulty of gauging answer quality without ground truth responses. We propose KGScore-Open, a configurable system capable of scoring questions and answers in Open Domain Question Answering (Open-QA) without ground truth answers present by leveraging DBPedia, a Knowledge Graph (KG) derived from Wikipedia, to score question-answer pairs. The system maps entities from questions and answers to DBPedia nodes, constructs a Knowledge Graph based on these entities, and calculates a relatedness score. Our system is validated on multiple datasets, achieving up to 83% accuracy in differentiating relevant from irrelevant answers in the Natural Questions dataset, 55% accuracy in classifying correct versus incorrect answers (hallucinations) in the TruthfulQA and HaluEval datasets, and 54% accuracy on the QA-Eval task using the EVOUNA dataset. The contributions of this work include a novel scoring system for indicating both relevancy and answer confidence in Open-QA without the need for ground truth answers, demonstrated efficacy across various tasks, and an extendable framework applicable to different KGs for evaluating QA systems of other domains.
|
9 |
Can AI models solve the programming challenge Advent of Code? : Evaluating state of the art large language modelsSandström, Johannes January 2024 (has links)
Large Language Models were developed during the 2010s, and chatbots like ChatGPT quickly became popular. The continued development of LLMs led to tools with specific use cases, one of which is software development. In this study, eight different LLMs are tested on their ability to solve the programming challenge Advent of Code. Advent of Code consists of 25 problems, each with two parts. Each LLM is given five attempts to try to solve the problem by generating Python code, and after each attempt, feedback is provided to the tools on any issues with the solution. The results show that ChatGPT-4 and Github Copilot generated the most correct solutions, with ChatGPT-4 generating the most correct solutions on the first attempt. The quality of the code is also examined using SonarQube, and ChatGPT-4 is the best in this regard as well. Of the tools tested in this study, Google's Gemini and Gemini Advanced had the fewest correct solutions. Based on the results of this study, it is clear that these LLMs are good at generating code, but Advent of Code 2023 is too difficult to solve. Despite this, these tools demonstrate that they can be useful for programmers.
|
10 |
USING CHATGPT TO GENERATEREBECA CODES FROM UML STATEDIAGRAMSEriksson, Kevin, Alm Johansson, Albin January 2024 (has links)
Unified Modeling Language (UML) is recognized as a de facto standard for modeling various typesof systems. However, its lack of formal semantics hinders the ability to perform formal verification, which is crucial to ensure the correctness of the models throughout the modeling process. Rebeca isan actor-based modeling language designed to formally verify reactive concurrent systems. Previous work has attempted to bridge the gap and provide a translation to take advantage of both UML and Rebeca’s benefits. These methods either require multiple UML diagrams and an understanding of Rebeca, or lack implementation solutions. We conducted experiments to explore the potential of zero-shot and few-shot learning with ChatGPT-4 as a tool for automating the translation from UML state diagrams to Rebeca code. The results indicated that the translation from UML state diagrams to Rebeca code can be partially made and they are not sufficient to derive correct Rebeca models. To mitigate this, we augment state diagrams with metadata, which resulted in the generated code having minor errors and requiring slight adjustments to be able to be compiled in the Rebeca model checking tool, Afra. The conclusion is that ChatGPT-4 can potentially facilitate the process of transforming UML state diagrams into executable Rebeca code with minimal additional information. We provide a translation procedure of Rebeca code to UML state diagrams, a conceptual mapping of them in reverse, and a dataset that can be used for further research. The dataset and the results are published in the GitHub repositoryof our project
|
Page generated in 0.0256 seconds