Spelling suggestions: "subject:"large language model"" "subject:"marge language model""
1 |
Sind Sprachmodelle in der Lage die Arbeit von Software-Testern zu übernehmen?: automatisierte JUnit Testgenerierung durch Large Language ModelsSchäfer, Nils 20 September 2024 (has links)
Die Bachelorarbeit untersucht die Qualität von Sprachmodellen im Kontext der Generierung
von Unit Tests für Java Anwendungen. Ziel der Arbeit ist es, zu analysieren,
inwieweit JUnit Tests durch den Einsatz von Sprachmodellen automatisiert generiert
werden können und daraus abzuleiten mit welcher Qualität sie die Arbeit von Software-
Testern übernehmen und ersetzen. Hierzu wird ein automatisiertes Testerstellungssystem
in Form eines Python-Kommandozeilen-Tools konzipiert sowie implementiert, welches mithilfe
von Anfragen an das Sprachmodell Testfälle generiert. Um dessen Qualität messen zu
können, werden die generierten Tests ohne manuellen Einfluss übernommen. Als Grundlage
der Evaluierung findet eine Durchführung statt, in der für 3 Java-Maven Projekte, mit
unterschiedlichen Komplexitätsgraden, Tests generiert werden. Die anschließende Analyse
besteht aus einem festen Bewertungsverfahren, welches die Testcodeabdeckung sowie
Erfolgsquote evaluiert und mit manuellen Tests vergleicht. Die Ergebnisse zeigen, dass
Sprachmodelle in der Lage sind, JUnit Tests mit einer zufriedenstellenden Testabdeckung
zu generieren, jedoch eine unzureichende Erfolsquote im Vergleich zu manuellen Tests
aufweisen. Es wird deutlich, dass sie aufgrund von Qualitätsmängeln bei der Generierung
von Testcode die Arbeit von Software-Testern nicht vollständig ersetzen können. Jedoch
bieten sie eine Möglichkeit, Testerstellungsprozesse zu übernehmen, welche mit einer
anschließenden manuellen Nachkontrolle enden und reduzieren somit den Arbeitsaufwand
der Tester.:Abbildungsverzeichnis IV
Tabellenverzeichnis V
Quellcodeverzeichnis VI
Abkürzungsverzeichnis VIII
1 Einleitung 1
1.1 Problemstellung 1
1.2 Zielstellung 2
2 Grundlagen 4
2.1 Software Development Lifecycle 4
2.2 Large Language Models 6
2.2.1 Begriff und Einführung 6
2.2.2 Generative Pre-trained Transformer 8
2.3 Prompt Engineering 9
2.3.1 Prompt Elemente 10
2.3.2 Prompt Techniken 10
2.4 Unit Testing 12
2.4.1 Grundlagen 12
2.4.2 Java mit JUnit5 14
2.5 SonarQube 16
3 Konzeption 18
3.1 Voraussetzungen 18
3.2 Anforderungsanalyse 19
3.3 Wahl des Large Language Models 21
3.4 Design des Prompts 22
3.5 Programmablaufplan 25
4 Implementation 28
4.1 Funktionalitäten 28
4.1.1 Nutzerabfrage 28
4.1.2 Java-Datei Erfassung im Projekt 30
4.1.3 Prompt-Erstellung 30
4.1.4 API-Anfrage zur Generierung von Tests 33
4.1.5 Testüberprüfung mit Repair Rounds 34
4.1.6 Logging 37
4.2 Integration von SonarQube, Plugins, Dependencies 39
4.3 Testdurchlauf 40
5 Durchführung und Analyse 43
5.1 Durchführung 43
5.2 Evaluation der Tests 44
5.2.1 Line Coverage 45
5.2.2 Branch Coverage 47
5.2.3 Overall Coverage 49
5.2.4 Erfolgsquote 51
5.3 Testcodeanalyse 52
5.4 Vergleich mit manuellen Testergebnissen 56
5.5 Einordnung der Ergebnisse 57
6 Fazit 58
6.1 Schlussfolgerung 58
6.2 Ausblick 59
Literaturverzeichnis I
A Anhang - Quelltexte
|
2 |
Cyberbullying Detection on social platforms using LargeLanguage ModelsOttosson, Dan January 2023 (has links)
Social media and platforms utilise moderation to removeunwanted content such as cyberbullying, an aggressive acttowards an individual or group that occurs over any type ofdigital technology, e.g. social platforms. However,moderating platforms manually is nearly impossible, and thedemand for automatic moderation is rising. Research ontechnical solutions for cyberbullying detection on socialplatforms is scarce and is mostly focused on MachineLearning models to detect cyberbullying without theconnection to platform moderation. This study aims toenhance the research on cyberbullying detection models byusing a GPT-3 Large Language model and reduce the gap toplatform moderation. The model is tweaked and tested todetect cyberbullying using popular cyberbullying datasetsand compared to previous Machine Learning- and LargeLanguage models using common performance metrics.Furthermore, the latency of the model is measured to test if itcan be used as an auto-moderation tool to detectcyberbullying on social platforms. The results show that themodel is on par with the previous models and that finetuning a Large Language model is the preferred way totweak the model in cyberbullying detection. Further, theresults show that Large Language models have higherlatency than Machine Learning models but can be improvedby using multiple threads and can be used as a platformmoderation tool to detect cyberbullying.
|
3 |
Is Generative AI the New Business Partner? : Examining the Implementation Strategies and Benefits of Leveraging Generative AI in Organizational SettingsSarri, Anton, Sjölund, Jonas January 2024 (has links)
Introduction and Purpose – Emerging technologies such as GenAI are revolutionizing the business landscape and drastically changing the way organizations operate. As digital transformation accelerates, more and more organizations are using GenAI to streamline operations and strengthen their competitive position. Therefore, this study explores the enabling factors and challenges when implementing GenAI in the organizational settings. Furthermore, it also examines the driving factors and leveraging benefits of GenAI in digital transformation efforts. Methodology – The study has an explorative qualitative research design with semi-structured interviews to gather data from different industries, and business areas to collect insights into the practical applications and challenges of GenAI. This approach allowed the authors to conduct an in-depth understanding of the context and complex phenomena, GenAI. Moreover, a theoretical framework was adapted and developed from the literature review that further guided the findings and analysis. Findings and Analysis – The findings and analysis identified enabling factors for a successful implementation; Technological, Organizational and Employees, and challenges concerning; Ethics, Regulations and Skill Gaps. Hence, these factors can be both enablers and challenges, resonating with the findings that emphasize adaptability and responsiveness in digital transformation efforts. Moreover, responsible AI is still an uncertainty due to the rapid evolvement of the technology, which means that regulatory compliance does not keep up and can act as a barrier, or enabler. It is clear that GenAI is not a straightforward path, as several enabling factors need to be in place before scaling the technology into the organizational settings. However, organizations face challenges with technological infrastructure, data management, change management, and skill gaps. Lastly, the driving factors and leveraging benefits of GenAI stems from increased business value, divided into; Efficiency and Productivity Enhancements, Innovative Product and Service Development, Knowledge Management, Personal Assistant, and Data-Driven Insights. Discussion and Conclusion – The discussion is central to this study, where the authors integrate theory and empirical findings to generate valuable contributions. Therefore, the most central elements merges and are further discussed; Technological Readiness, Organizational Dynamics, and Responsible AI, which resulted in the creation of a new framework that further guides the academic and practical discourse. Although GenAI facilitates significant value creation, efficiency and competitive advantage, organizations are often hampered by the lack of these factors in the pursuit of digital transformation. In conclusion, this study underlines the importance of understanding that there is not one single enabling factor that needs to be in place before an implementation, rather they need to coexist with each other for a successful integration, emphasizing the transformation where technological advances meet human skills. Additionally, the human interaction and monitoring is also crucial, by setting organizational policies and standards in the quest to adapt to new regulations and ethical standards.
|
4 |
Empathetic AI for Enhanced Workplace Engagement / Empatisk AI för ökat arbetsplatsengagemangJusic, Samuel, Klockars, Love, Melinder, Anthony, Uddin, Anik, Wadman, Isak, Zanetti, Marcus January 2024 (has links)
This report outlines the research focused on finding the system design for Happymaker AI, a large language model with a mission to promote well-being at workplaces through daily interactions. The study includes a market analysis of relevant system components, such as database, cloud storage, cloud computing service and large language model, as well as the development of a prototype. Despite facing challenges including limited training data and resource constraints, the prototype was developed using the Llama 2 13B model which was quantized to 8-bits and fine-tuned using LoRA. Through research and prototyping of Happymaker AI, recommendations for the system design were established. These findings provide a foundation for the further development of an ethical AI system, specifically tailored for user data security and scalability. The findings also introduce a new perspective on empathy and personal well-being within the AI field, emphasizing the importance of integrating human-centric values into technological advancements. / Denna rapport skildrar forskningen som fokuserade på att hitta systemdesignen för Happymaker AI, en stor språkmodell med uppdraget att främja välmående på arbetsplatser genom dagliga interaktioner. Studien inkluderar en marknadsanalys av relevanta systemkomponenter såsom databas, molnlagring, molntjänster och en stor språkmodell, samt utvecklingen av en prototyp. Trots utmaningar, inklusive begränsad träningsdata och resursbegränsningar utvecklades prototypen med modellen Llama 2 13B som kvantiserades till 8-bit och tränades med LoRA. Genom forskning och prototypframtagning av Happymaker AI fastställdes rekommendationer för systemdesignen. Resultaten av studien ger en grund för vidareutveckling av ett etiskt AI-system som är anpassat för användardatasäkerhet och skalbarhet. Samtidigt introduceras ett nytt perspektiv på empati och personligt välmående inom AI-fältet, vilket betonar vikten av att integrera människocentrerade värderingar i teknologiska framsteg.
|
5 |
Probabilistic Modeling of Airborne Spherical Object for Robotic Limbs Implementation Using Artificial IntelligencePham, Binh 01 January 2024 (has links) (PDF)
In recent years, the technological space experienced the proliferation of Generative AI models. A prominent type of this model is a language model-based chatbot. The primary function of these models is to generate answers to a question from an extensive database and create a stream of conversation at various levels of complexity. The database of these models encompasses diverse data type of text (e.g., ChatGPT), audio (e.g., PlayHT), or images (e.g., DALLE-2). The intricate process involves neural networks, which undergoes pre-training from the database, building result from architecture neural networks, refined tuning to creating coherent result, probability estimation to produce the correct context result, and generating and refinement as improvement to generated answers. This proposal aims to delve deep into the probability estimation process of the generative AI model. A specific focus is to predict an airborne object's trajectory to create an understanding of how to adapt and adjust robotic limbs and enable them to intercept and capture the thing with some degree of precision.
|
6 |
Användning av generativ AI inom digital innovation : En kvalitativ studie ur innovatörers perspektiv / The use of generative AI in digital innovation : A qualitative study through the lens of innovatorsSüvari, Andreas, Wallmark, Rebecca January 2023 (has links)
Påskyndat av teknik går utvecklingen snabbare än någonsin. Generativ AI har blivit tillgänglig för allmänheten. Det ger möjligheter för verksamheter att nyttja AI-teknik utan större insatser och kunskap. Detta skiftar förutsättningarna inom digital innovation. Denna nya aktör skapar gap i litteraturen, där tidigare forskning behöver omvärderas. Ett viktigt forskningsområde är hur användningen av generativ AI påverkar digital innovation. En annan aspekt är hur innovatörer kan nyttja, och förhålla sig till generativ AI inom innovationsprocessen. För att undersöka detta har en kvalitativ studie genomförts, där empiri har samlats in genom åtta intervjuer. Studien har resulterat i en tematisk modell med följande teman: Generativ AI som en kollega; Generativ AI som resurs för digital innovation; Generativ AI ökar tillgängligheten till AI-teknik; Känslor gällande generativ AI; Problematik gällande generativ AI; Spridd och differentierad syn på digital innovation. Studien visar att generativ AI kan påverka digital innovation genom de resulterande temana. Vidare relateras dessa teman till innovationsprocessen, där en modifierad processmodell för innovation har tagits fram. Då användningen av generativ AI är ett relativt nytt fenomen är det sannolikt att innovatörer framöver kommer att öka sin användning av verktyget, vilket medför att fynden från denna studie riskerar att snabbt bli utdaterade. Vidare forskning bör därför utföra liknande studier med jämna mellanrum, för att fånga upp nya erfarenheter som uppstår av den ökade användningen. / Accelerated by technology, development is progressing faster than ever. Generative AI has become accessible to the general public. It provides opportunities for businesses to leverage AI technology without significant efforts and expertise. This shifts the conditions within digital innovation. This new actor creates gaps in the literature, where previous research needs to be reevaluated. An important research area is how the use of generative AI affects digital innovation. Another aspect is how innovators can utilize and engage with generative AI in the innovation process. To investigate this, a qualitative study has been conducted, where empirical data has been collected through eight interviews. The study has resulted in a thematic model with the following themes: Generative AI as a colleague; Generative AI as resource for digital innovation; Generative AI increases accessibility to AI technology; Emotions regarding generative AI; Challenges regarding generative AI; Diverse and differentiated views on digital innovation. The study shows that generative AI can affect digital innovation through the resulting themes. Furthermore, these themes were related to the innovation process, where a modified process model for innovation has been developed. Since the use of generative AI is a relatively new phenomenon, it is likely that innovators will increase their use of the tool in the future. This may render the findings from this study quickly outdated. Further research should therefore conduct similar studies at regular intervals to capture new experiences arising from increased usage.
|
7 |
Towards Manipulator Task-Oriented Programming: Automating Behavior-Tree ConfigurationYue Cao (18985100) 08 July 2024 (has links)
<p dir="ltr">Task-oriented programming is a way of programming manipulators in terms of high-level tasks instead of explicit motions. It has been a long-standing vision in robotics since its early days. Despite its potential, several challenges have hindered its full realization. This thesis identifies three major challenges, particularly in task specification and the planning-to-execution transition: 1) The absence of natural language integration in system input. 2) The dilemma of continuously developing non-uniform and domain-specific primitive-task libraries. 3) The requirement for much human intervention.</p><p dir="ltr">To overcome these difficulties, this thesis introduces a novel approach that integrates natural language inputs, eliminates the need on fixed primitive-task libraries, and minimizes human intervention. It adopts the behavior tree, a modular and user-friendly form, as the task representation and advances its usage in task specification and planning-to-execution transition. The thesis is structured into two parts – Task Specification and Planning-to-Execution Transition.</p><p dir="ltr">Task specification explores the use of large language models to generate a behavior tree from an end-user's input. A Phase-Step prompt is designed to enable the automatic behavior-tree generation from end-user's abstract task descriptions in natural languages. With the powerful generalizability of large language models, it breaks the dilemma that stays with fixed primitive-task libraries in task generation. A full-process case study demonstrated the proposed approach. An ablation study was conducted to evaluate the effectiveness of the Phase-Step prompts. Task specification also proposes behavior-tree embeddings to facilitate the retrieval-augmented generation of behavior trees. The integration of behavior-tree embeddings not only eliminates the need for manual prompt configuration but also provides a way to incorporate external domain knowledge into the generation process. Three types of evaluations were performed to assess the performance of the behavior-tree embedding method.</p><p dir="ltr">Planning-to-execution transition explores how to transit primitive tasks from task specification into manipulator executions. Two types of primitive tasks are considered separately: point-to-point movement tasks and object-interaction tasks. For point-to-point movement tasks, a behavior-tree reward is proposed to enable reinforcement learning over low-level movement while following high-level running order of the behavior tree. End-users only need to specify rewards on the primitive tasks over the behavior tree, and the rest of the process will be handled automatically. A 2D space movement simulation was provided to justify the approach. For object-interaction tasks, the planning-to-execution transition uses a large-language-model-based generation approach. This approach takes natural-language-described primitive tasks as input and directly produces task-frame-formalism set-points. Combined with hybrid position/force control systems, a transition process from primitive tasks directly into joint-level execution can be realized. Evaluations over a set of 30 primitive tasks were conducted.</p><p dir="ltr">Overall, this thesis proposes an approach that advances the behavior-tree towards automated task specification and planning-to-execution transitions. It opens up new possibilities for building better task-oriented manipulator programming systems.</p>
|
8 |
[pt] CONSULTANDO BANCOS DE DADOS COM LINGUAGEM NATURAL: O USO DE MODELOS DE LINGUAGEM GRANDES PARA TAREFAS DE TEXTO-PARA-SQL / [en] QUERYING DATABASES WITH NATURAL LANGUAGE: THE USE OF LARGE LANGUAGE MODELS FOR TEXT-TO-SQL TASKSEDUARDO ROGER SILVA NASCIMENTO 23 May 2024 (has links)
[pt] A tarefa chamada brevemente de Texto-para-SQL envolve a geração de uma consulta SQL com base em um banco de dados relacional e uma pergunta em linguagem natural. Embora os rankings de benchmarks conhecidos indiquem que Modelos de Linguagem Grandes (LLMs) se destacam nessa tarefa, eles são avaliados em bancos de dados com esquemas bastante simples. Esta dissertação investiga inicialmente o desempenho de modelos Texto-para-SQL baseados em LLMs em um banco de dados disponível ao público (Mondial)com um esquema conceitual complexo e um conjunto de 100 perguntas em Linguagem Natural (NL). Executando sob GPT-3.5 e GPT-4, os resultados deste primeiro experimento mostram que as ferramentas baseadas em LLM têm desempenho significativamente inferior ao relatado nesses benchmarks e enfrentam dificuldades com a vinculação de esquemas e joins, sugerindo que o esquema relacional pode não ser adequado para LLMs. Essa dissertação propõe então o uso de visões e descrições de dados amigáveis ao LLM para melhorara precisão na tarefa Texto-para-SQL. Em um segundo experimento, usando a estratégia com melhor performance, custo e benefício do experimento anterior e outro conjunto com 100 perguntas sobre um banco de dados do mundo real, os resultados mostram que a abordagem proposta é suficiente para melhorar consideravelmente a precisão da estratégia de prompt. Esse trabalho conclui com uma discussão dos resultados obtidos e sugere abordagens adicionais para simplificar a tarefa de Texto-para-SQL. / [en] The Text-to-SQL task involves generating an SQL query based on a
given relational database and a Natural Language (NL) question. While the
leaderboards of well-known benchmarks indicate that Large Language Models
(LLMs) excel in this task, they are evaluated on databases with simpler
schemas. This dissertation first investigates the performance of LLM-based
Text-to-SQL models on a complex and openly available database (Mondial)
with a large schema and a set of 100 NL questions. Running under GPT-3.5
and GPT-4, the results of this first experiment show that the performance of
LLM-based tools is significantly less than that reported in the benchmarks
and that these tools struggle with schema linking and joins, suggesting that
the relational schema may not be suitable for LLMs. This dissertation then
proposes using LLM-friendly views and data descriptions for better accuracy
in the Text-to-SQL task. In a second experiment, using the strategy with
better performance, cost and benefit from the previous experiment and another
set with 100 questions over a real-world database, the results show that the
proposed approach is sufficient to considerably improve the accuracy of the
prompt strategy. This work concludes with a discussion of the results obtained
and suggests further approaches to simplify the Text-to-SQL task.
|
9 |
GENERATING SQL FROM NATURAL LANGUAGE IN FEW-SHOT AND ZERO-SHOT SCENARIOSAsplund, Liam January 2024 (has links)
Making information stored in databases more accessible to users inexperienced in structured query language (SQL) by converting natural language to SQL queries has long been a prominent research area in both the database and natural language processing (NLP) communities. There have been numerous approaches proposed for this task, such as encoder-decoder frameworks, semantic grammars, and more recently with the use of large language models (LLMs). When training LLMs to successfully generate SQL queries from natural language questions there are three notable methods used, pretraining, transfer learning and in-context learning (ICL). ICL is particularly advantageous in scenarios where the hardware at hand is limited, time is of concern and large amounts of task specific labled data is nonexistent. This study seeks to evaluate two strategies in ICL, namely zero-shot and few-shot scenarios using the Mistral-7B-Instruct LLM. Evaluation of the few-shot scenarios was conducted using two techniques, random selection and Jaccard Similarity. The zero-shot scenarios served as a baseline for the few-shot scenarios to overcome, which ended as anticipated, with the few-shot scenarios using Jaccard similarity outperforming the other two methods, followed by few-shot scenarios using random selection coming in at second best, and the zero-shot scenarios performing the worst. Evaluation results acquired based on execution accuracy and exact matching accuracy confirm that leveraging similarity in demonstrating examples when prompting the LLM will enhance the models knowledge about the database schema and table names which is used during the inference phase leadning to more accurately generated SQL queries than leveraging diversity in demonstrating examples.
|
10 |
Fake News Detection : Using a Large Language Model for Accessible SolutionsJurgell, Fredrik, Borgman, Theodor January 2024 (has links)
In an attempt to create a fake news detection tool using a large language model (LLM), the emphasis is on validating the effectiveness of this approach and then making the tooling readily available. With the current model of gpt-4-turbo-preview and its assistant capabilities combined with simple prompts tailored to different objectives. While tools to detect fake news and simplify the process are not new, insight into how they work and why is not commonly available, most likely due to the monetization around the current services. By building an open-source platform that others can expand upon, giving insight into the prompts used, and enabling experimentation and a baseline to start at when developing further or taking inspiration from. The results when articles are not willfully written as fake but missing key data are obviously very hard to detect. However, common tabloid-style news, which are often shared to create an emotional response, shows more promising detection results.
|
Page generated in 0.096 seconds