Spelling suggestions: "subject:"large language model"" "subject:"marge language model""
11 |
Automating Software Development Processes Through Multi-Agent Systems : A Study in LLM-based Software Engineering / Automatisering av Mjukvaruutvecklingsprocesser genom användning av Multi-Agent System : En studie inom LLM-baserad mjukvaruutvecklingPeltomaa Åström, Samuel, Winoy, Simon January 2024 (has links)
In the ever-evolving landscape of Software Development, the demand for more efficient, scalable, and automated processes is paramount. The advancement of Generative AI has unveiled new avenues for innovative approaches to address this demand. This thesis explores one such avenue through the use of Multi-Agent Systems combined with Large Language Models (LLMs) to automate tasks within the development lifecycle. The thesis presents a structure for designing and developing an LLM-based multi-agent application by encompassing agent design principles, strategies for facilitating multi-agent collaboration, and providing valuable insights into the selection of an appropriate agent framework. Furthermore, the thesis showcases the developed application in its problem-solving capabilities with quantitative benchmarking results. Additionally, the study demonstrates practical implementations through examples of real-world applications. This study demonstrates the potential of utilizing LLM-based multi-agent systems in enhancing software development efficiency, offering companies a promising and powerful tool for streamlining Software Engineering workflows. / I den ständigt föränderliga världen av mjukvaruutveckling är behovet av mer effektiva, skalbara, och automatiserade metoder av stor betydelse. Framstegen inom generativ AI har öppnat nya möjligheter för utveckling av metoder för detta ändamål. Denna studie undersöker en sådan möjlighet genom användning av multi-agent system i samband med stora språkmodeller (Large Language Models, LLM) för automatisering av uppgifter inom utvecklingslivscykeln. Studien presenterar en struktur för design och utveckling av en LLM-baserad multi-agent applikation genom att bearbeta agentdesign och strategier för att underlätta samarbete mellan flera agenter och ge värdefulla insikter i valet av ett lämpligt agent-ramverk. Vidare demonstrerar studien den utvecklade applikationens problemlösningsförmåga med kvantitativa benchmark-resultat. Utöver detta inkluderar studien även exempel på genererade applikationer för att presentera konkreta exempel på implementeringar. Denna studie visar potentialen av att använda LLM-baserade multi-agent system för att förbättra effektiviteten inom mjukvaruutveckling, och erbjuder företag ett lovande och kraftfullt verktyg för effektivisering av arbetsflöden inom mjukvaruteknik.
|
12 |
[en] A DATA ANNOTATION APPROACH USING LARGE LANGUAGE MODELS / [pt] UMA ABORDAGEM PARA ANOTAÇÃO DE DADOS UTILIZANDO GRANDES MODELOS DE LINGUAGEMCARLOS VINICIOS MARTINS ROCHA 17 October 2024 (has links)
[pt] Os documentos são essenciais para o sistema econômico e acadêmico;
no entanto, explorá-los pode ser uma tarefa complexa e demorada. Uma
abordagem para contornar esse problema é o uso de modelos de Visual
Question and Answering (VQA) para extração de informações de documentos
por meio de prompts em linguagem natural. No VQA, assim como para
o desenvolvimento dos mais variados modelos, é necessário possuir dados
anotados para a sua etapa de treinamento e validação. No entanto, criar esses
conjuntos de dados é desafiador devido ao alto custo envolvido no processo.
Com base nisso, propomos um processo de quatro etapas que combina Modelos
de Visão Computacional e Large Language Models (LLMs) para a anotação
de dados de VQA em relatórios financeiros. O método proposto inicia pelo
reconhecimento da estrutura textual dos documentos por meio de modelos de
Análise de Layout de Documentos e Extração de Estrutura de Tabelas. Em
seguida, utiliza duas LLMs distintas para a etapa de geração e avaliação dos
pares de perguntas e respostas geradas, automatizando a construção e seleção
dos melhores pares para compor a base final. Para avaliar o método proposto,
geramos um dataset para treinar e avaliar modelos especialistas em VQA. / [en] Documents are essential for the economic and academic system; however,
exploring them can be complex and time-consuming. An approach to surpass
this problem is the use of Visual Question and Answering (VQA) models to
extract information from documents through natural language prompts. In
VQA, as well as for the development of various models, it is necessary to have
annotated data for training and validation. However, creating these datasets is
challenging due to the high cost involved in the process. To face this challenge,
we propose a four-step process that combines Computer Vision Models and
Large Language Models (LLMs) for VQA data annotation in financial reports.
The proposed method starts with recognizing the textual structure of documents through Document Layout Analysis and Table Structure Extraction
models. Then, it uses two distinct LLMs for the generation and evaluation of
question and answer pairs, automating the construction and selection of the
best pairs to compose the final dataset. To evaluate the proposed method, we
generate a dataset for train and evaluate VQA specialized models.
|
13 |
Unveiling the Values of ChatGPT : An Explorative Study on Human Values in AI Systems / Avslöjandet av ChatGPT:s värderingar : En undersökande studie om mänskliga värderingar i AI-systemLindahl, Caroline, Saeid, Helin January 2023 (has links)
Recent technological breakthroughs in natural language processing and artificial intelligence (AI) and the subsequent release of OpenAIs generative AI system, ChatGPT, have warranted much attention from researchers and the general public alike. Some with praise, foreseeing a brighter future for all, and some predicting the end of humanity. As AI agents become increasingly complex and gain the ability to deal with tradeoffs and become more autonomous, the problem of embedding human values into these AI agents becomes more pressing. Embedding human values is a crucial part of the development of aligned AI systems that act in accordance with human intents and desires. The black-box nature of large language models (LLMs) offers little insight into the mechanics of the AI agents' decision-making processes. For this reason, it is of great interest to explore what values an LLM might hold. This explorative study lets the most popular LLM chatbot today, ChatGPT answer a set of questions focusing on human values. The questions were adopted from the World Value Survey (WVS) and relate to current global values around different subjects, such as same-sex marriage, corruption and raising children. The results were compared to the latest data set (from 2022) from the WVS to show how close or far the values of ChatGPT are to the respondents' values across countries. The findings contribute to the broader understanding of the challenges and implications of developing AI systems that align with human values. Which is crucial in ensuring the systems trustworthiness and beneficial impact on society. The findings of this explorative study support that ChatGPT's values are influenced by the values prevalent in developed democracies, with a leaning towards progressive/liberal views. Results could also imply that ChatGPT may propose a neutral attitude towards questioning established systems and institutions, and emphasizing individual rights. / Nya tekniska genombrott inom naturlig språkbehandling och artificiell intelligens (AI) samt OpenAIs generativa AI-system, ChatGPT, har genererat mycket uppmärksamhet från både forskare inom fältet och från allmänheten. Vissa spår en ljusare framtid, medan andra siar om slutet för mänskligheten. Allteftersom “AI-agenter” blir mer komplexa utvecklar förmågan att göra avvägningar, och blir mer autonoma, blir problemet med att integrera mänskliga värden i dessa AI-agenter mer angeläget. Att integrera mänskliga värderingar är en avgörande del i utvecklingen av AI-system som agerar i enlighet med mänskliga avsikter och vilja. Vi saknar insyn i mekaniken för beslutsprocesser hos stora språkmodeller (eng: large language models, LLMs) och av denna anledning är det av stort intresse att utforska vilka värderingar en språkmodell uttrycker. Denna utforskande studie låter den, för närvarande, mest populära LLM-chatboten, ChatGPT, svara på en uppsättning av frågor som fokuserar på mänskliga värderingar. Frågorna har hämtats från World Value Survey (WVS) och relaterar till aktuella globala värderingar kring olika ämnen, såsom samkönade äktenskap, korruption och barnuppfostran. Resultaten jämfördes med den senaste datan (från 2022) från WVS för att visa hur nära eller långt värdena för ChatGPT ligger respondenterna från olika länders. Resultaten från denna studie bidrar till en bredare förståelse för utmaningarna och konsekvenserna av att utveckla AI-system som är i linje med mänskliga värderinga. Detta är en viktig aspekt för att kunna säkerställa systemens pålitlighet och positiva inverkan på samhället. Resultaten av denna explorativa studie stödjer att ChatGPT:s värderingar influeras av de värderingar som råder i utvecklade demokratier, med en tendens mot progressiva/liberala åsikter. Resultaten kan också antyda att ChatGPT kan ställa sig neutralt till ifrågasättandet av etablerade system och institutioner, samt betonar individuella rättigheter.
|
14 |
Large language models as an interface to interact with API tools in natural languageTesfagiorgis, Yohannes Gebreyohannes, Monteiro Silva, Bruno Miguel January 2023 (has links)
In this research project, we aim to explore the use of Large Language Models (LLMs) as an interface to interact with API tools in natural language. Bubeck et al. [1] shed some light on how LLMs could be used to interact with API tools. Since then, new versions of LLMs have been launched and the question of how reliable a LLM can be in this task remains unanswered. The main goal of our thesis is to investigate the designs of the available system prompts for LLMs, identify the best-performing prompts, and evaluate the reliability of different LLMs when using the best-identified prompts. We will employ a multiple-stage controlled experiment: A literature review where we reveal the available system prompts used in the scientific community and open-source projects; then, using F1-score as a metric we will analyse the precision and recall of the system prompts aiming to select the best-performing system prompts in interacting with API tools; and in a latter stage, we compare a selection of LLMs with the best-performing prompts identified earlier. From these experiences, we realize that AI-generated system prompts perform better than the current prompts used in open-source and literature with GPT-4, zero-shot prompts have better performance in this specific task with GPT-4 and that a good system prompt in one model does not generalize well into other models.
|
15 |
Efficient Sentiment Analysis and Topic Modeling in NLP using Knowledge Distillation and Transfer Learning / Effektiv sentimentanalys och ämnesmodellering inom NLP med användning av kunskapsdestillation och överföringsinlärningMalki, George January 2023 (has links)
This abstract presents a study in which knowledge distillation techniques were applied to a Large Language Model (LLM) to create smaller, more efficient models without sacrificing performance. Three configurations of the RoBERTa model were selected as ”student” models to gain knowledge from a pre-trained ”teacher” model. Multiple steps were used to improve the knowledge distillation process, such as copying some weights from the teacher to the student model and defining a custom loss function. The selected task for the knowledge distillation process was sentiment analysis on Amazon Reviews for Sentiment Analysis dataset. The resulting student models showed promising performance on the sentiment analysis task capturing sentiment-related information from text. The smallest of the student models managed to obtain 98% of the performance of the teacher model while being 45% lighter and taking less than a third of the time to analyze an entire the entire IMDB Dataset of 50K Movie Reviews dataset. However, the student models struggled to produce meaningful results on the topic modeling task. These results were consistent with the topic modeling results from the teacher model. In conclusion, the study showcases the efficacy of knowledge distillation techniques in enhancing the performance of LLMs on specific downstream tasks. While the model excelled in sentiment analysis, further improvements are needed to achieve desirable outcomes in topic modeling. These findings highlight the complexity of language understanding tasks and emphasize the importance of ongoing research and development to further advance the capabilities of NLP models. / Denna sammanfattning presenterar en studie där kunskapsdestilleringstekniker tillämpades på en stor språkmodell (Large Language Model, LLM) för att skapa mindre och mer effektiva modeller utan att kompremissa på prestandan. Tre konfigurationer av RoBERTa-modellen valdes som ”student”-modeller för att inhämta kunskap från en förtränad ”teacher”-modell. Studien mäter även modellernas prestanda på två ”DOWNSTREAM” uppgifter, sentimentanalys och ämnesmodellering. Flera steg användes för att förbättra kunskapsdestilleringsprocessen, såsom att kopiera vissa vikter från lärarmodellen till studentmodellen och definiera en anpassad förlustfunktion. Uppgiften som valdes för kunskapsdestilleringen var sentimentanalys på datamängden Amazon Reviews for Sentiment Analysis. De resulterande studentmodellerna visade lovande prestanda på sentimentanalysuppgiften genom att fånga upp information relaterad till sentiment från texten. Den minsta av studentmodellerna lyckades erhålla 98% av prestandan hos lärarmodellen samtidigt som den var 45% lättare och tog mindre än en tredjedel av tiden att analysera hela IMDB Dataset of 50K Movie Reviews datasettet.Dock hade studentmodellerna svårt att producera meningsfulla resultat på ämnesmodelleringsuppgiften. Dessa resultat överensstämde med ämnesmodelleringsresultaten från lärarmodellen. Dock hade studentmodellerna svårt att producera meningsfulla resultat på ämnesmodelleringsuppgiften. Dessa resultat överensstämde med ämnesmodelleringsresultaten från lärarmodellen.
|
16 |
ChatGPT som socialt disruptiv teknologi : En fallstudie om studierektorers inställning till ChatGPT och dess påverkan på utbildningBack, Hampus, Fischer, Fredrik January 2023 (has links)
Teknologiutvecklingen av stora språkmodeller har på senaste tiden blivit uppmärksammad genom lanseringen av OpenAI:s ChatGPT. Det har förekommit diskussioner om vad detta innebär för samhället i stort men också hur utbildningen på lärosäten påverkas. Syftet med denna studie var att studera hur stor påverkan dessa verktyg har på utbildningen på Uppsala universitet. Fem studierektorer från olika institutioner har intervjuats. Datan analyserades sedan med hjälp av teorin för socialt disruptiva teknologier för att undersöka hur stor påverkansgraden är. Resultatet visar att det främst är examinationer som har påverkats, där vissa studierektorer har behövt ta bort eller kommer att ta bort hemuppgifter som konsekvens av ChatGPT. Skillnader i förändringsarbetet finns mellan olika institutioner, vilket tycks delvis grunda sig i brist på riktlinjer, men även i utbildningsstruktur och personligt engagemang. Det går dock inte att fastslå några systematiska skillnader mellan universitetets olika delar. Vidare har det diskuterats bredare frågor om studenternas lärande och hur man som studierektor kan förhålla sig till utvecklingen. / The technology development of large language models has recently received attention through the launch of OpenAI’s ChatGPT. There have been discussions of what this means for society overall, but also how the education at universities is affected. The purpose of this study was to study how much impact these tools have on education at Uppsala University. Five directors of studies from different departments have been interviewed. The data was then analyzed using the theory of socially disruptive technologies to investigate the degree of impact. The result shows that it is mainly examinations that have been affected, where some principals have had to remove or will remove homework assignments as a consequence of ChatGPT. Differences in change management exist between different institutions, which seem to be partly due to the lack of guidelines, but also due to educational structure and personal commitment. However, no systematic differences can be determined between the different parts of the university. Furthermore, there have been discussions about broader questions about the students' learning and how one should relate to the development as a director of studies.
|
17 |
Bridging Language & Data : Optimizing Text-to-SQL Generation in Large Language Models / Från ord till SQL : Optimering av text-till-SQL-generering i stora språkmodellerWretblad, Niklas, Gordh Riseby, Fredrik January 2024 (has links)
Text-to-SQL, which involves translating natural language into Structured Query Language (SQL), is crucial for enabling broad access to structured databases without expert knowledge. However, designing models for such tasks is challenging due to numerous factors, including the presence of ’noise,’ such as ambiguous questions and syntactical errors. This thesis provides an in-depth analysis of the distribution and types of noise in the widely used BIRD-Bench benchmark and the impact of noise on models. While BIRD-Bench was created to model dirty and noisy database values, it was not created to contain noise and errors in the questions and gold queries. We found after a manual evaluation that noise in questions and gold queries are highly prevalent in the financial domain of the dataset, and a further analysis of the other domains indicate the presence of noise in other parts as well. The presence of incorrect gold SQL queries, which then generate incorrect gold answers, has a significant impact on the benchmark’s reliability. Surprisingly, when evaluating models on corrected SQL queries, zero-shot baselines surpassed the performance of state-of-the-art prompting methods. The thesis then introduces the concept of classifying noise in natural language questions, aiming to prevent the entry of noisy questions into text-to-SQL models and to annotate noise in existing datasets. Experiments using GPT-3.5 and GPT-4 on a manually annotated dataset demonstrated the viability of this approach, with classifiers achieving up to 0.81 recall and 80% accuracy. Additionally, the thesis explored the use of LLMs for automatically correcting faulty SQL queries. This showed a 100% success rate for specific query corrections, highlighting the potential for LLMs in improving dataset quality. We conclude that informative noise labels and reliable benchmarks are crucial to developing new Text-to-SQL methods that can handle varying types of noise.
|
18 |
The future of IT Project Management & Delivery: NLP AI opportunities & challengesViznerova, Ester January 2023 (has links)
This thesis explores the opportunities and challenges of integrating recent Natural Language Processing (NLP) Artificial Intelligence (AI) advancements into IT project management and delivery (PM&D). Using a qualitative design through hermeneutic phenomenology strategy, the study employs a semi-systematic literature review and semi-structured interviews to delve into NLP AI's potential impacts in IT PM&D, from both theoretical and practical standpoints. The results revealed numerous opportunities for NLP AI application across Project Performance Domains, enhancing areas such as stakeholder engagement, team productivity, project planning, performance measurement, project work, delivery, and risk management. However, challenges were identified in areas including system integration, value definition, team and stakeholder-related issues, environmental considerations, and ethical concerns. In-house and third-party model usage also presented their unique set of challenges, emphasizing cost implications, data privacy and security, result quality, and dependence issues. The research concludes the immense potential of NLP AI in IT PM&D is tempered by these challenges, and calls for robust strategies, sound ethics, comprehensive training, new ROI evaluation frameworks, and responsible AI usage to effectively manage these issues. This thesis provides valuable insights to academics, practitioners, and decision-makers navigating the rapidly evolving landscape of NLP AI in IT PM&D.
|
19 |
Java Unit Testing with AI: An AI-Driven Prototype for Unit Test Generation / Enhetstestning i Java med hjälp av AI: En AI-baserad prototyp för generering av enhetstesterKahur, Katrin, Su, Jennifer January 2023 (has links)
In recent years, artificial intelligence (AI) has become increasingly popular. An area where AI technology is used and has received much attention during the past year is chatbots. They can simulate an understanding of human language and form text responses to questions asked. Apart from generating text responses, they can also generate programming code, making them useful for tasks such as testing. Although testing is considered a crucial part of software development, many find it tedious and time-consuming. There are currently limited AI-powered tools for generating unit tests in general and even fewer for the programming language Java. The thesis tackles the problem of the lack of tools for generating unit tests in Java that explore the capabilities of AI, and a research question is introduced thereafter. The purpose of this thesis is to address the issue by creating a prototype for generating unit tests in Java based on the AI model, GPT-3.5-Turbo. The goal is to provide a basis for other professionals to create tools for generating unit tests, which was done by experimenting with different prompts and values of a randomness parameter and then suggesting the prototype JUTAI. A quantitative research method with an experimental and comparative approach was used to evaluate the results. A comparison model with three criteria was brought forward to evaluate the results. The findings reveal that JUTAI outperformed the general-purpose AI tool, ChatGPT, across all three criteria and indicate that the goal of this thesis is achieved and the research question answered. / Intresset för artificiell intelligens (AI) har ökat de senaste åren. Ett område där AI- teknologi används och som har fått mycket uppmärksamhet under det senaste året är chattbottar. De kan simulera en förståelse för mänskligt språk och svara på frågor i textformat. Utöver det kan de även generera programkod. Tack vare förmågan att generera kod kan de användas för testning. Även om testning anses vara en viktig del av mjukvaruutveckling, tycker många att det är tråkigt och tidskrävande. För närvarande finns det ett begränsat antal verktyg som kan generera enhetstester, och det finns ännu färre verktyg som kan göra detta i Java. Detta examensarbete tog sig an problemet med bristen på AI-verktyg för enhetstestning i Java genom att besvara på forskningsfrågan som ställdes. Syftet med examensarbetet är att föreslå en lösning på problemet genom att utveckla en prototyp som använder sig av AI- modellen GPT-3.5-Turbo för att generera enhetstester i Java. Målet är att ge en grund för andra yrkesverksamma att skapa verktyg för att generera enhetstester, vilket gjordes genom att experimentera med olika instruktionstrukturer och värden för en slumpmässighetsparameter, och sedan föreslå protypen JUTAI. En kvantitativ forskningsmetod tillsammans med en experimentell och jämförande ansats användes för att utvärdera resultaten. En jämförelsemodell med tre kriterier togs fram för att utvärdera resultaten. Resultaten visar att JUTAI presterade bättre än AI-verktyget ChatGPT i de tre kriterierna och indikerar att målet med detta examensarbete uppnåddes och forskningsfrågan besvarades.
|
20 |
[pt] GERAÇÃO DE DESCRIÇÕES DE PRODUTOS A PARTIR DE AVALIAÇÕES DE USUÁRIOS USANDO UM LLM / [en] PRODUCT DESCRIPTION GENERATION FROM USER REVIEWS USING A LLMBRUNO FREDERICO MACIEL GUTIERREZ 04 June 2024 (has links)
[pt] No contexto de comércio eletrônico, descrições de produtos exercem
grande influência na experiência de compra. Descrições bem feitas devem
idealmente informar um potencial consumidor sobre detalhes relevantes do
produto, esclarecendo potenciais dúvidas e facilitando a compra. Gerar boas
descrições, entretanto, é uma atividade custosa, que tradicionalmente exige
esforço humano. Ao mesmo tempo, existe uma grande quantidade de produtos
sendo lançados a cada dia. Nesse contexto, este trabalho apresenta uma nova
metodologia para a geração automatizada de descrições de produtos, usando
as avaliações deixadas por usuários como fonte de informações. O método
proposto é composto por três etapas: (i) a extração de sentenças adequadas
para uma descrição a partir das avaliações (ii) a seleção de sentenças dentre
as candidatas (iii) a geração da descrição de produto a partir das sentenças
selecionadas usando um Large Language Model (LLM) de forma zero-shot.
Avaliamos a qualidade das descrições geradas pelo nosso método comparando-as com descrições de produto reais postadas pelos próprios anunciantes. Nessa
avaliação, contamos com a colaboração de 30 avaliadores, e verificamos que
nossas descrições são preferidas mais vezes do que as descrições originais,
sendo consideradas mais informativas, legíveis e relevantes. Além disso, nessa
mesma avaliação replicamos um método da literatura recente e executamos
um teste estatístico comparando seus resultados com o nosso método, e dessa
comparação verificamos que nosso método gera descrições mais informativas e
preferidas no geral. / [en] In the context of e-commerce, product descriptions have a great influence on the shopping experience. Well-made descriptions should ideally inform a potential consumer about relevant product details, clarifying potential doubt sand facilitating the purchase. Generating good descriptions, however, is a costly activity, which traditionally requires human effort. At the same time, there are a large number of products being launched every day. In this context, this work presents a new methodology for the automated generation of product descriptions, using reviews left by users as a source of information. The proposed method consists of three steps: (i) the extraction of suitable sentences for a description from the reviews (ii) the selection of sentences among the candidates (iii) the generation of the product description from the selected sentences using a Large Language Model (LLM) in a zero-shot way. We evaluate the quality of descriptions generated by our method by comparing them to real product descriptions posted by sellers themselves. In this evaluation, we had the collaboration of 30 evaluators, and we verified that our descriptions are preferred more often than the original descriptions, being considered more informative, readable and relevant. Furthermore, in this same evaluation we replicated a method from recent literature and performed a statistical test comparing its results with our method, and from this comparison we verified that our method generates more informative and preferred descriptions overall.
|
Page generated in 0.0907 seconds