41 |
Large language models as an interface to interact with API tools in natural languageTesfagiorgis, Yohannes Gebreyohannes, Monteiro Silva, Bruno Miguel January 2023 (has links)
In this research project, we aim to explore the use of Large Language Models (LLMs) as an interface to interact with API tools in natural language. Bubeck et al. [1] shed some light on how LLMs could be used to interact with API tools. Since then, new versions of LLMs have been launched and the question of how reliable a LLM can be in this task remains unanswered. The main goal of our thesis is to investigate the designs of the available system prompts for LLMs, identify the best-performing prompts, and evaluate the reliability of different LLMs when using the best-identified prompts. We will employ a multiple-stage controlled experiment: A literature review where we reveal the available system prompts used in the scientific community and open-source projects; then, using F1-score as a metric we will analyse the precision and recall of the system prompts aiming to select the best-performing system prompts in interacting with API tools; and in a latter stage, we compare a selection of LLMs with the best-performing prompts identified earlier. From these experiences, we realize that AI-generated system prompts perform better than the current prompts used in open-source and literature with GPT-4, zero-shot prompts have better performance in this specific task with GPT-4 and that a good system prompt in one model does not generalize well into other models.
|
42 |
ChatGPT’s Performance on the BriefElectricity and Magnetism AssessmentMelin, Jakob, Elias, Önerud January 2024 (has links)
In this study, we tested the performance of ChatGPT-4 on the concept inventory Brief Electricity and Magnetism Assessment (BEMA) to understand its potential as an educational tool in physics, especially in tasks requiring visual interpretation. Our results indicate that ChatGPT-4 performs similarly to undergraduate students in introductory electromagnetism courses, with an average score close to that of the students. However, ChatGPT-4 displayed significant differences compared to students, particularly in tasks involving complex visual elements such as electrical circuits and magnetic field diagrams. While ChatGPT-4 was proficient in proposing correct physical reasoning, it struggled with accurately interpreting visual information. These findings suggest that while ChatGPT-4 can be a useful supplementary tool for students, it should not be relied upon as a primary tutor for subjects heavily dependent on visual interpretation. Instead, it could be more effective as a peer, where its outputs are critically evaluated by students. Further research should focus on improving ChatGPT’s visual processing capabilities and exploring its role in diverse educational contexts.
|
43 |
Multimodal Multi-label Classification with Small Foundation ModelsMartin Björkdahl, Liv January 2024 (has links)
The use of electronic health records (EHR) from various sources like text, images and time-series data to make predictions or diagnosis have been researchedpreviously. Many previous methods have used separate models either for sepa-rate modalities or for distinct tasks. Recently, models trained to make medicalpredictions using multimodal input have emerged, as a unified approach wouldbe beneficial for health practitioners. We present a single model to make medicalpredictions for several tasks, using diverse input from different modalities. Wedemonstrate the effectiveness of using an autoencoder method to project (EHR)data from three different modalities – images, text and time-series data – into thesmall language model Gemma-2B. 6 projector models are used together with the small language model to perform multi-label prediction for 12 different medicalprediction tasks. Results show that a jointly trained model using asymmetric loss,a loss function that dynamically emphasises positives that are poorly predicted,shows good performance and predicts evenly across tasks.
|
44 |
Navigating Microservices with AI : Design Patterns and Communication Techniques in Modern IT IndustriesWijewarna Arachchige, Shehan Thamoda January 2024 (has links)
Identifying the most used design patterns and communication methods for a microservices architecture holds paramount importance in ensuring system stability within the IT industry. With a plethora of approaches available, encompassing various design patterns and communication techniques, the selection of technologies often hinges upon the expertise and knowledge of software designers. However, designing microservices solely through human efforts poses significant challenges for companies. Moreover, the monitoring aspect presents its own set of complexities in the realm of microservices. This research aims to elucidate the prevalent design patterns and communication methods utilized in modern IT microservices environments. It also seeks to demonstrate the integration of Artificial Intelligence tools such as ChatGPT into the architecture design process, particularly in enhancing monitoring capabilities within microservices. Subsequently, a prototype is proposed to offer design guidance, as recommended by the ChatGPT and Llama 2 customized models in accordance with the specified requirements. Additionally, the study primarily seeks to identify prevalent design patterns and communication methods employed within the IT industry through the utilization of surveys and interviews conducted with both technical and non-technical personnel.
|
45 |
Large Language Models as Advanced Data Preprocessors : Transforming Unstructured Text into Fine-Tuning DatasetsVangeli, Marius January 2024 (has links)
The digital landscape increasingly generates vast amounts of unstructured textual data, valuable for analytics and various machine learning (ML) applications. These vast stores of data, often likened to digital gold, are often challenging to process and utilize. Traditional text processing methods, lacking the ability to generalize, typically struggle with unstructured and unlabeled data. For many complex data management workflows, the solution typically involves human intervention in the form of manual curation and labeling — a time-consuming process. Large Language Models (LLMs) are AI models trained on vast amounts of text data. They have remarkable Natural Language Processing (NLP) capabilities and offer a promising alternative. This thesis serves as an empirical case study of LLMs as advanced data preprocessing tools. It explores the effectiveness and limitations of using LLMs to automate and refine traditionally challenging data preprocessing tasks, highlighting a critical area of research in data management. An LLM-based preprocessing pipeline, designed to clean and prepare raw textual data for use in ML applications, is implemented and evaluated. This pipeline was applied to a corpus of unstructured text documents, extracted from PDFs, with the aim of transforming them into a fine-tuning dataset for LLMs. The efficacy of the LLM-based preprocessing pipeline was assessed by comparing the results against a manually curated benchmark dataset using two text similarity metrics: the Levenshtein distance and ROUGE score. The findings indicate that although LLMs are not yet capable of fully replacing human curation in complex data management workflows, they substantially improve the efficiency and manageability of preprocessing unstructured textual data.
|
46 |
Digital Platform Dynamics: Governance, Market Design and AI IntegrationIlango Guru Muniasamy (19149178) 17 July 2024 (has links)
<p dir="ltr">In my dissertation, I examine the dynamics of digital platforms, starting with the governance practices of established platforms, then exploring innovative design approaches, and finally the integration of advanced AI technologies in platforms. I structure this exploration into three essays: in the first essay, I discuss moderation processes in online communities; in the second, I propose a novel design for a blockchain-based green bond exchange; and in the third, I examine how AI-based decision-making platforms can be enhanced through synthetic data generation.</p><p dir="ltr">In my first essay, I investigate the role of moderation in online communities, focusing on its effect on users' participation in community moderation. Using data from a prominent online forum, I analyze changes in users' moderation actions (upvoting and downvoting of others' content) after they experience a temporary account suspension. While I find no significant change in their upvoting behavior, my results suggest that users downvote more after their suspension. Combined with findings on lower quality and conformity with the community while downvoting, the results suggest an initial increase in hostile moderation after suspension, although these effects dissipate over time. The short-term hostility post-suspension has the potential to negatively affect platform harmony, thus revealing the complexities of disciplinary actions and their unintended consequences.</p><p dir="ltr">In the second essay, I shift from established platforms to innovations in platform design, presenting a novel hybrid green bond exchange that integrates blockchain technology with thermodynamic principles to address market volatility and regulatory uncertainty. The green bond market, despite its high growth, faces issues like greenwashing, liquidity constraints, and limited retail investor participation. To tackle these challenges, I propose an exchange framework that uses blockchain for green bond tokenization, enhancing transparency and accessibility. By conceptualizing the exchange as a thermodynamic system, I ensure economic value is conserved and redistributed, promoting stability and efficiency. I include key mechanisms in the design to conserve value in the exchange and deter speculative trading. Through simulations, I demonstrate significant improvements in market stability, liquidity, and efficiency, highlighting the effectiveness of this interdisciplinary approach and offering a robust framework for future financial system development.</p><p dir="ltr">In the third essay, I explore the integration of advanced AI technologies, focusing on how large language models (LLMs) like GPT can be adapted for specialized fields such as education policy and decision-making. To address the need for high-quality, domain-specific training data, I develop a methodology that combines agent-based simulation (ABS) with synthetic data generation and GPT fine-tuning. This enhanced model provides accurate, contextually relevant, and interpretable insights for educational policy scenarios. My approach addresses challenges such as data scarcity, privacy concerns, and the need for diverse, representative data. Experiments show significant improvements in model performance and robustness, offering policymakers a powerful tool for exploring complex scenarios and making data-driven decisions. This research advances the literature on synthetic data in AI and agent-based modeling in education, demonstrating the adaptability of large language models to specialized domains.</p>
|
47 |
RAG-based data extraction : Mining information from second-life battery documentsEdström, Jesper January 2024 (has links)
With the constant evolution of Large Language Models (LLMs), methods for minimizing hallucinations are being developed to provide more truthful answers. By using Retrieval-Augmented Generation (RAG), external data can be provided to the model on which its answers should be based. This project aims at using RAG for a data extraction pipeline specified for second-life batteries. By pre-defining the prompts the user may only provide the documents that are wished to be analyzed, this is to ensure that the answers are in the correct format for further data processing. To process different document types, initial labeling takes place before more specific extraction suitable for the document can be applied. Best performance is achieved by grouping questions that allow the model to reason around what the relevant questions are so that no hallucinations occur. Regardless of whether there are two or three document types, the model performs equally well, and it is clear that a pipeline of this type is well suited to today's models. Further improvements can be achieved by utilizing models containing a larger context window and initially using Optical Character Recognition (OCR) to read text from the documents.
|
48 |
Parameter efficiency in Fine tuning Pretrained Large Language Models for Downstream TasksDorairaj, Jonathan January 2024 (has links)
This thesis investigates Parameter-Efficient Fine-Tuning (PEFT) methods, specifically Low-Rank Adaptation (LoRA) (Hu et al. 2021) and Adapters (Houlsby et al. 2019), using the General Language Understanding Evaluation (GLUE) dataset (Wang et al. 2019). The primary focus is to evaluate the effectiveness and efficiency of these methods in fine-tuning pre-trained language models. Additionally, we introduce a novel application by applying the methodology from Yang et al. 2024 to the adapter module weights. We utilize Laplace approximations over both the LoRA (Yang et al. 2024, Daxberger et al. 2022a) and the newly adapted Adapter weights, assessing the Expected Calibration Error (ECE) and Negative Log-Likelihood (NLL). Furthermore, we discuss practical considerations such as training time, memory usage, and storage space implications of these PEFT techniques. The findings provide valuable insights into the trade-offs and benefits of using LoRA and Adapters for fine-tuning in resource-constrained environments.
|
49 |
Comparative Analysis of ChatGPT-4and Gemini Advanced in ErroneousCode Detection and CorrectionSun, Erik Wen Han, Grace, Yasine January 2024 (has links)
This thesis investigates the capabilities of two advanced Large Language Models(LLMs) OpenAI’s ChatGPT-4 and Google’s Gemini Advanced in the domain ofSoftware engineering. While LLMs are widely utilized across various applications,including text summarization and synthesis, their potential for detecting and correct-ing programming errors has not been thoroughly explored. This study aims to fill thisgap by conducting a comprehensive literature search and experimental comparisonof ChatGPT-4 and Gemini Advanced using the QuixBugs and LeetCode benchmarkdatasets, with specific focus on Python and Java programming languages. The re-search evaluates the models’ abilities to detect and correct bugs using metrics suchas Accuracy, Recall, Precision, and F1-score.Experimental results presets that ChatGPT-4 consistently outperforms GeminiAdvanced in both the detection and correction of bugs. These findings provide valu-able insights that could guide further research in the field of LLMs.
|
50 |
Går det att lita på ChatGPT? En kvalitativ studie om studenters förtroende för ChatGPT i lärandesammanhangHärnström, Alexandra, Bergh, Isak Eljas January 2023 (has links)
Världens tekniska utveckling går framåt i snabb takt, inte minst när det kommer till ”smarta” maskiner och algoritmer med förmågan att anpassa sig efter sin omgivning. Detta delvis på grund av den enorma mängd data som finns tillgänglig och delvis tack vare en ökad lagringskapacitet. I november 2022 släpptes ett av de senaste AI-baserade programmen; chatboten ChatGPT. Inom två månader hade ChatGPT fått över 100 miljoner användare. Denna webbaserade mjukvara kan i realtid konversera med användare genom att besvara textbaserade frågor. Genom att snabbt och ofta korrekt besvara användarnas frågor på ett mänskligt och övertygande sätt, har tjänsten på kort tid genererat mycket uppmärksamhet. Det finns flera studier som visar på hur ett stort antal människor saknar ett generellt förtroende för AI. Vissa studier menar att de svar som ChatGPT genererar inte alltid kan antas vara helt korrekta och därför bör följas upp med en omfattande kontroll av faktan, eftersom de annars kan bidra till spridandet av falsk information. Eftersom förtroende för AI har visat sig vara en viktig del i hur väl teknologin utvecklas och integreras, kan brist på förtroende för sådana tjänster, såsom ChatGPT, vara ett hinder för en välfungerande användning. Trots att man sett på ökad produktivitet vid införandet av AI-teknologi hos företag så har det inom högre utbildning, som ett hjälpmedel för studenter, inte integrerats i samma utsträckning. Genom att ta reda på vilket förtroende studenter har för ChatGPT i lärandesammanhang, kan man erhålla information som kan vara till hjälp för integrationen av sådan AI-teknik. Dock saknas det specifik forskning kring studenters förtroende för ChatGPT i lärandesammanhang. Därför syftar denna studie till att fylla denna kunskapslucka, genom att utföra en kartläggning. Vår frågeställning är: ” Vilket förtroende har studenter för ChatGPT i lärandesammanhang?”. Kartläggningen utfördes med semistrukturerade intervjuer av åtta studenter som använt ChatGPT i lärandesammanhang. Intervjuerna genererade kvalitativa data som analyserades med tematisk analys, och resultatet visade på att studenters förtroende för ChatGPT i lärandesammanhang beror på en rad faktorer. Under analysen identifierade vi sex teman som ansågs vara relevanta för att besvara frågeställningen: ● Erfarenheter ● Användning ● ChatGPT:s karaktär ● Yttre påverkan ● Organisationer ● Framtida förtroende / The world's technological development is advancing rapidly, especially when it comes to "smart" machines and algorithms with the ability to adapt to their surroundings. This is partly due to the enormous amount of available data and partly thanks to increased storage capacity. In November 2022, one of the latest AI-based programs was released; the chatbot ChatGPT. This web-based software can engage in real-time conversations with users by answering text-based questions. By quickly, and often accurately, answering users' questions in a human-like and convincing manner, the service has generated a lot of attention in a short period of time. Within two months, ChatGPT had over 100 million users. There are several studies that show how a large number of people lack a general trust in AI. Some studies argue that the responses generated by ChatGPT may not always be assumed to be completely accurate and should therefore be followed up with extensive fact-checking, as otherwise they may contribute to the spreading of false information. Since trust in AI has been shown to be an important part of how well the technology develops and integrates, a lack of trust in services like ChatGPT can be a hindrance to effective usage. Despite the increased productivity observed in the implementation of AI technology in companies, it has not been integrated to the same extent within higher education as an aid for students. By determining the level of trust that students have in ChatGPT in an educational context, valuable information can be obtained to assist in the integration of such AI technology. However, there is a lack of specific research on students' trust in ChatGPT in an educational context. Therefore, this study aims to fill this knowledge gap by conducting a survey. Our research question is: “What trust do students have in ChatGPT in a learning context?”. The survey was conducted through semi-structured interviews with eight students who have used ChatGPT in an educational context. The interviews generated qualitative data that was analyzed using thematic analysis, and the results showed that students' trust in ChatGPT in an educational context depends on several factors. During the analysis, six themes were identified as relevant for answering the research question: • Experiences • Usage • ChatGPT’s character • Influences • Organizations • Future trust
|
Page generated in 0.038 seconds