• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 36
  • 5
  • 1
  • Tagged with
  • 48
  • 48
  • 48
  • 20
  • 19
  • 18
  • 17
  • 15
  • 15
  • 14
  • 14
  • 13
  • 13
  • 13
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

[en] SUMARIZATION OF HEALTH SCIENCE PAPERS IN PORTUGUESE / [pt] SUMARIZAÇÃO DE ARTIGOS CIENTÍFICOS EM PORTUGUÊS NO DOMÍNIO DA SAÚDE

DAYSON NYWTON C R DO NASCIMENTO 30 October 2023 (has links)
[pt] Neste trabalho, apresentamos um estudo sobre o fine-tuning de um LLM (Modelo de Linguagem Amplo ou Large Language Model) pré-treinado para a sumarização abstrativa de textos longos em português. Para isso, construímos um corpus contendo uma coleção de 7.450 artigos científicos na área de Ciências da Saúde em português. Utilizamos esse corpus para o fine-tuning do modelo BERT pré-treinado para o português brasileiro (BERTimbau). Em condições semelhantes, também treinamos um segundo modelo baseado em Memória de Longo Prazo e Recorrência (LSTM) do zero, para fins de comparação. Nossa avaliação mostrou que o modelo ajustado obteve pontuações ROUGE mais altas, superando o modelo baseado em LSTM em 30 pontos no F1-score. O fine-tuning do modelo pré-treinado também se destaca em uma avaliação qualitativa feita por avaliadores a ponto de gerar a percepção de que os resumos gerados poderiam ter sido criados por humanos em uma coleção de documentos específicos do domínio das Ciências da Saúde. / [en] In this work, we present a study on the fine-tuning of a pre-trained Large Language Model for abstractive summarization of long texts in Portuguese. To do so, we built a corpus gathering a collection of 7,450 public Health Sciences papers in Portuguese. We fine-tuned a pre-trained BERT model for Brazilian Portuguese (the BERTimbau) with this corpus. In a similar condition, we also trained a second model based on Long Short-Term Memory (LSTM) from scratch for comparison purposes. Our evaluation showed that the fine-tuned model achieved higher ROUGE scores, outperforming the LSTM based by 30 points for F1-score. The fine-tuning of the pre-trained model also stands out in a qualitative evaluation performed by assessors, to the point of generating the perception that the generated summaries could have been created by humans in a specific collection of documents in the Health Sciences domain.
2

ChatGPT: A Good Computer Engineering Student? : An Experiment on its Ability to Answer Programming Questions from Exams

Loubier, Michael January 2023 (has links)
The release of ChatGPT has really set new standards for what an artificial intelligence chatbot should be. It has even shown its potential in answering university-level exam questions from different subjects. This research is focused on evaluating its capabilities in programming subjects. To achieve this, coding questions taken from software engineering exams were posed to the AI (N = 23) through an experiment. Then, statistical analysis was done to find out how good of a student ChatGPT is by analyzing its answer’s correctness, degree of completion, diversity of response, speed of response, extraneity, number of errors, length of response and confidence levels. GPT-3.5 is the version analyzed. The experiment was done using questions from three different programming subjects. Afterwards, results showed a 93% rate of correct answer generation, demonstrating its competence. However, it was found that the AI occasionally produces unnecessary lines of code that were not asked for and thus treated as extraneity. The confidence levels given by ChatGPT, which were always high, also didn't always align with response quality which showed the subjectiveness of the AI’s self-assessment. Answer diversity was also a concern, where most answers were repeatedly written nearly the same way. Moreover, when there was diversity in the answers, it also caused much more extraneous code. If ChatGPT was to be blind tested for a software engineering exam containing a good number of coding questions, unnecessary lines of code and comments could be what gives it away as being an AI. Nonetheless, ChatGPT was found to have great potential as a learning tool. It can offer explanations, debugging help, and coding guidance just as any other tool or person could. It is not perfect though, so it should be used with caution.
3

Innovating the Study of Self-Regulated Learning: An Exploration through NLP, Generative AI, and LLMs

Gamieldien, Yasir 12 September 2023 (has links)
This dissertation explores the use of natural language processing (NLP) and large language models (LLMs) to analyze student self-regulated learning (SRL) strategies in response to exam wrappers. Exam wrappers are structured reflection activities that prompt students to practice SRL after they get their graded exams back. The dissertation consists of three manuscripts that compare traditional qualitative analysis with NLP-assisted approaches using transformer-based models including GPT-3.5, a state-of-the-art LLM. The data set comprises 3,800 student responses from an engineering physics course. The first manuscript develops two NLP-assisted codebooks for identifying learning strategies related to SRL in exam wrapper responses and evaluates the agreement between them and traditional qualitative analysis. The second manuscript applies a novel NLP technique called zero-shot learning (ZSL) to classify student responses into the codes developed in the first manuscript and assesses the accuracy of this method by evaluating a subset of the full dataset. The third manuscript identifies the distribution and differences of learning strategies and SRL constructs among students of different exam performance profiles using the results from the second manuscript. The dissertation demonstrates the potential of NLP and LLMs to enhance qualitative research by providing scalable, robust, and efficient methods for analyzing large corpora of textual data. The dissertation also contributes to the understanding of SRL in engineering education by revealing the common learning strategies, impediments, and SRL constructs that students report they use while preparing for exams in a first-year engineering physics course. The dissertation suggests implications, limitations, and directions for future research on NLP, LLMs, and SRL. / Doctor of Philosophy / This dissertation is about using artificial intelligence (AI) to help researchers and teachers understand how students learn from their exams. Exams are not only a way to measure what students know, but also a chance for students to reflect on how they studied and what they can do better next time. One way that students can reflect is by using exam wrappers, which are short questions that students answer after they get their graded exams back. A type of AI called natural language processing (NLP) is used in this dissertation, which can analyze text and find patterns and meanings in it. This study also uses a powerful AI tool called GPT-3.5, which can generate text and answer questions. The dissertation has three manuscripts that compare the traditional way of analyzing exam wrappers, which is done by hand, with the new way of using NLP and GPT-3.5, evaluate a specific promising NLP method, and use this method to try and gain a deeper understanding in students self-regulated learning (SRL) while preparing for exams. The data comes from 3,800 exam wrappers from a physics course for engineering students. The first manuscript develops a way of using NLP and GPT-3.5 to find out what learning strategies and goals students talk about in their exam wrappers and compares it to more traditional methods of analysis. The second manuscript tests how accurate a specific NLP technique is in finding these strategies and goals. The third manuscript looks at how different students use different strategies and goals depending on how well they did on the exams using the NLP technique in the second manuscript. I found that NLP and GPT-3.5 can aid in analyzing exam wrappers faster and provide nuanced insights when compared with manual approaches. The dissertation also shows what learning strategies and goals are most discussed for engineering students as they prepare for exams. The dissertation gives some suggestions, challenges, and ideas for future research on AI and learning from exams.
4

From Bytecode to Safety : Decompiling Smart Contracts for Vulnerability Analysis

Darwish, Malek January 2024 (has links)
This thesis investigated the use of Large Language Models (LLMs) for vulnerability analysis of decompiled smart contracts. A controlled experiment was conducted in which an automated system was developed to decompile smart contracts using two decompilers: Dedaub and Heimdall-rs, and subsequently analyze them using three LLMs: OpenAI’s GPT-4 and GPT-3.5, as well as Meta’s CodeLlama. The study focuses on assessing the effectiveness of the LLMs at identifying a range of vulnerabilities. The evaluation method included the collection and comparative analysis of performance and evaluative metrics such as the precision, recall and F1-scores. Our results show the LLM-decompiler pairing of Dedaub and GPT-4 to exhibit impressive detection capabilities across a range of vulnerabilities while failing to detect some vulnerabilities at which CodeLlama excelled. We demonstrated the potential of LLMs to improve smart contract security and sets the stage for future research to further expand on this domain.
5

Improving Rainfall Index Insurance: Evaluating Effects of Fine-Scale Data and Interactive Tools in the PRF-RI Program

Ramanujan, Ramaraja 04 June 2024 (has links)
Since its inception, the Pasture, Rangeland, and Forage Rainfall Index (PRF-RI) insurance program has issued a total of $8.8 billion in payouts. Given the program's significance, this thesis investigates methodologies to help improve it. For the first part, we evaluated the impact of finer-scale precipitation data on insurance payouts by comparing how the payout distribution differs between the program's current dataset and the finer-scale precipitation dataset by creating a simulated scenario where all parameters are constant except the rainfall index computed by the respective dataset. The analysis for Texas in 2021 revealed that using the finer-scale dataset to compute the rainfall index would result in payouts worth $27 million less than the current dataset. The second part of the research involved the development of two interactive decision-support tools: the "Next-Gen PRF" web tool and the "AgInsurance LLM" chatbot. These tools were designed to help users understand complex insurance parameters and make informed decisions regarding their insurance policies. User studies for the "Next-Gen PRF" tool measured usability, comprehension decision-making efficiency, and user experience, showing that it outperforms traditional methods by providing insightful visualizations and detailed descriptions. The findings suggest that using fine-scale precipitation data and advanced decision-support technologies can substantially benefit the PRF-RI program by reducing spatial basis risk and promoting user education, thus leading to higher user engagement and enrollment. / Master of Science / The Pasture, Rangeland, and Forage Rainfall Index (PRF-RI) program helps farmers manage drought risk. Since it started, it has paid farmers about $8.8 billion. This study looks into ways to improve the program. We first examined whether using rain data at a more finer spatial resolution could affect how much money is paid out. In Texas in 2021, we found that using this finer spatial resolution data could have reduced payouts by $27 million, underscoring the importance of evaluating our proposed change. Additionally, we created two new tools to help farmers understand and choose their insurance options more easily: the "Next-Gen PRF" web tool and the "AgInsurance LLM" chatbot. These tools seek to provide clear visuals and explanations. User studies with these tools show they help users learn more effectively and make more informed decisions compared to existing tools. Overall, our research suggests that using finer spatial resolution precipitation data as well as these interactive tools can enhance the insurance program, including by making it easier to engage with, and enabling farmers to evaluate if and how this program can help them resolve their weather risk management problems.
6

Analyzing Large Language Models For Classifying Sexual Harassment Stories With Out-of-Vocabulary Word Substitution

Seung Yeon Paik (18419409) 25 April 2024 (has links)
<p dir="ltr">Sexual harassment is regarded as a serious issue in society, with a particularly negative impact on young children and adolescents. Online sexual harassment has recently gained prominence as a significant number of communications have taken place online. Online sexual harassment can happen anywhere in the world because of the global nature of the internet, which transcends geographical barriers and allows people to communicate electronically. Online sexual harassment can occur in a wide variety of environments such as through work mail or chat apps in the workplace, on social media, in online communities, and in games (Chawki & El Shazly, 2013).<br>However, especially for non-native English speakers, due to cultural differences and language barriers, may vary in their understanding or interpretation of text-based sexual harassment (Welsh, Carr, MacQuarrie, & Huntley, 2006). To bridge this gap, previous studies have proposed large language models to detect and classify online sexual harassment, prompting a need to explore how language models comprehend the nuanced aspects of sexual harassment data. Prior to exploring the role of language models, it is critical to recognize the current gaps in knowledge that these models could potentially address in order to comprehend and interpret the complex nature of sexual harassment.</p><p><br></p><p dir="ltr">The Large Language Model (LLM) has attracted significant attention recently due to its exceptional performance on a broad spectrum of tasks. However, these models are characterized by being very sensitive to input data (Fujita et al., 2022; Wei, Wang, et al., 2022). Thus, the purpose of this study is to examine how various LLMs interpret data that falls under the domain of sexual harassment and how they comprehend it after replacing Out-of-Vocabulary words.</p><p dir="ltr"><br>This research examines the impact of Out-of-Vocabulary words on the performance of LLMs in classifying sexual harassment behaviors in text. The study compares the story classification abilities of cutting-edge LLM, before and after the replacement of Out-of-Vocabulary words. Through this investigation, the study provides insights into the flexibility and contextual awareness of LLMs when managing delicate narratives in the context of sexual harassment stories as well as raises awareness of sensitive social issues.</p>
7

Augmenting Large Language Models with Humor Theory To Understand Puns

Ryan Rony Dsilva (18429846) 25 April 2024 (has links)
<p dir="ltr">This research explores the application of large language models (LLMs) to comprehension of puns. Leveraging the expansive capabilities of LLMs, this study delves into the domain of pun classification by examining it through the prism of two humor theories: the Computational Model of Humor and the Benign Violation theory, which is an extension of the N+V Theory. The computational model posits that for a phrase to qualify as a pun, it must possess both ambiguity and distinctiveness, characterized by a word that can be interpreted in two plausible ways, each interpretation being supported by at least one unique word. On the other hand, the Benign Violation theory posits that puns work by breaching one linguistic rule while conforming to another, thereby creating a "benign violation." By leveraging the capabilities of large language models (LLMs), this research endeavors to scrutinize a curated collection of English language puns. Our aim is to assess the validity and effectiveness of the use of these theoretical frameworks in accurately classifying puns. We undertake controlled experiments on the dataset, selectively removing a condition specific to one theory and then evaluating the puns based on the criteria of the other theory to see how well it classifies the altered inputs. This approach allows us to uncover deeper insights into the processes that facilitate the recognition of puns and to explore the practical implications of applying humor theories. The findings of our experiments, detailed in the subsequent sections, sheds light on how the alteration of specific conditions impacts the ability of the LLMs to accurately classify puns, according to each theory, where each component of the theory does not influence the result to the same extent, thereby contributing to our understanding of humor mechanics through the eyes of LLMs.</p>
8

Automatisering av CPV- klassificering : En studie om Large Language Models i kombination med word embeddings kan lösa CPV-kategorisering av offentliga upphandlingar.

Andersson, Niklas, Andersson Sjöberg, Hanna January 2024 (has links)
Denna studie utforskar användningen av Large Language Models och word embeddings för attautomatisera kategoriseringen av CPV-koder inom svenska offentliga upphandlingar. Tidigarestudier har inte lyckats uppnå tillförlitlig kategorisering, men detta experiment testar en nymetod som innefattar LLM-modellerna Mistral och Llama3 samt FastText word embeddings. Resultaten visar att även om studiens lösning korrekt kan identifiera vissa CPV-huvudgrupper, är dess övergripande prestanda låg med ett resultat på 12% för en helt korrekt klassificering av upphandlingar och 35% för en delvis korrekt klassificering med minst en korrekt funnen CPV-huvudgrupp. Förbättringar behövs både när det kommer till korrekthet och noggrannhet. Studien bidrar till forskningsfältet genom att påvisa de utmaningar och potentiella lösningar som finns för automatiserad kategorisering av offentliga upphandlingar. Den föreslår även framtida forskning som omfattar användningen av större och mer avancerade modeller för att adressera de identifierade utmaningarna.
9

Towards On-Premise Hosted Language Models for Generating Documentation in Programming Projects

Hedlund, Ludvig January 2024 (has links)
Documentation for programming projects can vary both in quality and availability. The availability of documentation can vary more for a closed working environment, since fewer developers will read the documentation. Documenting programming projects can be demanding on worker hours and unappreciated among developers. It is a common conception that developers rather invest time on developing a project than documenting a project, and making the documentation process more effective would benefit developers. To move towards a more automated process of writing documentation, this work generated documentation for repositories which attempts to summarize the repositories in their use cases and functionalities. Two different implementations are created to generate documentation using an on-premise hosted large language model (LLM) as a tool. First, the embedded solution processes all available code in a project and creates the documentation based on multiple summarizations of files and folders. Second, the RAG solution attempts to use only the most important parts of the code and lets the LLM create the documentation on a smaller set of the codebase. The results show that generating documentation is possible, but unreliable and must be controlled by a person with knowledge about the codebase. The embedded solution seems to be more reliable and produce better results, but is more costly compared to the RAG solution.
10

Improving Context Awareness of Transformer Networks using Retrieval-Augmented Generation

Do, Anh, Tran, Saga January 2024 (has links)
The Thermo-Calc software is a key tool in the research process for many material engineers. However, integrating multiple modules in Thermo-Calc requires the user to write code in a Python-based language, which can be challenging for novice programmers. This project aims to enable the generation of such code from user prompts by using existing generative AI models. In particular, we use a retrieval-augmented generation architecture applied to LLaMA and Mistral models. We use Code LLaMA-Instruct models with 7, 13, and 34 billion parameters, and a Mistral-Instruct model with 7 billion parameters. These models are all based on LLaMA 2. We also use a LLaMA 3-Instruct model with 8 billion parameters. All these models are instruction-tuned, which suggests that they have the capability to interpret natural language and identify appropriate options for a command-line program such as Python. In our testing, the LLaMA 3-Instruct model performed best, achieving 53% on the industry benchmark HumanEval and 49% on our internal adequacy assessment at pass@1, which is the expected probability of getting a correct solution when generating a response. This indicates that the model generates approximately every other answer correct. Due to GPU memory limitations, we had to apply quantisation to process the 13 and 34 billion parameter models. Our results revealed a mismatch between model size and optimal levels of quantisation, indicating that reduced precision adversely affects the performance of these models. Our findings suggest that a properly customised large language model can greatly reduce the coding effort of novice programmers, thereby improving productivity in material research.

Page generated in 0.0371 seconds