Spelling suggestions: "subject:"[een] LARGE LANGUAGE MODELS"" "subject:"[enn] LARGE LANGUAGE MODELS""
1 |
Transforming SDOH Screening: Towards a General Framework for Transformer-based Prediction of Social Determinants of HealthKing III, Kenneth Hale 09 September 2024 (has links)
Social Determinants of Health (SDOH) play a crucial role in healthcare outcomes, yet identifying them from unstructured patient data remains a challenge. This research explores the potential of Large Language Models (LLMs) for automated SDOH identification from patient notes. We propose a general framework for SDOH screening that is simple and straightforward. We leverage existing SDOH datasets, adapting and combining them to create a more comprehensive benchmark for this task, addressing the research gap of limited datasets. Using the benchmark and proposed framework, we conclude by conducting several preliminary experiments exploring and comparing promising LLM system implementations. Our findings highlight the potential of LLMs for automated SDOH screening while emphasizing the need for more robust datasets and evaluation frameworks. / Master of Science / Social Determinants of Health (SDOH) have been shown to significantly impact health outcomes and are seen as a major contributor to global health inequities. However, their use within the healthcare industry is still significantly under emphasized, largely due to the difficulty of manually identifying SDOH factors. While previous works have explored automated approaches for SDOH identification, they lack standardization, data transparency and robustness, and are largely outdated compared to the latest Artificial Intelligence (AI) approaches. Therefore, in this work we propose a holistic framework for automated SDOH identification. We also present a higher quality SDOH benchmark, merging existing publicly available datasets, standardizing them, and cleaning them for errors. With this benchmark, we then conducted experiments to gain greater insights into the best performance across different state-of-the-art AI approaches. Through this work, we contribute a better way to think about automated SDOH screening systems, the first publicly accessible multi-clinic and multi-annotator benchmark, as well as greater insights into the latest AI approaches for state-of-the-art results.
|
2 |
“Imagine You’re a Qualitative Researcher”: Exploring the Possibilities and Limitations of Gen-AI for Thematic AnalysisNarkiewicz, Nicole 01 January 2024 (has links) (PDF)
The integration of technology and technological advancements in qualitative research has transformed the research process over the past several decades. Various tools, applications, and devices have expanded opportunities for multimodal field sites and created possibilities for online observations, focus groups, and interviews. As a result of new technologies and innovations, methods of qualitative data collection and analysis have transformed, and methodological approaches have evolved. As new technologies emerge, it is important to understand their impacts on the research process. With the increasing accessibility of large language models, researchers and institutions must carefully assess the implications for qualitative analysis. In this qualitative methodological dissertation I explore the possibilities and limitations of utilizing generative artificial intelligence (GenAI) to analyze text data. I demonstrate the knowledge required to approach thematic analysis using Copilot Pro in Word (Copilot). I discuss the methodological decisions I encountered while exploring Copilot's features for qualitative analysis and explain my reasoning for choosing to utilize Copilot for this study. Further, I compare the results from a traditionally human-conducted thematic analysis to the codes, categories, and themes generated by Copilot. In the process, I developed the criteria by which I compared the outcomes while also evaluating Copilot’s output against the American Educational Research Association’s Standards for Reporting on Empirical Social Science Research. The findings provide insights into opportunities and limitations in leveraging GenAI tools in the qualitative research process. Through this methodological study I demonstrate Copilot’s capabilities generating codes inductively, grouping codes into categories, and developing themes. I argue that researchers need to balance the capabilities of the tool with an understanding of its limitations, particularly concerning time and efficiency, transparency regarding its analytic process, the reliability of its responses, the presentation of its outcomes, and the level of support provided to substantiate its claims.
|
3 |
[en] SUMARIZATION OF HEALTH SCIENCE PAPERS IN PORTUGUESE / [pt] SUMARIZAÇÃO DE ARTIGOS CIENTÍFICOS EM PORTUGUÊS NO DOMÍNIO DA SAÚDEDAYSON NYWTON C R DO NASCIMENTO 30 October 2023 (has links)
[pt] Neste trabalho, apresentamos um estudo sobre o fine-tuning de um LLM
(Modelo de Linguagem Amplo ou Large Language Model) pré-treinado para a
sumarização abstrativa de textos longos em português. Para isso, construímos
um corpus contendo uma coleção de 7.450 artigos científicos na área de Ciências
da Saúde em português. Utilizamos esse corpus para o fine-tuning do modelo
BERT pré-treinado para o português brasileiro (BERTimbau). Em condições
semelhantes, também treinamos um segundo modelo baseado em Memória
de Longo Prazo e Recorrência (LSTM) do zero, para fins de comparação.
Nossa avaliação mostrou que o modelo ajustado obteve pontuações ROUGE
mais altas, superando o modelo baseado em LSTM em 30 pontos no F1-score.
O fine-tuning do modelo pré-treinado também se destaca em uma avaliação
qualitativa feita por avaliadores a ponto de gerar a percepção de que os resumos
gerados poderiam ter sido criados por humanos em uma coleção de documentos
específicos do domínio das Ciências da Saúde. / [en] In this work, we present a study on the fine-tuning of a pre-trained Large
Language Model for abstractive summarization of long texts in Portuguese. To
do so, we built a corpus gathering a collection of 7,450 public Health Sciences
papers in Portuguese. We fine-tuned a pre-trained BERT model for Brazilian
Portuguese (the BERTimbau) with this corpus. In a similar condition, we also
trained a second model based on Long Short-Term Memory (LSTM) from
scratch for comparison purposes. Our evaluation showed that the fine-tuned
model achieved higher ROUGE scores, outperforming the LSTM based by 30
points for F1-score. The fine-tuning of the pre-trained model also stands out in
a qualitative evaluation performed by assessors, to the point of generating the
perception that the generated summaries could have been created by humans
in a specific collection of documents in the Health Sciences domain.
|
4 |
Semantische Transformation von natürlichsprachigen Anfragen in Datenbankabfragesprachen: Design und Implementierung einer sprachgesteuerten Schnittstelle für die semantische Transformation von natürlichsprachigen Anfragen in Datenbankabfragesprachen am Beispiel von OntoChem´s SciWalkerHorstkorte, Garlef 17 December 2024 (has links)
Diese Bachelorarbeit beschäftigt sich mit der Entwicklung einer Softwarelösung zur semantischen und syntaktischen Umwandlung natürlicher Sprache in Datenbankabfragesprachen. Ziel ist es, eine benutzerfreundliche Schnittstelle zu schaffen, die auch Nicht-Experten ermöglicht, komplexe Datenbankabfragen durchzuführen. Im Rahmen eines Praktikums bei der OntoChem GmbH wurde zunächst ein regelbasierter Prototyp entwickelt, der natürliche Sprachabfragen in maschinenlesbare Datenbank abfragen transformiert. Anschlieÿend wurde dieser Ansatz mit einem auf Large Language Models (LLMs) basierenden Ansatz, wie beispielsweise ChatGPT, verglichen. Dabei wurden unter anderem die Effizienz, Genauigkeit, Zuverlässigkeit und ökonomischen Kosten beider Ansätze untersucht.
Die Arbeit beginnt mit einer Einführung in die Grundlagen der natürlichen Sprachver
arbeitung (NLP), regelbasierter Systeme und LLMs. Es folgt eine detaillierte Beschrei
bung des Praktikumsprojekts, einschlieÿlich der eingesetzten Technologien und Tools. In den darauf folgenden Kapiteln werden der regelbasierte Ansatz und der LLM-Ansatz zur Umwandlung natürlicher Sprache in Datenbankabfragen vorgestellt, implementiert und getestet. Die Vergleichsanalyse zeigt, dass der regelbasierte Ansatz durch hohe Geschwindigkeit und Datenkontrolle besticht, jedoch in seiner Flexibilität und Genauigkeit limitiert ist. Der LLM-Ansatz bietet hingegen eine höhere Genauigkeit und Flexibilität bei der Interpretation natürlicher Sprache, weist jedoch längere Antwortzeiten und höhere Betriebskosten auf. Abschließend werden Empfehlungen für die Praxis gegeben und zukünftige Forschungsrichtungen aufgezeigt, wie etwa die Kombination beider Ansätze oder das Training eines eigenen Modells. Die Ergebnisse dieser Arbeit tragen dazu bei, die Interaktion zwischen natürlicher Sprache und Datenbanksystemen zu verbessern und bieten praktische Lösungen für die semantische Transformation von Benutzeranfragen.:1 Einleitung 2
1.1 Motivation 2
1.2 Zielsetzung der Arbeit 3
1.3 Aufbau der Arbeit 4
2 Hintergrund und theoretische Grundlagen 6
2.1 Natürliche Sprachverarbeitung (NLP) 6
2.1.1 Grundlagen der NLP 6
2.1.2 Modelle und Algorithmen 7
2.1.3 Anwendungsbereiche 8
2.2 Regelbasierte Systeme 9
2.2.1 Definition und Funktionsweise 9
2.2.2 Beispiele und Anwendungen 10
2.3 Large Language Models (LLMs) 10
2.3.1 Funktionsweise und Architektur 10
2.3.2 Entwicklung und Technologien 14
2.3.3 Training und Datenbasis 15
2.3.4 Anwendungsbereiche 15
2.3.5 Limitationen von GPT-Modellen 16
3 Praktikumsprojekt bei OntoChem GmbH 18
3.1 Unternehmensvorstellung 18
3.1.1 Überblick und Geschichte 18
3.1.2 Produkte und Technologien 19
3.2 Projektbeschreibung 21
3.2.1 Ziel des Projekts 21
3.2.2 Aufgabenstellung 26
3.3 Technologie-Stack und Tools 27
3.3.1 Programmiersprache und Umgebung 27
3.3.2 Bibliotheken 28
4 Regelbasierter Ansatz zur Umwandlung natürlicher Sprache in Datenbankabfragen 29
4.1 API-Design 29
4.1.1 Methodik und Konzeption 29
4.1.2 structFromNaturalSearch 29
4.1.3 queryFromSearchStructure 35
4.2 Implementierung 37
4.2.1 Funktion: SearchStructureFromString 37
4.2.2 Integration OC-Technologien 38
4.2.3 Algorithmen und Regeln 40
4.2.4 Herausforderungen 43
5 LLM-Ansatz zur Umwandlung natürlicher Sprache in Datenbankabfragen 45
5.1 Einführung in den LLM-Ansatz 45
5.1.1 Grundlagen 45
5.1.2 Vergleich mit Regelbasierten Systemen 46
5.2 Prompting in LLMs (z.B. ChatGPT) 46
5.2.1 Prinzipien des Promptings 46
5.2.2 Design effektiver Prompts 47
5.3 Tests und Evaluierung 50
5.3.1 Beschreibung der Tests 50
5.3.2 Ergebnisse und Analyse 52
6 Vergleich der Ansätze 58
6.1 Methodik 58
6.2 Ergebnisse 58
6.3 Diskussion 61
7 Evaluation und Ausblick 62
7.1 Kritische Betrachtung 62
7.2 Limitationen und Fehlerquellen 62
7.3 Fazit und Implikationen 63
7.4 Zukünftige Forschung 63
Literaturverzeichnis I
Abbildungsverzeichnis IV
Daten- und Codeverzeichnis V
|
5 |
ChatGPT: A Good Computer Engineering Student? : An Experiment on its Ability to Answer Programming Questions from ExamsLoubier, Michael January 2023 (has links)
The release of ChatGPT has really set new standards for what an artificial intelligence chatbot should be. It has even shown its potential in answering university-level exam questions from different subjects. This research is focused on evaluating its capabilities in programming subjects. To achieve this, coding questions taken from software engineering exams were posed to the AI (N = 23) through an experiment. Then, statistical analysis was done to find out how good of a student ChatGPT is by analyzing its answer’s correctness, degree of completion, diversity of response, speed of response, extraneity, number of errors, length of response and confidence levels. GPT-3.5 is the version analyzed. The experiment was done using questions from three different programming subjects. Afterwards, results showed a 93% rate of correct answer generation, demonstrating its competence. However, it was found that the AI occasionally produces unnecessary lines of code that were not asked for and thus treated as extraneity. The confidence levels given by ChatGPT, which were always high, also didn't always align with response quality which showed the subjectiveness of the AI’s self-assessment. Answer diversity was also a concern, where most answers were repeatedly written nearly the same way. Moreover, when there was diversity in the answers, it also caused much more extraneous code. If ChatGPT was to be blind tested for a software engineering exam containing a good number of coding questions, unnecessary lines of code and comments could be what gives it away as being an AI. Nonetheless, ChatGPT was found to have great potential as a learning tool. It can offer explanations, debugging help, and coding guidance just as any other tool or person could. It is not perfect though, so it should be used with caution.
|
6 |
Innovating the Study of Self-Regulated Learning: An Exploration through NLP, Generative AI, and LLMsGamieldien, Yasir 12 September 2023 (has links)
This dissertation explores the use of natural language processing (NLP) and large language models (LLMs) to analyze student self-regulated learning (SRL) strategies in response to exam wrappers. Exam wrappers are structured reflection activities that prompt students to practice SRL after they get their graded exams back. The dissertation consists of three manuscripts that compare traditional qualitative analysis with NLP-assisted approaches using transformer-based models including GPT-3.5, a state-of-the-art LLM. The data set comprises 3,800 student responses from an engineering physics course. The first manuscript develops two NLP-assisted codebooks for identifying learning strategies related to SRL in exam wrapper responses and evaluates the agreement between them and traditional qualitative analysis. The second manuscript applies a novel NLP technique called zero-shot learning (ZSL) to classify student responses into the codes developed in the first manuscript and assesses the accuracy of this method by evaluating a subset of the full dataset. The third manuscript identifies the distribution and differences of learning strategies and SRL constructs among students of different exam performance profiles using the results from the second manuscript. The dissertation demonstrates the potential of NLP and LLMs to enhance qualitative research by providing scalable, robust, and efficient methods for analyzing large corpora of textual data. The dissertation also contributes to the understanding of SRL in engineering education by revealing the common learning strategies, impediments, and SRL constructs that students report they use while preparing for exams in a first-year engineering physics course. The dissertation suggests implications, limitations, and directions for future research on NLP, LLMs, and SRL. / Doctor of Philosophy / This dissertation is about using artificial intelligence (AI) to help researchers and teachers understand how students learn from their exams. Exams are not only a way to measure what students know, but also a chance for students to reflect on how they studied and what they can do better next time. One way that students can reflect is by using exam wrappers, which are short questions that students answer after they get their graded exams back. A type of AI called natural language processing (NLP) is used in this dissertation, which can analyze text and find patterns and meanings in it. This study also uses a powerful AI tool called GPT-3.5, which can generate text and answer questions. The dissertation has three manuscripts that compare the traditional way of analyzing exam wrappers, which is done by hand, with the new way of using NLP and GPT-3.5, evaluate a specific promising NLP method, and use this method to try and gain a deeper understanding in students self-regulated learning (SRL) while preparing for exams. The data comes from 3,800 exam wrappers from a physics course for engineering students. The first manuscript develops a way of using NLP and GPT-3.5 to find out what learning strategies and goals students talk about in their exam wrappers and compares it to more traditional methods of analysis. The second manuscript tests how accurate a specific NLP technique is in finding these strategies and goals. The third manuscript looks at how different students use different strategies and goals depending on how well they did on the exams using the NLP technique in the second manuscript. I found that NLP and GPT-3.5 can aid in analyzing exam wrappers faster and provide nuanced insights when compared with manual approaches. The dissertation also shows what learning strategies and goals are most discussed for engineering students as they prepare for exams. The dissertation gives some suggestions, challenges, and ideas for future research on AI and learning from exams.
|
7 |
From Bytecode to Safety : Decompiling Smart Contracts for Vulnerability AnalysisDarwish, Malek January 2024 (has links)
This thesis investigated the use of Large Language Models (LLMs) for vulnerability analysis of decompiled smart contracts. A controlled experiment was conducted in which an automated system was developed to decompile smart contracts using two decompilers: Dedaub and Heimdall-rs, and subsequently analyze them using three LLMs: OpenAI’s GPT-4 and GPT-3.5, as well as Meta’s CodeLlama. The study focuses on assessing the effectiveness of the LLMs at identifying a range of vulnerabilities. The evaluation method included the collection and comparative analysis of performance and evaluative metrics such as the precision, recall and F1-scores. Our results show the LLM-decompiler pairing of Dedaub and GPT-4 to exhibit impressive detection capabilities across a range of vulnerabilities while failing to detect some vulnerabilities at which CodeLlama excelled. We demonstrated the potential of LLMs to improve smart contract security and sets the stage for future research to further expand on this domain.
|
8 |
Analyzing Large Language Models For Classifying Sexual Harassment Stories With Out-of-Vocabulary Word SubstitutionSeung Yeon Paik (18419409) 25 April 2024 (has links)
<p dir="ltr">Sexual harassment is regarded as a serious issue in society, with a particularly negative impact on young children and adolescents. Online sexual harassment has recently gained prominence as a significant number of communications have taken place online. Online sexual harassment can happen anywhere in the world because of the global nature of the internet, which transcends geographical barriers and allows people to communicate electronically. Online sexual harassment can occur in a wide variety of environments such as through work mail or chat apps in the workplace, on social media, in online communities, and in games (Chawki & El Shazly, 2013).<br>However, especially for non-native English speakers, due to cultural differences and language barriers, may vary in their understanding or interpretation of text-based sexual harassment (Welsh, Carr, MacQuarrie, & Huntley, 2006). To bridge this gap, previous studies have proposed large language models to detect and classify online sexual harassment, prompting a need to explore how language models comprehend the nuanced aspects of sexual harassment data. Prior to exploring the role of language models, it is critical to recognize the current gaps in knowledge that these models could potentially address in order to comprehend and interpret the complex nature of sexual harassment.</p><p><br></p><p dir="ltr">The Large Language Model (LLM) has attracted significant attention recently due to its exceptional performance on a broad spectrum of tasks. However, these models are characterized by being very sensitive to input data (Fujita et al., 2022; Wei, Wang, et al., 2022). Thus, the purpose of this study is to examine how various LLMs interpret data that falls under the domain of sexual harassment and how they comprehend it after replacing Out-of-Vocabulary words.</p><p dir="ltr"><br>This research examines the impact of Out-of-Vocabulary words on the performance of LLMs in classifying sexual harassment behaviors in text. The study compares the story classification abilities of cutting-edge LLM, before and after the replacement of Out-of-Vocabulary words. Through this investigation, the study provides insights into the flexibility and contextual awareness of LLMs when managing delicate narratives in the context of sexual harassment stories as well as raises awareness of sensitive social issues.</p>
|
9 |
Augmenting Large Language Models with Humor Theory To Understand PunsRyan Rony Dsilva (18429846) 25 April 2024 (has links)
<p dir="ltr">This research explores the application of large language models (LLMs) to comprehension of puns. Leveraging the expansive capabilities of LLMs, this study delves into the domain of pun classification by examining it through the prism of two humor theories: the Computational Model of Humor and the Benign Violation theory, which is an extension of the N+V Theory. The computational model posits that for a phrase to qualify as a pun, it must possess both ambiguity and distinctiveness, characterized by a word that can be interpreted in two plausible ways, each interpretation being supported by at least one unique word. On the other hand, the Benign Violation theory posits that puns work by breaching one linguistic rule while conforming to another, thereby creating a "benign violation." By leveraging the capabilities of large language models (LLMs), this research endeavors to scrutinize a curated collection of English language puns. Our aim is to assess the validity and effectiveness of the use of these theoretical frameworks in accurately classifying puns. We undertake controlled experiments on the dataset, selectively removing a condition specific to one theory and then evaluating the puns based on the criteria of the other theory to see how well it classifies the altered inputs. This approach allows us to uncover deeper insights into the processes that facilitate the recognition of puns and to explore the practical implications of applying humor theories. The findings of our experiments, detailed in the subsequent sections, sheds light on how the alteration of specific conditions impacts the ability of the LLMs to accurately classify puns, according to each theory, where each component of the theory does not influence the result to the same extent, thereby contributing to our understanding of humor mechanics through the eyes of LLMs.</p>
|
10 |
Large Language Models for Unsupervised Keyphrase Extraction and Biomedical Data AnalyticsHaoran Ding (18825838) 03 September 2024 (has links)
<p dir="ltr">Natural Language Processing (NLP), a vital branch of artificial intelligence, is designed to equip computers with the ability to comprehend and manipulate human language, facilitating the extraction and utilization of textual data. NLP plays a crucial role in harnessing the vast quantities of textual data generated daily, facilitating meaningful information extraction. Among the various techniques, keyphrase extraction stands out due to its ability to distill concise information from extensive texts, making it invaluable for summarizing and navigating content efficiently. The process of keyphrase extraction usually begins by generating candidates first and then ranking them to identify the most relevant phrases. Keyphrase extraction can be categorized into supervised and unsupervised approaches. Supervised methods typically achieve higher accuracy as they are trained on labeled data, which allows them to effectively capture and utilize patterns recognized during training. However, the dependency on extensive, well-annotated datasets limits their applicability in scenarios where such data is scarce or costly to obtain. On the other hand, unsupervised methods, while free from the constraints of labeled data, face challenges in capturing deep semantic relationships within text, which can impact their effectiveness. Despite these challenges, unsupervised keyphrase extraction holds significant promise due to its scalability and lower barriers to entry, as it does not require labeled datasets. This approach is increasingly favored for its potential to aid in building extensive knowledge bases from unstructured data, which can be particularly useful in domains where acquiring labeled data is impractical. As a result, unsupervised keyphrase extraction is not only a valuable tool for information retrieval but also a pivotal technology for the ongoing expansion of knowledge-driven applications in NLP.</p><p dir="ltr">In this dissertation, we introduce three innovative unsupervised keyphrase extraction methods: AttentionRank, AGRank, and LLMRank. Additionally, we present a method for constructing knowledge graphs from unsupervised keyphrase extraction, leveraging the self-attention mechanism. The first study discusses the AttentionRank model, which utilizes a pre-trained language model to derive underlying importance rankings of candidate phrases through self-attention. This model employs a cross-attention mechanism to assess the semantic relevance between each candidate phrase and the document, enhancing the phrase ranking process. AGRank, detailed in the second study, is a sophisticated graph-based framework that merges deep learning techniques with graph theory. It constructs a candidate phrase graph using mutual attentions from a pre-trained language model. Both global document information and local phrase details are incorporated as enhanced nodes within the graph, and a graph algorithm is applied to rank the candidate phrases. The third study, LLMRank, leverages the strengths of large language models (LLMs) and graph algorithms. It employs LLMs to generate keyphrase candidates and then integrates global information through the text's graphical structures. This process reranks the candidates, significantly improving keyphrase extraction performance. The fourth study explores how self-attention mechanisms can be used to extract keyphrases from medical literature and generate query-related phrase graphs, improving text retrieval visualization. The mutual attentions of medical entities, extracted using a pre-trained model, form the basis of the knowledge graph. This, coupled with a specialized retrieval algorithm, allows for the visualization of long-range connections between medical entities while simultaneously displaying the supporting literature. In summary, our exploration of unsupervised keyphrase extraction and biomedical data analysis introduces novel methods and insights in NLP, particularly in information extraction. These contributions are crucial for the efficient processing of large text datasets and suggest avenues for future research and applications.</p>
|
Page generated in 0.05 seconds