• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 64
  • 7
  • 6
  • 5
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 99
  • 99
  • 37
  • 28
  • 27
  • 26
  • 24
  • 24
  • 23
  • 23
  • 22
  • 21
  • 20
  • 15
  • 13
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Exploring GPT models as biomedical knowledge bases : By evaluating prompt methods for extracting information from language models pre-trained on scientific articles

Hellberg, Ebba January 2023 (has links)
Scientific findings recorded in literature help continuously guide scientific advancements, but manual approaches to accessing that knowledge are insufficient due to the sheer quantity of information and data available. Although pre-trained language models are being explored for their utility as knowledge bases and structured data repositories, there is a lack of research for this application in the biomedical domain. Therefore, the aim in this project was to determine how Generative Pre-trained Transformer models pre-trained on articles in the biomedical domain can be used to make relevant information more accessible. Several models (BioGPT, BioGPT-Large, and BioMedLM) were evaluated on the task of extracting chemical-protein relations between entities directly from the models through prompting. Prompts were formulated as a natural language text or an ordered triple, and provided in different settings (few-shot, one-shot, or zero-shot). Model-predictions were evaluated quantitatively as a multiclass classification task using a macro-averaged F1-score. The result showed that out of the explored methods, the best performance for extracting chemical-protein relations from article-abstracts was obtained using a triple-based text prompt on the largest model, BioMedLM, in the few-shot setting, albeit with low improvements from the baseline (+0.019 F1). There was no clear pattern for which prompt setting was favourable in terms of task performance, however, the triple based prompt was generally more robust than the natural language formulation. The task performance of the two smaller models underperformed the random baseline (by at best -0.026 and -0.001 F1). The impact of the prompt method was minimal in the smallest model, and the one-shot setting was the least sensitive to the prompt formulation in all models. However, there were more pronounced differences between the prompt methods in the few-shot setting of the larger models (+0.021-0.038 F1). The results suggested that the method of prompting and the size of the model impact the knowledge eliciting performance of a language model. Admittedly, the models mostly underperformed the baseline and future work needs to look into how to adapt generative language models to solve this task. Future research could investigate what impact automatic prompt-design methods and larger in-domain models have on the model performance. / De vetenskapliga upptäckter som presenteras inom litteraturen vägleder kontinuerligt vetenskapliga framsteg. Manuella tillvägagångssätt för att ta del av den kunskapen är otillräckliga på grund av den enorma mängd information och data som finns tillgänglig. Även om för-tränade språkmodeller utforskas för sin brukbarhet som kunskapsbaser och strukturerade dataförråd så finns det en brist på forskning inom den biomedicinska domänen. Målet med detta projekt var att utreda hur Generative Pre-trained Transformer (GPT) modeller för-tränade på biomedicinska artiklar kan användas för att öka tillgängligheten av relevant information inom denna domän. Olika modeller (BioGPT, BioGPT-Large, och BioMedLM) utvärderas på uppgiften att extrahera relationsinformation mellan entiteter direkt ur modellen genom en textprompt. En prompt formuleras genom naturlig text och som en ordnad trippel, och används i olika demonstrationsmiljöer (few-shot, one-shot, zero-shot). Modellförutsägelser utvärderas kvantitativt som ett multi-klass klassifikationsproblem genom ett genomsnittligt F1 värde. Resultatet indikerade att kemikalie-protein relationer från vetenskapliga artikelsammanfattningar kan extraheras med en högre sannolikhet än slumpen med en trippelbaserad prompt genom den största modellen, BioMedLM, i few-shot-miljön, dock med små förbättringar från baslinjen (+0.019 F1). Resultatet visade inga tydliga mönster gällande vilken demonstrationsmiljö som var mest gynnsam, men den trippelbaserade formuleringen var generellt mer robust än formuleringen som följde naturligt språk. Uppgiftsprestandan på de två mindre modellerna underpresterade den slumpmässiga baslinjen (med som bäst -0.026 och -0.001 F1). Effekten av valet av promptmetod var minimal med den minsta modellen, och one-shot-miljön var minst känslig för olika formuleringar hos alla modeller. Dock fanns det mer markanta skillnader mellan promptmetoder i few-shot-miljön hos de större modellerna (+0.021-0.038 F1). Resultatet antydde att valet av promptmetod och storleken på modell påverkar modellens förmåga att extrahera information. De utvärderade modellerna underpresterade dock baslinjen och fortsatt efterforskning behöver se över hur generativa språkmodeller kan anpassas för att lösa denna uppgift. Framtida forskning kan även undersöka vilken effekt automatiska promptdesignmetoder och större domänmodeller har på modellprestanda.
42

A Neurophysiologically-Inspired Statistical Language Model

Dehdari, Jonathan 02 October 2014 (has links)
No description available.
43

[pt] CONSULTANDO BANCOS DE DADOS COM LINGUAGEM NATURAL: O USO DE MODELOS DE LINGUAGEM GRANDES PARA TAREFAS DE TEXTO-PARA-SQL / [en] QUERYING DATABASES WITH NATURAL LANGUAGE: THE USE OF LARGE LANGUAGE MODELS FOR TEXT-TO-SQL TASKS

EDUARDO ROGER SILVA NASCIMENTO 23 May 2024 (has links)
[pt] A tarefa chamada brevemente de Texto-para-SQL envolve a geração de uma consulta SQL com base em um banco de dados relacional e uma pergunta em linguagem natural. Embora os rankings de benchmarks conhecidos indiquem que Modelos de Linguagem Grandes (LLMs) se destacam nessa tarefa, eles são avaliados em bancos de dados com esquemas bastante simples. Esta dissertação investiga inicialmente o desempenho de modelos Texto-para-SQL baseados em LLMs em um banco de dados disponível ao público (Mondial)com um esquema conceitual complexo e um conjunto de 100 perguntas em Linguagem Natural (NL). Executando sob GPT-3.5 e GPT-4, os resultados deste primeiro experimento mostram que as ferramentas baseadas em LLM têm desempenho significativamente inferior ao relatado nesses benchmarks e enfrentam dificuldades com a vinculação de esquemas e joins, sugerindo que o esquema relacional pode não ser adequado para LLMs. Essa dissertação propõe então o uso de visões e descrições de dados amigáveis ao LLM para melhorara precisão na tarefa Texto-para-SQL. Em um segundo experimento, usando a estratégia com melhor performance, custo e benefício do experimento anterior e outro conjunto com 100 perguntas sobre um banco de dados do mundo real, os resultados mostram que a abordagem proposta é suficiente para melhorar consideravelmente a precisão da estratégia de prompt. Esse trabalho conclui com uma discussão dos resultados obtidos e sugere abordagens adicionais para simplificar a tarefa de Texto-para-SQL. / [en] The Text-to-SQL task involves generating an SQL query based on a given relational database and a Natural Language (NL) question. While the leaderboards of well-known benchmarks indicate that Large Language Models (LLMs) excel in this task, they are evaluated on databases with simpler schemas. This dissertation first investigates the performance of LLM-based Text-to-SQL models on a complex and openly available database (Mondial) with a large schema and a set of 100 NL questions. Running under GPT-3.5 and GPT-4, the results of this first experiment show that the performance of LLM-based tools is significantly less than that reported in the benchmarks and that these tools struggle with schema linking and joins, suggesting that the relational schema may not be suitable for LLMs. This dissertation then proposes using LLM-friendly views and data descriptions for better accuracy in the Text-to-SQL task. In a second experiment, using the strategy with better performance, cost and benefit from the previous experiment and another set with 100 questions over a real-world database, the results show that the proposed approach is sufficient to considerably improve the accuracy of the prompt strategy. This work concludes with a discussion of the results obtained and suggests further approaches to simplify the Text-to-SQL task.
44

GENERATING SQL FROM NATURAL LANGUAGE IN FEW-SHOT AND ZERO-SHOT SCENARIOS

Asplund, Liam January 2024 (has links)
Making information stored in databases more accessible to users inexperienced in structured query language (SQL) by converting natural language to SQL queries has long been a prominent research area in both the database and natural language processing (NLP) communities. There have been numerous approaches proposed for this task, such as encoder-decoder frameworks, semantic grammars, and more recently with the use of large language models (LLMs). When training LLMs to successfully generate SQL queries from natural language questions there are three notable methods used, pretraining, transfer learning and in-context learning (ICL). ICL is particularly advantageous in scenarios where the hardware at hand is limited, time is of concern and large amounts of task specific labled data is nonexistent. This study seeks to evaluate two strategies in ICL, namely zero-shot and few-shot scenarios using the Mistral-7B-Instruct LLM. Evaluation of the few-shot scenarios was conducted using two techniques, random selection and Jaccard Similarity. The zero-shot scenarios served as a baseline for the few-shot scenarios to overcome, which ended as anticipated, with the few-shot scenarios using Jaccard similarity outperforming the other two methods, followed by few-shot scenarios using random selection coming in at second best, and the zero-shot scenarios performing the worst. Evaluation results acquired based on execution accuracy and exact matching accuracy confirm that leveraging similarity in demonstrating examples when prompting the LLM will enhance the models knowledge about the database schema and table names which is used during the inference phase leadning to more accurately generated SQL queries than leveraging diversity in demonstrating examples.
45

Fake News Detection : Using a Large Language Model for Accessible Solutions

Jurgell, Fredrik, Borgman, Theodor January 2024 (has links)
In an attempt to create a fake news detection tool using a large language model (LLM), the emphasis is on validating the effectiveness of this approach and then making the tooling readily available. With the current model of gpt-4-turbo-preview and its assistant capabilities combined with simple prompts tailored to different objectives. While tools to detect fake news and simplify the process are not new, insight into how they work and why is not commonly available, most likely due to the monetization around the current services. By building an open-source platform that others can expand upon, giving insight into the prompts used, and enabling experimentation and a baseline to start at when developing further or taking inspiration from.  The results when articles are not willfully written as fake but missing key data are obviously very hard to detect. However, common tabloid-style news, which are often shared to create an emotional response, shows more promising detection results.
46

Automating Software Development Processes Through Multi-Agent Systems : A Study in LLM-based Software Engineering / Automatisering av Mjukvaruutvecklingsprocesser genom användning av Multi-Agent System : En studie inom LLM-baserad mjukvaruutveckling

Peltomaa Åström, Samuel, Winoy, Simon January 2024 (has links)
In the ever-evolving landscape of Software Development, the demand  for more efficient, scalable, and automated processes is paramount. The advancement of Generative AI has unveiled new avenues for innovative approaches to address this demand. This thesis explores one such avenue through the use of Multi-Agent Systems  combined  with Large Language Models (LLMs) to automate tasks within the development lifecycle. The thesis presents a structure for designing and developing an LLM-based multi-agent application by encompassing agent design principles, strategies for facilitating multi-agent collaboration, and providing valuable insights into the selection of an appropriate agent framework. Furthermore, the thesis showcases the developed application in its problem-solving capabilities with quantitative benchmarking results. Additionally, the study demonstrates practical implementations through examples of real-world applications. This study demonstrates the potential of utilizing LLM-based multi-agent systems in enhancing software development efficiency, offering companies a promising and powerful tool for streamlining Software Engineering workflows. / I den ständigt föränderliga världen av mjukvaruutveckling är behovet av mer effektiva, skalbara, och automatiserade metoder av stor betydelse. Framstegen inom generativ AI har öppnat nya möjligheter för utveckling av metoder för detta ändamål. Denna studie undersöker en sådan möjlighet genom användning av multi-agent system i samband med stora språkmodeller (Large Language Models, LLM) för automatisering av uppgifter inom utvecklingslivscykeln. Studien presenterar en struktur för design och utveckling av en LLM-baserad multi-agent applikation genom att bearbeta agentdesign och strategier för att underlätta samarbete mellan flera agenter och ge värdefulla insikter i valet av ett lämpligt agent-ramverk. Vidare demonstrerar studien den utvecklade applikationens problemlösningsförmåga med kvantitativa benchmark-resultat. Utöver detta inkluderar studien även exempel på genererade applikationer för att presentera konkreta exempel på implementeringar. Denna studie visar potentialen av att använda LLM-baserade multi-agent system för att förbättra effektiviteten inom mjukvaruutveckling, och erbjuder företag ett lovande och kraftfullt verktyg för effektivisering av arbetsflöden inom mjukvaruteknik.
47

Artificiell Intelligens i sjukvården : Fördelar och utmaningar / Artificial Intelligence in Healthcare : Benefits and Challenges

O'Gorman, John, Turesson, Lucas January 2024 (has links)
Det här examensarbetet utforskar hur vårdpersonal i Sverige upplever implementeringen av artificiell intelligens (AI) i arbetsmiljön. Studien tar särskild hänsyn till de tekniska och etiska utmaningarna som medföljer vid införandet av AI-teknologier. Genom att använda teoretiska ramverk som Unified Theory of Acceptance and Use of Technology (UTAUT) och Diffusion of Innovations (DOI), ger arbetet insikt i både de individuella och organisatoriska aspekterna av teknikadoption. Data för den här kvalitativa studien samlades in genom semistrukturerade intervjuer med vårdpersonal. Intervjuerna fokuserade på deltagarnas personliga upplevelser och uppfattningar om AI, dess användbarhet och de utmaningar de står inför. Tematisk analys användes för att identifiera och analysera återkommande teman i den insamlade datan, vilket möjliggjorde en djupare förståelse av positiva aspekter och potentiella risker med AI enligt vårdpersonalen. De teman som identifierades var användningen av AI, avlasta sjukvårdspersonalen, förtroende och ansvar samt etik och orosområden med AI. Resultaten visar på att medan AI erbjuder betydande potential för att förbättra effektiviteten och kvaliteten på både processer och patientvården, finns det dock en betydande oro för frågor som rör dataskydd, patientintegritet och den potentiella risken för jobbersättning. Studien belyser vikten av att utveckla klara riktlinjer och regelverk för att hantera dessa utmaningar på etiskt och korrekt vis.  Det här arbetet bidrar till debatten om AI:s vara och icke vara samt roll i sjukvården och understryker behovet av en välbalanserad och välinformerad approach till teknikintegration, vilket är avgörande för att säkerställa både innovationens fördelar och vårdtagarnas samt personalens välbefinnande. Den här undersökningen bidrar till ökad och fördjupad förståelse för den dynamiska roll AI har i sjukvården. / This thesis explores how healthcare staff in Sweden experience the implementation of artificial intelligence (AI) in their work environment. The study pays special attention to the technical and ethical challenges that accompany the introduction of AI technologies in healthcare. By using theoretical frameworks such as the Unified Theory of Acceptance and Use of Technology (UTAUT) and Diffusion of Innovations (DOI), the work provides insights into both the individual and organizational aspects of technology adoption. Data for the qualitative study were collected through semi-structured interviews with healthcare personnel from various offices. The interviews focused on the participants' experiences and their perceptions of AI, its usability, and the challenges they face. Thematic analysis was used to identify and analyse recurring themes in the collected data, enabling a deeper understanding of the positive aspect and potential risks of AI in healthcare. The themes that were identified included the use of AI, relieving healthcare personnel, trust, and responsibility, as well as ethics and concerns related to AI. The results show that while AI offers significant potential to improve both the efficiency and quality of process and patient care, there is considerable concern regarding the issues related to data protection, patient privacy, and the potential risk of job displacement. The study highlights the importance of developing clear guidelines and regulations to address these challenges in an ethical and correct manner.  This research contributes to the debate on the pros and cons and role of AI in healthcare and underscores the need for a well-balanced and well-informed approach to technology integration, which is crucial to ensuring both the benefits of innovation and the well-being of patients and staff. This investigation contributes to an increased and deeper understanding of the dynamic role AI has in healthcare.
48

Towards Manipulator Task-Oriented Programming: Automating Behavior-Tree Configuration

Yue Cao (18985100) 08 July 2024 (has links)
<p dir="ltr">Task-oriented programming is a way of programming manipulators in terms of high-level tasks instead of explicit motions. It has been a long-standing vision in robotics since its early days. Despite its potential, several challenges have hindered its full realization. This thesis identifies three major challenges, particularly in task specification and the planning-to-execution transition: 1) The absence of natural language integration in system input. 2) The dilemma of continuously developing non-uniform and domain-specific primitive-task libraries. 3) The requirement for much human intervention.</p><p dir="ltr">To overcome these difficulties, this thesis introduces a novel approach that integrates natural language inputs, eliminates the need on fixed primitive-task libraries, and minimizes human intervention. It adopts the behavior tree, a modular and user-friendly form, as the task representation and advances its usage in task specification and planning-to-execution transition. The thesis is structured into two parts – Task Specification and Planning-to-Execution Transition.</p><p dir="ltr">Task specification explores the use of large language models to generate a behavior tree from an end-user's input. A Phase-Step prompt is designed to enable the automatic behavior-tree generation from end-user's abstract task descriptions in natural languages. With the powerful generalizability of large language models, it breaks the dilemma that stays with fixed primitive-task libraries in task generation. A full-process case study demonstrated the proposed approach. An ablation study was conducted to evaluate the effectiveness of the Phase-Step prompts. Task specification also proposes behavior-tree embeddings to facilitate the retrieval-augmented generation of behavior trees. The integration of behavior-tree embeddings not only eliminates the need for manual prompt configuration but also provides a way to incorporate external domain knowledge into the generation process. Three types of evaluations were performed to assess the performance of the behavior-tree embedding method.</p><p dir="ltr">Planning-to-execution transition explores how to transit primitive tasks from task specification into manipulator executions. Two types of primitive tasks are considered separately: point-to-point movement tasks and object-interaction tasks. For point-to-point movement tasks, a behavior-tree reward is proposed to enable reinforcement learning over low-level movement while following high-level running order of the behavior tree. End-users only need to specify rewards on the primitive tasks over the behavior tree, and the rest of the process will be handled automatically. A 2D space movement simulation was provided to justify the approach. For object-interaction tasks, the planning-to-execution transition uses a large-language-model-based generation approach. This approach takes natural-language-described primitive tasks as input and directly produces task-frame-formalism set-points. Combined with hybrid position/force control systems, a transition process from primitive tasks directly into joint-level execution can be realized. Evaluations over a set of 30 primitive tasks were conducted.</p><p dir="ltr">Overall, this thesis proposes an approach that advances the behavior-tree towards automated task specification and planning-to-execution transitions. It opens up new possibilities for building better task-oriented manipulator programming systems.</p>
49

Arabic text recognition of printed manuscripts : efficient recognition of off-line printed Arabic text using Hidden Markov Models, Bigram Statistical Language Model, and post-processing

Al-Muhtaseb, Husni Abdulghani January 2010 (has links)
Arabic text recognition was not researched as thoroughly as other natural languages. The need for automatic Arabic text recognition is clear. In addition to the traditional applications like postal address reading, check verification in banks, and office automation, there is a large interest in searching scanned documents that are available on the internet and for searching handwritten manuscripts. Other possible applications are building digital libraries, recognizing text on digitized maps, recognizing vehicle license plates, using it as first phase in text readers for visually impaired people and understanding filled forms. This research work aims to contribute to the current research in the field of optical character recognition (OCR) of printed Arabic text by developing novel techniques and schemes to advance the performance of the state of the art Arabic OCR systems. Statistical and analytical analysis for Arabic Text was carried out to estimate the probabilities of occurrences of Arabic character for use with Hidden Markov models (HMM) and other techniques. Since there is no publicly available dataset for printed Arabic text for recognition purposes it was decided to create one. In addition, a minimal Arabic script is proposed. The proposed script contains all basic shapes of Arabic letters. The script provides efficient representation for Arabic text in terms of effort and time. Based on the success of using HMM for speech and text recognition, the use of HMM for the automatic recognition of Arabic text was investigated. The HMM technique adapts to noise and font variations and does not require word or character segmentation of Arabic line images. In the feature extraction phase, experiments were conducted with a number of different features to investigate their suitability for HMM. Finally, a novel set of features, which resulted in high recognition rates for different fonts, was selected. The developed techniques do not need word or character segmentation before the classification phase as segmentation is a byproduct of recognition. This seems to be the most advantageous feature of using HMM for Arabic text as segmentation tends to produce errors which are usually propagated to the classification phase. Eight different Arabic fonts were used in the classification phase. The recognition rates were in the range from 98% to 99.9% depending on the used fonts. As far as we know, these are new results in their context. Moreover, the proposed technique could be used for other languages. A proof-of-concept experiment was conducted on English characters with a recognition rate of 98.9% using the same HMM setup. The same techniques where conducted on Bangla characters with a recognition rate above 95%. Moreover, the recognition of printed Arabic text with multi-fonts was also conducted using the same technique. Fonts were categorized into different groups. New high recognition results were achieved. To enhance the recognition rate further, a post-processing module was developed to correct the OCR output through character level post-processing and word level post-processing. The use of this module increased the accuracy of the recognition rate by more than 1%.
50

Stream-based statistical machine translation

Levenberg, Abby D. January 2011 (has links)
We investigate a new approach for SMT system training within the streaming model of computation. We develop and test incrementally retrainable models which, given an incoming stream of new data, can efficiently incorporate the stream data online. A naive approach using a stream would use an unbounded amount of space. Instead, our online SMT system can incorporate information from unbounded incoming streams and maintain constant space and time. Crucially, we are able to match (or even exceed) translation performance of comparable systems which are batch retrained and use unbounded space. Our approach is particularly suited for situations when there is arbitrarily large amounts of new training material and we wish to incorporate it efficiently and in small space. The novel contributions of this thesis are: 1. An online, randomised language model that can model unbounded input streams in constant space and time. 2. An incrementally retrainable translationmodel for both phrase-based and grammarbased systems. The model presented is efficient enough to incorporate novel parallel text at the single sentence level. 3. Strategies for updating our stream-based language model and translation model which demonstrate how such components can be successfully used in a streaming translation setting. This operates both within a single streaming environment and also in the novel situation of having to translate multiple streams. 4. Demonstration that recent data from the stream is beneficial to translation performance. Our stream-based SMT system is efficient for tackling massive volumes of new training data and offers-up new ways of thinking about translating web data and dealing with other natural language streams.

Page generated in 0.0992 seconds