Global ETD Search

1	[pt] CONSULTANDO BANCOS DE DADOS COM LINGUAGEM NATURAL: O USO DE MODELOS DE LINGUAGEM GRANDES PARA TAREFAS DE TEXTO-PARA-SQL / [en] QUERYING DATABASES WITH NATURAL LANGUAGE: THE USE OF LARGE LANGUAGE MODELS FOR TEXT-TO-SQL TASKS EDUARDO ROGER SILVA NASCIMENTO 23 May 2024 (has links) [pt] A tarefa chamada brevemente de Texto-para-SQL envolve a geração de uma consulta SQL com base em um banco de dados relacional e uma pergunta em linguagem natural. Embora os rankings de benchmarks conhecidos indiquem que Modelos de Linguagem Grandes (LLMs) se destacam nessa tarefa, eles são avaliados em bancos de dados com esquemas bastante simples. Esta dissertação investiga inicialmente o desempenho de modelos Texto-para-SQL baseados em LLMs em um banco de dados disponível ao público (Mondial)com um esquema conceitual complexo e um conjunto de 100 perguntas em Linguagem Natural (NL). Executando sob GPT-3.5 e GPT-4, os resultados deste primeiro experimento mostram que as ferramentas baseadas em LLM têm desempenho significativamente inferior ao relatado nesses benchmarks e enfrentam dificuldades com a vinculação de esquemas e joins, sugerindo que o esquema relacional pode não ser adequado para LLMs. Essa dissertação propõe então o uso de visões e descrições de dados amigáveis ao LLM para melhorara precisão na tarefa Texto-para-SQL. Em um segundo experimento, usando a estratégia com melhor performance, custo e benefício do experimento anterior e outro conjunto com 100 perguntas sobre um banco de dados do mundo real, os resultados mostram que a abordagem proposta é suficiente para melhorar consideravelmente a precisão da estratégia de prompt. Esse trabalho conclui com uma discussão dos resultados obtidos e sugere abordagens adicionais para simplificar a tarefa de Texto-para-SQL. / [en] The Text-to-SQL task involves generating an SQL query based on a given relational database and a Natural Language (NL) question. While the leaderboards of well-known benchmarks indicate that Large Language Models (LLMs) excel in this task, they are evaluated on databases with simpler schemas. This dissertation first investigates the performance of LLM-based Text-to-SQL models on a complex and openly available database (Mondial) with a large schema and a set of 100 NL questions. Running under GPT-3.5 and GPT-4, the results of this first experiment show that the performance of LLM-based tools is significantly less than that reported in the benchmarks and that these tools struggle with schema linking and joins, suggesting that the relational schema may not be suitable for LLMs. This dissertation then proposes using LLM-friendly views and data descriptions for better accuracy in the Text-to-SQL task. In a second experiment, using the strategy with better performance, cost and benefit from the previous experiment and another set with 100 questions over a real-world database, the results show that the proposed approach is sufficient to considerably improve the accuracy of the prompt strategy. This work concludes with a discussion of the results obtained and suggests further approaches to simplify the Text-to-SQL task. [pt] MODELOS GRANDE DE LINGUAGEM [pt] GPT [pt] LANGCHAIN [pt] TEXTO-PARA-SQL [en] LARGE LANGUAGE MODEL [en] GPT [en] LANGCHAIN [en] TEXT-TO-SQL
2	[pt] ASSISTENTE VIRTUAL UTILIZANDO TRANSFORMERS GENERATIVOS PRÉ-TREINADOS NO CONTEXTO DE GERENCIAMENTO DE RESERVATÓRIOS / [en] VIRTUAL ASSISTANT USING PRETRAINED GENER ATIVE TRANSFORMERS IN THE CONTEXT OF RESERVOIR MANAGEMENT MATHEUS MORAES FERREIRA 18 March 2025 (has links) [pt] Com a crescente popularização das técnicas de Inteligência Artificial, principalmente voltadas ao processamento de linguagem natural, testemunhamos um notável avanço nos Large Language Models (modelos de linguagem avançados), dos quais o Generative Pre-trained Transformer (GPT) consiste no exemplo mais notável. Consequentemente, assistentes virtuais têm conquistado zuma presença significativa em diversas áreas da vida contemporânea. Neste trabalho, é proposta uma metodologia para desenvolver uma assistente virtual inteligente, baseada em um modelo gerador, capaz de compreender a língua portuguesa do Brasil, bem como o domínio específico da Indústria de Óleo e Gás. Essa assistente tem a capacidade de interpretar comandos textuais fornecidos pelos usuários e executar ações correspondentes em um sistema corporativo. Essa metodologia é o resultado de uma cuidadosa análise de diferentes modelos generativos disponíveis, buscando identificar aquele que melhor se adequa aos requisitos da assistente virtual inteligente em português. Para treinamento é criado um dataset representativo com os conceitos necessários e específicos do sistema e da indústria do petróleo. É adotado um processo de refinamento que permite identificar eventuais falhas e aperfeiçoar a compreensão da assistente para garantir respostas precisas e direcionadas. Também são abordados neste trabalho os desafios e limitações inerentes aos modelos generativos, bem como estratégias para superá-las a fim de obter gerações mais precisas e seguras. / [en] With the growing popularity of Artificial Intelligence, specially related to Natural Language Processing, we notice a remarkable development of Large Language Models, which finds in the Generative Pre-Trained Transformers (GPT) their most outstanding example. As a result, virtual assistants have being gaining significant presence in various areas of modern life. In this work, we present the development of an intelligent virtual assistant, based on a generative model. The assistant understands Brazilian Portuguese and is trained on the specific jargon of the Oil and Gas Industry. This assistant has the ability to interpret textual commands provided by users and execute corresponding actions within a corporate system. This methodology is the result of a careful analysis of different available generative models, aiming to identify the one that best suited the requirements of an intelligent virtual assistant in Portuguese. Additionally, it involves the creation of a representative dataset, with concepts specific to the system and the Oil and Gas Industry, to effectively train the assistant. A refinement process allows the identification of potential flaws and the improvement of the assistant s understanding to ensure accurate and targeted responses. Furthermore, this work presents the challenges and the inherent limitations of generative models, and proposes strategies to overcome them in order to achieve more precise and secure generations. [pt] PROCESSAMENTO DE LINGUAGEM NATURAL [pt] GPT [pt] APRENDIZADO DE MAQUINA [pt] GRANDE MODELO DE LINGUAGEM [pt] ASSISTENTE VIRTUAL INTELIGENTE [en] NATURAL LANGUAGE PROCESSING [en] GPT [en] MACHINE LEARNING [en] LARGE LANGUAGE MODEL [en] INTELLIGENT VIRTUAL ASSISTANT

Search results

[pt] CONSULTANDO BANCOS DE DADOS COM LINGUAGEM NATURAL: O USO DE MODELOS DE LINGUAGEM GRANDES PARA TAREFAS DE TEXTO-PARA-SQL / [en] QUERYING DATABASES WITH NATURAL LANGUAGE: THE USE OF LARGE LANGUAGE MODELS FOR TEXT-TO-SQL TASKS

[pt] ASSISTENTE VIRTUAL UTILIZANDO TRANSFORMERS GENERATIVOS PRÉ-TREINADOS NO CONTEXTO DE GERENCIAMENTO DE RESERVATÓRIOS / [en] VIRTUAL ASSISTANT USING PRETRAINED GENER ATIVE TRANSFORMERS IN THE CONTEXT OF RESERVOIR MANAGEMENT