• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 26
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 37
  • 37
  • 17
  • 14
  • 13
  • 12
  • 11
  • 9
  • 8
  • 8
  • 8
  • 7
  • 7
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Surprise for Horwich (and Some Advocates of the Fine-Tuning Argument (Which Does Not Include Horwich (as Far as I Know)))

Harker, David 01 November 2012 (has links)
The judgment that a given event is epistemically improbable is necessary but insufficient for us to conclude that the event is surprising. Paul Horwich has argued that surprising events are, in addition, more probable given alternative background assumptions that are not themselves extremely improbable. I argue that Horwich's definition fails to capture important features of surprises and offer an alternative definition that accords better with intuition. An important application of Horwich's analysis has arisen in discussions of fine-tuning arguments. In the second part of the paper I consider the implications for this argument of employing my definition of surprise. I argue that advocates of fine-tuning arguments are not justified in attaching significance to the fact that we are surprised by examples of fine-tuning.
2

Improving Article Summarizationby Fine-tuning GPT-3.5

Gillgren, Fredrik January 2024 (has links)
This thesis project aims to improve text summarization in the financial application by fine-tuningGenerative Pre-trained Transformer 3.5 (GPT-3.5) . Through meticulous training andoptimization, the model was adeptly configured to accurately and efficiently condense complexfinancial reports into concise, informative summaries, specifically designed to support decision-making in professional business environments. Notable improvements were demonstrated inthe model's capacity to retain essential financial details while enhancing the readability andcontextual relevance of the text, as evidenced by superior ROUGE and BLEU scores whencompared to the baseline GPT-3.5 Turbo model. This fine-tuning approach not only underscoresGPT-3.5’s remarkable adaptability to domain-specific challenges but also marks a significantadvancement in the field of automated text summarization within the financial sector. Thefindings from this research highlight the transformative potential of bespoke NLP solutions,offering data-driven industries the tools to rapidly generate precise and actionable businessinsights, thus facilitating more informed decision-making processes
3

Разработка чат-бота для второй линии поддержки : магистерская диссертация / Development of a chatbot for the second technical support line

Гулич, А. С., Gulich, A. S. January 2024 (has links)
Целью магистерской диссертации является разработка и программная реализация чат-бота для второй линии технической поддержки в компании Nordic IT на базе языковой модели GPT-3.5-turbo. В работе приведен подробный обзор современных средств разработки чат-ботов, проведен обзор проектирования, классификации, преимуществ и недостатков, а также практического применения чат-ботов. Проведен анализ примененных технологий разработки, обоснован выбор оптимальных методов для обучения языковых моделей. Приведено подробное описание структуры программных модулей, методов обучения, оценки качества ответов, интеграции бота с Microsoft Teams и произведен расчет экономической эффективности проекта. Практическая значимость данной работы обусловлена в возможности и целесообразности использования разработанного чат-бота для улучшения производительности и качества обслуживания в компании Nordic IT. / The aim of the master thesis is the development and program implementation of a chatbot for the second line of technical support in Nordic IT based on the GPT-3.5-turbo language model. The work provides a detailed overview of modern chatbot development tools, a review of the design, classification, advantages and disadvantages, and practical application of chatbots. The applied development technologies are analyzed, the choice of optimal methods for training language models is justified. A detailed description of the structure of program modules, training methods, evaluation of the quality of answers, integration of the bot with Microsoft Teams and calculation of the economic efficiency of the project are given. The practical significance of this work is determined by the possibility and expediency of using the developed chatbot to improve productivity and quality of service in the company Nordic IT.
4

[en] SUMARIZATION OF HEALTH SCIENCE PAPERS IN PORTUGUESE / [pt] SUMARIZAÇÃO DE ARTIGOS CIENTÍFICOS EM PORTUGUÊS NO DOMÍNIO DA SAÚDE

DAYSON NYWTON C R DO NASCIMENTO 30 October 2023 (has links)
[pt] Neste trabalho, apresentamos um estudo sobre o fine-tuning de um LLM (Modelo de Linguagem Amplo ou Large Language Model) pré-treinado para a sumarização abstrativa de textos longos em português. Para isso, construímos um corpus contendo uma coleção de 7.450 artigos científicos na área de Ciências da Saúde em português. Utilizamos esse corpus para o fine-tuning do modelo BERT pré-treinado para o português brasileiro (BERTimbau). Em condições semelhantes, também treinamos um segundo modelo baseado em Memória de Longo Prazo e Recorrência (LSTM) do zero, para fins de comparação. Nossa avaliação mostrou que o modelo ajustado obteve pontuações ROUGE mais altas, superando o modelo baseado em LSTM em 30 pontos no F1-score. O fine-tuning do modelo pré-treinado também se destaca em uma avaliação qualitativa feita por avaliadores a ponto de gerar a percepção de que os resumos gerados poderiam ter sido criados por humanos em uma coleção de documentos específicos do domínio das Ciências da Saúde. / [en] In this work, we present a study on the fine-tuning of a pre-trained Large Language Model for abstractive summarization of long texts in Portuguese. To do so, we built a corpus gathering a collection of 7,450 public Health Sciences papers in Portuguese. We fine-tuned a pre-trained BERT model for Brazilian Portuguese (the BERTimbau) with this corpus. In a similar condition, we also trained a second model based on Long Short-Term Memory (LSTM) from scratch for comparison purposes. Our evaluation showed that the fine-tuned model achieved higher ROUGE scores, outperforming the LSTM based by 30 points for F1-score. The fine-tuning of the pre-trained model also stands out in a qualitative evaluation performed by assessors, to the point of generating the perception that the generated summaries could have been created by humans in a specific collection of documents in the Health Sciences domain.
5

Bimodal Automatic Speech Segmentation And Boundary Refinement Techniques

Akdemir, Eren 01 March 2010 (has links) (PDF)
Automatic segmentation of speech is compulsory for building large speech databases to be used in speech processing applications. This study proposes a bimodal automatic speech segmentation system that uses either articulator motion information (AMI) or visual information obtained by a camera in collaboration with auditory information. The presence of visual modality is shown to be very beneficial in speech recognition applications, improving the performance and noise robustness of those systems. In this dissertation a significant increase in the performance of the automatic speech segmentation system is achieved by using a bimodal approach. Automatic speech segmentation systems have a tradeoff between precision and resulting number of gross errors. Boundary refinement techniques are used in order to increase precision of these systems without decreasing the system performance. Two novel boundary refinement techniques are proposed in this thesis / a hidden Markov model (HMM) based fine tuning system and an inverse filtering based fine tuning system. The segment boundaries obtained by the bimodal speech segmentation system are improved further by using these techniques. To fulfill these goals, a complete two-stage automatic speech segmentation system is produced and tested in two different databases. A phonetically rich Turkish audiovisual speech database, that contains acoustic data and camera recordings of 1600 Turkish sentences uttered by a male speaker, is build from scratch in order to be used in the experiments. The visual features of the recordings are extracted and manual phonetic alignment of the database is done to be used as a ground truth for the performance tests of the automatic speech segmentation systems.
6

Automatic fine tuning of cavity filters / Automatisk finjustering av kavitetsfilter

Boyer de la Giroday, Anna January 2016 (has links)
Cavity filters are a necessary component in base stations used for telecommunication. Without these filters it would not be possible for base stations to send and receive signals at the same time. Today these cavity filters require fine tuning by humans before they can be deployed. This thesis have designed and implemented a neural network that can tune cavity filters. Different types of design parameters have been evaluated, such as neural network architecture, data presentation and data preprocessing. While the results was not comparable to human fine tuning, it was shown that there was a relationship between error and number of weights in the neural network. The thesis also presents some rules of thumb for future designs of neural network used for filter tuning.
7

Marine Habitat Mapping Using Image Enhancement Techniques & Machine Learning

Mureed, Mudasar January 2022 (has links)
AbstractThe mapping of habitats is the first step that is done in policies that target theenvironment, as well as in spatial planning and management. The biodiversityplans are always centered around habitats. Therefore, constant monitoring ofthese delicate species in terms of health, changes, and extinction is a must inbiodiversity plans. Human activities are constantly growing, resulting in theextinction of land and marine habitats. Land habitats are being destroyed using airpollution and the cutting of forests. At the same time, marine habitats are beingdestroyed due to acidification of ocean waters and waste materials from theindustries and pollution. The author has focused on aquatic habitats in thisdissertation, mainly coral reefs. An estimate of 27% of coral reef ecosystems havebeen destroyed, and a further 30% are at risk of being damaged in the comingyears. Coral reefs occupy 1% of the ocean floor, and yet they provide a home to30% of marine organisms. To analyze the health of these aquatic habitats, theyneed to be assessed through habitat mapping. Habitat mapping shows thegeographic distribution of different habitats within a particular area. Marinehabitats are typically mapped using camera imagery. The quality of underwaterimages suffers from the characteristics of the marine environment. This results inblurry images or containing particles that cover many parts of an image. Toovercome this, underwater image enhancement algorithms are used to preprocessimages beforehand. Now, there are many underwater image enhancementalgorithms that target different characteristics of the marine environment, butthere is no consensus among researchers about a single underwater technique thatcan be used for any marine dataset. In this dissertation, multiple experiments onvarious popular image enhancement techniques (seven) were conducted and usedto reach a decision about a single underwater approach for all datasets. Thedatasets include EILAT, EILAT2, RSMAS, and MLC08. Also, two state-of-the-artdeep convolutional neural networks for habitat mapping, i.e., DenseNet andMobileNet tested. Maximum results from the combination of Contrast LimitedAdaptive Histogram Equalization (CLAHE) achieved as underwater imageenhancement technique and DenseNet as deep convolutional network. / Not applicable
8

[pt] SUMARIZAÇÃO AUTOMÁTICA DE MULTIPLAS AVALIAÇÕES UTILIZANDO AJUSTE FINO DE MODELOS DE LINGUAGEM TRANSFORMERS / [en] UNSUPERVISED MULTI-REVIEW SUMMARIZATION USING FINE-TUNED TRANSFORMER LANGUAGE MODELS

LUCAS ROBERTO DA SILVA 05 July 2021 (has links)
[pt] Sumarização automática é a tarefa de gerar resumos concisos, corretos e com consistência factual. A tarefa pode ser aplicada a diversos estilos textuais, dentre eles notícias, publicações acadêmicas e avaliações de produtos ou lugares. A presente dissertação aborda a sumarização de múltiplas avaliações. Esse tipo de aplicação se destaca por sua natureza não supervisionada e pela necessidade de lidar com a redundância das informações presentes nas avaliações. Os trabalhos de sumarização automática são avaliados utilizando a métrica ROUGE, que se baseia na comparação de n-gramas entre o texto de referência e o resumo gerado. A falta de dados supervisionados motivou a criação da arquitetura MeanSum, que foi a primeira arquitetura de rede neural baseada em um modelo não supervisionado para essa tarefa. Ela é baseada em auto-encoder e foi estendida por outros trabalhos, porém nenhum deles apresentou os efeitos do uso do mecanismo de atenção e tarefas auxiliares durante o treinamento do modelo. O presente trabalho é dividido em duas etapas. A primeira trata de um experimento no qual extensões à arquitetura do MeanSum foram propostas para acomodar mecanismos de atenção e tarefas auxiliares de classificação de sentimento. Ainda nessa etapa, explora-se o uso de dados sintéticos para adaptar modelos supervisionados a tarefas não supervisionadas. Na segunda etapa, os resultados obtidos anteriormente foram utilizados para realizar um estudo sobre o uso de ajuste fino (fine-tuning) de modelos de linguagem Transformers pré-treinados. A utilização desses modelos mostrou ser uma alternativa promissora para enfrentar a natureza não supervisionada do problema, apresentando um desempenho de + 4 ROUGE quando comparado a trabalhos anteriores. / [en] Automatic summarization is the task of generating concise, correct, and factual summaries. The task can be applied to different textual styles, including news, academic publications, and product or place reviews. This dissertation addresses the summary of multiple evaluations. This type of application stands out for its unsupervised nature and the need to deal with the redundancy of the information present in the reviews. The automatic summarization works are evaluated using the ROUGE metric, which is based on the comparison of n-grans between the reference text and the generated summary. The lack of supervised data motivated the creation of the MeanSum architecture, which was the first neural network architecture based on an unsupervised model for this task. It is based on auto-encoder and has been extended to other works, but none explored the effects of using the attention mechanism and auxiliary tasks during training. The present work is divided into two parts: the first deals with an experiment in which we make extensions to the MeanSum architecture, adding attention mechanisms and auxiliary sentiment classification tasks. In the same experiment, we explore synthetic data to adapt supervised models for unsupervised tasks. In the second part, we used the results previously obtained to carry out a second study on fine-tuning pre-trained Transformer language models. The use of these models showed a promising alternative to the unsupervised nature of the problem, outperforming previous works by + 4 ROUGE.
9

Aligning language models to code : exploring efficient, temporal, and preference alignment for code generation

Weyssow, Martin 09 1900 (has links)
Pre-trained and large language models (PLMs, LLMs) have had a transformative impact on the artificial intelligence (AI) for software engineering (SE) research field. Through large-scale pre-training on terabytes of natural and programming language data, these models excel in generative coding tasks such as program repair and code generation. Existing approaches to align the model's behaviour with specific tasks propose using parameter-free methods like prompting or fine-tuning to improve their effectiveness. Nevertheless, it remains unclear how to align code PLMs and LLMs to more complex scenarios that extend beyond task effectiveness. We focus on model alignment in three overlooked scenarios for code generation, each addressing a specific objective: optimizing fine-tuning costs, aligning models with new data while retaining previous knowledge, and aligning with user coding preferences or non-functional requirements. We explore these scenarios in three articles, which constitute the main contributions of this thesis. In the first article, we conduct an empirical study on parameter-efficient fine-tuning techniques (PEFTs) for code LLMs in resource-constraint settings. Our study reveals the superiority of PEFTs over few-shot learning, showing that PEFTs like LoRA and QLoRA allow fine-tuning LLMs with up to 33 billion parameters on a single 24GB GPU without compromising task effectiveness. In the second article, we examine the behaviour of code PLMs in a continual fine-tuning setting, where the model acquires new knowledge from sequential domain-specific datasets. Each dataset introduces new data about third-party libraries not seen during pre-training or previous fine-tuning. We demonstrate that sequential fine-tuning leads to catastrophic forgetting and implement replay- and regularization-based continual learning approaches, showcasing their superiority in balancing task effectiveness and knowledge retention. In our third article, we introduce CodeUltraFeedback and CODAL-Bench, a novel dataset and benchmark for aligning code LLMs to user coding preferences or non-functional requirements. Our experiments reveal that tuning LLMs with reinforcement learning techniques like direct preference optimization (DPO) using CodeUltraFeedback results in better-aligned LLMs to coding preferences and substantial improvement in the functional correctness of LLM-generated code. / Les modèles de langue pré-entraînés et de grande taille (PLMs, LLMs) ont eu un impact transformateur sur le domaine de la recherche en intelligence artificielle (IA) pour l’ingénierie logicielle (SE). Grâce à un pré-entraînement à grande échelle sur des téraoctets de données en langage naturel et de programmation, ces modèles excellent dans les tâches de codage génératif telles que la réparation de programmes et la génération de code. Les approches existantes pour aligner le comportement du modèle avec des tâches spécifiques proposent l’utilisation de méthodes non paramétriques telles que le prompting ou le fine-tuning pour améliorer leur efficacité. Néanmoins, il reste incertain comment aligner les PLMs et LLMs de code sur des scénarios plus complexes qui nécessitent plus que garantir l’efficacité du modèle sur des tâches cibles. Nous nous concentrons sur l’alignement des modèles dans trois scénarios négligés pour la génération de code, chacun abordant un objectif spécifique: optimiser les coûts de fine-tuning, aligner les modèles avec de nouvelles données dans le temps tout en conservant les connaissances antérieures, et aligner les modèles sur les préférences de codage des utilisateurs ou exigences non fonctionnelles. Nous explorons ces scénarios dans trois articles, qui constituent les principales contributions de cette thèse. Dans le premier article, nous réalisons une étude empirique sur les techniques de finetuning efficaces en paramètres (PEFTs) pour les LLMs de code dans des environnements à ressources limitées. Notre étude révèle la supériorité des PEFTs par rapport au few-shot learning, montrant que des PEFTs comme LoRA et QLoRA permettent de fine-tuner des LLMs jusqu’à 33 milliards de paramètres sur un seul GPU de 24Go sans compromettre l’efficacité sur les tâches. Dans le deuxième article, nous examinons le comportement des PLMs de code dans un contexte de fine-tuning continu, où le modèle acquiert de nouvelles connaissances à partir de jeux de données séquentiels. Chaque jeu de données introduit de nouvelles informations sur des bibliothèques tierces non vues lors de la phase de préentraînement ou dans les jeux de données de fine-tuning précédents. Nous démontrons que le fine-tuning séquentiel conduit à de l’oubli catastrophique et mettons en œuvre des approches d’apprentissage continu basées sur le replay et la régularisation, et montrons leur supériorité pour balancer l’efficacité du modèle et la rétention des connaissances. Dans notre troisième article, nous introduisons CodeUltraFeedback et CODAL-Bench, un nouveau jeu de données et un banc d’essai pour aligner les LLMs de code sur les préférences de codage des utilisateurs ou exigences non fonctionnelles. Nos expériences révèlent que le tuning des LLMs avec des techniques d’apprentissage par renforcement comme l’optimisation directe des préférences (DPO) utilisant CodeUltraFeedback résulte en des LLMs mieux alignés sur les préférences de codage et une amélioration substantielle de l’exactitude fonctionnelle des codes générés.
10

Radiative alpha capture on carbon-12

Gan, Ling 08 December 2023 (has links) (PDF)
In this thesis, we used Effective Field Theory (EFT) to calculate the radiative ALPHA capture on 12C . This reaction is considered the “holy grail” in nuclear astrophysics because it determines the relevant abundance of 16O and 12C . We considered the E1 transition from initial p-wave at energy around the Gamow energy EG =0.3MeV.The theoretical formula for the cross section is obtained by fitting the EFT parameters to the phase shift and S-factor data. We find the Effective Range Expansion (ERE) parameters describing the ALPHA-wave phase shift are fine tuned. The shallow bound state and the resonance ALPHA-wave states are also described.

Page generated in 0.0846 seconds