Global ETD Search

1	Night Setback Identification of District Heating Substations Gerima, Kassaye January 2021 (has links) Energy efficiency of district heating systems is of great interest to energy stakeholders. However, it is not uncommon that district heating systems fail to achieve the expected performance due to inappropriate operations. Night setback is one control strategy, which has been proved to be not a suitable setting for well-insulated modern buildings in terms of both economic and energy efficiency. Therefore, identification of a night setback control is vital to district heating companies to smoothly manage their heat energy distribution to their customers. This study is motivated to automate this identification process. The method used in this thesis is a Convolutional Neural Network(CNN) approach using the concept of transfer learning. 133 substations in Oslo are used in this case study to design a machine learning model that can identify a substation as night setback or non-night setback series. The results show that the proposed method can classify the substations with approximately 97% accuracy and 91% F1-score. This shows that the proposed method has a high potential to be deployed and used in practice to identify a night setback control in district heating substations. District heating Night setback CNN Transfer learning Pre-trained model Computer and Information Sciences Data- och informationsvetenskap
2	Readability Assessment with Pre-Trained Transformer Models : An Investigation with Neural Linguistic Features Ma, Chuchu January 2022 (has links) Readability assessment (RA) is to assign a score or a grade to a given document, which measures the degree of difficulty to read the document. RA originated in language education studies and was used to classify reading materials for language learners. Later, RA was applied to many other applications, such as aiding automatic text simplification. This thesis is aimed at improving the way of using Transformer for RA. The motivation is the “pipeline” effect (Tenney et al., 2019) of pretrained Transformers: lexical, syntactic, and semantic features are best encoded with different layers of a Transformer model. After a preliminary test of a basic RA model that resembles the previous works, we proposed several methods to enhance the performance: by using a Transformer layer that is not the last, by concatenating or mixing the outputs of all layers, and by using syntax-augmented Transformer layers. We examined these enhanced methods on three datasets: WeeBit, OneStopEnglish, and CommonLit. We observed that the improvements showed a clear correlation with the dataset characteristics. On the OneStopEnglish and the CommonLit datasets, we achieved absolute improvements of 1.2% in F1 score and 0.6% in Pearson’s correlation coefficients, respectively. We also show that an 𝑛-gram frequency- based baseline, which is simple but was not reported in previous works, has superior performance on the classification datasets (WeeBit and OneStopEnglish), prompting further research on vocabulary-based lexical features for RA. Readability Assessment Pre-trained Transformer Models Neural Linguistic Features
3	A Quantitative Comparison of Pre-Trained Model Registries to Traditional Software Package Registries Jason Hunter Jones (18430302) 06 May 2024 (has links) <p dir="ltr">Software Package Registries are an integral part of the Software Supply Chain, acting as collaborative platforms that unite contributors, users, and packages, and streamline package management processes. Much of the engineering work around reusing packages from these platforms deals with the issue of synthesis, combining multiple packages into a new package or downstream project. Recently, researchers have examined registries that specialize in providing Pre-Trained Models (PTMs), to explore the nuances of the PTM Supply Chain. These works suggest that the main engineering challenge of PTM reuse is not synthesis but selection. However, these findings have been primarily qualitative and lacking quantitative evidence of the observed differences. I therefore evaluate the following hypothesis:</p><p dir="ltr"><i>The prioritization of selection over synthesis in Pre-Trained Model reuse means that the evolution and reuse of Pre-Trained Models differs compared to traditional software. </i><i>The evolution of models will be more linear, and the reuse of models will be more centralized.</i></p> Empirical software engineering software supply chain Pre-trained models HuggingFace Software Package Registries
4	En undersökning av metoder förautomatiserad text ochparameterextraktion frånPDF-dokument med NaturalLanguage Processing / An investigation of methods forautomated text and parameterextraction from PDF documentsusing Natural LanguageProcessing Värling, Alexander, Hultgren, Emil January 2024 (has links) I dagens affärsmiljö strävar många organisationer efter att automatisera processen för att hämta information från fakturor. Målet är att göra hanteringen av stora mängder fakturor mer effektiv. Trots detta möter man utmaningar på grund av den varierande strukturen hos fakturor. Placeringen och formatet för information kan variera betydligt mellan olika fakturor, vilket skapar komplexitet och hinder vid automatiserad utvinning av fakturainformation. Dessa utmaningar kan påverka noggrannheten och effektiviteten i processen. Förmågan att navigera genom dessa utmaningar blir därmed avgörande för att framgångsrikt implementera automatiserade system för hantering av fakturor. Detta arbete utforskar fyra olika textextraktions metoder som använder optisk teckenigenkänning, bildbehandling, vanlig textextraktion och textbearbetning, följt av en jämförelse mellan de naturliga språkbehandlingsmodellerna GPT- 3.5 (Generative Pre-trained Transformer) och GPT-4 för parameterextraktion av fakturor. Dessa modeller testades på sin förmåga att extrahera åtta specifika fält i PDF-dokument, sedan jämfördes deras resultat. Resultatet presenteras med valideringsmetoden ”Micro F1-poäng” en skala mellan 0 till 1, där 1 är en perfekt extraktion. Metoden som använde GPT-4 visade sig vara mest framgångsrik, som gav ett resultat på 0.98 och felfri extraktion i sex av åtta fält när den testades på 19 PDF-dokument. GPT 3.5 kom på andraplats och visade lovande resultat i fyra av de åtta fält, men presterade inte lika bra i de återstående fält, vilket resulterade i ett Micro F1-poäng på 0.71. På grund av det begränsade datamängden kunde GPT 3.5 inte uppnå sin fulla potential, eftersom finjustering och validering kräver större datamängder. Likaså behöver GPT-4 valideras med ett mer omfattande dataset för att kunna dra slutsatser om modellernas faktiska prestanda. Ytterligare forskning är nödvändig för att fastställa GPT-modellernas kapacitet med dessa förbättringar. / In today’s business environment, many organizations aim to automate the process of extracting information from invoices with the goal of making the management of large volumes of invoices more efficient. However, challenges arise due to the varied structure of invoices. The placement and format of information can significantly differ between different invoices, creating complexity and obstacles in the automated extraction of invoice information. These challenges can impact the accuracy and efficiency of the process, making the ability to navigate through them crucial for the successful implementation of automated systems for invoice management. This work explores four different text extraction methods that use optical character recognition, image processing, plain text extraction, and text processing, followed by a comparison between the natural language processing models GPT-3.5 (Generative Pre-trained Transformer) and GPT-4 for parameter extraction of invoices. These models were tested on their ability to extract eight specific fields in PDF documents, after which their results were compared. The results are presented using the ”Micro F1-Score” validation method, a scale from 0 to 1, where 1 represents perfect extraction. The method that used GPT-4 proved to be the most successful, yielding a result of 0.98 and error-free extraction in six out of eight fields when tested on 19 PDF documents. GPT-3.5 came in second place and showed promising results in four of the eight fields but did not perform as well in the remaining fields, resulting in a Micro F1-Score of 0.71. Due to the limited amount of data, GPT-3.5 could not reach its full potential, as fine-tuning and validation require larger datasets. Similarly, GPT-4 needs validation with a more comprehensive dataset to draw conclusions about the models’ actual performance. Further research is necessary to determine the capacities of GPT models with these improvements. portable document format faktura digitalisering IT-lösningar optisk teckenigenkänning textextraktion naturlig språkbehandling generative pre-trained transformer portable document format faktura digitalisering IT-lösningar optisk teckenigenkänning textextraktion naturlig språkbehandling generative pre-trained transformer Software Engineering Programvaruteknik
5	Parafrasidentifiering med maskinklassificerad data : utvärdering av olika metoder / Paraphrase identification with computer classified paraphrases : An evaluation of different methods Johansson, Oskar January 2020 (has links) Detta arbete undersöker hur språkmodellen BERT och en MaLSTM-arkitektur fungerar att för att identifiera parafraser ur 'Microsoft Paraphrase Research Corpus' (MPRC) om dessa tränats på automatiskt identifierade parafraser ur 'Paraphrase Database' (PPDB). Metoderna ställs mot varandra för att undersöka vilken som presterar bäst och metoden att träna på maskinklassificerad data för att användas på mänskligt klassificerad data utvärderas i förhållande till annan klassificering av samma dataset. Meningsparen som används för att träna modellerna hämtas från de högst rankade parafraserna ur PPDB och genom en genereringsmetod som skapar icke-parafraser ur samma dataset. I resultatet visar sig BERT vara kapabel till att identifiera en del parafraser ur MPRC, medan MaLSTM-arkitekturen inte klarade av detta trots förmåga att särskilja på parafraser och icke-parafraser under träning. Både BERT och MaLSTM presterade sämre på att identifiera parafraser ur MPRC än modeller som till exempel StructBERT, som tränat och utvärderats på samma dataset, presterar. Anledningar till att MaLSTM inte klarar av uppgiften diskuteras och främst lyfts att meningarna från icke-parafraserna ur träningsdatan är för olika varandra i förhållande till hur de ser ut i MPRC. Slutligen diskuteras vikten av att forska vidare på hur man kan använda sig av maskinframtagna parafraser inom parafraseringsrelaterad forskning. NLP text classification BERT
6	Automatisk Summering av Cybersäkerhetsdiskussioner på Onlineforum : En prototyp för abstraktiv textsummering med en Zero-shot modell Ununger, Andreas January 2022 (has links) Antalet cyberattacker ökar ständigt och därav också antalet angreppssätt och försvarstekniker. Detta innebär att personer verksamma inom cybersäkerhet behöver spendera mer och mer tid på att hålla sig uppdaterade om de senaste utvecklingarna i branschen. Det är därför av intresse att hitta sätt som kan påskynda denna inhämtning av information. I denna studie utvecklas en prototyp med målet att på ett nytt sätt automatiskt summera en av de många sorters nyhetskällor som finns inom cybersäkerhetsdiskussioner på onlineforum. Prototypen använder sig av abstraktiv textsummering med zero-shot modellen GPT-3. Prototypen som utvecklades utvärderades genom att mäta de summeringar som skapades med SUPERT. Resultatet från mätningen gav ett värde av 0,269 vid mätning mot de originella texterna och 0,358 vid mätning mot ett dataset som städats från text som inte rör cybersäkerhet. Från dessa värdet dras slutsatsen att utvecklingen av prototypen lyckades. Maskininlärning pre-trained GPT-3 automatisk summering abstraktiv summering Zero-shot SUPERT Natural language processing Information Systems
7	Predicting the Unpredictable – Using Language Models to Assess Literary Quality Wu, Yaru January 2023 (has links) People read for various purposes like learning specific skills, acquiring foreign languages, and enjoying the pure reading experience, etc. This kind of pure enjoyment may credit to many aspects, such as the aesthetics of languages, the beauty of rhyme, and the entertainment of being surprised by what will happen next, the last of which is typically featured in fictional narratives and is also the main topic of this project. In other words, “good” fiction may be better at entertaining readers by baffling and eluding their expectations whereas “normal” narratives may contain more cliches and ready-made sentences that are easy to predict. Therefore, this project examines whether “good” fiction is less predictable than “normal” fiction, the two of which are predefined as canonized and non-canonized. The predictability can be statistically reflected by the probability of the next words being correctly predicted given the previous content, which is then further measured in the metric of perplexity. Thanks to recent advances in deep learning, language models based on neural networks with billions of parameters can now be trained on terabytes of text to improve their performance in predicting the next unseen texts. Therefore, the generative pre-trained modeling and the text generator are combined to estimate the perplexities of canonized literature and non-canonized literature. Due to the potential risk that the terabytes of text on which the advanced models have been trained may contain book content within the corpus, two series of models are designed to yield non-biased perplexity results, namely the self-trained models and the generative pre-trained Transformer-2 models. The comparisons of these two groups of results set up the final hierarchy of architecture constituted by five models for further experiments. Over the process of perplexity estimation, the perplexity variance can also be generated at the same time, which is then used to denote how predictability varies across sequences with a certain length within each piece of literature. Evaluated by the perplexity variance, the literature property of homogeneity can also be examined between these two groups of literature. The ultimate results from the five models imply that there lie distinctions in both perplexity values and variances between the canonized literature and non-canonized literature. Besides, the canonized literature shows higher perplexity values and variances measured in both median and mean metrics, which denotes that it is less predictable and homogeneous than the non-canonized literature. Obviously, the perplexity values and variances cannot be used to define the literary quality directly. However, they offer some signals that the metric of perplexity can be insightful in the literary quality analysis using natural language processing techniques. perplexity variance unpredictability homogeneity generative pre-trained models text generation literary quality
8	Generative Language Models for Automated Programming Feedback Hedberg Segeholm, Lea, Gustafsson, Erik January 2023 (has links) In recent years, Generative Language Models have exploded into the mainstream with household names like BERT and ChatGPT, proving that text generation could have the potential to solve a variety of tasks. As the number of students enrolled into programming classes has increased significantly, providing adequate feedback for everyone has become a pressing logistical issue. In this work, we evaluate the ability of near state-of-the-art Generative Language Models to provide said feedback on an automated basis. Our results show that the latest publicly available model GPT-3.5 has a significant aptitude for finding errors in code while the older GPT-3 is noticeably more uneven in its analysis. It is our hope that future, potentially fine-tuned models could help fill the role of providing early feedback for beginners, thus significantly alleviating the pressure put upon instructors. Education Code feedback Pre-trained language models (PTMs) GPT-3 GPT-3.5 Generative language models Computer Sciences Datavetenskap (datalogi)
9	A Study on Effective Approaches for Exploiting Temporal Information in News Archives / ニュースアーカイブの時制情報活用のための有効な手法に関する研究 Wang, Jiexin 26 September 2022 (has links) 京都大学 / 新制・課程博士 / 博士(情報学) / 甲第24259号 / 情博第803号 / 新制\|\|情\|\|135(附属図書館) / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授吉川正俊, 教授田島敬史, 教授黒橋禎夫, 特定准教授 LIN Donghui / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Temporal News Collections Temporal Information Question Answering Event Time Estimation Text Representation Question Generation Pre-trained Language Model 007
10	[en] ASSESSMENT OF FINE-TUNING ON END-TO-END SPEECH RECOGNITION MODELS / [pt] AVALIAÇÃO DE AJUSTE FINO EM MODELOS DE PONTA A PONTA PARA RECONHECIMENTO DE FALA JONATAS DOS SANTOS GROSMAN 04 November 2022 (has links) [pt] Utilizar representações fornecidas por um grande modelo pré-treinado tornou-se a principal estratégia para alcançar o estado da arte nas mais variadas tarefas. Um grande modelo pré-treinado recentemente proposto, wav2vec 2.0, foi seminal para vários outros trabalhos sobre pré-treinamento de grandes modelos em dados de fala. Muitos modelos estão sendo pré-treinados usando a mesma arquitetura baseada em transformer que o wav2vec 2.0 e estão obtendo o estado da arte em várias tarefas relacionadas à fala. No entanto, poucos trabalhos propuseram maiores análises sobre o comportamento desses modelos em diferentes cenários de fine-tuning. Nosso trabalho visa analisar esse modelo sobre dois aspectos diferentes. O primeiro é sobre a transferibilidade entre línguas desses modelos. Nossos experimentos nos mostraram que o tamanho dos dados usados durante o pré-treinamento desses modelos não é tão crucial para a transferibilidade quanto a diversidade. Percebemos que o desempenho das línguas indo-europeias é superior ao das línguas não indo-europeias nos modelos avaliados. Vimos uma transferência positiva de conhecimento entre línguas usando modelos monolinguais, o que foi percebido em todos os idiomas que usamos, mas foi mais evidente quando o idioma usado durante o pré-treinamento era mais semelhante ao idioma do fine-tuning. O segundo aspecto que investigamos em nosso trabalho é quão bem esses modelos se comportam em cenários de desbalanceamento de dados, onde há um subconjunto mais representativo no conjunto de dados do fine-tuning. Nossos resultados mostraram que o desbalanceamento dos dados no fine-tuning geralmente afeta o resultado final dos modelos, com melhor desempenho nos subconjuntos mais representativos. No entanto, uma maior variabilidade no conjunto de treinamento favorece o desempenhodo modelo para um subconjunto mais representativo. Porém essamaior variabilidade nos dados não favoreceu os idiomas não vistos durante o treinamento. Observamos também que os modelos parecem mais robustos em lidar com o desbalanceamento de gênero do que idade ou sotaque. Com esses achados, esperamos ajudar a comunidade científica na utilização de modelos pré-treinados existentes, bem como auxiliar no pré-treinamento de novosmodelos. / [en] Using representations given by a large pre-trained model has become the primary strategy to reach the state-of-the-art in the most varied tasks. A recently proposed large pre-trained model, wav2vec 2.0, was seminal for several other works on pre-training large models on speech data. Many models are being pre-trained using the same transformer-based architecture as wav2vec 2.0 and are getting state-of-the-art in various speech-related tasks. However, few works have proposed further analysis of these models in different finetuning scenarios. Our work investigates these models concerning two different aspects. The first is about the cross-lingual transferability of these models. Our experiments showed us that the size of data used during the pre-training of these models is not as crucial to the transferability as the diversity. We noticed that the performance of Indo-European languages is superior to non-Indo- European languages in the evaluated models. We have seen a positive crosslingual transfer of knowledge using monolingual models, which was noticed in all the languages we used but was more evident when the language used during the pre-training was more similar to the downstream task language. The second aspect we investigated in our work is how well these models perform in data imbalance scenarios, where there is a more representative subset in the fine-tuning dataset. Our results showed that data imbalance in fine-tuning generally affects the final result of the models, with better performance in the most representative subsets. However, greater variability in the training set favors model performance for a more representative subset. Nevertheless, this greater variability in the data did not favor languages not seen during training. We also observed that the models seem more robust in dealing with gender imbalance than age or accent. With these findings, we hope to help the scientific community in the use of existing pre-trained models, as well as assist in the pre-training of new models. [pt] AJUSTE FINO [pt] RECONHECIMENTO DE FALA [pt] MODELOS PRE-TREINADOS [en] FINE-TUNING [en] SPEECH RECOGNITION [en] PRE-TRAINED MODELS

Search results