Global ETD Search

1	A Quantitative Comparison of Pre-Trained Model Registries to Traditional Software Package Registries Jason Hunter Jones (18430302) 06 May 2024 (has links) <p dir="ltr">Software Package Registries are an integral part of the Software Supply Chain, acting as collaborative platforms that unite contributors, users, and packages, and streamline package management processes. Much of the engineering work around reusing packages from these platforms deals with the issue of synthesis, combining multiple packages into a new package or downstream project. Recently, researchers have examined registries that specialize in providing Pre-Trained Models (PTMs), to explore the nuances of the PTM Supply Chain. These works suggest that the main engineering challenge of PTM reuse is not synthesis but selection. However, these findings have been primarily qualitative and lacking quantitative evidence of the observed differences. I therefore evaluate the following hypothesis:</p><p dir="ltr"><i>The prioritization of selection over synthesis in Pre-Trained Model reuse means that the evolution and reuse of Pre-Trained Models differs compared to traditional software. </i><i>The evolution of models will be more linear, and the reuse of models will be more centralized.</i></p> Empirical software engineering software supply chain Pre-trained models HuggingFace Software Package Registries
2	Study of augmentations on historical manuscripts using TrOCR Meoded, Erez 08 December 2023 (has links) (PDF) Historical manuscripts are an essential source of original content. For many reasons, it is hard to recognize these manuscripts as text. This thesis used a state-of-the-art Handwritten Text Recognizer, TrOCR, to recognize a 16th-century manuscript. TrOCR uses a vision transformer to encode the input images and a language transformer to decode them back to text. We showed that carefully preprocessed images and designed augmentations can improve the performance of TrOCR. We suggest an ensemble of augmented models to achieve an even better performance. TrOCR Transformer Ensemble learning OCR Handwritten Text Recognition Deep Neural Networks Machine Learning Artificial Intelligence Huggingface Python Artificial Intelligence and Robotics Data Science
3	Surmize: An Online NLP System for Close-Domain Question-Answering and Summarization Bergkvist, Alexander, Hedberg, Nils, Rollino, Sebastian, Sagen, Markus January 2020 (has links) The amount of data available and consumed by people globally is growing. To reduce mental fatigue and increase the general ability to gain insight into complex texts or documents, we have developed an application to aid in this task. The application allows users to upload documents and ask domain-specific questions about them using our web application. A summarized version of each document is presented to the user, which could further facilitate their understanding of the document and guide them towards what types of questions could be relevant to ask. Our application allows users flexibility with the types of documents that can be processed, it is publicly available, stores no user data, and uses state-of-the-art models for its summaries and answers. The result is an application that yields near human-level intuition for answering questions in certain isolated cases, such as Wikipedia and news articles, as well as some scientific texts. The application shows a decrease in reliability and its prediction as to the complexity of the subject, the number of words in the document, and grammatical inconsistency in the questions increases. These are all aspects that can be improved further if used in production. / Mängden data som är tillgänglig och konsumeras av människor växer globalt. För att minska den mentala trötthet och öka den allmänna förmågan att få insikt i komplexa, massiva texter eller dokument, har vi utvecklat en applikation för att bistå i de uppgifterna. Applikationen tillåter användare att ladda upp dokument och fråga kontextspecifika frågor via vår webbapplikation. En sammanfattad version av varje dokument presenteras till användaren, vilket kan ytterligare förenkla förståelsen av ett dokument och vägleda dem mot vad som kan vara relevanta frågor att ställa. Vår applikation ger användare möjligheten att behandla olika typer av dokument, är tillgänglig för alla, sparar ingen personlig data, och använder de senaste modellerna inom språkbehandling för dess sammanfattningar och svar. Resultatet är en applikation som når en nära mänsklig intuition för vissa domäner och frågor, som exempelvis Wikipedia- och nyhetsartiklar, samt viss vetensaplig text. Noterade undantag för tillämpningen härrör från ämnets komplexitet, grammatiska korrekthet för frågorna och dokumentets längd. Dessa är områden som kan förbättras ytterligare om den används i produktionen. Summary Summarization Abstractive Summarization Extractive Summarization ASUS ESUS Question Answering Question-Answering QA QA Model QA System Natural Language Processing NLP Online NLP System Machine Learning ML Deep Learning DL Close-Domain Question Answering cdQA Transformer Model transformer Transformer BERT Watson Online QA Online Summary Online Summarization Spacy FastAPI Surmize Huggingface Engineering and Technology Teknik och teknologier
4	Large-Context Question Answering with Cross-Lingual Transfer Sagen, Markus January 2021 (has links) Models based around the transformer architecture have become one of the most prominent for solving a multitude of natural language processing (NLP)tasks since its introduction in 2017. However, much research related to the transformer model has focused primarily on achieving high performance and many problems remain unsolved. Two of the most prominent currently are the lack of high performing non-English pre-trained models, and the limited number of words most trained models can incorporate for their context. Solving these problems would make NLP models more suitable for real-world applications, improving information retrieval, reading comprehension, and more. All previous research has focused on incorporating long-context for English language models. This thesis investigates the cross-lingual transferability between languages when only training for long-context in English. Training long-context models in English only could make long-context in low-resource languages, such as Swedish, more accessible since it is hard to find such data in most languages and costly to train for each language. This could become an efficient method for creating long-context models in other languages without the need for such data in all languages or pre-training from scratch. We extend the models’ context using the training scheme of the Longformer architecture and fine-tune on a question-answering task in several languages. Our evaluation could not satisfactorily confirm nor deny if transferring long-term context is possible for low-resource languages. We believe that using datasets that require long-context reasoning, such as a multilingual TriviaQAdataset, could demonstrate our hypothesis’s validity. Long-Context Multilingual Model Longformer XLM-R Longformer Long-term Context Extending Context Extend Context Large-Context Long-Context Large Context Long Context Cross-Lingual Multi-Lingual Cross Lingual Multi Lingual QA Question-Answering Question Answering Transformer model Machine Learning Transfer Learning SQuAD Memory Transfer Learning Long-Context Long Context Efficient Monolingual Multilingual QA model Language Model Huggingface BERT RoBERTa XLM-R mBERT Multilingual BERT Efficient Transformers Reformer Linformer Performer Transformer-XL Wikitext-103 TriviaQA HotpotQA WikiHopQA VINNOVA Peltarion AI LM MLM Deep Learning Natural Language Processing NLP Attention Transformers Transfer Learning Datasets Computer and Information Sciences Data- och informationsvetenskap

1

Page generated in 0.0499 seconds