• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 238
  • 10
  • 10
  • 9
  • 3
  • 2
  • 2
  • 1
  • Tagged with
  • 319
  • 319
  • 142
  • 121
  • 115
  • 97
  • 73
  • 65
  • 61
  • 57
  • 57
  • 54
  • 52
  • 51
  • 51
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
231

Quantification of DNA Nanoballs Using Image Processing Techniques

Lindberg, Sara January 2023 (has links)
In gene editing, it is important to identify the number of edited and unedited nucleic acids in the development of new therapies and drugs. Countagen is developing a technology for accelerating genomic research and their product is called GeneAbacus. The product is a consumable reagent kit for the quantification of nucleic acids, which can be used by CRISPR gene editing researchers. The DNA which is analyzed with the reagent kit is first extracted in an assay and then targeted with tailored padlock probes. The target region is amplified via RCA and the products collapse into a fluorescent DNA nanoball, which can be analyzed with a fluorescence microscope. Each fluorescent dot in the microscope corresponds to a single recognition event, making the quantification of the edited and unedited nucleic acids possible.  The purpose of this project was to count the number of DNA nanoballs in images from a fluorescence microscope with a focus on deep learning. To do this, the images were first preprocessed to enhance the image quality and then cropped into small patches, before the patches were manually annotated on image-level. The mean value from three annotators was used as the label and the labelled images were used to train a ResNet by using a regression- based approach. PyTorch and the API Fastai were used for training and the applied method was transfer learning. The network was trained in two stages: first, the newly added layers were trained for feature extraction, and then the pre-trained base model was unfrozen and trained for fine-tuning. To find the position of the nanoballs in the images, Class Activation Maps (CAMs) and Gradient-weighted Class Activation Mapping (Grad-CAMs) were created, and the local maxima were calculated to produce statistics.  The best-performing model was a ResNet34 trained with batch size 32 and the loss function Huber loss. The model inference showed that the deep learning model counted the nanoballs in the same interval as the observers in 40 of 50 test images. The created CAMs and Grad-CAMs had too low resolution to find the coordinates of the detected nanoballs.  During this project, the nanoballs were only counted in small patches, but the goal was to find nanoballs in a large image. This project has been limited by time and unfortunately, the step where the number of nanoballs in the different patches were to be summed was not performed. However, the study showed that it is possible to implement and train a deep learning model to count nanoballs in small patches. It also showed that the activation maps had too low resolution to be able to find the positions of the nanoballs by looking for local maxima. The results showed that the number of patches used as training samples did not greatly impact the model’s performance when comparing 300 patches and 450 patches. Manual annotation of nanoballs was a difficult task since the nanoballs are moving when the images are taken, which results in unsharp nanoballs in some patches. Therefore, the manual annotation should probably be performed by experts to get the correct labels for the training. To improve the model and be able to find the positions of the nanoballs further investigation is needed.
232

Using Reinforcement Learning to Correct Soft Errors of Deep Neural Networks / Använda Förstärkningsinlärning för att Upptäcka och Mildra Mjuka Fel i Djupa Neurala Nätverk

Li, Yuhang January 2023 (has links)
Deep Neural Networks (DNNs) are becoming increasingly important in various aspects of human life, particularly in safety-critical areas such as autonomous driving and aerospace systems. However, soft errors including bit-flips can significantly impact the performance of these systems, leading to serious consequences. To ensure the reliability of DNNs, it is essential to guarantee their performances. Many solutions have been proposed to enhance the trustworthiness of DNNs, including traditional methods like error correcting code (ECC) that can mitigate and detect soft errors but come at a high cost of redundancy. This thesis proposes a new method of correcting soft errors in DNNs using Deep Reinforcement Learning (DRL) and Transfer Learning (TL). DRL agent can learn the knowledge of identifying the layer-wise critical weights of a DNN. To accelerate the training time, TL is used to apply this knowledge to train other layers. The primary objective of this method is to ensure acceptable performance of a DNN by mitigating the impact of errors on it while maintaining low redundancy. As a case study, we tested the proposed method approach on a multilayer perception (MLP) and ResNet-18, and our results show that our method can save around 25% redundancy compared to the baseline method ECC while achieving the same level of performance. With the same redundancy, our approach can boost system performance by up to twice that of conventional methods. By implementing TL, the training time of MLP is shortened to around 81.11%, and that of ResNet-18 is shortened to around 57.75%. / DNNs blir allt viktigare i olika aspekter av mänskligt liv, särskilt inom säkerhetskritiska områden som autonom körning och flygsystem. Mjuka fel inklusive bit-flip kan dock påverka prestandan hos dessa system avsevärt, vilket leder till allvarliga konsekvenser. För att säkerställa tillförlitligheten hos DNNs är det viktigt att garantera deras prestanda. Många lösningar har föreslagits för att förbättra tillförlitligheten för DNNs, inklusive traditionella metoder som ECC som kan mildra och upptäcka mjuka fel men som har en hög kostnad för redundans. Denna avhandling föreslår en ny metod för att korrigera mjuka fel i DNN med DRL och TL. DRL-agenten kan lära sig kunskapen om att identifiera de lagermässiga kritiska vikterna för en DNN. För att påskynda träningstiden används TL för att tillämpa denna kunskap för att träna andra lager. Det primära syftet med denna metod är att säkerställa acceptabel prestanda för en DNN genom att mildra inverkan av fel på den samtidigt som låg redundans bibehålls. Som en fallstudie testade vi den föreslagna metodmetoden på en MLP och ResNet-18, och våra resultat visar att vår metod kan spara cirka 25% redundans jämfört med baslinjemetoden ECC samtidigt som vi uppnår samma prestationsnivå. Med samma redundans kan vårt tillvägagångssätt öka systemets prestanda med upp till dubbelt så högt som för konventionella metoder. Genom att implementera TL förkortas träningstiden för MLP till cirka 81.11%, och den för ResNet-18 förkortas till cirka 57.75%.
233

Deep Learning Classification and Model Explainability for Prediction of Mental Health Patients Emergency Department Visit / Emergency Department Resource Prediction Using Explainable Deep Learning

Rashidiani, Sajjad January 2022 (has links)
The rate of Emergency Department (ED) visits due to mental health and drug abuse among children and youth has been increasing for more than a decade and is projected to become the leading cause of ED visits. Identifying high-risk patients well before an ED visit will enable mental health care providers to better predict ED resource utilization, improve their service, and ultimately reduce the risk of a future ED visit. Many studies in the literature utilized medical history to predict future hospitalization. However, in mental health care, the medical history of new patients is not always available from the first visit and it is crucial to identify high risk patients from the beginning as the rate of drop-out is very high in mental health treatment. In this study, a new approach of creating a text representation of questionnaire data for deep learning analysis is proposed. Employing this new text representation has enabled us to use transfer learning and develop a deep Natural Language Processing (NLP) model that estimates the possibility of 6-month ED visit among children and youth using mental health patient reported outcome measures (PROM). The proposed method achieved an Area Under Receiver Operating Characteristic Curve of 0.75 for classification of 6-month ED visit. In addition, a novel method was proposed to identify the words that carry the highest amount of information related to the outcome of the deep NLP models. This measurement of word information using Entropy Gain increases the explainability of the model by providing insight to the model attention. Finally, the results of this method were analyzed to explain how the deep NLP model achieved a high classification performance. / Dissertation / Master of Applied Science (MASc) / In this document, an Artificial Intelligence (AI) approach for predicting 6-month Emergency Department (ED) visits is proposed. In this approach, the questionnaires gathered from children and youth admitted to an outpatient or inpatient clinic are converted to a text representation called Textionnaire. Next, AI is utilized to analyze the Textionnaire and predict the possibility of a future ED visit. This method was successful in about 75% of the time. In addition to the AI solution, an explainability component is introduced to explain how the natural language processing algorithm identifies the high risk patients.
234

Investigating Few-Shot Transfer Learning for Address Parsing : Fine-Tuning Multilingual Pre-Trained Language Models for Low-Resource Address Segmentation / En Undersökning av Överföringsinlärning för Adressavkodning med Få Exempel : Finjustering av För-Tränade Språkmodeller för Låg-Resurs Adress Segmentering

Heimisdóttir, Hrafndís January 2022 (has links)
Address parsing is the process of splitting an address string into its different address components, such as street name, street number, et cetera. Address parsing has been quite extensively researched and there exist some state-ofthe-art address parsing solutions, mostly unilingual. In more recent years research has emerged which focuses on multinational address parsing and deep architecture address parsers have been used to achieve state-of-the-art performance on multinational address data. However, training these deep architectures for address parsing requires a rather large amount of address data which is not always accessible. Generally within Natural Language Processing (NLP) data is difficult to come by and most of the NLP data available consists of data from about only 20 of the approximately 7000 languages spoken around the world, so-called high-resource languages. This also applies to address data, which can be difficult to come by for some of the so-called low-resource languages of the world for which little or no NLP data exists. To attempt to deal with the lack of address data availability for some of the less spoken languages of the world, the current project investigates the potential of FewShot Learning (FSL) for multinational address parsing. To investigate this, two few-shot transfer learning models are implemented, both implementations consist of a fine-tuned pre-trained language model (PTLM). The difference between the two models is the PTLM used, which were the multilingual language models mBERT and XLM-R, respectively. The two PTLMs are finetuned using a linear classifier layer to then be used as multinational address parsers. The two models are trained and their results are compared with a state-of-the-art multinational address parser, Deepparse, as well as with each other. Results show that the two models do not outperform Deepparse, but they do show promising results, not too far from what Deepparse achieves on holdout and zero-shot datasets. On a mix of low- and high-resource language address data, both models perform well and achieve over 96% on the overall F1-score. Out of the two models used for implementation, XLM-R achieves significantly better results than mBERT and can therefore be considered the more appropriate PTLM to use for multinational FSL address parsing. Based on these results the conclusion is that there is great potential for FSL within the field of multinational address parsing and that general FSL methods can be used and perform well on multinational address parsing tasks. / Adressavkodning är processen att dela upp en adresssträng i dess olika adresskomponenter såsom gatunamn, gatunummer, et cetera. Adressavkodning har undersökts ganska omfattande och det finns några toppmoderna adressavkodningslösningar, mestadels enspråkiga. Senaste åren har forskning fokuserad på multinationell adressavkodning börjat dyka upp och djupa arkitekturer för adressavkodning har använts för att uppnå toppmodern prestation på multinationell adressdata. Att träna dessa arkitekturer kräver dock en ganska stor mängd adressdata, vilket inte alltid är tillgängligt. Det är generellt svårt att få tag på data inom naturlig språkbehandling och majoriteten av den data som är tillgänglig består av data från endast 20 av de cirka 7000 språk som används runt om i världen, så kallade högresursspråk. Detta gäller även för adressdata, vilket kan vara svårt att få tag på för vissa av världens så kallade resurssnåla språk för vilka det finns lite eller ingen data för naturlig språkbehandling. För att försöka behandla denna brist på adressdata för några av världens mindre talade språk undersöker detta projekt om det finns någon potential för inlärning med få exempel för multinationell adressavkodning. För detta implementeras två modeller för överföringsinlärning med få exempel genom finjustering av förtränade språkmodeller. Skillnaden mellan de två modellerna är den förtränade språkmodellen som används, mBERT respektive XLM-R. Båda modellerna finjusteras med hjälp av ett linjärt klassificeringsskikt för att sedan användas som multinationella addressavkodare. De två modellerna tränas och deras resultat jämförs med en toppmodern multinationell adressavkodare, Deepparse. Resultaten visar att de två modellerna presterar båda sämre än Deepparse modellen, men de visar ändå lovande resultat, inte långt ifrån vad Deepparse uppnår för både holdout och zero-shot dataset. Vidare, så presterar båda modeller bra på en blandning av adressdata från låg- och högresursspråk och båda modeller uppnår över 96% övergripande F1-score. Av de två modellerna uppnår XLM-R betydligt bättre resultat än mBERT och kan därför anses vara en mer lämplig förtränad språkmodell att använda för multinationell inlärning med få exempel för addressavkodning. Utifrån dessa resultat dras slutsatsen att det finns stor potential för inlärning med få exempel inom området multinationall adressavkodning, samt att generella metoder för inlärning med få exempel kan användas och preseterar bra på multinationella adressavkodningsuppgifter.
235

Monolingual and Cross-Lingual Survey Response Annotation

Zhao, Yahui January 2023 (has links)
Multilingual natural language processing (NLP) is increasingly recognized for its potential in processing diverse text-type data, including those from social media, reviews, and technical reports. Multilingual language models like mBERT and XLM-RoBERTa (XLM-R) play a pivotal role in multilingual NLP. Notwithstanding their capabilities, the performance of these models largely relies on the availability of annotated training data. This thesis employs the multilingual pre-trained model XLM-R to examine its efficacy in sequence labelling to open-ended questions on democracy across multilingual surveys. Traditional annotation practices have been labour-intensive and time-consuming, with limited automation attempts. Previous studies often translated multilingual data into English, bypassing the challenges and nuances of native languages. Our study explores automatic multilingual annotation at the token level for democracy survey responses in five languages: Hungarian, Italian, Polish, Russian, and Spanish. The results reveal promising F1 scores, indicating the feasibility of using multilingual models for such tasks. However, the performance of these models is closely tied to the quality and nature of the training set. This research paves the way for future experiments and model adjustments, underscoring the importance of refining training data and optimizing model techniques for enhanced classification accuracy.
236

Detection and Classification of Cancer and Other Noncommunicable Diseases Using Neural Network Models

Gore, Steven Lee 07 1900 (has links)
Here, we show that training with multiple noncommunicable diseases (NCDs) is both feasible and beneficial to modeling this class of diseases. We first use data from the Cancer Genome Atlas (TCGA) to train a pan cancer model, and then characterize the information the model has learned about the cancers. In doing this we show that the model has learned concepts that are relevant to the task of cancer classification. We also test the model on datasets derived independently of the TCGA cohort and show that the model is robust to data outside of its training distribution such as precancerous legions and metastatic samples. We then utilize the cancer model as the basis of a transfer learning study where we retrain it on other, non-cancer NCDs. In doing so we show that NCDs with very differing underlying biology contain extractible information relevant to each other allowing for a broader model of NCDs to be developed with existing datasets. We then test the importance of the samples source tissue in the model and find that the NCD class and tissue source may not be independent in our model. To address this, we use the tissue encodings to create augmented samples. We test how successfully we can use these augmented samples to remove or diminish tissue source importance to NCD class through retraining the model. In doing this we make key observations about the nature of concept importance and its usefulness in future neural network explainability efforts.
237

Neural maskinöversättning av gawarbati / Neural machine translation for Gawarbati

Gillholm, Katarina January 2023 (has links)
Nya neurala modeller har lett till stora framsteg inom maskinöversättning, men fungerar fortfarande sämre på språk som saknar stora mängder parallella data, så kallade lågresursspråk. Gawarbati är ett litet, hotat lågresursspråk där endast 5000 parallella meningar finns tillgängligt. Denna uppsats använder överföringsinlärning och hyperparametrar optimerade för små datamängder för att undersöka möjligheter och begränsningar för neural maskinöversättning från gawarbati till engelska. Genom att använda överföringsinlärning där en föräldramodell först tränades på hindi-engelska förbättrades översättningar med 1.8 BLEU och 1.3 chrF. Hyperparametrar optimerade för små datamängder ökade BLEU med 0.6 men minskade chrF med 1. Att kombinera överföringsinlärning och hyperparametrar optimerade för små datamängder försämrade resultatet med 0.5 BLEU och 2.2 chrF. De neurala modellerna jämförs med och presterar bättre än ordbaserad statistisk maskinöversättning och GPT-3. Den bäst presterande modellen uppnådde endast 2.8 BLEU och 19 chrF, vilket belyser begränsningarna av maskinöversättning på lågresursspråk samt det kritiska behovet av mer data. / Recent neural models have led to huge improvements in machine translation, but performance is still suboptimal for languages without large parallel datasets, so called low resource languages. Gawarbati is a small, threatened low resource language with only 5000 parallel sentences. This thesis uses transfer learning and hyperparameters optimized for small datasets to explore possibilities and limitations for neural machine translation from Gawarbati to English. Transfer learning, where the parent model was trained on parallel data between Hindi and English, improved results by 1.8 BLEU and 1.3 chrF. Hyperparameters optimized for small datasets increased BLEU by 0.6 but decreased chrF by 1. Combining transfer learning and hyperparameters optimized for small datasets led to a decrease in performance by 0.5 BLEU and 2.2 chrF. The neural models outperform a word based statistical machine translation and GPT-3. The highest performing model only achieved 2.8 BLEU and 19 chrF, which illustrates the limitations of machine translation for low resource languages and the critical need for more data. / VR 2020-01500
238

Développement et validation d’un modèle d’apprentissage machine pour la détection de potentiels donneurs d’organes

Sauthier, Nicolas 08 1900 (has links)
Le processus du don d’organes, crucial pour la survie de nombreux patients, ne répond pas à la demande croissante. Il dépend d’une identification, par les cliniciens, des potentiels donneurs d’organes. Cette étape est imparfaite et manque entre 30% et 60% des potentiels donneurs d’organes et ce indépendamment des pays étudiés. Améliorer ce processus est un impératif à la fois moral et économique. L’objectif de ce mémoire était de développer et valider un modèle afin de détecter automatiquement les potentiels donneurs d’organes. Pour ce faire, les données cliniques de l’ensemble des patients adultes hospitalisés aux soins intensifs du CHUM entre 2012 et 2019 ont été utilisées. 103 valeurs de laboratoires temporelles différentes et 2 valeurs statiques ont été utilisées pour développer un modèle de réseaux de neurones convolutifs entrainé à prédire les potentiels donneurs d’organes. Ce modèle a été comparé à un modèle fréquentiste linéaire non temporel. Le modèle a par la suite été validé dans une population externe cliniquement distincte. Différentes stratégies ont été comparées pour peaufiner le modèle dans cette population externe et améliorer les performances. Un total de 19 463 patients, dont 397 donneurs potentiels, ont été utilisés pour développer le modèle et 4 669, dont 36 donneurs potentiels, ont été utilisés pour la validation externe. Le modèle démontrait une aire sous la courbe ROC (AUROC) de 0.966 (IC95% 0.9490.981), supérieure au modèle fréquentiste linéaire (AUROC de 0.940 IC95% 0.908-0.969, p=0.014). Le modèle était aussi supérieur dans certaines sous populations d’intérêt clinique. Dans le groupe de validation externe, l’AUROC du modèle de réseaux de neurones était de 0.820 (0.682-0.948) augmentant à 0.874 (0.731-0.974) à l’aide d’un ré-entrainement. Ce modèle prometteur a le potentiel de modifier et d’améliorer la détection des potentiels donneurs d’organes. D’autres étapes de validation prospectives et d’amélioration du modèle, notamment l’ajout de données spécifiques, sont nécessaires avant une utilisation clinique de routine. / The organ donation process, however crucial for many patients’ survival, is not enough to address the increasing demand. Its efficiency depends on potential organ donors’ identification by clinicians. This imperfect step misses between 30%–60% of potential organ donor. Improving that process is a moral and economic imperative. The main goal of this work was to address that liming step by developing and validating a predictive model that could automatically detect potential organ donors. The clinical data from all patients hospitalized, between 2012 and 2019 to the CHUM critical care units were extracted. The temporal evolution of 103 types of laboratory analysis and 2 static clinical data was used to develop and test a convolutive neural network (CNN), trained to predict potential organ donors. This model was compared to a non-temporal logistical model as a baseline. The CNN model was validated in a clinically distinct external population. To improve the performance in this external cohort, strategies to fine-tune the network were compared. 19 463 patients, including 397 potential organ donors, were used to create the model and 4 669 patients, including 36 potential organ donors, served as the external validation cohort. The CNN model performed better with an AUROC of 0.966 (IC95% 0.949-0.981), compared to the logistical model (AUROC de 0.940 IC95% 0.908-0.969, p=0.014). The CNN model was also superior in specific subpopulation of increased clinical interest. In the external validation cohort, the CNN model’s AUROC was 0.820 (0.682-0.948) and could be improved to 0.874 (0.731-0.974) after fine tuning. This promising model could change potential organ donors' detection for the better. More studies are however required to improve the model, by adding more types of data, and to validate prospectively the mode before routine clinical usage.
239

Head-to-head Transfer Learning Comparisons made Possible : A Comparative Study of Transfer Learning Methods for Neural Machine Translation of the Baltic Languages

Stenlund, Mathias January 2023 (has links)
The struggle of training adequate MT models using data-hungry NMT frameworks for low-resource language pairs has created a need to alleviate the scarcity of sufficiently large parallel corpora. Different transfer learning methods have been introduced as possible solutions to this problem, where a new model for a target task is initialized using parameters learned from some other high-resource task. Many of these methods are claimed to increase the translation quality of NMT systems in some low-resource environments, however, they are often proven to do so using different parent and child language pairs, a variation in data size, NMT frameworks, and training hyperparameters, which makes comparing them impossible. In this thesis project, three such transfer learning methods are put head-to-head in a controlled environment where the target task is to translate from the under-resourced Baltic languages Lithuanian and Latvian to English. In this controlled environment, the same parent language pairs, data sizes, data domains, transformer framework, and training parameters are used to ensure fair comparisons between the three transfer learning methods. The experiments involve training and testing models using all different combinations of transfer learning methods, parent language pairs, and either in-domain or out-domain data for an extensive study where different strengths and weaknesses are observed. The results display that Multi-Round Transfer Learning improves the overall translation quality the most but, at the same time, requires the longest training time by far. The Parameter freezing method provides a marginally lower overall improvement of translation quality but requires only half the training time, while Trivial Transfer learning improves quality the least. Both Polish and Russian work well as parents for the Baltic languages, while web-crawled data improves out-domain translations the most. The results suggest that all transfer learning methods are effective in a simulated low-resource environment, however, none of them can compete with simply having a larger target language pair data set, due to none of them overcoming the strong higher-resource baseline.
240

Deep Learning Methods for Recovering Trading Strategies

Emtell, Erik, Spjuth, Oliver January 2022 (has links)
The aim of this paper is first of all to determine whether deep learning methods can recover trading strategies based on historical price and volume data, with scarcity of real data in mind. The second aim is to evaluate the methods to generate a deep learning blueprint for strategy extraction. Trading strategies can be built on many different types of data, often combined from different areas. In this paper, we focus on trading strategies based solely on historical price and volume data to limit the scope of the problem. Combinations of different deep learning architectures and methods such as transfer- and ensemble methods were evaluated. The results clearly show that deep learning models can recover relatively complex trading strategies to some extent. Models leveraging transfer learning outperform other models when data is scarce and ensemble methods elevate performance in certain regards. / Målet med denna rapport är i första hand att ta reda på om djupinlärningsmetoder kan återskapa handlingsstragetier baserat på historiska priser och volymdata, med vetskapen att datan är begränsad. Det andra målet är att utvärdera metoder för att skapa en djupinlärningsmall för att utvinna handelsstrategier. Handelsstrategier kan vara byggda på många olika datatyper, ofta i kombination från olika områden. I denna rapport fokuserar vi på strategier som enbart är baserade på historiska priser och volymdata för att begränsa problemet. Kombinationer av olika djupinlärningsarkitekturer tillsammans med metoder som till exempel överföringsinlärning och ensembleinlärning utvärderades. Resultaten visar tydligt att djupinlärningsmodeller kan återskapa relativt komplexa handlingsstrategier. Modeller som utnyttjade överföringsinlärning presterade bättre än andra modeller när datan var begränsad och ensembleinlärning ökade prestandan ytterligare i vissa sammanhang. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm

Page generated in 0.0558 seconds