Global ETD Search

231	Investigation of Machine Learning Regression Techniques to Predict Critical Heat Flux Helmryd Grosfilley, Emil January 2022 (has links) A unifying model for Critical Heat Flux (CHF) prediction has been elusive for over 60 years. With the release of the data utilized in the making of the 2006 Groeneveld Lookup table (LUT), by far the largest public CHF database available to date, data-driven predictions on a large variable space can be performed. The popularization of machine learning techniques to solve regression problems allows for deeper and more advanced tools when analyzing the data. We compare three different machine learning algorithms to predict the occurrence of CHF in vertical, uniformly heated round tubes. For each selected algorithm (ν-Support vector regression, Gaussian process regression, and Neural network regression), an optimized hyperparameter set is fitted. The best performing algorithm is the Neural network, which achieves a standard deviation of the prediction/measured factor three times lower than the LUT, while the Gaussian process regression and the ν-Support vector regression both lead to two times lower standard deviation. All algorithms significantly outperform the LUT prediction performance. The neural network model and training methodology are designed to prevent overfitting, which is confirmed by data analysis of the predictions. Additionally, a feasibility study of transfer learning and uncertainty quantification is performed, to investigate potential future applications. Critical Heat Flux Machine Learning Regression Neural Network nu-Support vector regression Gaussian Process regression Transfer Learning Computer Sciences Datavetenskap (datalogi)
232	An Investigation of Low-Rank Decomposition for Increasing Inference Speed in Deep Neural Networks With Limited Training Data Wikén, Victor January 2018 (has links) In this study, to increase inference speed of convolutional neural networks, the optimization technique low-rank tensor decomposition has been implemented and applied to AlexNet which had been trained to classify dog breeds. Due to a small training set, transfer learning was used in order to be able to classify dog breeds. The purpose of the study is to investigate how effective low-rank tensor decomposition is when the training set is limited. The results obtained from this study, compared to a previous study, indicate that there is a strong relationship between the effects of the tensor decomposition and how much available training data exists. A significant speed up can be obtained in the different convolutional layers using tensor decomposition. However, since there is a need to retrain the network after the decomposition and due to the limited dataset there is a slight decrease in accuracy. / För att öka inferenshastigheten hos faltningssnätverk, har i denna studie optimeringstekniken low-rank tensor decomposition implementerats och applicerats på AlexNet, som har tränats för att klassificera hundraser. På grund av en begränsad mängd träningsdata användes transfer learning för uppgiften. Syftet med studien är att undersöka hur effektiv low-rank tensor decomposition är när träningsdatan är begränsad. Jämfört med resultaten från en tidigare studie visar resultaten från denna studie att det finns ett starkt samband mellan effekterna av low-rank tensor decomposition och hur mycket tillgänglig träningsdata som finns. En signifikant hastighetsökning kan uppnås i de olika faltningslagren med hjälp av low-rank tensor decomposition. Eftersom det finns ett behov av att träna om nätverket efter dekompositionen och på grund av den begränsade mängden data så uppnås hastighetsökningen dock på bekostnad av en viss minskning i precisionen för modellen. deep neural networks convolutional neural networks AlexNet inference speed optimization low-rank tensor decomposition fine-grained classification problem dog breed classification transfer learning Computer Sciences Datavetenskap (datalogi)
233	Designing a Question Answering System in the Domain of Swedish Technical Consulting Using Deep Learning / Design av ett frågebesvarande system inom svensk konsultverksamhet med användning av djupinlärning Abrahamsson, Felix January 2018 (has links) Question Answering systems are greatly sought after in many areas of industry. Unfortunately, as most research in Natural Language Processing is conducted in English, the applicability of such systems to other languages is limited. Moreover, these systems often struggle in dealing with long text sequences. This thesis explores the possibility of applying existing models to the Swedish language, in a domain where the syntax and semantics differ greatly from typical Swedish texts. Additionally, the text length may vary arbitrarily. To solve these problems, transfer learning techniques and state-of-the-art Question Answering models are investigated. Furthermore, a novel, divide-and-conquer based technique for processing long texts is developed. Results show that the transfer learning is partly unsuccessful, but the system is capable of perform reasonably well in the new domain regardless. Furthermore, the system shows great performance improvement on longer text sequences with the use of the new technique. / System som givet en text besvarar frågor är högt eftertraktade inom många arbetsområden. Eftersom majoriteten av all forskning inom naturligtspråkbehandling behandlar engelsk text är de flesta system inte direkt applicerbara på andra språk. Utöver detta har systemen ofta svårt att hantera långa textsekvenser. Denna rapport utforskar möjligheten att applicera existerande modeller på det svenska språket, i en domän där syntaxen och semantiken i språket skiljer sig starkt från typiska svenska texter. Dessutom kan längden på texterna variera godtyckligt. För att lösa dessa problem undersöks flera tekniker inom transferinlärning och frågebesvarande modeller i forskningsfronten. En ny metod för att behandla långa texter utvecklas, baserad på en dekompositionsalgoritm. Resultaten visar på att transfer learning delvis misslyckas givet domänen och modellerna, men att systemet ändå presterar relativt väl i den nya domänen. Utöver detta visas att systemet presterar väl på långa texter med hjälp av den nya metoden. Question Answering Deep Learning Machine Learning Transfer Learning Natural Language Processing Technical Consulting Word Embeddings Divide and Conquer Computer Sciences Datavetenskap (datalogi)
234	Decision Making and Classification for Time Series Data Yang, Qiwei 16 August 2022 (has links) No description available. Artificial Intelligence Bioinformatics Business Costs Computer Science Computer Engineering decision making state identification deep learning transfer learning classification time series cyber vulnerability cramped synchronized movement
235	Quantification of DNA Nanoballs Using Image Processing Techniques Lindberg, Sara January 2023 (has links) In gene editing, it is important to identify the number of edited and unedited nucleic acids in the development of new therapies and drugs. Countagen is developing a technology for accelerating genomic research and their product is called GeneAbacus. The product is a consumable reagent kit for the quantification of nucleic acids, which can be used by CRISPR gene editing researchers. The DNA which is analyzed with the reagent kit is first extracted in an assay and then targeted with tailored padlock probes. The target region is amplified via RCA and the products collapse into a fluorescent DNA nanoball, which can be analyzed with a fluorescence microscope. Each fluorescent dot in the microscope corresponds to a single recognition event, making the quantification of the edited and unedited nucleic acids possible. The purpose of this project was to count the number of DNA nanoballs in images from a fluorescence microscope with a focus on deep learning. To do this, the images were first preprocessed to enhance the image quality and then cropped into small patches, before the patches were manually annotated on image-level. The mean value from three annotators was used as the label and the labelled images were used to train a ResNet by using a regression- based approach. PyTorch and the API Fastai were used for training and the applied method was transfer learning. The network was trained in two stages: first, the newly added layers were trained for feature extraction, and then the pre-trained base model was unfrozen and trained for fine-tuning. To find the position of the nanoballs in the images, Class Activation Maps (CAMs) and Gradient-weighted Class Activation Mapping (Grad-CAMs) were created, and the local maxima were calculated to produce statistics. The best-performing model was a ResNet34 trained with batch size 32 and the loss function Huber loss. The model inference showed that the deep learning model counted the nanoballs in the same interval as the observers in 40 of 50 test images. The created CAMs and Grad-CAMs had too low resolution to find the coordinates of the detected nanoballs. During this project, the nanoballs were only counted in small patches, but the goal was to find nanoballs in a large image. This project has been limited by time and unfortunately, the step where the number of nanoballs in the different patches were to be summed was not performed. However, the study showed that it is possible to implement and train a deep learning model to count nanoballs in small patches. It also showed that the activation maps had too low resolution to be able to find the positions of the nanoballs by looking for local maxima. The results showed that the number of patches used as training samples did not greatly impact the model’s performance when comparing 300 patches and 450 patches. Manual annotation of nanoballs was a difficult task since the nanoballs are moving when the images are taken, which results in unsharp nanoballs in some patches. Therefore, the manual annotation should probably be performed by experts to get the correct labels for the training. To improve the model and be able to find the positions of the nanoballs further investigation is needed. image analysis fluorescence microscopy quantification DNA rolling circle amplification machine learning deep learning transfer learning neural networks Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
236	Using Reinforcement Learning to Correct Soft Errors of Deep Neural Networks / Använda Förstärkningsinlärning för att Upptäcka och Mildra Mjuka Fel i Djupa Neurala Nätverk Li, Yuhang January 2023 (has links) Deep Neural Networks (DNNs) are becoming increasingly important in various aspects of human life, particularly in safety-critical areas such as autonomous driving and aerospace systems. However, soft errors including bit-flips can significantly impact the performance of these systems, leading to serious consequences. To ensure the reliability of DNNs, it is essential to guarantee their performances. Many solutions have been proposed to enhance the trustworthiness of DNNs, including traditional methods like error correcting code (ECC) that can mitigate and detect soft errors but come at a high cost of redundancy. This thesis proposes a new method of correcting soft errors in DNNs using Deep Reinforcement Learning (DRL) and Transfer Learning (TL). DRL agent can learn the knowledge of identifying the layer-wise critical weights of a DNN. To accelerate the training time, TL is used to apply this knowledge to train other layers. The primary objective of this method is to ensure acceptable performance of a DNN by mitigating the impact of errors on it while maintaining low redundancy. As a case study, we tested the proposed method approach on a multilayer perception (MLP) and ResNet-18, and our results show that our method can save around 25% redundancy compared to the baseline method ECC while achieving the same level of performance. With the same redundancy, our approach can boost system performance by up to twice that of conventional methods. By implementing TL, the training time of MLP is shortened to around 81.11%, and that of ResNet-18 is shortened to around 57.75%. / DNNs blir allt viktigare i olika aspekter av mänskligt liv, särskilt inom säkerhetskritiska områden som autonom körning och flygsystem. Mjuka fel inklusive bit-flip kan dock påverka prestandan hos dessa system avsevärt, vilket leder till allvarliga konsekvenser. För att säkerställa tillförlitligheten hos DNNs är det viktigt att garantera deras prestanda. Många lösningar har föreslagits för att förbättra tillförlitligheten för DNNs, inklusive traditionella metoder som ECC som kan mildra och upptäcka mjuka fel men som har en hög kostnad för redundans. Denna avhandling föreslår en ny metod för att korrigera mjuka fel i DNN med DRL och TL. DRL-agenten kan lära sig kunskapen om att identifiera de lagermässiga kritiska vikterna för en DNN. För att påskynda träningstiden används TL för att tillämpa denna kunskap för att träna andra lager. Det primära syftet med denna metod är att säkerställa acceptabel prestanda för en DNN genom att mildra inverkan av fel på den samtidigt som låg redundans bibehålls. Som en fallstudie testade vi den föreslagna metodmetoden på en MLP och ResNet-18, och våra resultat visar att vår metod kan spara cirka 25% redundans jämfört med baslinjemetoden ECC samtidigt som vi uppnår samma prestationsnivå. Med samma redundans kan vårt tillvägagångssätt öka systemets prestanda med upp till dubbelt så högt som för konventionella metoder. Genom att implementera TL förkortas träningstiden för MLP till cirka 81.11%, och den för ResNet-18 förkortas till cirka 57.75%. DNN Soft errors Redundancy DRL DQN Transfer learning Training time DNN Mjuka fel Redundans DRL DQN Överföringsinlärning Utbildningstid Computer and Information Sciences Data- och informationsvetenskap
237	Deep Learning Classification and Model Explainability for Prediction of Mental Health Patients Emergency Department Visit / Emergency Department Resource Prediction Using Explainable Deep Learning Rashidiani, Sajjad January 2022 (has links) The rate of Emergency Department (ED) visits due to mental health and drug abuse among children and youth has been increasing for more than a decade and is projected to become the leading cause of ED visits. Identifying high-risk patients well before an ED visit will enable mental health care providers to better predict ED resource utilization, improve their service, and ultimately reduce the risk of a future ED visit. Many studies in the literature utilized medical history to predict future hospitalization. However, in mental health care, the medical history of new patients is not always available from the first visit and it is crucial to identify high risk patients from the beginning as the rate of drop-out is very high in mental health treatment. In this study, a new approach of creating a text representation of questionnaire data for deep learning analysis is proposed. Employing this new text representation has enabled us to use transfer learning and develop a deep Natural Language Processing (NLP) model that estimates the possibility of 6-month ED visit among children and youth using mental health patient reported outcome measures (PROM). The proposed method achieved an Area Under Receiver Operating Characteristic Curve of 0.75 for classification of 6-month ED visit. In addition, a novel method was proposed to identify the words that carry the highest amount of information related to the outcome of the deep NLP models. This measurement of word information using Entropy Gain increases the explainability of the model by providing insight to the model attention. Finally, the results of this method were analyzed to explain how the deep NLP model achieved a high classification performance. / Dissertation / Master of Applied Science (MASc) / In this document, an Artificial Intelligence (AI) approach for predicting 6-month Emergency Department (ED) visits is proposed. In this approach, the questionnaires gathered from children and youth admitted to an outpatient or inpatient clinic are converted to a text representation called Textionnaire. Next, AI is utilized to analyze the Textionnaire and predict the possibility of a future ED visit. This method was successful in about 75% of the time. In addition to the AI solution, an explainability component is introduced to explain how the natural language processing algorithm identifies the high risk patients. Deep Learning Transfer Learning Natural Language Processing Readmission Prediction Emergency Department Visit Questionnaire Patient Reported Outcome Measure PROM NLP Explainability Artificial Intelligence Machine Learning
238	Investigating Few-Shot Transfer Learning for Address Parsing : Fine-Tuning Multilingual Pre-Trained Language Models for Low-Resource Address Segmentation / En Undersökning av Överföringsinlärning för Adressavkodning med Få Exempel : Finjustering av För-Tränade Språkmodeller för Låg-Resurs Adress Segmentering Heimisdóttir, Hrafndís January 2022 (has links) Address parsing is the process of splitting an address string into its different address components, such as street name, street number, et cetera. Address parsing has been quite extensively researched and there exist some state-ofthe-art address parsing solutions, mostly unilingual. In more recent years research has emerged which focuses on multinational address parsing and deep architecture address parsers have been used to achieve state-of-the-art performance on multinational address data. However, training these deep architectures for address parsing requires a rather large amount of address data which is not always accessible. Generally within Natural Language Processing (NLP) data is difficult to come by and most of the NLP data available consists of data from about only 20 of the approximately 7000 languages spoken around the world, so-called high-resource languages. This also applies to address data, which can be difficult to come by for some of the so-called low-resource languages of the world for which little or no NLP data exists. To attempt to deal with the lack of address data availability for some of the less spoken languages of the world, the current project investigates the potential of FewShot Learning (FSL) for multinational address parsing. To investigate this, two few-shot transfer learning models are implemented, both implementations consist of a fine-tuned pre-trained language model (PTLM). The difference between the two models is the PTLM used, which were the multilingual language models mBERT and XLM-R, respectively. The two PTLMs are finetuned using a linear classifier layer to then be used as multinational address parsers. The two models are trained and their results are compared with a state-of-the-art multinational address parser, Deepparse, as well as with each other. Results show that the two models do not outperform Deepparse, but they do show promising results, not too far from what Deepparse achieves on holdout and zero-shot datasets. On a mix of low- and high-resource language address data, both models perform well and achieve over 96% on the overall F1-score. Out of the two models used for implementation, XLM-R achieves significantly better results than mBERT and can therefore be considered the more appropriate PTLM to use for multinational FSL address parsing. Based on these results the conclusion is that there is great potential for FSL within the field of multinational address parsing and that general FSL methods can be used and perform well on multinational address parsing tasks. / Adressavkodning är processen att dela upp en adresssträng i dess olika adresskomponenter såsom gatunamn, gatunummer, et cetera. Adressavkodning har undersökts ganska omfattande och det finns några toppmoderna adressavkodningslösningar, mestadels enspråkiga. Senaste åren har forskning fokuserad på multinationell adressavkodning börjat dyka upp och djupa arkitekturer för adressavkodning har använts för att uppnå toppmodern prestation på multinationell adressdata. Att träna dessa arkitekturer kräver dock en ganska stor mängd adressdata, vilket inte alltid är tillgängligt. Det är generellt svårt att få tag på data inom naturlig språkbehandling och majoriteten av den data som är tillgänglig består av data från endast 20 av de cirka 7000 språk som används runt om i världen, så kallade högresursspråk. Detta gäller även för adressdata, vilket kan vara svårt att få tag på för vissa av världens så kallade resurssnåla språk för vilka det finns lite eller ingen data för naturlig språkbehandling. För att försöka behandla denna brist på adressdata för några av världens mindre talade språk undersöker detta projekt om det finns någon potential för inlärning med få exempel för multinationell adressavkodning. För detta implementeras två modeller för överföringsinlärning med få exempel genom finjustering av förtränade språkmodeller. Skillnaden mellan de två modellerna är den förtränade språkmodellen som används, mBERT respektive XLM-R. Båda modellerna finjusteras med hjälp av ett linjärt klassificeringsskikt för att sedan användas som multinationella addressavkodare. De två modellerna tränas och deras resultat jämförs med en toppmodern multinationell adressavkodare, Deepparse. Resultaten visar att de två modellerna presterar båda sämre än Deepparse modellen, men de visar ändå lovande resultat, inte långt ifrån vad Deepparse uppnår för både holdout och zero-shot dataset. Vidare, så presterar båda modeller bra på en blandning av adressdata från låg- och högresursspråk och båda modeller uppnår över 96% övergripande F1-score. Av de två modellerna uppnår XLM-R betydligt bättre resultat än mBERT och kan därför anses vara en mer lämplig förtränad språkmodell att använda för multinationell inlärning med få exempel för addressavkodning. Utifrån dessa resultat dras slutsatsen att det finns stor potential för inlärning med få exempel inom området multinationall adressavkodning, samt att generella metoder för inlärning med få exempel kan användas och preseterar bra på multinationella adressavkodningsuppgifter. Address Parsing Address Segmentation Few-Shot Learning Transfer Learning Named Entity Recognition Adressavkodning Adress Segmentering Inlärning med Få Exempel Överföringsinlärning Computer and Information Sciences Data- och informationsvetenskap
239	Monolingual and Cross-Lingual Survey Response Annotation Zhao, Yahui January 2023 (has links) Multilingual natural language processing (NLP) is increasingly recognized for its potential in processing diverse text-type data, including those from social media, reviews, and technical reports. Multilingual language models like mBERT and XLM-RoBERTa (XLM-R) play a pivotal role in multilingual NLP. Notwithstanding their capabilities, the performance of these models largely relies on the availability of annotated training data. This thesis employs the multilingual pre-trained model XLM-R to examine its efficacy in sequence labelling to open-ended questions on democracy across multilingual surveys. Traditional annotation practices have been labour-intensive and time-consuming, with limited automation attempts. Previous studies often translated multilingual data into English, bypassing the challenges and nuances of native languages. Our study explores automatic multilingual annotation at the token level for democracy survey responses in five languages: Hungarian, Italian, Polish, Russian, and Spanish. The results reveal promising F1 scores, indicating the feasibility of using multilingual models for such tasks. However, the performance of these models is closely tied to the quality and nature of the training set. This research paves the way for future experiments and model adjustments, underscoring the importance of refining training data and optimizing model techniques for enhanced classification accuracy. transfer learning zero-shot cross-lingual transfer model-based transfer multilingual pre-trained language models sequence labeling open-ended questions democracy
240	Detection and Classification of Cancer and Other Noncommunicable Diseases Using Neural Network Models Gore, Steven Lee 07 1900 (has links) Here, we show that training with multiple noncommunicable diseases (NCDs) is both feasible and beneficial to modeling this class of diseases. We first use data from the Cancer Genome Atlas (TCGA) to train a pan cancer model, and then characterize the information the model has learned about the cancers. In doing this we show that the model has learned concepts that are relevant to the task of cancer classification. We also test the model on datasets derived independently of the TCGA cohort and show that the model is robust to data outside of its training distribution such as precancerous legions and metastatic samples. We then utilize the cancer model as the basis of a transfer learning study where we retrain it on other, non-cancer NCDs. In doing so we show that NCDs with very differing underlying biology contain extractible information relevant to each other allowing for a broader model of NCDs to be developed with existing datasets. We then test the importance of the samples source tissue in the model and find that the NCD class and tissue source may not be independent in our model. To address this, we use the tissue encodings to create augmented samples. We test how successfully we can use these augmented samples to remove or diminish tissue source importance to NCD class through retraining the model. In doing this we make key observations about the nature of concept importance and its usefulness in future neural network explainability efforts. Cancer Neural network VAE generative augmented data methylation variational autoencoder CpG island TCGA schizophrenia asthma arthritis transfer learning TCAV Biology, Bioinformatics Computer Science

Search results