Global ETD Search

161	Regularization schemes for transfer learning with convolutional networks / Stratégies de régularisation pour l'apprentissage par transfert des réseaux de neurones à convolution Li, Xuhong 10 September 2019 (has links) L’apprentissage par transfert de réseaux profonds réduit considérablement les coûts en temps de calcul et en données du processus d’entraînement des réseaux et améliore largement les performances de la tâche cible par rapport à l’apprentissage à partir de zéro. Cependant, l’apprentissage par transfert d’un réseau profond peut provoquer un oubli des connaissances acquises lors de l’apprentissage de la tâche source. Puisque l’efficacité de l’apprentissage par transfert vient des connaissances acquises sur la tâche source, ces connaissances doivent être préservées pendant le transfert. Cette thèse résout ce problème d’oubli en proposant deux schémas de régularisation préservant les connaissances pendant l’apprentissage par transfert. Nous examinons d’abord plusieurs formes de régularisation des paramètres qui favorisent toutes explicitement la similarité de la solution finale avec le modèle initial, par exemple, L1, L2, et Group-Lasso. Nous proposons également les variantes qui utilisent l’information de Fisher comme métrique pour mesurer l’importance des paramètres. Nous validons ces approches de régularisation des paramètres sur différentes tâches de segmentation sémantique d’image ou de calcul de flot optique. Le second schéma de régularisation est basé sur la théorie du transport optimal qui permet d’estimer la dissimilarité entre deux distributions. Nous nous appuyons sur la théorie du transport optimal pour pénaliser les déviations des représentations de haut niveau entre la tâche source et la tâche cible, avec le même objectif de préserver les connaissances pendant l’apprentissage par transfert. Au prix d’une légère augmentation du temps de calcul pendant l’apprentissage, cette nouvelle approche de régularisation améliore les performances des tâches cibles et offre une plus grande précision dans les tâches de classification d’images par rapport aux approches de régularisation des paramètres. / Transfer learning with deep convolutional neural networks significantly reduces the computation and data overhead of the training process and boosts the performance on the target task, compared to training from scratch. However, transfer learning with a deep network may cause the model to forget the knowledge acquired when learning the source task, leading to the so-called catastrophic forgetting. Since the efficiency of transfer learning derives from the knowledge acquired on the source task, this knowledge should be preserved during transfer. This thesis solves this problem of forgetting by proposing two regularization schemes that preserve the knowledge during transfer. First we investigate several forms of parameter regularization, all of which explicitly promote the similarity of the final solution with the initial model, based on the L1, L2, and Group-Lasso penalties. We also propose the variants that use Fisher information as a metric for measuring the importance of parameters. We validate these parameter regularization approaches on various tasks. The second regularization scheme is based on the theory of optimal transport, which enables to estimate the dissimilarity between two distributions. We benefit from optimal transport to penalize the deviations of high-level representations between the source and target task, with the same objective of preserving knowledge during transfer learning. With a mild increase in computation time during training, this novel regularization approach improves the performance of the target tasks, and yields higher accuracy on image classification tasks compared to parameter regularization approaches. Apprentissage par transfert Réseaux de neurones à convolution Régularisation Transfer learning Regularization Convolutional networks Computer vision Optimal transport Machine learning Image processing
162	On Transfer Learning Techniques for Machine Learning Debasmit Das (8314707) 30 April 2020 (has links) <pre><pre><p> </p><p>Recent progress in machine learning has been mainly due to the availability of large amounts of annotated data used for training complex models with deep architectures. Annotating this training data becomes burdensome and creates a major bottleneck in maintaining machine-learning databases. Moreover, these trained models fail to generalize to new categories or new varieties of the same categories. This is because new categories or new varieties have data distribution different from the training data distribution. To tackle these problems, this thesis proposes to develop a family of transfer-learning techniques that can deal with different training (source) and testing (target) distributions with the assumption that the availability of annotated data is limited in the testing domain. This is done by using the auxiliary data-abundant source domain from which useful knowledge is transferred that can be applied to data-scarce target domain. This transferable knowledge serves as a prior that biases target-domain predictions and prevents the target-domain model from overfitting. Specifically, we explore structural priors that encode relational knowledge between different data entities, which provides more informative bias than traditional priors. The choice of the structural prior depends on the information availability and the similarity between the two domains. Depending on the domain similarity and the information availability, we divide the transfer learning problem into four major categories and propose different structural priors to solve each of these sub-problems.</p><p> </p><p>This thesis first focuses on the unsupervised-domain-adaptation problem, where we propose to minimize domain discrepancy by transforming labeled source-domain data to be close to unlabeled target-domain data. For this problem, the categories remain the same across the two domains and hence we assume that the structural relationship between the source-domain samples is carried over to the target domain. Thus, graph or hyper-graph is constructed as the structural prior from both domains and a graph/hyper-graph matching formulation is used to transform samples in the source domain to be closer to samples in the target domain. An efficient optimization scheme is then proposed to tackle the time and memory inefficiencies associated with the matching problem. The few-shot learning problem is studied next, where we propose to transfer knowledge from source-domain categories containing abundantly labeled data to novel categories in the target domain that contains only few labeled data. The knowledge transfer biases the novel category predictions and prevents the model from overfitting. The knowledge is encoded using a neural-network-based prior that transforms a data sample to its corresponding class prototype. This neural network is trained from the source-domain data and applied to the target-domain data, where it transforms the few-shot samples to the novel-class prototypes for better recognition performance. The few-shot learning problem is then extended to the situation, where we do not have access to the source-domain data but only have access to the source-domain class prototypes. In this limited information setting, parametric neural-network-based priors would overfit to the source-class prototypes and hence we seek a non-parametric-based prior using manifolds. A piecewise linear manifold is used as a structural prior to fit the source-domain-class prototypes. This structure is extended to the target domain, where the novel-class prototypes are found by projecting the few-shot samples onto the manifold. Finally, the zero-shot learning problem is addressed, which is an extreme case of the few-shot learning problem where we do not have any labeled data in the target domain. However, we have high-level information for both the source and target domain categories in the form of semantic descriptors. We learn the relation between the sample space and the semantic space, using a regularized neural network so that classification of the novel categories can be carried out in a common representation space. This same neural network is then used in the target domain to relate the two spaces. In case we want to generate data for the novel categories in the target domain, we can use a constrained generative adversarial network instead of a traditional neural network. Thus, we use structural priors like graphs, neural networks and manifolds to relate various data entities like samples, prototypes and semantics for these different transfer learning sub-problems. We explore additional post-processing steps like pseudo-labeling, domain adaptation and calibration and enforce algorithmic and architectural constraints to further improve recognition performance. Experimental results on standard transfer learning image recognition datasets produced competitive results with respect to previous work. Further experimentation and analyses of these methods provided better understanding of machine learning as well.</p><p> </p></pre></pre> Transfer learning Computer Vision Machine Learning Domain Adaptation Few-shot Learning Zero-shot Learning
163	Knowledge transfer for image understanding / Transfert de connaissance pour la compréhension des images Kulkarni, Praveen 23 January 2017 (has links) Le Transfert de Connaissance (Knowledge Transfer or Transfer Learning) est une solution prometteuse au difficile problème de l’apprentissage des réseaux profonds au moyen de bases d’apprentissage de petite taille, en présence d’une grande variabilité visuelle intra-classe. Dans ce travail, nous reprenons ce paradigme, dans le but d’étendre les capacités des CNN les plus récents au problème de la classification. Dans un premier temps, nous proposons plusieurs techniques permettant, lors de l’apprentissage et de la prédiction, une réduction des ressources nécessaires – une limitation connue des CNN. (i) En utilisant une méthode hybride combinant des techniques classiques comme des Bag-Of-Words (BoW) avec des CNN. (iv) En introduisant une nouvelle méthode d’agrégation intégrée à une structure de type CNN ainsi qu’un modèle non-linéaire s’appuyant sur des parties de l’image. La contribution clé est, finalement, une technique capable d’isoler les régions des images utiles pour une représentation locale. De plus, nous proposons une méthode nouvelle pour apprendre une représentation structurée des coefficients des réseaux de neurones. Nous présentons des résultats sur des jeux de données difficiles, ainsi que des comparaisons avec des méthodes concurrentes récentes. Nous prouvons que les méthodes proposées s’étendent à d’autres tâches de reconnaissance visuelles comme la classification d’objets, de scènes ou d’actions. / Knowledge transfer is a promising solution for the difficult problem of training deep convolutional neural nets (CNNs) using only small size training datasets with a high intra-class visual variability. In this thesis work, we explore this paradigm to extend the ability of state-of-the-art CNNs for image classification.First, we propose several effective techniques to reduce the training and test-time computational burden associated to CNNs:(i) Using a hybrid method to combine conventional, unsupervised aggregators such as Bag-of-Words (BoW) with CNNs;(ii) Introducing a novel pooling methods within a CNN framework along with non-linear part-based models. The key contribution lies in a technique able to discover useful regions per image involved in the pooling of local representations;In addition, we also propose a novel method to learn the structure of weights in deep neural networks. Experiments are run on challenging datasets with comparisons against state-of-the-art methods. The methods proposed are shown to generalize to different visual recognition tasks, such as object, scene or action classification. Apprentissage Machine Classification d’Images Transfer de connaissances Modèles à Parties Computer Vision Machine Learning Image Classification Transfer Learning Part-Based Models
164	Implications of Conversational AI on Humanoid Robots Soudamalla, Sharath Kumar 09 October 2020 (has links) Humanizing Technologies GmbH develops Intelligent software for the humanoid robots from Softbank Robotics. The main objective of this thesis is to develop and deploy Conversational Artificial Intelligence software into the humanoid robots using deep learning techniques. Development of conversational agents using Machine Learning or Artificial Intelligence is an intriguing issue with regards to Natural Language Processing. Great research and experimentation is being conducted in this area. Currently most of the chatbots are developed with rule based programming that cannot hold conversation which replicates real human interaction. This issue is addressed in this thesis with the development of Deep learning conversational AI based on Sequence to sequence, Attention mechanism, Transfer learning, Active learning and Beam search decoding which emulates human like conversation. The complete end to end conversational AI software is designed, implemented and deployed in this thesis work according to the conceptual specifications. The research objectives are successfully accomplished and results of the proposed concept are dis- cussed in detail. info:eu-repo/classification/ddc/004 ddc:004
165	Improving Biometric Log Detection with Partitioning and Filtering of the Search Space Rajabli, Nijat January 2021 (has links) Tracking of tree logs from a harvesting site to its processing site is a legal requirement for timber-based industries for social and economic reasons. Biometric tree log detection systems use images of the tree logs to track the logs by checking whether a given log image matches any of the logs registered in the system. However, as the number of registered tree logs in the database increases, the number of pairwise comparisons, and consequently the search time increase proportionally. Growing search space degrades the accuracy and the response time of matching queries and slows down the tracking process, costing time and resources. This work introduces database filtering and partitioning approaches based on discriminative log-end features to reduce the search space of the biometric log identification algorithms. In this study, 252 unique log images are used to train and test models for extracting features from the log images and to filter and cluster a database of logs. Experiments are carried out to show the end-to-end accuracy and speed-up impact of the individual approaches as well as the combinations thereof. The findings of this study indicate that the proposed approaches are suited for speeding-up tree log identification systems and highlight further opportunities in this field tree log tracking tree log identification cross-section segmentation search space reduction transfer learning Computer Sciences Datavetenskap (datalogi)
166	Sentiment Analysis of Financial News with Supervised Learning Syeda, Farha Shazmeen January 2020 (has links) Financial data in banks are unstructured and complicated. It is challenging to analyze these texts manually due to the small amount of labeled training data in financial text. Moreover, the financial text consists of language in the economic domain where a general-purpose model is not efficient. In this thesis, data had collected from MFN (Modular Finance) financial news, this data is scraped and persisted in the database and price indices are collected from Bloomberg terminal. Comprehensive study and tests are conducted to find the state-of-art results for classifying the sentiments using traditional classifiers like Naive Bayes and transfer learning models like BERT and FinBERT. FinBERT outperform the Naive Bayes and BERT classifier. The time-series indices for sentiments are built, and their correlations with price indices calculated using Pearson correlation. Augmented Dickey-Fuller (ADF) is used to check if both the time series data are stationary. Finally, the statistical hypothesis Granger causality test determines if the sentiment time series helps predict price. This result shows that there is a significant correlation and causal relation between sentiments and price. Financial news Transfer learning Sentiment classification BERT FinBERT Time series indices Casual inference Computer Sciences Datavetenskap (datalogi)
167	Detecting gastrointestinal abnormalities with binary classification of the Kvasir-Capsule dataset : A TensorFlow deep learning study / Detektering av gastrointenstinentala abnormaliteter med binär klassificering av datasetet Kvasir-Capsule : En TensoFlow djupinlärning studie Hollstensson, Mathias January 2022 (has links) The early discovery of gastrointestinal (GI) disorders can significantly decrease the fatality rate of severe afflictions. Video capsule endoscopy (VCE) is a technique that produces an eight hour long recording of the GI tract that needs to be manually reviewed. This has led to the demand for AI-based solutions, but unfortunately, the lack of labeled data has been a major obstacle. In 2020 the Kvasir-Capsule dataset was produced which is the largest labeled dataset of GI abnormalities to date, but challenges still exist.The dataset suffers from unbalanced and very similar data created from labeled video frames. To avoid specialization to the specific data the creators of the set constructed an official split which is encouraged to use for testing. This study evaluates the use of transfer learning, Data augmentation and binary classification to detect GI abnormalities. The performance of machine learning (ML) classification is explored, with and without official split-based testing. For the performance evaluation, a specific focus will be on achieving a low rate of false negatives. The proposition behind this is that the most important aspect of an automated detection system for GI abnormalities is a low miss rate of possible lethal abnormalities. The results from the controlled experiments conducted in this study clearly show the importance of using official split-based testing. The difference in performance between a model trained and tested on the same set and a model that uses official split-based testing is significant. This enforces that without the use of official split-based testing the model will not produce reliable and generalizable results. When using official split-based testing the performance is improved compared to the initial baseline that is presented with the Kvasir-Capsule set. Some experiments in the study produced results with as low as a 1.56% rate of false negatives but with the cost of lowered performance for the normal class. TensorFlow Image classification Transfer learning Binary classification Data augmentation Video capsule endoscopy Kvasir-Capsule. Computer Sciences Datavetenskap (datalogi)
168	Measuring Porosity in Ceramic Coating using Convolutional Neural Networks and Semantic Segmentation Isaksson, Filip January 2022 (has links) Ceramic materials contain several defects, one of which is porosity. At the time of writing, porosity measurement is a manual and time-consuming process performed by a human operator. With advances in deep learning for computer vision, this thesis explores to what degree convolutional neural networks and semantic segmentation can reliably measure porosity from microscope images. Combining classical image processing techniques with deep learning, images were automatically labeled and then used for training semantic segmentation neural networks leveraging transfer learning. Deep learning-based methods were more robust and could more reliably identify porosity in a larger variety of images than solely relying on classical image processing techniques. Deep Learning Image processing Transfer Learning Material Science
169	OSPREY: Person Re-Identification in the sport of Padel : Utilizing One-Shot Person Re-identification with locally aware transformers to improve tracking Svensson, Måns, Hult, Jim January 2022 (has links) This thesis is concerned with the topic of person re-identification. Many tracking algorithms today cannot keep track of players reentering the scene from different angles and times. Therefore, in this thesis, current literature is explored to gather information about the topic, and a current state-of-the-art model is tested. The person re-identification techniques will be applied to Padel games due to the collaboration with PadelPlay AB. The purpose of the thesis is to keep track of players during full matches of Padel with correct identities. To this, a current state-of-the-art model is applied to an existing tracking algorithm to enhance its capabilities. Furthermore, the purpose is broken down into two research questions. Firstly, how well does an existing person re-id model perform on Padel matches when it comes to keeping a consistent and accurate id on all players. Secondly, how can this model be improved upon to perform better in the new domain, being the sport of Padel? To be able to answer the research questions, a Padel dataset is created for benchmarking purposes. The state-of-the-art model is tested on the new dataset to see how it handles a new domain. Additionally, the same state-of-the-art model is retrained on the Padel dataset to answer the second research question. The results show that the state-of-the-art model that is previously trained on the Market-1501 dataset is highly generalizable on the Padel dataset and performs closely to the new model that is purely trained on the Padel dataset. Although they perform alike, the new model trained on the Padel dataset is slightly better as seen through both the quantitative and qualitative evaluations. Furthermore, the application of re-identification technology to keep track of players yielded significantly higher results than conventional solutions such as YOLOv5 with Deepsort. AI Person Re-identification Re-id Transformer Padel Computer Vision Transfer-learning Deep Learning Computer Sciences Datavetenskap (datalogi)
170	Multi-task regression QSAR/QSPR prediction utilizing text-based Transformer Neural Network and single-task using feature-based models Dimitriadis, Spyridon January 2021 (has links) With the recent advantages of machine learning in cheminformatics, the drug discovery process has been accelerated; providing a high impact in the field of medicine and public health. Molecular property and activity prediction are key elements in the early stages of drug discovery by helping prioritize the experiments and reduce the experimental work. In this thesis, a novel approach for multi-task regression using a text-based Transformer model is introduced and thoroughly explored for training on a number of properties or activities simultaneously. This multi-task regression with Transformer based model is inspired by the field of Natural Language Processing (NLP) which uses prefix tokens to distinguish between each task. In order to investigate our architecture two data categories are used; 133 biological activities from ExCAPE database and three physical chemistry properties from MoleculeNet benchmark datasets. The Transformer model consists of the embedding layer with positional encoding, a number of encoder layers, and a Feedforward Neural Network (FNN) to turn it into a regression problem. The molecules are represented as a string of characters using the Simplified Molecular-Input Line-Entry System (SMILES) which is a ’chemistry language’ with its own syntax. In addition, the effect of Transfer Learning is explored by experimenting with two pretrained Transformer models, pretrained on 1.5 million and on 100 million molecules. The text-base Transformer models are compared with a feature-based Support Vector Regression (SVR) with the Tanimoto kernel where the input molecules are encoded as Extended Connectivity Fingerprint (ECFP), which are calculated features. The results have shown that Transfer Learning is crucial for improving the performance on both property and activity predictions. On bioactivity tasks, the larger pretrained Transformer on 100 million molecules achieved comparable performance to the feature-based SVR model; however, overall SVR performed better on the majority of the bioactivity tasks. On the other hand, on physicochemistry property tasks, the larger pretrained Transformer outperformed SVR on all three tasks. Concluding, the multi-task regression architecture with the prefix token had comparable performance with the traditional feature-based approach on predicting different molecular properties or activities. Lastly, using the larger pretrained models trained on a wide chemical space can play a key role in improving the performance of Transformer models on these tasks. multi-task regression QSAR QSPR deep learning attention based models transfer learning Other Computer and Information Science Annan data- och informationsvetenskap

Search results