Global ETD Search

101	Multilingual Neural Machine Translation for Low Resource Languages Lakew, Surafel Melaku 20 April 2020 (has links) Machine Translation (MT) is the task of mapping a source language to a target language. The recent introduction of neural MT (NMT) has shown promising results for high-resource language, however, poorly performing for low-resource language (LRL) settings. Furthermore, the vast majority of the 7, 000+ languages around the world do not have parallel data, creating a zero-resource language (ZRL) scenario. In this thesis, we present our approach to improving NMT for LRL and ZRL, leveraging a multilingual NMT modeling (M-NMT), an approach that allows building a single NMT to translate across multiple source and target languages. This thesis begins by i) analyzing the effectiveness of M-NMT for LRL and ZRL translation tasks, spanning two NMT modeling architectures (Recurrent and Transformer), ii) presents a self-learning approach for improving the zero-shot translation directions of ZRLs, iii) proposes a dynamic transfer-learning approach from a pre-trained (parent) model to a LRL (child) model by tailoring to the vocabulary entries of the latter, iv) extends M-NMT to translate from a source language to specific language varieties (e.g. dialects), and finally, v) proposes an approach that can control the verbosity of an NMT model output. Our experimental findings show the effectiveness of the proposed approaches in improving NMT of LRLs and ZRLs.
102	Low-Resource Natural Language Understanding in Task-Oriented Dialogue Louvan, Samuel 11 March 2022 (has links) Task-oriented dialogue (ToD) systems need to interpret the user's input to understand the user's needs (intent) and corresponding relevant information (slots). This process is performed by a Natural Language Understanding (NLU) component, which maps the text utterance into a semantic frame representation, involving two subtasks: intent classification (text classification) and slot filling (sequence tagging). Typically, new domains and languages are regularly added to the system to support more functionalities. Collecting domain-specific data and performing fine-grained annotation of large amounts of data every time a new domain and language is introduced can be expensive. Thus, developing an NLU model that generalizes well across domains and languages with less labeled data (low-resource) is crucial and remains challenging. This thesis focuses on investigating transfer learning and data augmentation methods for low-resource NLU in ToD. Our first contribution is a study of the potential of non-conversational text as a source for transfer. Most transfer learning approaches assume labeled conversational data as the source task and adapt the NLU model to the target task. We show that leveraging similar tasks from non-conversational text improves performance on target slot filling tasks through multi-task learning in low-resource settings. Second, we propose a set of lightweight augmentation methods that apply data transformation on token and sentence levels through slot value substitution and syntactic manipulation. Despite its simplicity, the performance is comparable to deep learning-based augmentation models, and it is effective on six languages on NLU tasks. Third, we investigate the effectiveness of domain adaptive pre-training for zero-shot cross-lingual NLU. In terms of overall performance, continued pre-training in English is effective across languages. This result indicates that the domain knowledge learned in English is transferable to other languages. In addition to that, domain similarity is essential. We show that intermediate pre-training data that is more similar – in terms of data distribution – to the target dataset yields better performance.
103	Verbesserung von maschinellen Lernmodellen durch Transferlernen zur Zeitreihenprognose im Radial-Axial Ringwalzen Seitz, Johannes, Wang, Qinwen, Moser, Tobias, Brosius, Alexander, Kuhlenkötter, Bernd 28 November 2023 (has links) Anwendung von maschinellen Lernverfahren (ML) in der Produktionstechnik, in Zeiten der Industrie 4.0, stark angestiegen. Insbesondere die Datenverfügbarkeit ist an dieser Stelle elementar und für die erfolgreiche Umsetzung einer ML-Applikation Voraussetzung. Falls für eine gegebene Problemstellung die Datenmenge oder -qualität nicht ausreichend ist, können Techniken, wie die Datenaugmentierung, der Einsatz von synthetischen Daten sowie das Transferlernen von ähnlichen Datensätzen Abhilfe schaffen. Innerhalb dieser Ausarbeitung wird das Konzept des Transferlernens im Bereich das Radial-Axial Ringwalzens (RAW) angewendet und am Beispiel der Zeitreihenprognose des Außendurchmessers über die Prozesszeit durchgeführt. Das Radial-Axial Ringwalzen ist ein warmumformendes Verfahren und dient der nahtlosen Ringherstellung.
104	Deep Transferable Intelligence for Wearable Big Data Pattern Detection Gangadharan, Kiirthanaa 08 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Biomechanical Big Data is of great significance to precision health applications, among which we take special interest in Physical Activity Detection (PAD). In this study, we have performed extensive research on deep learning-based PAD from biomechanical big data, focusing on the challenges raised by the need for real-time edge inference. First, considering there are many places we can place the motion sensors, we have thoroughly compared and analyzed the location difference in terms of deep learning-based PAD performance. We have further compared the difference among six sensor channels (3-axis accelerometer and 3-axis gyroscope). Second, we have selected the optimal sensor and the optimal sensor channel, which can not only provide sensor usage suggestions but also enable ultra-lowpower application on the edge. Third, we have investigated innovative methods to minimize the training effort of the deep learning model, leveraging the transfer learning strategy. More specifically, we propose to pre-train a transferable deep learning model using the data from other subjects and then fine-tune the model using limited data from the target-user. In such a way, we have found that, for single-channel case, the transfer learning can effectively increase the deep model performance even when the fine-tuning effort is very small. This research, demonstrated by comprehensive experimental evaluation, has shown the potential of ultra-low-power PAD with minimized sensor stream, and minimized training effort. / 2023-06-01 Wearable Computer Machine learning Deep learning Convolutional Neural Network Transfer learning Performance analysis Accuracy ranking
105	Deep Learning-based Domain Adaptation Methodology for Fault Diagnosis of Complex Manufacturing Systems Azamfar, Moslem 28 June 2021 (has links) No description available. Engineering transfer learning fault diagnosis deep learning gearbox ball screw semiconductor
106	Sharing to learn and learning to share : Fitting together metalearning and multi-task learning Upadhyay, Richa January 2023 (has links) This thesis focuses on integrating learning paradigms that ‘share to learn,’ i.e., Multitask Learning (MTL), and ‘learn (how) to share,’ i.e., meta learning. MTL involves learning several tasks simultaneously within a shared network structure so that the tasks can mutually benefit each other’s learning. While meta learning, better known as ‘learning to learn,’ is an approach to reducing the amount of time and computation required to learn a novel task by leveraging on knowledge accumulated over the course of numerous training episodes of various tasks. The learning process in the human brain is innate and natural. Even before birth, it is capable of learning and memorizing. As a consequence, humans do not learn everything from scratch, and because they are naturally capable of effortlessly transferring their knowledge between tasks, they quickly learn new skills. Humans naturally tend to believe that similar tasks have (somewhat) similar solutions or approaches, so sharing knowledge from a previous activity makes it feasible to learn a new task quickly in a few tries. For instance, the skills acquired while learning to ride a bike are helpful when learning to ride a motorbike, which is, in turn, helpful when learning to drive a car. This natural learning process, which involves sharing information between tasks, has inspired a few research areas in Deep Learning (DL), such as transfer learning, MTL, meta learning, Lifelong Learning (LL), and many more, to create similar neurally-weighted algorithms. These information-sharing algorithms exploit the knowledge gained from one task to improve the performance of another related task. However, they vary in terms of what information they share, when to share, and why to share. This thesis focuses particularly on MTL and meta learning, and presents a comprehensive explanation of both the learning paradigms. A theoretical comparison of both algorithms demonstrates that the strengths of one can outweigh the constraints of the other. Therefore, this work aims to combine MTL and meta learning to attain the best of both worlds. The main contribution of this thesis is Multi-task Meta Learning (MTML), an integration of MTL and meta learning. As the gradient (or optimization) based metalearning follows an episodic approach to train a network, we propose multi-task learning episodes to train a MTML network in this work. The basic idea is to train a multi-task model using bi-level meta-optimization so that when a new task is added, it can learn in fewer steps and perform at least as good as traditional single-task learning on the new task. The MTML paradigm is demonstrated on two publicly available datasets – the NYU-v2 and the taskonomy dataset, for which four tasks are considered, i.e., semantic segmentation, depth estimation, surface normal estimation, and edge detection. This work presents a comparative empirical analysis of MTML to single-task and multi-task learning, where it is evident that MTML excels for most tasks. The future direction of this work includes developing efficient and autonomous MTL architectures by exploiting the concepts of meta learning. The main goal will be to create a task-adaptive MTL, where meta learning may learn to select layers (or features) from the shared structure for every task because not all tasks require the same highlevel, fine-grained features from the shared network. This can be seen as another way of combining MTL and meta learning. It will also introduce modular learning in the multi-task architecture. Furthermore, this work can be extended to include multi-modal multi-task learning, which will help to study the contributions of each input modality to various tasks. Multi-task learning Meta learning transfer learning knowledge sharing algorithms Computer Systems Datorsystem
107	Generalization and Automation of Machine Learning-Based Intelligent Fault Classification for Rotating Machinery Larocque-Villiers, Justin 29 January 2024 (has links) This thesis leverages vibration-based unsupervised learning and deep transfer learning to reduce the manual labour involved in building algorithms that perform intelligent fault detection (IFD) on roller element bearings. A review of theory and literature in the field of IFD is presented, and challenges are discussed. An issue is then introduced; current machine learning models built for IFD show strong performance on a small subset of specific data, but do not generalize to a broader range of applications. Signal processing, machine learning, and transfer learning concepts are then explained and discussed. Time-frequency fingerprinting, as well as feature engineering, is used in conjunction with principal component analysis (PCA) to prepare vibration signals to be clustered by a gaussian mixture model (GMM). This process allows for the intelligent referral of data towards algorithms that have performed well on similar datasets and favours the re-use of domain-specific tasks. An algorithm is then proposed that promotes generalization in convolutional neural networks (CNNs) and simplifies the hyperparameter tuning process to allow machine learning models to be applied to a broader set of problems. The machine learning process is then automated as much as possible through meta learning and ensemble models: data similarity measurements are used to evaluate the data fit for transfer and propose training guidelines. Throughout the thesis, three open-source bearing fault datasets are used to test and validate the hypotheses. This thesis focuses on developing and adapting current deep learning models to succeed in challenging domains and real-world scenarios, while improving performance with unsupervised learning and transfer learning. Predictive Maintenance Machine Learning Transfer Learning Sensors Convolutional Neural Networks Signal Processing
108	Knowledge Extraction from Biomedical Literature with Symbolic and Deep Transfer Learning Methods Ramponi, Alan 30 June 2021 (has links) The available body of biomedical literature is increasing at a high pace, exceeding the ability of researchers to promptly leverage this knowledge-rich amount of information. Although the outstanding progress in natural language processing (NLP) we observed in the past few years, current technological advances in the field mainly concern newswire and web texts, and do not directly translate in good performance on highly specialized domains such as biomedicine due to linguistic variations along surface, syntax and semantic levels. Given the advances in NLP and the challenges the biomedical domain exhibits, and the explosive growth of biomedical knowledge being currently published, in this thesis we contribute to the biomedical NLP field by providing efficient means for extracting semantic relational information from biomedical literature texts. To this end, we made the following contributions towards the real-world adoption of knowledge extraction methods to support biomedicine: (i) we propose a symbolic high-precision biomedical relation extraction approach to reduce the time-consuming manual curation efforts of extracted relational evidence (Chapter 3), (ii) we conduct a thorough cross-domain study to quantify the drop in performance of deep learning methods for biomedical edge detection shedding lights on the importance of linguistic varieties in biomedicine (Chapter 4), and (iii) we propose a fast and accurate end-to-end solution for biomedical event extraction, leveraging sequential transfer learning and multi-task learning, making it a viable approach for real-world large-scale scenarios (Chapter 5). We then outline the conclusions by highlighting challenges and providing future research directions in the field. Settore INF/01 - Informatica
109	Transfer Learning and Attention Mechanisms in a Multimodal Setting Greco, Claudio 13 May 2022 (has links) Humans are able to develop a solid knowledge of the world around them: they can leverage information coming from different sources (e.g., language, vision), focus on the most relevant information from the input they receive in a given life situation, and exploit what they have learned before without forgetting it. In the field of Artificial Intelligence and Computational Linguistics, replicating these human abilities in artificial models is a major challenge. Recently, models based on pre-training and on attention mechanisms, namely pre-trained multimodal Transformers, have been developed. They seem to perform tasks surprisingly well compared to other computational models in multiple contexts. They simulate a human-like cognition in that they supposedly rely on previously acquired knowledge (transfer learning) and focus on the most important information (attention mechanisms) of the input. Nevertheless, we still do not know whether these models can deal with multimodal tasks that require merging different types of information simultaneously to be solved, as humans would do. This thesis attempts to fill this crucial gap in our knowledge of multimodal models by investigating the ability of pre-trained Transformers to encode multimodal information; and the ability of attention-based models to remember how to deal with previously-solved tasks. With regards to pre-trained Transformers, we focused on their ability to rely on pre-training and on attention while dealing with tasks requiring to merge information coming from language and vision. More precisely, we investigate if pre-trained multimodal Transformers are able to understand the internal structure of a dialogue (e.g., organization of the turns); to effectively solve complex spatial questions requiring to process different spatial elements (e.g., regions of the image, proximity between elements, etc.); and to make predictions based on complementary multimodal cues (e.g., guessing the most plausible action by leveraging the content of a sentence and of an image). The results of this thesis indicate that pre-trained Transformers outperform other models. Indeed, they are able to some extent to integrate complementary multimodal information; they manage to pinpoint both the relevant turns in a dialogue and the most important regions in an image. These results suggest that pre-training and attention play a key role in pre-trained Transformers’ encoding. Nevertheless, their way of processing information cannot be considered as human-like. Indeed, when compared to humans, they struggle (as non-pre-trained models do) to understand negative answers, to merge spatial information in difficult questions, and to predict actions based on complementary linguistic and visual cues. With regards to attention-based models, we found out that these kinds of models tend to forget what they have learned in previously-solved tasks. However, training these models on easy tasks before more complex ones seems to mitigate this catastrophic forgetting phenomenon. These results indicate that, at least in this context, attention-based models (and, supposedly, pre-trained Transformers too) are sensitive to tasks’ order. A better control of this variable may therefore help multimodal models learn sequentially and continuously as humans do. Settore INF/01 - Informatica
110	Learning Transferable Features for Diagnosis of Breast Cancer from Histopathological Images Al Zorgani, Maisun M., Irfan, Mehmood,, Ugail, Hassan 25 March 2022 (has links) No / Nowadays, there is no argument that deep learning algorithms provide impressive results in many applications of medical image analysis. However, data scarcity problem and its consequences are challenges in implementation of deep learning for the digital histopathology domain. Deep transfer learning is one of the possible solutions for these challenges. The method of off-the-shelf features extraction from pre-trained convolutional neural networks (CNNs) is one of the common deep transfer learning approaches. The architecture of deep CNNs has a significant role in the choice of the optimal learning transferable features to adopt for classifying the cancerous histopathological image. In this study, we have investigated three pre-trained CNNs on ImageNet dataset; ResNet-50, DenseNet-201 and ShuffleNet models for classifying the Breast Cancer Histopathology (BACH) Challenge 2018 dataset. The extracted deep features from these three models were utilised to train two machine learning classifiers; namely, the K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) to classify the breast cancer grades. Four grades of breast cancer were presented in the BACH challenge dataset; these grades namely normal tissue, benign tumour, in-situ carcinoma and invasive carcinoma. The performance of the target classifiers was evaluated. Our experimental results showed that the extracted off-the-shelf features from DenseNet-201 model provide the best predictive accuracy using both SVM and KNN classifiers. They yielded the image-wise classification accuracy of 93.75% and 88.75% for SVM and KNN classifiers, respectively. These results indicate the high robustness of our proposed framework. Breast cancer Deep transfer learning Machine learning classifier Histopathological image classification

Search results