Spelling suggestions: "subject:"multitask learning"" "subject:"multimask learning""
11 |
Méthodes d’ensembles pour l’apprentissage multi-tâche avec des tâches hétérogènes et sans restrictions / Ensemble Methods to Learn Multiple Heterogenous Tasks without RestrictionsFaddoul, Jean-Baptiste 18 June 2012 (has links)
Apprendre des tâches simultanément peut améliorer la performance de prédiction par rapport à l'apprentissage de ces tâches de manière indépendante. Dans cette thèse, nous considérons l'apprentissage multi-tâche lorsque le nombre de tâches est grand. En outre, nous débattons des restrictions imposées sur les tâches. Ces restrictions peuvent être trouvées dans les méthodes de l'état de l'art. Plus précisément on trouve les restrictions suivantes : l'imposition du même espace d'étiquette sur les tâches, l'exigence des mêmes exemples d'apprentissage entre tâches et / ou supposant une hypothèse de corrélation globale entre tâches. Nous proposons des nouveaux classificateurs multi-tâches qui relaxent les restrictions précédentes. Nos classificateurs sont considérés en fonction de la théorie de l'apprentissage PAC des classifieurs faibles, donc, afin de parvenir à un faible taux d'erreur de classification, un ensemble de ces classifieurs faibles doivent être appris. Ce cadre est appelé l'apprentissage d'ensembles, dans lequel nous proposons un algorithme d'apprentissage multi-tâche inspiré de l'algorithme Adaboost pour seule tâche. Différentes variantes sont proposées également, à savoir, les forêts aléatoires pour le multi-tâche, c'est une méthode d'apprentissage d'ensemble, mais fondée sur le principe statistique d'échantillonnage Bootstrap. Enfin, nous donnons une validation expérimentale qui montre que l'approche sur-performe des méthodes existantes et permet d'apprendre des nouvelles configurations de tâches qui ne correspondent pas aux méthodes de l'état de l'art. / Learning multiple related tasks jointly by exploiting their underlying shared knowledge can improve the predictive performance on every task compared to learning them individually. In this thesis, we address the problem of multi-task learning (MTL) when the tasks are heterogenous: they do not share the same labels (eventually with different number of labels), they do not require shared examples. In addition, no prior assumption about the relatedness pattern between tasks is made. Our contribution to multi-task learning lies in the framework of en- semble learning where the learned function consists normally of an ensemble of "weak " hypothesis aggregated together by an ensemble learning algorithm (Boosting, Bagging, etc.). We propose two approaches to cope with heterogenous tasks without making prior assumptions about the relatedness patterns. For each approach, we devise novel multi-task weak hypothesis along with their learning algorithms then we adapt a boosting algorithm to the multi-task setting. In the first approach, the weak classi ers we consider are 2-level decision stumps for di erent tasks. A weak classi er assigns a class to each instance on two tasks and abstain on other tasks. The weak classi ers allow to handle dependencies between tasks on the instance space. We introduce di fferent effi cient weak learners. We then consider Adaboost with weak classi ers which can abstain and adapt it to multi-task learning. In an empirical study, we compare the weak learners and we study the influence of the number of boosting rounds. In the second approach, we develop the multi-task Adaboost environment with Multi-Task Decision Trees as weak classi ers. We fi rst adapt the well known decision tree learning to the multi-task setting. We revise the information gain rule for learning decision trees in the multi-task setting. We use this feature to develop a novel criterion for learning Multi-Task Decision Trees. The criterion guides the tree construction by learning the decision rules from data of di fferent tasks, and representing diff erent degrees of task relatedness. We then modify MT-Adaboost to combine Multi-task Decision Trees as weak learners. We experimentally validate the advantage of our approaches; we report results of experiments conducted on several multi-task datasets, including the Enron email set and Spam Filtering collection.
|
12 |
A Unified Generative and Discriminative Approach to Automatic Chord Estimation for Music Audio Signals / 音楽音響信号に対する自動コード推定のための生成・識別統合的アプローチWu, Yiming 24 September 2021 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第23540号 / 情博第770号 / 新制||情||131(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)准教授 吉井 和佳, 教授 河原 達也, 教授 西野 恒, 教授 鹿島 久嗣 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
13 |
Multi-Task Learning SegNet Architecture for Semantic SegmentationSorg, Bradley R. January 2018 (has links)
No description available.
|
14 |
Incorporating Meta Information for Speech Recognition of Low-resource Language / 低資源言語の音声認識のためのメタ情報の活用SOKY, KAK 23 March 2023 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第24729号 / 情博第817号 / 新制||情||137(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 河原 達也, 教授 黒橋 禎夫, 教授 森 信介 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
15 |
On Kernel-base Multi-Task LearningLi, Cong 01 January 2014 (has links)
Multi-Task Learning (MTL) has been an active research area in machine learning for two decades. By training multiple relevant tasks simultaneously with information shared across tasks, it is possible to improve the generalization performance of each task, compared to training each individual task independently. During the past decade, most MTL research has been based on the Regularization-Loss framework due to its flexibility in specifying various types of information sharing strategies, the opportunity it offers to yield a kernel-based methods and its capability in promoting sparse feature representations. However, certain limitations exist in both theoretical and practical aspects of Regularization-Loss-based MTL. Theoretically, previous research on generalization bounds in connection to MTL Hypothesis Space (HS)s, where data of all tasks are pre-processed by a (partially) common operator, has been limited in two aspects: First, all previous works assumed linearity of the operator, therefore completely excluding kernel-based MTL HSs, for which the operator is potentially non-linear. Secondly, all previous works, rather unnecessarily, assumed that all the task weights to be constrained within norm-balls, whose radii are equal. The requirement of equal radii leads to significant inflexibility of the relevant HSs, which may cause the generalization performance of the corresponding MTL models to deteriorate. Practically, various algorithms have been developed for kernel-based MTL models, due to different characteristics of the formulations. Most of these algorithms are a burden to develop and end up being quite sophisticated, so that practitioners may face a hard task in interpreting and implementing them, especially when multiple models are involved. This is even more so, when Multi-Task Multiple Kernel Learning (MT-MKL) models are considered. This research largely resolves the above limitations. Theoretically, a pair of new kernel-based HSs are proposed: one for single-kernel MTL, and another one for MT-MKL. Unlike previous works, we allow each task weight to be constrained within a norm-ball, whose radius is learned during training. By deriving and analyzing the generalization bounds of these two HSs, we show that, indeed, such a flexibility leads to much tighter generalization bounds, which often results to significantly better generalization performance. Based on this observation, a pair of new models is developed, one for each case: single-kernel MTL, and another one for MT-MKL. From a practical perspective, we propose a general MT-MKL framework that covers most of the prominent MT-MKL approaches, including our new MT-MKL formulation. Then, a general purpose algorithm is developed to solve the framework, which can also be employed for training all other models subsumed by this framework. A series of experiments is conducted to assess the merits of the proposed mode when trained by the new algorithm. Certain properties of our HSs and formulations are demonstrated, and the advantage of our model in terms of classification accuracy is shown via these experiments.
|
16 |
Sharing to learn and learning to share : Fitting together metalearning and multi-task learningUpadhyay, Richa January 2023 (has links)
This thesis focuses on integrating learning paradigms that ‘share to learn,’ i.e., Multitask Learning (MTL), and ‘learn (how) to share,’ i.e., meta learning. MTL involves learning several tasks simultaneously within a shared network structure so that the tasks can mutually benefit each other’s learning. While meta learning, better known as ‘learning to learn,’ is an approach to reducing the amount of time and computation required to learn a novel task by leveraging on knowledge accumulated over the course of numerous training episodes of various tasks. The learning process in the human brain is innate and natural. Even before birth, it is capable of learning and memorizing. As a consequence, humans do not learn everything from scratch, and because they are naturally capable of effortlessly transferring their knowledge between tasks, they quickly learn new skills. Humans naturally tend to believe that similar tasks have (somewhat) similar solutions or approaches, so sharing knowledge from a previous activity makes it feasible to learn a new task quickly in a few tries. For instance, the skills acquired while learning to ride a bike are helpful when learning to ride a motorbike, which is, in turn, helpful when learning to drive a car. This natural learning process, which involves sharing information between tasks, has inspired a few research areas in Deep Learning (DL), such as transfer learning, MTL, meta learning, Lifelong Learning (LL), and many more, to create similar neurally-weighted algorithms. These information-sharing algorithms exploit the knowledge gained from one task to improve the performance of another related task. However, they vary in terms of what information they share, when to share, and why to share. This thesis focuses particularly on MTL and meta learning, and presents a comprehensive explanation of both the learning paradigms. A theoretical comparison of both algorithms demonstrates that the strengths of one can outweigh the constraints of the other. Therefore, this work aims to combine MTL and meta learning to attain the best of both worlds. The main contribution of this thesis is Multi-task Meta Learning (MTML), an integration of MTL and meta learning. As the gradient (or optimization) based metalearning follows an episodic approach to train a network, we propose multi-task learning episodes to train a MTML network in this work. The basic idea is to train a multi-task model using bi-level meta-optimization so that when a new task is added, it can learn in fewer steps and perform at least as good as traditional single-task learning on the new task. The MTML paradigm is demonstrated on two publicly available datasets – the NYU-v2 and the taskonomy dataset, for which four tasks are considered, i.e., semantic segmentation, depth estimation, surface normal estimation, and edge detection. This work presents a comparative empirical analysis of MTML to single-task and multi-task learning, where it is evident that MTML excels for most tasks. The future direction of this work includes developing efficient and autonomous MTL architectures by exploiting the concepts of meta learning. The main goal will be to create a task-adaptive MTL, where meta learning may learn to select layers (or features) from the shared structure for every task because not all tasks require the same highlevel, fine-grained features from the shared network. This can be seen as another way of combining MTL and meta learning. It will also introduce modular learning in the multi-task architecture. Furthermore, this work can be extended to include multi-modal multi-task learning, which will help to study the contributions of each input modality to various tasks.
|
17 |
Transformer Networks for Smart Cities: Framework and Application to Makassar Smart Garden AlleysDeRieux, Alexander Christian 09 September 2022 (has links)
Many countries around the world are undergoing massive urbanization campaigns at an unprecedented rate, heralded by promises of economical prosperity and bolstered population health and well-being. Projections indicate that by 2050, nearly 68% of the world populace will reside in these urban environments. However, rapid growth at such an exceptional scale poses unique challenges pertaining to environmental quality and food production, which can negate the effectiveness of the aforementioned boons. As such, there is an emphasis on mitigating these negative effects through the construction of smart and connected communities (S&CC), which integrate both artificial intelligence (AI) and the Internet of Things (IoT). This coupling of intelligent technologies also poses interesting system design challenges pertaining to the fusion of the diverse, heterogeneous datasets available to IoT environments, and the ability to learn multiple S&CC problem sets concurrently. Attention-based Transformer networks are of particular interest given their success across diverse fields of natural language processing (NLP), computer vision, time-series regression, and multi-modal data fusion in recent years. This begs the question whether Transformers can be further diversified to leverage fusions of IoT data sources for heterogeneous multi-task learning in S&CC trade spaces. This is a fundamental question that this thesis seeks to answer. Indeed, the key contribution of this thesis is the design and application of Transformer networks for developing AI systems in emerging smart cities. This is executed within a collaborative U.S.-Indonesia effort between Virginia Tech, the University of Colorado Boulder, the Universitas Gadjah Mada, and the Institut Teknologi Bandung with the goal of growing smart and sustainable garden alleys in Makassar City, Indonesia. Specifically, a proof-of-concept AI nerve-center is proposed using a backbone of pure-encoder Transformer architectures to learn a diverse set of tasks such as multivariate time-series regression, visual plant disease classification, and image-time-series fusion. To facilitate the data fusion tasks, an effective algorithm is also proposed to synthesize heterogeneous feature sets, such as multivariate time-series and time-correlated images. Moreover, a hyperparameter tuning framework is also proposed to standardize and automate model training regimes. Extensive experimentation shows that the proposed Transformer-based systems can handle various input data types via custom sequence embedding techniques, and are naturally suited to learning a diverse set of tasks. Further, the results also show that multi-task learners increase both memory and computational efficiency while maintaining comparable performance to both single-task variants, and non-Transformer baselines. This demonstrates the flexibility of Transformer networks to learn from a fusion of IoT data sources, their applicability in S&CC trade spaces, and their further potential for deployment on edge computing devices. / Master of Science / Many countries around the world are undergoing massive urbanization campaigns at an unprecedented rate, heralded by promises of economical prosperity and bolstered population health and well-being. Projections indicate that by 2050, nearly 68% of the world populace will reside in these urban environments. However, rapid growth at such an exceptional scale poses unique environmental and food cultivation challenges. Hence, there is a focus on reducing these negative effects through building smart and connected communities (S&CC). The term connected is derived from the integration of small, low-cost devices which gather information from the surrounding environment, called the Internet of Things (IoT). Likewise, smart is a term derived from the integration of artificial intelligence (AI), which is used to make informed decisions based on IoT-collected information. This coupling of intelligent technologies also poses its own unique challenges pertaining to the blending of IoT data with highly diverse characteristics. Of specific interest is the design of AI models that can not only learn from a fusion of this diverse information, but also learn to perform multiple tasks in parallel. Attention-based networks are a relatively new category of AI which learn to focus on, or attend to, the most important portions of an arbitrary data sequence. Transformers are AI models which are designed using attention as their backbone, and have been employed to much success in many fields in recent years. This success begs the question whether Transformers can be further extended to put the smart in S&CC. The overarching goal of this thesis is to design and implement a Transformer-based AI system for emerging smart cities. In particular, this is accomplished within a U.S.-Indonesia collaborative effort with the goal of growing smart and sustainable garden alleys in Makassar City, Indonesia.
|
18 |
Attention-based Multi-Behavior Sequential Network for E-commerce Recommendation / Rekommendation för uppmärksamhetsbaserat multibeteende sekventiellt nätverk för e-handelLi, Zilong January 2022 (has links)
The original intention of the recommender system is to solve the problem of information explosion, hoping to help users find the content they need more efficiently. In an e-commerce platform, users typically interact with items that they are interested in or need in a variety of ways. For example, buying, browsing details, etc. These interactions are recorded as time-series information. How to use this sequential information to predict user behaviors in the future and give an efficient and effective recommendation is a very important problem. For content providers, such as merchants in e-commerce platforms, more accurate recommendation means higher traffic, CTR (click-through rate), and revenue. Therefore, in the industry, the CTR model for recommendation systems is a research hotspot. However, in the fine ranking stage of the recommendation system, the existing models have some limitations. No researcher has attempted to predict multiple behaviors of one user simultaneously by processing sequential information. We define this problem as the multi-task sequential recommendation problem. In response to this problem, we study the CTR model, sequential recommendation, and multi-task learning. Based on these studies, this paper proposes AMBSN (Attention-based Multi-Behavior Sequential Network). Specifically, we added a transformer layer, the activation unit, and the multi-task tower to the traditional Embedding&MLP (multi-layer perceptron) model. The transformer layer enables our model to efficiently extract sequential behavior information, the activation unit can understand user interests, and the multi-task tower structure makes the model give the prediction of different user behaviors at the same time. We choose user behavior data from Taobao for recommendation published on TianChi as the dataset, and AUC as the evaluation criterion. We compare the performance of AMBSN and some other models on the test set after training. The final results of the experiment show that our model outperforms some existing models. / L’intenzione originale del sistema di raccomandazione è risolvere il problema dell’esplosione delle informazioni, sperando di aiutare gli utenti a trovare il contenuto di cui hanno bisogno in modo più efficiente. In una piattaforma di e-commerce, gli utenti in genere interagiscono con gli articoli a cui sono interessati o di cui hanno bisogno in vari modi. Ad esempio, acquisti, dettagli di navigazione, ecc. Queste interazioni vengono registrate come informazioni di serie temporali. Come utilizzare queste informazioni sequenziali per prevedere i comportamenti degli utenti in futuro e fornire una raccomandazione efficiente ed efficace è un problema molto importante. Per i fornitori di contenuti, come i commercianti nelle piattaforme di e-commerce, una raccomandazione più accurata significa traffico, CTR (percentuale di clic) ed entrate più elevati. Pertanto, nel settore, il modello CTR per i sistemi di raccomandazione è un hotspot di ricerca. Tuttavia, nella fase di classificazione fine del sistema di raccomandazione, i modelli esistenti presentano alcune limitazioni. Nessun ricercatore ha tentato di prevedere più comportamenti di un utente contemporaneamente elaborando informazioni sequenziali. Definiamo questo problema come il problema di raccomandazione sequenziale multi-task. In risposta a questo problema, studiamo il modello CTR, la raccomandazione sequenziale e l’apprendimento multi-task. Sulla base di questi studi, questo documento propone AMBSN (Attention-based Multi-Behavior Sequential Network). In particolare, abbiamo aggiunto uno strato trasformatore, l’unità di attivazione e la torre multi-task al tradizionale modello Embedding&MLP (multi-layer perceptron). Il livello del trasformatore consente al nostro modello di estrarre in modo efficiente le informazioni sul comportamento sequenziale, l’unità di attivazione può comprendere gli interessi degli utenti e la struttura della torre multi-task fa sì che il modello fornisca la previsione di diversi comportamenti degli utenti contemporaneamente. Scegliamo i dati sul comportamento degli utenti da Taobao per la raccomandazione pubblicata su TianChi come set di dati e l’AUC come criterio di valutazione. Confrontiamo le prestazioni di AMBSN e di alcuni altri modelli sul set di test dopo l’allenamento. I risultati finali dell’esperimento mostrano che il nostro modello supera alcuni modelli esistenti.
|
19 |
Modélisation d’un parc de machines pour la surveillance. : Application aux composants en centrale nucléaire / Modelling a fleet of machines for their diagnosis. : Application to nuclear power plants componentsAnkoud, Farah 12 December 2011 (has links)
Cette thèse porte sur la conception de méthodes de surveillance de système à partir de données collectées sur des composants de conceptions identiques exploités par plusieurs processus. Nous nous sommes intéressés aux approches de diagnostic sans modèle a priori et plus particulièrement à l'élaboration des modèles de bon fonctionnement des composants à partir des données collectées sur le parc. Nous avons ainsi abordé ce problème comme un problème d'apprentissage multi-tâches qui consiste à élaborer conjointement les modèles de chaque composant, l'hypothèse sous-jacente étant que ces modèles partagent des parties communes. Dans le deuxième chapitre, on considère, dans un premier temps, des modèles linéaires de type multi-entrées/mono-sortie, ayant des structures a priori connues. Dans une première approche, après une phase d'analyse des modèles obtenus par régression linéaire pour les machines prises indépendamment les unes des autres, on identifie leurs parties communes, puis on procède à une nouvelle estimation des coefficients des modèles pour tenir compte des parties communes. Dans une seconde approche, on identifie simultanément les coefficients des modèles ainsi que leurs parties communes. Dans un deuxième temps, on cherche à obtenir directement les relations de redondance existant entre les variables mesurées par l'ACP. On s'affranchit alors des hypothèses sur la connaissance des structures des modèles et on prend en compte la présence d'erreurs sur l'ensemble des variables. Dans un troisième chapitre, une étude de la discernabilité des modèles est réalisée. Il s'agit de déterminer les domaines de variation des variables d'entrée garantissant la discernabilité des sorties des modèles. Ce problème d'inversion ensembliste est résolu soit en utilisant des pavés circonscrits aux différents domaines soit une approximation par pavage de ces domaines. Finalement, une application des approches proposées est réalisée sur des simulateurs d'échangeurs thermiques / This thesis deals with the conception of diagnosis systems using the data collected on identical machines working under different conditions. We are interested in the fault diagnosis method without a priori model and in modelling a fleet of machines using the data collected on all the machines. Hence, the problem can be formulated as a multi-task learning problem where models of the different machines are constructed simultaneously. These models are supposed to share some common parts. In the second chapter, we first consider linear models of type multiple-input/single-output. A first approach consists in analyzing the linear regression models generated using the data of each machine independently from the others in order to identify their common parts. Using this knowledge, new models for the machines are generated. The second approach consists in identifying simultaneously the coefficients of the models and their common parts. Secondly, the redundancy models are searched for using PCA. This way, no hypothesis on the knowledge of the structures of models describing the normal behavior of each machine is needed. In addition, this method allows to take into consideration the errors existing on all the variables since it does not differentiate between input or output variables. In the third chapter, a study on the discernibility of the outputs of the models is realized. The problem consists in identifying the range of variation of the input variables leading to discernible outputs of the models. This problem is solved using either the confined pavements to the different domains or a pavement method. Finally, the multi-task modelling approaches are applied on simulators of heat exchangers
|
20 |
Apprentissage multi-cibles : théorie et applications / Multi-output learning : theory and applications.Moura, Simon 17 December 2018 (has links)
Cette thèse traite du problème de l'apprentissage automatique supervisé dans le cas ou l'on considère plusieurs sorties, potentiellement de différent types. Nous proposons d'explorer trois différents axes de recherche en rapport avec ce sujet. Dans un premier temps, nous nous concentrons sur le cas homogène et proposons un cadre théorique pour étudier la consistance des problèmes multi-labels dans le cas de l'utilisation de chaîne de classifieurs. Ensuite, en nous plaçant dans ce cadre, nous proposons une borne de Rademacher sur l'erreur de généralisation pour tous les classifieurs de la chaîne et exposons deux facteurs de dépendance reliant les sorties les unes aux autres. Dans un deuxième temps, nous développons et analysons la performance de modèles en lien avec la théorie proposée. Toujours dans le cadre de l'apprentissage avec plusieurs sorties homogènes, nous proposons un modèle basé sur des réseaux de neurones pour l'analyse de sentiments à grain fin. Enfin, nous proposons un cadre et une étude empirique qui montrent la pertinence de l'apprentissage multi-objectif dans le cas de multiples sorties hétérogènes. / In this thesis, we study the problem of learning with multiple outputs related to different tasks, such as classification and ranking. In this line of research, we explored three different axes. First we proposed a theoretical framework that can be used to show the consistency of multi-label learning in the case of classifier chains, where outputs are homogeneous. Based on this framework, we proposed Rademacher generalization error bound made by any classifier in the chain and exhibit dependency factors relating each output to the others. As a result, we introduced multiple strategies to learn classifier chains and select an order for the chain. Still focusing on the homogeneous multi-output framework, we proposed a neural network based solution for fine-grained sentiment analysis and show the efficiency of the approach. Finally, we proposed a framework and an empirical study showing the interest of learning with multiple tasks, even when the outputs are of different types.
|
Page generated in 0.0569 seconds