Global ETD Search

241	Tools O' the Times : understanding the common proporties of species interaction networks across space Strydom, Tanya 11 1900 (has links) Le domaine de l’écologie des réseaux est encore limité dans sa capacité à faire des inférences mondiales à grande échelle. Ce défi est principalement dû à la difficulté d’échantillonnage des interactions sur le terrain, entraînant de nombreuses « lacunes » en ce qui concerne la couverture mondiale des données. Cette thèse adopte une approche « centrée sur les méthodes » de l’écologie des réseaux et se concentre sur l’idée de développer des outils pour aider à combler les lacunes en matière de données en présentant la prédiction comme une alternative accessible à l’échantillonnage sur le terrain et introduit deux « outils » différents qui sont prêts à poser des questions à l’échelle mondiale. Le chapitre 1 présente les outils que nous pouvons utiliser pour faire des prédictions de réseaux et est motivé par l’idée selon laquelle avoir la capacité de prédire les interactions entre les espèces grâce à l’utilisation d’outils de modélisation est impératif pour une compréhension plus globale des réseaux écologiques. Ce chapitre comprend une preuve de concept (dans laquelle nous montrons comment un simple modèle de réseau neuronal est capable de faire des prédictions précises sur les interactions entre espèces), une évaluation des défis et des opportunités associés à l’amélioration des prédictions d’interaction et une feuille de route conceptuelle concernant l’utilisation de modèles prédictifs pour les réseaux écologiques. Les chapitres 2 et 3 sont étroitement liés et se concentrent sur l’utilisation de l’intégration de graphiques pour la prédiction de réseau. Essentiellement, l’intégration de graphes nous permet de transformer un graphe (réseau) en un ensemble de vecteurs, qui capturent une propriété écologique du réseau et nous fournissent une abstraction simple mais puissante d’un réseau d’interaction et servent de moyen de maximiser les informations disponibles. dispo- nibles à partir des réseaux d’interactions d’espèces. Parce que l’intégration de graphes nous permet de « décoder » les informations au sein d’un réseau, elle est conçue comme un outil de prédiction de réseau, en particulier lorsqu’elle est utilisée dans un cadre d’apprentissage par transfert. Elle s’appuie sur l’idée que nous pouvons utiliser les connaissances acquises en résolvant un problème connu. et l’utiliser pour résoudre un problème étroitement lié. Ici, nous avons utilisé le métaweb européen (connu) pour prédire un métaweb pour les espèces canadiennes en fonction de leur parenté phylogénétique. Ce qui rend ce travail particulière- ment passionnant est que malgré le faible nombre d’espèces partagées entre ces deux régions, nous sommes capables de récupérer la plupart (91%) des interactions. Le chapitre 4 approfondit la réflexion sur la complexité des réseaux et les différentes ma- nières que nous pourrions choisir de définir la complexité. Plus spécifiquement, nous remet- tons en question les mesures structurelles plus traditionnelles de la complexité en présentant l’entropie SVD comme une mesure alternative de la complexité. Adopter une approche phy- sique pour définir la complexité nous permet de réfléchir aux informations contenues dans un réseau plutôt qu’à leurs propriétés émergentes. Il est intéressant de noter que l’entropie SVD révèle que les réseaux bipartites sont très complexes et ne sont pas nécessairement conformes à l’idée selon laquelle la complexité engendre la stabilité. Enfin, je présente le package Julia SpatialBoundaries.jl. Ce package permet à l’utili- sateur d’implémenter l’algorithme de wombling spatial pour des données disposées de manière uniforme ou aléatoire dans l’espace. Étant donné que l’algorithme de wombling spatial se concentre à la fois sur le gradient et sur la direction du changement pour un paysage donné, il peut être utilisé à la fois pour détecter les limites au sens traditionnel du terme ainsi que pour examiner de manière plus nuancée la direction des changements. Cette approche pourrait être un moyen bénéfique de réfléchir aux questions liées à la détection des limites des réseaux et à leur relation avec les limites environnementales. / The field of network ecology is still limited in its ability to make large-scale, global inferences. This challenge is primarily driven by the difficulty of sampling interactions in the field, leading to many ‘gaps’ with regards to global coverage of data. This thesis takes a ’methods-centric’ approach to network ecology and focuses on the idea of developing tools to help with filling in the the data gaps by presenting prediction as an accessible alternative to sampling in the field and introduces two different ’tools’ that are primed for asking questions at global scales. Chapter 1 maps out tools we can use to make network predictions and is driven by the idea that having the ability to predict interactions between species through the use of modelling tools is imperative for a more global understanding of ecological networks. This chapter includes a proof-of-concept (where we show how a simple neural network model is able to make accurate predictions about species interactions), an assessment of the challenges and opportunities associated with improving interaction predictions, and providing a conceptual roadmap concerned with the use of predictive models for ecological networks. Chapters 2 and 3 are closely intertwined and are focused on the use of graph embedding for network prediction. Essentially graph embedding allows us to transform a graph (net- work) into a set of vectors, which capture an ecological property of the network and provides us with a simple, yet powerful abstraction of an interaction network and serves as a way to maximise the available information available from species interaction networks. Because graph embedding allows us to ’decode’ the information within a network it is primed as a tool for network prediction, specifically when used in a transfer learning framework, this builds on the idea that we can take the knowledge gained from solving a known problem and using it to solve a closely related problem. Here we used the (known) European metaweb to predict a metaweb for Canadian species based on their phylogenetic relatedness. What makes this work particularly exciting is that despite the low number of species shared between these two regions we are able to recover most (91%) of interactions. Chapter 4 delves into thinking about the complexity of networks and the different ways we might choose to define complexity. More specifically we challenge the more traditional structural measures of complexity by presenting SVD entropy as an alternative measure of complexity. Taking a physical approach to defining complexity allows us to think about the information contained within a network as opposed to their emerging properties. Interest- ingly, SVD entropy reveals that bipartite networks are highly complex and do not necessarily conform to the idea that complexity begets stability. Finally, I present the Julia package SpatialBoundaries.jl. This package allows the user to implement the spatial wombling algorithm for data arranged uniformly or randomly across space. Because the spatial wombling algorithm focuses on both the gradient as well as the direction of change for the given landscape it can be used both for detecting boundaries in the traditional sense as well as a more nuanced look at at the direction of changes. This approach could be a beneficial way with which to think about questions which relate to boundary detection for networks and how these relate to environmental boundaries. Réseaux écologiques Décomposition des valeurs singulières Apprentissage par transfert Wombling spatial Ecological networks Singular value decomposition Transfer learning Spatial wombling Biology / Biologie (UMI : 0306)
242	Topics on Machine Learning under Imperfect Supervision Yuan, Gan January 2024 (has links) This dissertation comprises several studies addressing supervised learning problems where the supervision is imperfect. Firstly, we investigate the margin conditions in active learning. Active learning is characterized by its special mechanism where the learner can sample freely over the feature space and exploit mostly the limited labeling budget by querying the most informative labels. Our primary focus is to discern critical conditions under which certain active learning algorithms can outperform the optimal passive learning minimax rate. Within a non-parametric multi-class classification framework,our results reveal that the uniqueness of Bayes labels across the feature space serves as the pivotal determinant for the superiority of active learning over passive learning. Secondly, we study the estimation of central mean subspace (CMS), and its application in transfer learning. We show that a fast parametric convergence rate is achievable via estimating the expected smoothed gradient outer product, for a general class of covariate distribution that admits Gaussian or heavier distributions. When the link function is a polynomial with a degree of at most r and the covariates follow the standard Gaussian, we show that the prefactor depends on the ambient dimension d as d^r. Furthermore, we show that under a transfer learning setting, an oracle rate of prediction error as if the CMS is known is achievable, when the source training data is abundant. Finally, we present an innovative application involving the utilization of weak (noisy) labels for addressing an Individual Tree Crown (ITC) segmentation challenge. Here, the objective is to delineate individual tree crowns within a 3D LiDAR scan of tropical forests, with only 2D noisy manual delineations of crowns on RGB images available as a source of weak supervision. We propose a refinement algorithm designed to enhance the performance of existing unsupervised learning methodologies for the ITC segmentation problem. Statistics Computer science Machine learning--Statistical methods Transfer learning (Machine learning) Active learning Bayesian statistical decision theory Crowns (Botany) Gaussian processes
243	<b>MODEL BASED TRANSFER LEARNING ACROSS NANOMANUFACTURING PROCESSES AND BAYESIAN OPTIMIZATION FOR ADVANCED MODELING OF MIXTURE DATA</b> Yueyun Zhang (18183583) 24 June 2024 (has links) <p dir="ltr">Broadly, the focus of this work is on efficient statistical estimation and optimization of data arising from experimental data, particularly motivated by nanomanufacturing experiments on the material tellurene. Tellurene is a novel material for transistors with reliable attributes that enhance the performance of electronics (e.g., nanochip). As a solution-grown product, two-dimensional (2D) tellurene can be manufactured through a scalable process at a low cost. There are three main throughlines to this work, data augmentation, optimization, and equality constraint, and three distinct methodological projects, each of which addresses a subset of these throughlines. For the first project, I apply transfer learning in the analysis of data from a new tellurene experiment (process B) using the established linear regression model from a prior experiment (process A) from a similar study to combine the information from both experiments. The key of this approach is to incorporate the total equivalent amounts (TEA) of a lurking variable (experimental process changes) in terms of an observed (base) factor that appears in both experimental designs into the prespecified linear regression model. The results of the experimental data are presented including the optimal PVP chain length for scaling up production through a larger autoclave size. For the second project, I develop a multi-armed bandit Bayesian optimization (BO) approach to incorporate the equality constraint that comes from a mixture experiment on tellurium nanoproduct and account for factors with categorical levels. A more complex optimization approach was necessitated by the experimenters’ use of a neural network regression model to estimate the response surface. Results are presented on synthetic data to validate the ability of BO to recover the optimal response and its efficiency is compared to Monte Carlo random sampling to understand the level of experimental design complexity at which BO begins to pay off. The third project examines the potential enhancement of parameter estimation by utilizing synthetic data generated through Generative Adversarial Networks (GANs) to augment experimental data coming from a mixture experiment with a small to moderate number of runs. Transfer learning shows high promise for aiding in tellurene experiments, BO’s value increases with the complexity of the experiment, and GANs performed poorly on smaller experiments introducing bias to parameter estimates.</p> Industrial engineering Tellurium Nanoproduct Tellurene Transfer Learning Lurking Variable Experimental Design Bayesian Optimization Mixture Experiments Monte Carlo Random Sampling Neural Network Regression Generative Adversarial Networks
244	Deep Learning Based Image Segmentation for Tumor Cell Death Characterization Forsberg, Elise, Resare, Alexander January 2024 (has links) This report presents a deep learning based approach for segmenting and characterizing tumor cell deaths using images provided by the Önfelt lab, which contain NK cells and HL60 leukemia cells. We explore the efficiency of convolutional neural networks (CNNs) in distinguishing between live and dead tumor cells, as well as different classes of cell death. Three CNN architectures: MobileNetV2, ResNet-18, and ResNet-50 were employed, utilizing transfer learning to optimize performance given the limited size of available datasets. The networks were trained using two loss functions: weighted cross-entropy and generalized dice loss and two optimizers: Adaptive moment estimation (Adam) and stochastic gradient descent with momentum (SGDM), with performance evaluations based on metrics such as mean accuracy, intersection over union (IoU), and BF score. Our results indicate that MobileNetV2 with cross-entropy loss and the Adam optimizer outperformed other configurations, demonstrating high mean accuracy. Challenges such as class imbalance, annotation bias, and dataset limitations are discussed, alongside potential future directions to enhance model robustness and accuracy. The successful training of networks capable of classifying all identified types of cell death, demonstrates the potential for a deep learning approach to identify different types of cell deaths as a tool for analyzing immunotherapeutic strategies and enhance understanding of NK cell behaviors in cancer treatment. Deep Learning Image Segmentation Tumor Cells NK Cells Cell Death Characterization Immunotherapy Data Augmentation Convolutional Neural Networks (CNNs) Transfer Learning ResNet MobileNetV2 Optimization Algorithms Physical Sciences Fysik
245	Large-Context Question Answering with Cross-Lingual Transfer Sagen, Markus January 2021 (has links) Models based around the transformer architecture have become one of the most prominent for solving a multitude of natural language processing (NLP)tasks since its introduction in 2017. However, much research related to the transformer model has focused primarily on achieving high performance and many problems remain unsolved. Two of the most prominent currently are the lack of high performing non-English pre-trained models, and the limited number of words most trained models can incorporate for their context. Solving these problems would make NLP models more suitable for real-world applications, improving information retrieval, reading comprehension, and more. All previous research has focused on incorporating long-context for English language models. This thesis investigates the cross-lingual transferability between languages when only training for long-context in English. Training long-context models in English only could make long-context in low-resource languages, such as Swedish, more accessible since it is hard to find such data in most languages and costly to train for each language. This could become an efficient method for creating long-context models in other languages without the need for such data in all languages or pre-training from scratch. We extend the models’ context using the training scheme of the Longformer architecture and fine-tune on a question-answering task in several languages. Our evaluation could not satisfactorily confirm nor deny if transferring long-term context is possible for low-resource languages. We believe that using datasets that require long-context reasoning, such as a multilingual TriviaQAdataset, could demonstrate our hypothesis’s validity. Long-Context Multilingual Model Longformer XLM-R Longformer Long-term Context Extending Context Extend Context Large-Context Long-Context Large Context Long Context Cross-Lingual Multi-Lingual Cross Lingual Multi Lingual QA Question-Answering Question Answering Transformer model Machine Learning Transfer Learning SQuAD Memory Transfer Learning Long-Context Long Context Efficient Monolingual Multilingual QA model Language Model Huggingface BERT RoBERTa XLM-R mBERT Multilingual BERT Efficient Transformers Reformer Linformer Performer Transformer-XL Wikitext-103 TriviaQA HotpotQA WikiHopQA VINNOVA Peltarion AI LM MLM Deep Learning Natural Language Processing NLP Attention Transformers Transfer Learning Datasets Computer and Information Sciences Data- och informationsvetenskap
246	Duplicate Detection and Text Classification on Simplified Technical English / Dublettdetektion och textklassificering på Förenklad Teknisk Engelska Lund, Max January 2019 (has links) This thesis investigates the most effective way of performing classification of text labels and clustering of duplicate texts in technical documentation written in Simplified Technical English. Pre-trained language models from transformers (BERT) were tested against traditional methods such as tf-idf with cosine similarity (kNN) and SVMs on the classification task. For detecting duplicate texts, vector representations from pre-trained transformer and LSTM models were tested against tf-idf using the density-based clustering algorithms DBSCAN and HDBSCAN. The results show that traditional methods are comparable to pre-trained models for classification, and that using tf-idf vectors with a low distance threshold in DBSCAN is preferable for duplicate detection. NLP CNL transformer models LSTM BERT document embeddings word embeddings text classification text clustering transfer learning machine learning Computer Sciences Datavetenskap (datalogi)
247	Multi-object detection and tracking in video sequences / Détection et suivi multi-objets dans des séquences vidéo Mhalla, Ala 04 April 2018 (has links) Le travail développé dans cette thèse porte sur l'analyse de séquences vidéo. Cette dernière est basée sur 3 taches principales : la détection, la catégorisation et le suivi des objets. Le développement de solutions fiables pour l'analyse de séquences vidéo ouvre de nouveaux horizons pour plusieurs applications telles que les systèmes de transport intelligents, la vidéosurveillance et la robotique. Dans cette thèse, nous avons mis en avant plusieurs contributions pour traiter les problèmes de détection et de suivi d'objets multiples sur des séquences vidéo. Les techniques proposées sont basées sur l’apprentissage profonds et des approches de transfert d'apprentissage. Dans une première contribution, nous abordons le problème de la détection multi-objets en proposant une nouvelle technique de transfert d’apprentissage basé sur le formalisme et la théorie du filtre SMC (Sequential Monte Carlo) afin de spécialiser automatiquement un détecteur de réseau de neurones convolutionnel profond (DCNN) vers une scène cible. Dans une deuxième contribution, nous proposons une nouvelle approche de suivi multi-objets original basé sur des stratégies spatio-temporelles (entrelacement / entrelacement inverse) et un détecteur profond entrelacé, qui améliore les performances des algorithmes de suivi par détection et permet de suivre des objets dans des environnements complexes (occlusion, intersection, fort mouvement). Dans une troisième contribution, nous fournissons un système de surveillance du trafic, qui intègre une extension du technique SMC afin d’améliorer la précision de la détection de jour et de nuit et de spécialiser tout détecteur DCNN pour les caméras fixes et mobiles. Tout au long de ce rapport, nous fournissons des résultats quantitatifs et qualitatifs. Sur plusieurs aspects liés à l’analyse de séquences vidéo, ces travaux surpassent les cadres de détection et de suivi de pointe. En outre, nous avons implémenté avec succès nos infrastructures dans une plate-forme matérielle intégrée pour la surveillance et la sécurité du trafic routier. / The work developed in this PhD thesis is focused on video sequence analysis. Thelatter consists of object detection, categorization and tracking. The development ofreliable solutions for the analysis of video sequences opens new horizons for severalapplications such as intelligent transport systems, video surveillance and robotics.In this thesis, we put forward several contributions to deal with the problems ofdetecting and tracking multi-objects on video sequences. The proposed frameworksare based on deep learning networks and transfer learning approaches.In a first contribution, we tackle the problem of multi-object detection by puttingforward a new transfer learning framework based on the formalism and the theoryof a Sequential Monte Carlo (SMC) filter to automatically specialize a Deep ConvolutionalNeural Network (DCNN) detector towards a target scene. The suggestedspecialization framework is used in order to transfer the knowledge from the sourceand the target domain to the target scene and to estimate the unknown target distributionas a specialized dataset composed of samples from the target domain. Thesesamples are selected according to the importance of their weights which reflectsthe likelihood that they belong to the target distribution. The obtained specializeddataset allows training a specialized DCNN detector to a target scene withouthuman intervention.In a second contribution, we propose an original multi-object tracking frameworkbased on spatio-temporal strategies (interlacing/inverse interlacing) and aninterlaced deep detector, which improves the performances of tracking-by-detectionalgorithms and helps to track objects in complex videos (occlusion, intersection,strong motion).In a third contribution, we provide an embedded system for traffic surveillance,which integrates an extension of the SMC framework so as to improve the detectionaccuracy in both day and night conditions and to specialize any DCNN detector forboth mobile and stationary cameras.Throughout this report, we provide both quantitative and qualitative results.On several aspects related to video sequence analysis, this work outperformsthe state-of-the-art detection and tracking frameworks. In addition, we havesuccessfully implemented our frameworks in an embedded hardware platform forroad traffic safety and monitoring. Intelligence artificielle Vision par ordinateur Transfert d'apprentissage Apprentissage profond Détection multi-objets Spécialisation Suivi par détection Suivi multi-objets Système embarqué Artificial intelligence Computer vision Transfer learning Deep learning Multiobject detection Specialization Tracking-by-detection Multi-object tracking Embedded system
248	Extractive Multi-document Summarization of News Articles Grant, Harald January 2019 (has links) Publicly available data grows exponentially through web services and technological advancements. To comprehend large data-streams multi-document summarization (MDS) can be used. In this research, the area of multi-document summarization is investigated. Multiple systems for extractive multi-document summarization are implemented using modern techniques, in the form of the pre-trained BERT language model for word embeddings and sentence classification. This is combined with well proven techniques, in the form of the TextRank ranking algorithm, the Waterfall architecture and anti-redundancy filtering. The systems are evaluated on the DUC-2002, 2006 and 2007 datasets using the ROUGE metric. Where the results show that the BM25 sentence representation implemented in the TextRank model using the Waterfall architecture and an anti-redundancy technique outperforms the other implementations, providing competitive results with other state-of-the-art systems. A cohesive model is derived from the leading system and tried in a user study using a real-world application. The user study is conducted using a real-time news detection application with users from the news-domain. The study shows a clear favour for cohesive summaries in the case of extractive multi-document summarization. Where the cohesive summary is preferred in the majority of cases. NLP extractive summarization multi-document neural embeddings information extraction text-to-text generation textrank BERT attention Transformer transfer learning fine-tuning ROUGE
249	Reinforcement Learning from Demonstration Suay, Halit Bener 25 April 2016 (has links) Off-the-shelf Reinforcement Learning (RL) algorithms suffer from slow learning performance, partly because they are expected to learn a task from scratch merely through an agent's own experience. In this thesis, we show that learning from scratch is a limiting factor for the learning performance, and that when prior knowledge is available RL agents can learn a task faster. We evaluate relevant previous work and our own algorithms in various experiments. Our first contribution is the first implementation and evaluation of an existing interactive RL algorithm in a real-world domain with a humanoid robot. Interactive RL was evaluated in a simulated domain which motivated us for evaluating its practicality on a robot. Our evaluation shows that guidance reduces learning time, and that its positive effects increase with state space size. A natural follow up question after our first evaluation was, how do some other previous works compare to interactive RL. Our second contribution is an analysis of a user study, where na"ive human teachers demonstrated a real-world object catching with a humanoid robot. We present the first comparison of several previous works in a common real-world domain with a user study. One conclusion of the user study was the high potential of RL despite poor usability due to slow learning rate. As an effort to improve the learning efficiency of RL learners, our third contribution is a novel human-agent knowledge transfer algorithm. Using demonstrations from three teachers with varying expertise in a simulated domain, we show that regardless of the skill level, human demonstrations can improve the asymptotic performance of an RL agent. As an alternative approach for encoding human knowledge in RL, we investigated the use of reward shaping. Our final contributions are Static Inverse Reinforcement Learning Shaping and Dynamic Inverse Reinforcement Learning Shaping algorithms that use human demonstrations for recovering a shaping reward function. Our experiments in simulated domains show that our approach outperforms the state-of-the-art in cumulative reward, learning rate and asymptotic performance. Overall we show that human demonstrators with varying skills can help RL agents to learn tasks more efficiently. robotics robots user study lfd rl rlfd artificial intelligence rule learning machine learning policy learning robot learning from demonstration transfer learning agents robot learning learning from demonstration reward shaping reinforcement learning
250	Weakly supervised learning of deformable part models and convolutional neural networks for object detection / Détection d'objets faiblement supervisée par modèles de pièces déformables et réseaux de neurones convolutionnels Tang, Yuxing 14 December 2016 (has links) Dans cette thèse, nous nous intéressons au problème de la détection d’objets faiblement supervisée. Le but est de reconnaître et de localiser des objets dans les images, n’ayant à notre disposition durant la phase d’apprentissage que des images partiellement annotées au niveau des objets. Pour cela, nous avons proposé deux méthodes basées sur des modèles différents. Pour la première méthode, nous avons proposé une amélioration de l’approche ”Deformable Part-based Models” (DPM) faiblement supervisée, en insistant sur l’importance de la position et de la taille du filtre racine initial spécifique à la classe. Tout d’abord, un ensemble de candidats est calculé, ceux-ci représentant les positions possibles de l’objet pour le filtre racine initial, en se basant sur une mesure générique d’objectness (par region proposals) pour combiner les régions les plus saillantes et potentiellement de bonne qualité. Ensuite, nous avons proposé l’apprentissage du label des classes latentes de chaque candidat comme un problème de classification binaire, en entrainant des classifieurs spécifiques pour chaque catégorie afin de prédire si les candidats sont potentiellement des objets cible ou non. De plus, nous avons amélioré la détection en incorporant l’information contextuelle à partir des scores de classification de l’image. Enfin, nous avons élaboré une procédure de post-traitement permettant d’élargir et de contracter les régions fournies par le DPM afin de les adapter efficacement à la taille de l’objet, augmentant ainsi la précision finale de la détection. Pour la seconde approche, nous avons étudié dans quelle mesure l’information tirée des objets similaires d’un point de vue visuel et sémantique pouvait être utilisée pour transformer un classifieur d’images en détecteur d’objets d’une manière semi-supervisée sur un large ensemble de données, pour lequel seul un sous-ensemble des catégories d’objets est annoté avec des boîtes englobantes nécessaires pour l’apprentissage des détecteurs. Nous avons proposé de transformer des classifieurs d’images basés sur des réseaux convolutionnels profonds (Deep CNN) en détecteurs d’objets en modélisant les différences entre les deux en considérant des catégories disposant à la fois de l’annotation au niveau de l’image globale et l’annotation au niveau des boîtes englobantes. Cette information de différence est ensuite transférée aux catégories sans annotation au niveau des boîtes englobantes, permettant ainsi la conversion de classifieurs d’images en détecteurs d’objets. Nos approches ont été évaluées sur plusieurs jeux de données tels que PASCAL VOC, ImageNet ILSVRC et Microsoft COCO. Ces expérimentations ont démontré que nos approches permettent d’obtenir des résultats comparables à ceux de l’état de l’art et qu’une amélioration significative a pu être obtenue par rapport à des méthodes récentes de détection d’objets faiblement supervisées. / In this dissertation we address the problem of weakly supervised object detection, wherein the goal is to recognize and localize objects in weakly-labeled images where object-level annotations are incomplete during training. To this end, we propose two methods which learn two different models for the objects of interest. In our first method, we propose a model enhancing the weakly supervised Deformable Part-based Models (DPMs) by emphasizing the importance of location and size of the initial class-specific root filter. We first compute a candidate pool that represents the potential locations of the object as this root filter estimate, by exploring the generic objectness measurement (region proposals) to combine the most salient regions and “good” region proposals. We then propose learning of the latent class label of each candidate window as a binary classification problem, by training category-specific classifiers used to coarsely classify a candidate window into either a target object or a non-target class. Furthermore, we improve detection by incorporating the contextual information from image classification scores. Finally, we design a flexible enlarging-and-shrinking post-processing procedure to modify the DPMs outputs, which can effectively match the approximate object aspect ratios and further improve final accuracy. Second, we investigate how knowledge about object similarities from both visual and semantic domains can be transferred to adapt an image classifier to an object detector in a semi-supervised setting on a large-scale database, where a subset of object categories are annotated with bounding boxes. We propose to transform deep Convolutional Neural Networks (CNN)-based image-level classifiers into object detectors by modeling the differences between the two on categories with both image-level and bounding box annotations, and transferring this information to convert classifiers to detectors for categories without bounding box annotations. We have evaluated both our approaches extensively on several challenging detection benchmarks, e.g. , PASCAL VOC, ImageNet ILSVRC and Microsoft COCO. Both our approaches compare favorably to the state-of-the-art and show significant improvement over several other recent weakly supervised detection methods. Détection d’objets Apprentissage faiblement supervisé Deformable parts models Apprentissage profond Réseaux de neurones convolutionnels Transfert d’apprentissage Object detection Weakly supervised learning Deformable part models Region proposals Deep learning Convolutional neural networks Transfer learning

Search results