• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 13
  • 2
  • Tagged with
  • 22
  • 22
  • 22
  • 10
  • 10
  • 9
  • 8
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Visuo-Haptic recognition of daily-life objects : a contribution to the data scarcity problem / Reconnaissance visio-haptique des objets de la vie quotidienne : à partir de peu de données d'entraînement

Abderrahmane, Zineb 29 November 2018 (has links)
Il est important pour les robots de pouvoir reconnaître les objets rencontrés dans la vie quotidienne afin d’assurer leur autonomie. De nos jours, les robots sont équipés de capteurs sophistiqués permettant d’imiter le sens humain du toucher. C’est ce qui permet aux robots interagissant avec les objets de percevoir les propriétés (telles la texture, la rigidité et la matière) nécessaires pour leur reconnaissance. Dans cette thèse, notre but est d’exploiter les données haptiques issues de l’interaction robot-objet afin de reconnaître les objets de la vie quotidienne, et cela en utilisant les algorithmes d’apprentissage automatique. Le problème qui se pose est la difficulté de collecter suffisamment de données haptiques afin d’entraîner les algorithmes d’apprentissage supervisé sur tous les objets que le robot doit reconnaître. En effet, les objets de la vie quotidienne sont nombreux et l’interaction physique entre le robot et chaque objet pour la collection des données prend beaucoup de temps et d’efforts. Pour traiter ce problème, nous développons un système de reconnaissance haptique permettant de reconnaître des objets à partir d'aucune, de une seule, ou de plusieurs données d’entraînement. Enfin, nous intégrons la vision afin d’améliorer la reconnaissance d'objets lorsque le robot est équipé de caméras. / Recognizing surrounding objects is an important skill for the autonomy of robots performing in daily-life. Nowadays robots are equipped with sophisticated sensors imitating the human sense of touch. This allows the recognition of an object based on information ensuing from robot-object physical interaction. Such information can include the object texture, compliance and material. In this thesis, we exploit haptic data to perform haptic recognition of daily life objects using machine learning techniques. The main challenge faced in our work is the difficulty of collecting a fair amount of haptic training data for all daily-life objects. This is due to the continuously growing number of objects and to the effort and time needed by the robot to physically interact with each object for data collection. We solve this problem by developing a haptic recognition framework capable of performing Zero-shot, One-shot and Multi-shot Learning. We also extend our framework by integrating vision to enhance the robot’s recognition performance, whenever such sense is available.
2

Machine learning for wireless signal learning

Smith, Logan 30 April 2021 (has links)
Wireless networks are vulnerable to adversarial devices by spoofing the digital identity of valid wireless devices, allowing unauthorized devices access to the network. Instead of validating devices based on their digital identity, it is possible to use their unique "physical fingerprint" caused by changes in the signal due to deviations in wireless hardware. In this thesis, the physical fingerprint was validated by performing classification with complex-valued neural networks (NN), achieving a high level of accuracy in the process. Additionally, zero-shot learning (ZSL) was implemented to learn discriminant features to separate legitimate from unauthorized devices using outlier detection and then further separate every unauthorized device into their own cluster. This approach allows 42\% of unauthorized devices to be identified as unauthorized and correctly clustered
3

Zero-shot Learning for Visual Recognition Problems

Naha, Shujon January 2016 (has links)
In this thesis we discuss different aspects of zero-shot learning and propose solutions for three challenging visual recognition problems: 1) unknown object recognition from images 2) novel action recognition from videos and 3) unseen object segmentation. In all of these three problems, we have two different sets of classes, the “known classes”, which are used in the training phase and the “unknown classes” for which there is no training instance. Our proposed approach exploits the available semantic relationships between known and unknown object classes and use them to transfer the appearance models from known object classes to unknown object classes to recognize unknown objects. We also propose an approach to recognize novel actions from videos by learning a joint model that links videos and text. Finally, we present a ranking based approach for zero-shot object segmentation. We represent each unknown object class as a semantic ranking of all the known classes and use this semantic relationship to extend the segmentation model of known classes to segment unknown class objects. / October 2016
4

Zero-shot visual recognition via latent embedding learning

Wang, Qian January 2018 (has links)
Traditional supervised visual recognition methods require a great number of annotated examples for each concerned class. The collection and annotation of visual data (e.g., images and videos) could be laborious, tedious and time-consuming when the number of classes involved is very large. In addition, there are such situations where the test instances are from novel classes for which training examples are unavailable in the training stage. These issues can be addressed by zero-shot learning (ZSL), an emerging machine learning technique enabling the recognition of novel classes. The key issue in zero-shot visual recognition is the semantic gap between visual and semantic representations. We address this issue in this thesis from three different perspectives: visual representations, semantic representations and the learning models. We first propose a novel bidirectional latent embedding framework for zero-shot visual recognition. By learning a latent space from visual representations and labelling information of the training examples, instances of different classes can be mapped into the latent space with the preserving of both visual and semantic relatedness, hence the semantic gap can be bridged. We conduct experiments on both object and human action recognition benchmarks to validate the effectiveness of the proposed ZSL framework. Then we extend the ZSL to the multi-label scenarios for multi-label zero-shot human action recognition based on weakly annotated video data. We employ a long short term memory (LSTM) neural network to explore the multiple actions underlying the video data. A joint latent space is learned by two component models (i.e. the visual model and the semantic model) to bridge the semantic gap. The two component embedding models are trained alternately to optimize the ranking based objectives. Extensive experiments are carried out on two multi-label human action datasets to evaluate the proposed framework. Finally, we propose alternative semantic representations for human actions towards narrowing the semantic gap from the perspective of semantic representation. A simple yet effective solution based on the exploration of web data has been investigated to enhance the semantic representations for human actions. The novel semantic representations are proved to benefit the zero-shot human action recognition significantly compared to the traditional attributes and word vectors. In summary, we propose novel frameworks for zero-shot visual recognition towards narrowing and bridging the semantic gap, and achieve state-of-the-art performance in different settings on multiple benchmarks.
5

Thought Recognition: Predicting and Decoding Brain Activity Using the Zero-Shot Learning Model

Palatucci, Mark M. 25 April 2011 (has links)
Machine learning algorithms have been successfully applied to learning classifiers in many domains such as computer vision, fraud detection, and brain image analysis. Typically, classifiers are trained to predict a class value given a set of labeled training data that includes all possible class values, and sometimes additional unlabeled training data. Little research has been performed where the possible values for the class variable include values that have been omitted from the training examples. This is an important problem setting, especially in domains where the class value can take on many values, and the cost of obtaining labeled examples for all values is high. We show that the key to addressing this problem is not predicting the held-out classes directly, but rather by recognizing the semantic properties of the classes such as their physical or functional attributes. We formalize this method as zero-shot learning and show that by utilizing semantic knowledge mined from large text corpora and crowd-sourced humans, we can discriminate classes without explicitly collecting examples of those classes for a training set. As a case study, we consider this problem in the context of thought recognition, where the goal is to classify the pattern of brain activity observed from a non-invasive neural recording device. Specifically, we train classifiers to predict a specific concrete noun that a person is thinking about based on an observed image of that person’s neural activity. We show that by predicting the semantic properties of the nouns such as “is it heavy?” and “is it edible?”, we can discriminate concrete nouns that people are thinking about, even without explicitly collecting examples of those nouns for a training set. Further, this allows discrimination of certain nouns that are within the same category with significantly higher accuracies than previous work. In addition to being an important step forward for neural imaging and braincomputer- interfaces, we show that the zero-shot learning model has important implications for the broader machine learning community by providing a means for learning algorithms to extrapolate beyond their explicit training set.
6

Video2Vec: Learning Semantic Spatio-Temporal Embedding for Video Representations

January 2016 (has links)
abstract: High-level inference tasks in video applications such as recognition, video retrieval, and zero-shot classification have become an active research area in recent years. One fundamental requirement for such applications is to extract high-quality features that maintain high-level information in the videos. Many video feature extraction algorithms have been purposed, such as STIP, HOG3D, and Dense Trajectories. These algorithms are often referred to as “handcrafted” features as they were deliberately designed based on some reasonable considerations. However, these algorithms may fail when dealing with high-level tasks or complex scene videos. Due to the success of using deep convolution neural networks (CNNs) to extract global representations for static images, researchers have been using similar techniques to tackle video contents. Typical techniques first extract spatial features by processing raw images using deep convolution architectures designed for static image classifications. Then simple average, concatenation or classifier-based fusion/pooling methods are applied to the extracted features. I argue that features extracted in such ways do not acquire enough representative information since videos, unlike images, should be characterized as a temporal sequence of semantically coherent visual contents and thus need to be represented in a manner considering both semantic and spatio-temporal information. In this thesis, I propose a novel architecture to learn semantic spatio-temporal embedding for videos to support high-level video analysis. The proposed method encodes video spatial and temporal information separately by employing a deep architecture consisting of two channels of convolutional neural networks (capturing appearance and local motion) followed by their corresponding Fully Connected Gated Recurrent Unit (FC-GRU) encoders for capturing longer-term temporal structure of the CNN features. The resultant spatio-temporal representation (a vector) is used to learn a mapping via a Fully Connected Multilayer Perceptron (FC-MLP) to the word2vec semantic embedding space, leading to a semantic interpretation of the video vector that supports high-level analysis. I evaluate the usefulness and effectiveness of this new video representation by conducting experiments on action recognition, zero-shot video classification, and semantic video retrieval (word-to-video) retrieval, using the UCF101 action recognition dataset. / Dissertation/Thesis / Masters Thesis Computer Science 2016
7

Zero Shot Learning for Visual Object Recognition with Generative Models

January 2020 (has links)
abstract: Visual object recognition has achieved great success with advancements in deep learning technologies. Notably, the existing recognition models have gained human-level performance on many of the recognition tasks. However, these models are data hungry, and their performance is constrained by the amount of training data. Inspired by the human ability to recognize object categories based on textual descriptions of objects and previous visual knowledge, the research community has extensively pursued the area of zero-shot learning. In this area of research, machine vision models are trained to recognize object categories that are not observed during the training process. Zero-shot learning models leverage textual information to transfer visual knowledge from seen object categories in order to recognize unseen object categories. Generative models have recently gained popularity as they synthesize unseen visual features and convert zero-shot learning into a classical supervised learning problem. These generative models are trained using seen classes and are expected to implicitly transfer the knowledge from seen to unseen classes. However, their performance is stymied by overfitting towards seen classes, which leads to substandard performance in generalized zero-shot learning. To address this concern, this dissertation proposes a novel generative model that leverages the semantic relationship between seen and unseen categories and explicitly performs knowledge transfer from seen categories to unseen categories. Experiments were conducted on several benchmark datasets to demonstrate the efficacy of the proposed model for both zero-shot learning and generalized zero-shot learning. The dissertation also provides a unique Student-Teacher based generative model for zero-shot learning and concludes with future research directions in this area. / Dissertation/Thesis / Masters Thesis Computer Science 2020
8

Domain-Aware Continual Zero-Shot Learning

Yi, Kai 29 November 2021 (has links)
We introduce Domain Aware Continual Zero-Shot Learning (DACZSL), the task of visually recognizing images of unseen categories in unseen domains sequentially. We created DACZSL on top of the DomainNet dataset by dividing it into a sequence of tasks, where classes are incrementally provided on seen domains during training and evaluation is conducted on unseen domains for both seen and unseen classes. We also proposed a novel Domain-Invariant CZSL Network (DIN), which outperforms state-of-the-art baseline models that we adapted to DACZSL setting. We adopt a structure-based approach to alleviate forgetting knowledge from previous tasks with a small per-task private network in addition to a global shared network. To encourage the private network to capture the domain and task-specific representation, we train our model with a novel adversarial knowledge disentanglement setting to make our global network task-invariant and domain-invariant over all the tasks. Our method also learns a class-wise learnable prompt to obtain better class-level text representation, which is used to represent side information to enable zero-shot prediction of future unseen classes. Our code and benchmarks are made available at https://zero-shot-learning.github.io/daczsl.
9

VISUAL AND SEMANTIC KNOWLEDGE TRANSFER FOR NOVEL TASKS

Ye, Meng January 2019 (has links)
Data is a critical component in a supervised machine learning system. Many successful applications of learning systems on various tasks are based on a large amount of labeled data. For example, deep convolutional neural networks have surpassed human performance on ImageNet classification, which consists of millions of labeled images. However, one challenge in conventional supervised learning systems is their generalization ability. Once a model is trained on a specific dataset, it can only perform the task on those \emph{seen} classes and cannot be used for novel \emph{unseen} classes. In order to make the model work on new classes, one has to collect and label new data and then re-train the model. However, collecting data and labeling them is labor-intensive and costly, in some cases, it is even impossible. Also, there is an enormous amount of different tasks in the real world. It is not applicable to create a dataset for each of them. These problems raise the need for Transfer Learning, which is aimed at using data from the \emph{source} domain to improve the performance of a model on the \emph{target} domain, and these two domains have different data or different tasks. One specific case of transfer learning is Zero-Shot Learning. It deals with the situation where \emph{source} domain and \emph{target} domain have the same data distribution but do not have the same set of classes. For example, a model is given animal images of `cat' and `dog' for training and will be tested on classifying 'tiger' and 'wolf' images, which it has never seen. Different from conventional supervised learning, Zero-Shot Learning does not require training data in the \emph{target} domain to perform classification. This property gives ZSL the potential to be broadly applied in various applications where a system is expected to tackle unexpected situations. In this dissertation, we develop algorithms that can help a model effectively transfer visual and semantic knowledge learned from \emph{source} task to \emph{target} task. More specifically, first we develop a model that learns a uniform visual representation of semantic attributes, which help alleviate the domain shift problem in Zero-Shot Learning. Second, we develop an ensemble network architecture with a progressive training scheme, which transfers \emph{source} domain knowledge to the \emph{target} domain in an end-to-end manner. Lastly, we move a step beyond ZSL and explore Label-less Classification, which transfers knowledge from pre-trained object detectors into scene classification tasks. Our label-less classification takes advantage of word embeddings trained from unorganized online text, thus eliminating the need for expert-defined semantic attributes for each class. Through comprehensive experiments, we show that the proposed methods can effectively transfer visual and semantic knowledge between tasks, and achieve state-of-the-art performances on standard datasets. / Computer and Information Science
10

Improving Zero-Shot Learning via Distribution Embeddings

Chalumuri, Vivek January 2020 (has links)
Zero-Shot Learning (ZSL) for image classification aims to recognize images from novel classes for which we have no training examples. A common approach to tackling such a problem is by transferring knowledge from seen to unseen classes using some auxiliary semantic information of class labels in the form of class embeddings. Most of the existing methods represent image features and class embeddings as point vectors, and such vector representation limits the expressivity in terms of modeling the intra-class variability of the image classes. In this thesis, we propose three novel ZSL methods that represent image features and class labels as distributions and learn their corresponding parameters as distribution embeddings. Therefore, the intra-class variability of image classes is better modeled. The first model is a Triplet model, where image features and class embeddings are projected as Gaussian distributions in a common space, and their associations are learned by metric learning. Next, we have a Triplet-VAE model, where two VAEs are trained with triplet based distributional alignment for ZSL. The third model is a simple Probabilistic Classifier for ZSL, which is inspired by energy-based models. When evaluated on the common benchmark ZSL datasets, the proposed methods result in an improvement over the existing state-of-the-art methods for both traditional ZSL and more challenging Generalized-ZSL (GZSL) settings. / Zero-Shot Learning (ZSL) för bildklassificering syftar till att känna igen bilder från nya klasser som vi inte har några utbildningsexempel för. Ett vanligt tillvägagångssätt för att ta itu med ett sådant problem är att överföra kunskap från sett till osynliga klasser med hjälp av någon semantisk information om klassetiketter i form av klassinbäddningar. De flesta av de befintliga metoderna representerar bildfunktioner och klassinbäddningar som punktvektorer, och sådan vektorrepresentation begränsar uttrycksförmågan när det gäller att modellera bildklassernas variation inom klass. I denna avhandling föreslår vi tre nya ZSL-metoder som representerar bildfunktioner och klassetiketter som distributioner och lär sig deras motsvarande parametrar som distributionsinbäddningar. Därför är bildklassernas variation inom klass bättre modellerad. Den första modellen är en Triplet-modell, där bildfunktioner och klassinbäddningar projiceras som Gaussiska fördelningar i ett gemensamt utrymme, och deras föreningar lärs av metrisk inlärning. Därefter har vi en Triplet-VAE-modell, där två VAEs tränas med tripletbaserad fördelningsinriktning för ZSL. Den tredje modellen är en enkel Probabilistic Classifier för ZSL, som är inspirerad av energibaserade modeller. När de utvärderas på de vanliga ZSLdatauppsättningarna, resulterar de föreslagna metoderna i en förbättring jämfört med befintliga toppmoderna metoder för både traditionella ZSL och mer utmanande Generalized-ZSL (GZSL) -inställningar.

Page generated in 0.084 seconds