Global ETD Search

1	Hanolistic: A Hierarchical Automatic Image Annotation System Using Holistic Approach Oztimur, Ozge 01 January 2008 (has links) (PDF) Automatic image annotation is the process of assigning keywords to digital images depending on the content information. In one sense, it is a mapping from the visual content information to the semantic context information. In this thesis, we propose a novel approach for automatic image annotation problem, where the annotation is formulated as a multivariate mapping from a set of independent descriptor spaces, representing a whole image, to a set of words, representing class labels. For this purpose, a hierarchical annotation architecture, named as HANOLISTIC (Hierarchical Image Annotation System Using Holistic Approach), is dened with two layers. At the rst layer, called level-0 annotator, each annotator is fed by a set of distinct descriptor, extracted from the whole image. This enables us to represent the image at each annotator by a dierent visual property of a descriptor. Since, we use the whole image, the problematic segmentation process is avoided. Training of each annotator is accomplished by a supervised learning paradigm, where each word is represented by a class label. Note that, this approach is slightly dierent then the classical training approaches, where each data has a unique label. In the proposed system, since each image has one or more annotating words, we assume that an image belongs to more than one class. The output of the level-0 annotators indicate the membership values of the words in the vocabulary, to belong an image. These membership values from each annotator is, then, aggregated at the second layer by using various rules, to obtain meta-layer annotator. The rules, employed in this study, involves summation and/or weighted summation of the output of layer-0 annotators. Finally, a set of words from the vocabulary is selected based on the ranking of the output of meta-layer. The hierarchical annotation system proposed in this thesis outperforms state of the art annotation systems based on segmental and holistic approaches. The proposed system is examined in-depth and compared to the other systems in the literature by means of using several performance criteria. QA General 15707
2	ClarQue: Chatbot Recognizing Ambiguity in the Conversation and Asking Clarifying Questions Mody, Shreeya Himanshu 31 July 2020 (has links) Recognizing when we need more information and asking clarifying questions are integral to communication in our day to day life. It helps us complete our mental model of the world and eliminate confusion. Chatbots need this technique to meaningfully collaborate with humans. We have investigated a process to generate an automated system that mimics human communication behavior using knowledge graphs, weights, an ambiguity test, and a response generator. It can take input dialog text and based on the chatbot's knowledge about the world and the user it can decide if it has enough information or if it requires more. Based on that decision, the chatbot generates a dialog output text which can be an answer if a question is asked, a statement if there are no doubts or if there is any ambiguity, it generates a clarifying question. The effectiveness of these features has been backed up by an empirical study which suggests that they are very useful in a chatbot not only for crucial information retrial but also for keeping the flow and context of the conversation intact. knowledge graph Stanford NLP annotators chatbot clarifying questions ambiguity Physical Sciences and Mathematics
3	Emergsem : une approche d'annotation collaborative et de recherche d'images basée sur les sémantiques émergentes / Emergsem : an approach of collaborative annotation and retrieval of images based on semantics emergent Zomahoun, Damien Esse 05 June 2015 (has links) L’extraction de la sémantique d’une image est un processus qui nécessite une analyse profonde du contenu de l’image. Elle se réfère à leur interprétation à partir d’un point de vuehumain. Dans ce dernier cas, la sémantique d’une image pourrait être générique (par exemple un véhicule) ou spécifique (par exemple une bicyclette). Elle consiste à extraire une sémantique simple ou multiple de l’image afin de faciliter sa récupération. Ces objectifs indiquent clairement que l’extraction de la sémantique n’est pas un nouveau domaine de recherche. Cette thèse traite d’une approche d’annotation collaborative et de recherche d’images baséesur les sémantiques émergentes. Il aborde d’une part, la façon dont les annotateurs pourraient décrire et représenter le contenu des images en se basant sur les informations visuelles, et d’autre part comment la recherche des images pourrait être considérablement améliorée grâce aux récentes techniques, notamment le clustering et la recommandation. Pour atteindre ces objectifs, l’exploitation des outils de description implicite du contenu des images, des interactions des annotateurs qui décrivent la sémantique des images et celles des utilisateurs qui utilisent la sémantique produite pour rechercher les images seraient indispensables.Dans cette thèse, nous nous sommes penchés vers les outils duWeb Sémantique, notamment les ontologies pour décrire les images de façon structurée. L’ontologie permet de représenter les objets présents dans une image ainsi que les relations entre ces objets (les scènes d’image). Autrement dit, elle permet de représenter de façon formelle les différents types d’objets et leurs relations. L’ontologie code la structure relationnelle des concepts que l’on peut utiliser pour décrire et raisonner. Cela la rend éminemment adaptée à de nombreux problèmes comme la description sémantique des images qui nécessite une connaissance préalable et une capacité descriptive et normative.La contribution de cette thèse est focalisée sur trois points essentiels : La représentationsémantique, l’annotation sémantique collaborative et la recherche sémantique des images.La représentation sémantique permet de proposer un outil capable de représenter la sémantique des images. Pour capturer la sémantique des images, nous avons proposé une ontologie d’application dérivée d’une ontologie générique.L’annotation sémantique collaborative que nous proposons consiste à faire émerger la sémantique des images à partir des sémantiques proposées par une communauté d’annotateurs.La recherche sémantique permet de rechercher les images avec les sémantiques fournies par l’annotation sémantique collaborative. Elle est basée sur deux techniques : le clustering et la recommandation. Le clustering permet de regrouper les images similaires à la requête d’utilisateur et la recommandation a pour objectif de proposer des sémantiques aux utilisateurs en se basant sur leurs profils statiques et dynamiques. Elle est composée de trois étapes à savoir : la formation de la communauté des utilisateurs, l’acquisition des profils d’utilisateurs et la classification des profils d’utilisateurs avec l’algèbre de Galois. Des expérimentations ont été menées pour valider les différentes approches proposées dans ce travail. / The extraction of images semantic is a process that requires deep analysis of the image content. It refers to their interpretation from a human point of view. In this lastest case, the image semantic may be generic (e.g., a vehicle) or specific (e.g., a bicycle). It consists in extracting single or multiple images semantic in order to facilitate its retrieval. These objectives clearly show that the extraction of semantic is not a new research field. This thesis deals with the semantic collaborative annotation of images and their retrieval. Firstly, it discusses how annotators could describe and represent images content based on visual information, and secondly how images retrieval could be greatly improved thank to latest techniques, such as clustering and recommendation. To achieve these purposes, the use of implicit image content description tools, interactions of annotators that describe the semantics of images and those of users that use generated semantics to retrieve the images, would be essential. In this thesis, we focus our research on the use of Semantic Web tools, in particular ontologies to produce structured descriptions of images. Ontology is used to represent image objects and the relationships between these objects. In other words, it allows to formally represent the different types of objects and their relationships. Ontology encodes the relational structure of concepts that can be used to describe and reason. This makes them eminently adapted to many problems such as semantic description of images that requires prior knowledge as well as descriptive and normative capacity. The contribution of this thesis is focused on three main points : semantic representation, collaborative semantic annotation and semantic retrieval of images.Semantic representation allows to offer a tool for the capturing semantics of images. To capture the semantics of images, we propose an application ontology derived from a generic ontology. Collaborative semantic annotation that we define, provides emergent semantics through the fusion of semantics proposed by the annotators.Semantic retrieval allows to look for images with semantics provided by collaborative semantic annotation. It is based on clustering and recommendation. Clustering is used to group similar images corresponding to the user’s query and recommendation aims to propose semantics to users based on their profiles. It consists of three steps : creation of users community, acquiring of user profiles and classification of user profiles with Galois algebra. Experiments were conducted to validate the approaches proposed in this work. Annotation Collaboration Images Recommandations Annotateurs Ontologie Sémantiques Émergentes Annotation Collaboration Images Recommendations Annotators Ontology Emergent Semantic 004.678
4	Learning from Multiple Knowledge Sources Zhang, Ping January 2013 (has links) In supervised learning, it is usually assumed that true labels are readily available from a single annotator or source. However, recent advances in corroborative technology have given rise to situations where the true label of the target is unknown. In such problems, multiple sources or annotators are often available that provide noisy labels of the targets. In these multi-annotator problems, building a classifier in the traditional single-annotator manner, without regard for the annotator properties may not be effective in general. In recent years, how to make the best use of the labeling information provided by multiple annotators to approximate the hidden true concept has drawn the attention of researchers in machine learning and data mining. In our previous work, a probabilistic method (i.e., MAP-ML algorithm) of iteratively evaluating the different annotators and giving an estimate of the hidden true labels is developed. However, the method assumes the error rate of each annotator is consistent across all the input data. This is an impractical assumption in many cases since annotator knowledge can fluctuate considerably depending on the groups of input instances. In this dissertation, one of our proposed methods, GMM-MAPML algorithm, follows MAP-ML but relaxes the data-independent assumption, i.e., we assume an annotator may not be consistently accurate across the entire feature space. GMM-MAPML uses a Gaussian mixture model (GMM) and Bayesian information criterion (BIC) to find the fittest model to approximate the distribution of the instances. Then the maximum a posterior (MAP) estimation of the hidden true labels and the maximum-likelihood (ML) estimation of quality of multiple annotators at each Gaussian component are provided alternately. Recent studies show that it is not the case that employing more annotators regardless of their expertise will result in improved highest aggregating performance. In this dissertation, we also propose a novel algorithm to integrate multiple annotators by Aggregating Experts and Filtering Novices, which we call AEFN. AEFN iteratively evaluates annotators, filters the low-quality annotators, and re-estimates the labels based only on information obtained from the good annotators. The noisy annotations we integrate are from any combination of human and previously existing machine-based classifiers, and thus AEFN can be applied to many real-world problems. Emotional speech classification, CASP9 protein disorder prediction, and biomedical text annotation experiments show a significant performance improvement of the proposed methods (i.e., GMM-MAPML and AEFN) as compared to the majority voting baseline and the previous data-independent MAP-ML method. Recent experiments include predicting novel drug indications (i.e., drug repositioning) for both approved drugs and new molecules by integrating multiple chemical, biological or phenotypic data sources. / Computer and Information Science Computer Science Information Science Bioinformatics Crowdsourcing Data-dependent Experts Data Mining Drug Repositioning Machine Learning Multiple Annotators
5	Apprentissage supervisé à partir des multiples annotateurs incertains / Supervised Learning from Multiple Uncertain Annotators Wolley, Chirine 01 December 2014 (has links) En apprentissage supervisé, obtenir les réels labels pour un ensemble de données peut être très fastidieux et long. Aujourd'hui, les récentes avancées d'Internet ont permis le développement de services d'annotations en ligne, faisant appel au crowdsourcing pour collecter facilement des labels. Néanmoins, le principal inconvénient de ces services réside dans le fait que les annotateurs peuvent avoir des niveaux d'expertise très hétérogènes. De telles données ne sont alors pas forcément fiables. Par conséquent, la gestion de l'incertitude des annotateurs est un élément clé pour l'apprentissage à partir de multiples annotateurs non experts. Dans cette thèse, nous proposons des algorithmes probabilistes qui traitent l'incertitude des annotateurs et la qualité des données durant la phase d'apprentissage. Trois modèles sont proposés: IGNORE permet de classer de nouvelles instances tout en évaluant les annotateurs en terme de performance d'annotation qui dépend de leur incertitude. X-IGNORE intègre la qualité des données en plus de l'incertitude des juges. En effet, X-IGNORE suppose que la performance des annotateurs dépend non seulement de leur incertitude mais aussi de la qualité des données qu'ils annotent. Enfin, ExpertS répond au problème de sélection d'annotateurs durant l'apprentissage. ExpertS élimine les annotateurs les moins performants, et se base ainsi uniquement sur les labels des bons annotateurs (experts) lors de l'étape d'apprentissage. De nombreuses expérimentations effectuées sur des données synthétiques et réelles montrent la performance et la stabilité de nos modèles par rapport à différents algorithmes de la littérature. / In supervised learning tasks, obtaining the ground truth label for each instance of the training dataset can be difficult, time-consuming and/or expensive. With the advent of infrastructures such as the Internet, an increasing number of web services propose crowdsourcing as a way to collect a large enough set of labels from internet users. The use of these services provides an exceptional facility to collect labels from anonymous annotators, and thus, it considerably simplifies the process of building labels datasets. Nonetheless, the main drawback of crowdsourcing services is their lack of control over the annotators and their inability to verify and control the accuracy of the labels and the level of expertise for each labeler. Hence, managing the annotators' uncertainty is a clue for learning from imperfect annotations. This thesis provides three algorithms when learning from multiple uncertain annotators. IGNORE generates a classifier that predict the label of a new instance and evaluate the performance of each annotator according to their level of uncertainty. X-Ignore, considers that the performance of the annotators both depends on their uncertainty and on the quality of the initial dataset to be annotated. Finally, ExpertS deals with the problem of annotators' selection when generating the classifier. It identifies experts annotators, and learn the classifier based only on their labels. We conducted in this thesis a large set of experiments in order to evaluate our models, both using experimental and real world medical data. The results prove the performance and accuracy of our models compared to previous state of the art solutions in this context. Apprentissage supervisé Incertitude Multiple annotateurs Expertise Qualité des données Analyse bayésienne Algorithme EM Supervised learning Uncertainty Multiple annotators Properties of labelers Data quality Bayesian analysis EM algorithm 004

1

Page generated in 0.0293 seconds