• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 397
  • 64
  • 43
  • 26
  • 6
  • 4
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 626
  • 626
  • 284
  • 222
  • 213
  • 150
  • 138
  • 131
  • 101
  • 95
  • 93
  • 88
  • 80
  • 78
  • 78
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
491

Mesure sans référence de la qualité des vidéos haute définition diffusées avec des pertes de transmission / No-Reference Video Quality Assessment of High Definition Video Streams Delivered with Losses

Boujut, Hugo 24 September 2012 (has links)
Les objectifs de ce travail de thèse ont été: d’une part de détecter automatique-ment les images gelées dans des vidéos télédiffusées; et d’autre part de mesurer sans référencela qualité des vidéos télédiffusées (IP et DVB-T). Ces travaux ont été effectués dans le cadred’un projet de recherche mené conjointement par le LaBRI et la société Audemat WorldCastSystems.Pour la détection d’images gelées, trois méthodes ont été proposées: MV (basée vecteurde mouvement), DC (basée sur les coefficients DC de la DCT) et SURF (basée sur les pointscaractéristiques SURF). Les deux premières méthodes ne nécessitent qu’un décodage partieldu flux vidéo.Le second objectif était de mesurer sans référence la qualité des vidéos télédiffusées (IP etDVB-T). Une métrique a été développée pour mesurer la qualité perçue lorsque le flux vidéoa été altéré par des pertes de transmission. Cette métrique "Weighted Macro-Block ErrorRate" (WMBER) est fondée sur la mesure de la saillance visuelle et la détection des macro-blocs endommagés. Le rôle de la saillance visuelle est de pondérer l’importance des erreursdétectées. Certaines améliorations ont été apportées à la construction des cartes de saillancespatio-temporelle. En particulier, la fusion des cartes de saillance spatiale et temporelle aété améliorée par rapport à l’état de l’art. Par ailleurs, plusieurs études ont montré que lasémantique d’une scène visuelle avait une influence sur le comportement du système visuelhumain. Il apparaît que ce sont surtout les visages humains qui attirent le regard. C’est laraison pour laquelle nous avons ajouté une dimension sémantique aux cartes de saillancespatio-temporelle. Cette dimension sémantique est essentiellement basée sur le détecteurde visage de Viola Jones. Pour prédire la qualité perçue par les utilisateurs, nous avonsutilisé une méthode par apprentissage supervisé. Cette méthode offre ainsi la possibilité deprédire la métrique subjective "Mean Opinion Score" (MOS) à partir de mesures objectivestelles que le WMBER, PSNR ou SSIM. Une expérience psycho-visuelle a été menée avec 50sujets pour évaluer ces travaux. Cette base de données vidéo Haute-Définition est en coursde transfert à l’action COST Qualinet. Ces travaux ont également été évalués sur une autrebase de données vidéo (en définition standard) provenant de l’IRCCyN / The goal of this Ph.D thesis is to design a no-reference video quality assessment method for lossy net-works. This Ph.D thesis is conducted in collaboration with the Audemat Worldcast Systemscompany.Our first no-reference video quality assessment indicator is the frozen frame detection.Frozen frame detection was a research topic which was well studied in the past decades.However, the challenge is to embed a frozen frame detection method in the GoldenEagleAudemat equipment. This equipment has low computation resources that not allow real-time HD video decoding. Two methods are proposed: one based on the compressed videostream motion vectors (MV-method) and another one based on the DC coefficients from thedct transform (DC-method). Both methods only require the partial decoding of the com-pressed video stream which allows for real-time analysis on the GoldenEagle equipment.The evaluation shows that results are better than the frame difference base-line method.Nevertheless, the MV and the DC methods are only suitable with for MPEG2 and H.264video streams. So a third method based on SURF points is proposed.As a second step on the way to a no-reference video quality assessment metric, we areinterested in the visual perception of transmission impairments. We propose a full-referencemetric based on saliency maps. This metric, Weighted Mean Squared Error (WMSE), is theMSE metric weighted by the saliency map. The saliency map role is to distinguish betweennoticeable and unnoticeable transmission impairments. Therefore this spatio-temporal saliencymaps is computed on the impaired frame. Thus the pixel difference in the MSE computationis emphasized or diminished with regard to the pixel saliency. According to the state of theart, several improvements are brought to the saliency map computation process. Especially,new spatio-temporal saliency map fusion strategies are designed.After our successful attempt to assess the video quality with saliency maps, we develop ano-reference quality metric. This metric, Weighted Macro-Block Error Rate (WMBER), relies on the saliency map and the macro-block error detection. The macro-block error detectionprovides the impaired macro-blocks location in the frame. However, the impaired macro-blocks are concealed with more or less success during the decoding process. So the saliencymap provides the user perceived impairment strength for each macro-block.Several psycho-visual studies have shown that semantics play an important role in visualscene perception. These studies conclude that faces and text are the most attractive. Toimprove the spatio-temporal saliency model a semantic dimension is added. This semanticsaliency is based on the Viola & Jones face detector.To predict the Mean Opinion Score (MOS) from objective metric values like WMBER,WMSE, PSNR or SSIM, we propose to use a supervised learning approach. This approach iscalled Similarity Weighted Average (SWA). Several improvements are brought to the originalSWA.For the metrics evaluation a psycho-visual experiment with 50 subjects has been carriedout. To measure the saliency map models accuracy, a psycho-visual experiment with aneye-tracker has also been carried out. These two experiments habe been conducted in col-laboration with the Ben Gurion University, Israel. WMBER and WMSE performances arecompared with reference metrics like SSIM and PSNR. The proposed metrics are also testedon a database provided by IRCCyN research laboratory.
492

Classification automatique pour la compréhension de la parole : vers des systèmes semi-supervisés et auto-évolutifs / Machine learning applied to speech language understanding : towards semi-supervised and self-evolving systems

Gotab, Pierre 04 December 2012 (has links)
La compréhension automatique de la parole est au confluent des deux grands domaines que sont la reconnaissance automatique de la parole et l'apprentissage automatique. Un des problèmes majeurs dans ce domaine est l'obtention d'un corpus de données conséquent afin d'obtenir des modèles statistiques performants. Les corpus de parole pour entraîner des modèles de compréhension nécessitent une intervention humaine importante, notamment dans les tâches de transcription et d'annotation sémantique. Leur coût de production est élevé et c'est la raison pour laquelle ils sont disponibles en quantité limitée.Cette thèse vise principalement à réduire ce besoin d'intervention humaine de deux façons : d'une part en réduisant la quantité de corpus annoté nécessaire à l'obtention d'un modèle grâce à des techniques d'apprentissage semi-supervisé (Self-Training, Co-Training et Active-Learning) ; et d'autre part en tirant parti des réponses de l'utilisateur du système pour améliorer le modèle de compréhension.Ce dernier point touche à un second problème rencontré par les systèmes de compréhension automatique de la parole et adressé par cette thèse : le besoin d'adapter régulièrement leurs modèles aux variations de comportement des utilisateurs ou aux modifications de l'offre de services du système / Two wide research fields named Speech Recognition and Machine Learning meet with the Automatic Speech Language Understanding. One of the main problems in this domain is to obtain a sufficient corpus to train an efficient statistical model. Such speech corpora need a lot of human involvement to transcript and semantically annotate them. Their production cost is therefore quite high and they are difficultly available.This thesis mainly aims at reducing the need of human intervention in two ways: firstly, reducing the amount of corpus needed to build a model thanks to some semi-supervised learning methods (Self-Training, Co-Training and Active-Learning); And lastly, using the answers of the system end-user to improve the comprehension model.This last point addresses another problem related to automatic speech understanding systems: the need to adapt their models to the fluctuation of end-user habits or to the modification of the services list offered by the system
493

Predicting Linguistic Structure with Incomplete and Cross-Lingual Supervision

Täckström, Oscar January 2013 (has links)
Contemporary approaches to natural language processing are predominantly based on statistical machine learning from large amounts of text, which has been manually annotated with the linguistic structure of interest. However, such complete supervision is currently only available for the world's major languages, in a limited number of domains and for a limited range of tasks. As an alternative, this dissertation considers methods for linguistic structure prediction that can make use of incomplete and cross-lingual supervision, with the prospect of making linguistic processing tools more widely available at a lower cost. An overarching theme of this work is the use of structured discriminative latent variable models for learning with indirect and ambiguous supervision; as instantiated, these models admit rich model features while retaining efficient learning and inference properties. The first contribution to this end is a latent-variable model for fine-grained sentiment analysis with coarse-grained indirect supervision. The second is a model for cross-lingual word-cluster induction and the application thereof to cross-lingual model transfer. The third is a method for adapting multi-source discriminative cross-lingual transfer models to target languages, by means of typologically informed selective parameter sharing. The fourth is an ambiguity-aware self- and ensemble-training algorithm, which is applied to target language adaptation and relexicalization of delexicalized cross-lingual transfer parsers. The fifth is a set of sequence-labeling models that combine constraints at the level of tokens and types, and an instantiation of these models for part-of-speech tagging with incomplete cross-lingual and crowdsourced supervision. In addition to these contributions, comprehensive overviews are provided of structured prediction with no or incomplete supervision, as well as of learning in the multilingual and cross-lingual settings. Through careful empirical evaluation, it is established that the proposed methods can be used to create substantially more accurate tools for linguistic processing, compared to both unsupervised methods and to recently proposed cross-lingual methods. The empirical support for this claim is particularly strong in the latter case; our models for syntactic dependency parsing and part-of-speech tagging achieve the hitherto best published results for a wide number of target languages, in the setting where no annotated training data is available in the target language.
494

Agrupamento de dados baseado em comportamento coletivo e auto-organização / Data clustering based on collective behavior and self-organization

Gueleri, Roberto Alves 18 June 2013 (has links)
O aprendizado de máquina consiste de conceitos e técnicas que permitem aos computadores melhorar seu desempenho com a experiência, ou, em outras palavras, aprender com dados. Um dos principais tópicos do aprendizado de máquina é o agrupamento de dados que, como o nome sugere, procura agrupar os dados de acordo com sua similaridade. Apesar de sua definição relativamente simples, o agrupamento é uma tarefa computacionalmente complexa, tornando proibitivo o emprego de algoritmos exaustivos, na busca pela solução ótima do problema. A importância do agrupamento de dados, aliada aos seus desafios, faz desse campo um ambiente de intensa pesquisa. Também a classe de fenômenos naturais conhecida como comportamento coletivo tem despertado muito interesse. Isso decorre da observação de um estado organizado e global que surge espontaneamente das interações locais presentes em grandes grupos de indivíduos, caracterizando, pois, o que se chama auto-organização ou emergência, para ser mais preciso. Os desafios intrínsecos e a relevância do tema vêm motivando sua pesquisa em diversos ramos da ciência e da engenharia. Ao mesmo tempo, técnicas baseadas em comportamento coletivo vêm sendo empregadas em tarefas de aprendizado de máquina, mostrando-se promissoras e ganhando bastante atenção. No presente trabalho, objetivou-se o desenvolvimento de técnicas de agrupamento baseadas em comportamento coletivo. Faz-se cada item do conjunto de dados corresponder a um indivíduo, definem-se as leis de interação local, e então os indivíduos são colocados a interagir entre si, de modo que os padrões que surgem reflitam os padrões originalmente presentes no conjunto de dados. Abordagens baseadas em dinâmica de troca de energia foram propostas. Os dados permanecem fixos em seu espaço de atributos, mas carregam certa informação a energia , a qual é progressivamente trocada entre eles. Os grupos são estabelecidos entre dados que tomam estados de energia semelhantes. Este trabalho abordou também o aprendizado semissupervisionado, cuja tarefa é rotular dados em bases parcialmente rotuladas. Nesse caso, foi adotada uma abordagem baseada na movimentação dos próprios dados pelo espaço de atributos. Procurou-se, durante todo este trabalho, não apenas propor novas técnicas de aprendizado, mas principalmente, por meio de muitas simulações e ilustrações, mostrar como elas se comportam em diferentes cenários, num esforço em mostrar onde reside a vantagem de se utilizar a dinâmica coletiva na concepção dessas técnicas / Machine learning consists of concepts and techniques that enable computers to improve their performance with experience, i.e., enable computers to learn from data. Data clustering (or just clustering) is one of its main topics, which aims to group data according to their similarities. Regardless of its simple definition, clustering is a complex computational task. Its relevance and challenges make this field an environment of intense research. The class of natural phenomena known as collective behavior has also attracted much interest. This is due to the observation that global patterns may spontaneously arise from local interactions among large groups of individuals, what is know as self-organization (or emergence). The challenges and relevance of the subject are encouraging its research in many branches of science and engineering. At the same time, techniques based on collective behavior are being employed in machine learning tasks, showing to be promising. The objective of the present work was to develop clustering techniques based on collective behavior. Each dataset item corresponds to an individual. Once the local interactions are defined, the individuals begin to interact with each other. It is expected that the patterns arising from these interactions match the patterns originally present in the dataset. Approaches based on dynamics of energy exchange have been proposed. The data are kept fixed in their feature space, but they carry some sort of information (the energy), which is progressively exchanged among them. The groups are established among data that take similar energy states. This work has also addressed the semi-supervised learning task, which aims to label data in partially labeled datasets. In this case, it has been proposed an approach based on the motion of the data themselves around the feature space. More than just providing new machine learning techniques, this research has tried to show how the techniques behave in different scenarios, in an effort to show where lies the advantage of using collective dynamics in the design of such techniques
495

Métodos de validação tradicional e temporal aplicados à avaliação de classificadores de RNAs codificantes e não codificantes / Traditional and time validation methods applied to the evaluation of coding and non-coding RNA classifiers

Sá, Clebiano da Costa 23 March 2018 (has links)
Os ácidos ribonucleicos (RNAs) podem ser classificados em duas classes principais: codificante e não codificante de proteína. Os codificantes, representados pelos RNAs mensageiros (mRNAs), possuem a informação necessária à síntese proteica. Já os RNAs não codificantes (ncRNAs) não são traduzidos em proteínas, mas estão envolvidos em várias atividades celulares distintas e associados a várias doenças tais como cardiopatias, câncer e desordens psiquiátricas. A descoberta de novos ncRNAs e seus papéis moleculares favorece avanços no conhecimento da biologia molecular e pode também impulsionar o desenvolvimento de novas terapias contra doenças. A identificação de ncRNAs é uma ativa área de pesquisa e um dos correntes métodos é a classificação de sequências transcritas utilizando sistemas de reconhecimento de padrões baseados em suas características. Muitos classificadores têm sido desenvolvidos com este propósito, especialmente nos últimos três anos. Um exemplo é o Coding Potential Calculator (CPC), baseado em Máquinas de Vetores de Suporte (SVM). No entanto, outros algoritmos robustos são também reconhecidos pelo seu potencial em tarefas de classificação, como por exemplo Random Forest (RF). O método mais utilizado para avaliação destas ferramentas tem sido a validação cruzada k-fold. Uma questão não considerada nessa forma de validação é a suposição de que as distribuições de frequências dentro do banco de dados, em termos das classes das sequências e outras variáveis, não se alteram ao longo do tempo. Caso essa premissa não seja verdadeira, métodos tradicionais como a validação cruzada e o hold-out podem subestimar os erros de classificação. Constata-se, portanto, a necessidade de um método de validação que leve em consideração a constante evolução dos bancos de dados ao longo do tempo, para proporcionar uma análise de desempenho mais realista destes classificadores. Neste trabalho comparamos dois métodos de avaliação de classificadores: hold-out temporal e hold-out tradicional (atemporal). Além disso, testamos novos modelos de classificação a partir da combinação de diferentes algoritmos de indução com características de classificadores do estado da arte e um novo conjunto de características. A partir dos testes das hipóteses, observamos que tanto a validação hold-out tradicional quanto a validação hold-out temporal tendem a subestimar os erros de classificação, que a avaliação por validação temporal é mais fidedigna, que classificadores treinados a partir de parâmetros calibrados por validação temporal não melhoram a classificação e que nosso modelo de classificação baseado em Random Forest e treinado com características de classificadores do estado da arte e mais um novo conjunto de características proporcionou uma melhora significativa na discriminação dos RNAs codificantes e não codificantes. Por fim, destacamos o potencial do algoritmo Random Forest e das características utilizadas, diante deste problema de classificação, e sugerimos o uso do método de validação hold-out temporal para a obtenção de estimativas de desempenho mais fidedignas para os classificadores de RNAs codificantes e não codificantes de proteína. / Ribonucleic acids (RNAs) can be classified into two main classes: coding and non-coding of protein. The coding, represented by messenger RNAs (mRNAs), has the necessary information for protein synthesis. Non-coding RNAs (ncRNAs) are not translated into proteins but are involved in several distinct cellular activities associated with various diseases such as heart disease, cancer and psychiatric disorders. The discovery of new ncRNAs and their molecular roles favors advances in the knowledge of molecular biology and may also boost the development of new therapies against diseases. The identification of ncRNAs is an active area of research and one of the current methods is the classification of transcribed sequences using pattern recognition systems based on their characteristics. Many classifiers have been developed for this purpose, especially in the last three years. An example is the Coding Potential Calculator (CPC), based on Supporting Vector Machines (SVM). However, other robust algorithms are also recognized for their potential in classification tasks, such as Random Forest (RF). The most commonly used method for evaluating these tools has been cross-validation k-fold. An issue not considered in this form of validation is the assumption that frequency distributions within the database, in terms of sequence classes and other variables, do not change over time. If this assumption is not true, traditional methods such as cross-validation and hold-out may underestimate classification errors. The need for a validation method that takes into account the constant evolution of databases over time is therefore needed to provide a more realistic performance analysis of these classifiers. In this work we compare two methods of evaluation of classifiers: time hold-out and traditional hold-out (without considering the time). In addition, we tested new classification models from the combination of different induction algorithms with state-ofthe-art classifier characteristics and a new set of characteristics. From the hypothesis tests, we observe that both the traditional hold-out validation and the time hold-out validation tend to underestimate the classification errors, that the time validation evaluation is more reliable, than classifiers trained from parameters calibrated by time validation did not improve classification and that our Random Forest-based classification model trained with state-of-the-art classifier characteristics and a new set of characteristics provided a significant improvement in the discrimination of the coding and non-coding RNAs. Finally, we highlight the potential of the Random Forest algorithm and the characteristics used, in view of this classification problem, and we suggest the use of the time hold-out validation method to obtain more reliable estimates of the protein coding and non-coding RNA classifiers.
496

O algoritmo de aprendizado semi-supervisionado co-training e sua aplicação na rotulação de documentos / The semi-supervised learning algorithm co-training applied to label text documents

Matsubara, Edson Takashi 26 May 2004 (has links)
Em Aprendizado de Máquina, a abordagem supervisionada normalmente necessita de um número significativo de exemplos de treinamento para a indução de classificadores precisos. Entretanto, a rotulação de dados é freqüentemente realizada manualmente, o que torna esse processo demorado e caro. Por outro lado, exemplos não-rotulados são facilmente obtidos se comparados a exemplos rotulados. Isso é particularmente verdade para tarefas de classificação de textos que envolvem fontes de dados on-line tais como páginas de internet, email e artigos científicos. A classificação de textos tem grande importância dado o grande volume de textos disponível on-line. Aprendizado semi-supervisionado, uma área de pesquisa relativamente nova em Aprendizado de Máquina, representa a junção do aprendizado supervisionado e não-supervisionado, e tem o potencial de reduzir a necessidade de dados rotulados quando somente um pequeno conjunto de exemplos rotulados está disponível. Este trabalho descreve o algoritmo de aprendizado semi-supervisionado co-training, que necessita de duas descrições de cada exemplo. Deve ser observado que as duas descrições necessárias para co-training podem ser facilmente obtidas de documentos textuais por meio de pré-processamento. Neste trabalho, várias extensões do algoritmo co-training foram implementadas. Ainda mais, foi implementado um ambiente computacional para o pré-processamento de textos, denominado PreTexT, com o objetivo de utilizar co-training em problemas de classificação de textos. Os resultados experimentais foram obtidos utilizando três conjuntos de dados. Dois conjuntos de dados estão relacionados com classificação de textos e o outro com classificação de páginas de internet. Os resultados, que variam de excelentes a ruins, mostram que co-training, similarmente a outros algoritmos de aprendizado semi-supervisionado, é afetado de maneira bastante complexa pelos diferentes aspectos na indução dos modelos. / In Machine Learning, the supervised approach usually requires a large number of labeled training examples to learn accurately. However, labeling is often manually performed, making this process costly and time-consuming. By contrast, unlabeled examples are often inexpensive and easier to obtain than labeled examples. This is especially true for text classification tasks involving on-line data sources, such as web pages, email and scientific papers. Text classification is of great practical importance today given the massive volume of online text available. Semi-supervised learning, a relatively new area in Machine Learning, represents a blend of supervised and unsupervised learning, and has the potential of reducing the need of expensive labeled data whenever only a small set of labeled examples is available. This work describes the semi-supervised learning algorithm co-training, which requires a partitioned description of each example into two distinct views. It should be observed that the two different views required by co-training can be easily obtained from textual documents through pre-processing. In this works, several extensions of co-training algorithm have been implemented. Furthermore, we have also implemented a computational environment for text pre-processing, called PreTexT, in order to apply the co-training algorithm to text classification problems. Experimental results using co-training on three data sets are described. Two data sets are related to text classification and the other one to web-page classification. Results, which range from excellent to poor, show that co-training, similarly to other semi-supervised learning algorithms, is affected by modelling assumptions in a rather complicated way.
497

Analyse harmonique sur graphes dirigés et applications : de l'analyse de Fourier aux ondelettes / Harmonic Analysis on directed graphs and applications : From Fourier analysis to wavelets

Sevi, Harry 22 November 2018 (has links)
La recherche menée dans cette thèse a pour but de développer une analyse harmonique pour des fonctions définies sur les sommets d'un graphe orienté. À l'ère du déluge de données, de nombreuses données sont sous forme de graphes et données sur ce graphe. Afin d'analyser d'exploiter ces données de graphes, nous avons besoin de développer des méthodes mathématiques et numériquement efficientes. Ce développement a conduit à l'émergence d'un nouveau cadre théorique appelé le traitement de signal sur graphe dont le but est d'étendre les concepts fondamentaux du traitement de signal classique aux graphes. Inspirées par l'aspect multi échelle des graphes et données sur graphes, de nombreux constructions multi-échelles ont été proposé. Néanmoins, elles s'appliquent uniquement dans le cadre non orienté. L'extension d'une analyse harmonique sur graphe orienté bien que naturelle, s'avère complexe. Nous proposons donc une analyse harmonique en utilisant l'opérateur de marche aléatoire comme point de départ de notre cadre. Premièrement, nous proposons des bases de type Fourier formées des vecteurs propres de l'opérateur de marche aléatoire. De ces bases de Fourier, nous en déterminons une notion fréquentielle en analysant la variation de ses vecteurs propres. La détermination d'une analyse fréquentielle à partir de la base des vecteurs de l'opérateur de marche aléatoire nous amène aux constructions multi-échelles sur graphes orientés. Plus particulièrement, nous proposons une construction en trames d'ondelettes ainsi qu'une construction d'ondelettes décimées sur graphes orientés. Nous illustrons notre analyse harmonique par divers exemples afin d'en montrer l'efficience et la pertinence. / The research conducted in this thesis aims to develop a harmonic analysis for functions defined on the vertices of an oriented graph. In the era of data deluge, much data is in the form of graphs and data on this graph. In order to analyze and exploit this graph data, we need to develop mathematical and numerically efficient methods. This development has led to the emergence of a new theoretical framework called signal processing on graphs, which aims to extend the fundamental concepts of conventional signal processing to graphs. Inspired by the multi-scale aspect of graphs and graph data, many multi-scale constructions have been proposed. However, they apply only to the non-directed framework. The extension of a harmonic analysis on an oriented graph, although natural, is complex. We, therefore, propose a harmonic analysis using the random walk operator as the starting point for our framework. First, we propose Fourier-type bases formed by the eigenvectors of the random walk operator. From these Fourier bases, we determine a frequency notion by analyzing the variation of its eigenvectors. The determination of a frequency analysis from the basis of the vectors of the random walk operator leads us to multi-scale constructions on oriented graphs. More specifically, we propose a wavelet frame construction as well as a decimated wavelet construction on directed graphs. We illustrate our harmonic analysis with various examples to show its efficiency and relevance.
498

Weakly supervised learning of deformable part models and convolutional neural networks for object detection / Détection d'objets faiblement supervisée par modèles de pièces déformables et réseaux de neurones convolutionnels

Tang, Yuxing 14 December 2016 (has links)
Dans cette thèse, nous nous intéressons au problème de la détection d’objets faiblement supervisée. Le but est de reconnaître et de localiser des objets dans les images, n’ayant à notre disposition durant la phase d’apprentissage que des images partiellement annotées au niveau des objets. Pour cela, nous avons proposé deux méthodes basées sur des modèles différents. Pour la première méthode, nous avons proposé une amélioration de l’approche ”Deformable Part-based Models” (DPM) faiblement supervisée, en insistant sur l’importance de la position et de la taille du filtre racine initial spécifique à la classe. Tout d’abord, un ensemble de candidats est calculé, ceux-ci représentant les positions possibles de l’objet pour le filtre racine initial, en se basant sur une mesure générique d’objectness (par region proposals) pour combiner les régions les plus saillantes et potentiellement de bonne qualité. Ensuite, nous avons proposé l’apprentissage du label des classes latentes de chaque candidat comme un problème de classification binaire, en entrainant des classifieurs spécifiques pour chaque catégorie afin de prédire si les candidats sont potentiellement des objets cible ou non. De plus, nous avons amélioré la détection en incorporant l’information contextuelle à partir des scores de classification de l’image. Enfin, nous avons élaboré une procédure de post-traitement permettant d’élargir et de contracter les régions fournies par le DPM afin de les adapter efficacement à la taille de l’objet, augmentant ainsi la précision finale de la détection. Pour la seconde approche, nous avons étudié dans quelle mesure l’information tirée des objets similaires d’un point de vue visuel et sémantique pouvait être utilisée pour transformer un classifieur d’images en détecteur d’objets d’une manière semi-supervisée sur un large ensemble de données, pour lequel seul un sous-ensemble des catégories d’objets est annoté avec des boîtes englobantes nécessaires pour l’apprentissage des détecteurs. Nous avons proposé de transformer des classifieurs d’images basés sur des réseaux convolutionnels profonds (Deep CNN) en détecteurs d’objets en modélisant les différences entre les deux en considérant des catégories disposant à la fois de l’annotation au niveau de l’image globale et l’annotation au niveau des boîtes englobantes. Cette information de différence est ensuite transférée aux catégories sans annotation au niveau des boîtes englobantes, permettant ainsi la conversion de classifieurs d’images en détecteurs d’objets. Nos approches ont été évaluées sur plusieurs jeux de données tels que PASCAL VOC, ImageNet ILSVRC et Microsoft COCO. Ces expérimentations ont démontré que nos approches permettent d’obtenir des résultats comparables à ceux de l’état de l’art et qu’une amélioration significative a pu être obtenue par rapport à des méthodes récentes de détection d’objets faiblement supervisées. / In this dissertation we address the problem of weakly supervised object detection, wherein the goal is to recognize and localize objects in weakly-labeled images where object-level annotations are incomplete during training. To this end, we propose two methods which learn two different models for the objects of interest. In our first method, we propose a model enhancing the weakly supervised Deformable Part-based Models (DPMs) by emphasizing the importance of location and size of the initial class-specific root filter. We first compute a candidate pool that represents the potential locations of the object as this root filter estimate, by exploring the generic objectness measurement (region proposals) to combine the most salient regions and “good” region proposals. We then propose learning of the latent class label of each candidate window as a binary classification problem, by training category-specific classifiers used to coarsely classify a candidate window into either a target object or a non-target class. Furthermore, we improve detection by incorporating the contextual information from image classification scores. Finally, we design a flexible enlarging-and-shrinking post-processing procedure to modify the DPMs outputs, which can effectively match the approximate object aspect ratios and further improve final accuracy. Second, we investigate how knowledge about object similarities from both visual and semantic domains can be transferred to adapt an image classifier to an object detector in a semi-supervised setting on a large-scale database, where a subset of object categories are annotated with bounding boxes. We propose to transform deep Convolutional Neural Networks (CNN)-based image-level classifiers into object detectors by modeling the differences between the two on categories with both image-level and bounding box annotations, and transferring this information to convert classifiers to detectors for categories without bounding box annotations. We have evaluated both our approaches extensively on several challenging detection benchmarks, e.g. , PASCAL VOC, ImageNet ILSVRC and Microsoft COCO. Both our approaches compare favorably to the state-of-the-art and show significant improvement over several other recent weakly supervised detection methods.
499

Obstacle detection and emergency exit sign recognition for autonomous navigation using camera phone

Mohammed, Abdulmalik January 2017 (has links)
In this research work, we develop an obstacle detection and emergency exit sign recognition system on a mobile phone by extending the feature from accelerated segment test detector with Harris corner filter. The first step often required for many vision based applications is the detection of objects of interest in an image. Hence, in this research work, we introduce emergency exit sign detection method using colour histogram. The hue and saturation component of an HSV colour model are processed into features to build a 2D colour histogram. We backproject a 2D colour histogram to detect emergency exit sign from a captured image as the first task required before performing emergency exit sign recognition. The result of classification shows that the 2D histogram is fast and can discriminate between objects and background with accuracy. One of the challenges confronting object recognition methods is the type of image feature to compute. In this work therefore, we present two feature detectors and descriptor methods based on the feature from accelerated segment test detector with Harris corner filter. The first method is called Upright FAST-Harris and binary detector (U-FaHB), while the second method Scale Interpolated FAST-Harris and Binary (SIFaHB). In both methods, feature points are extracted using the accelerated segment test detectors and Harris filter to return the strongest corner points as features. However, in the case of SIFaHB, the extraction of feature points is done across the image plane and along the scale-space. The modular design of these detectors allows for the integration of descriptors of any kind. Therefore, we combine these detectors with binary test descriptor like BRIEF to compute feature regions. These detectors and the combined descriptor are evaluated using different images observed under various geometric and photometric transformations and the performance is compared with other detectors and descriptors. The results obtained show that our proposed feature detector and descriptor method is fast and performs better compared with other methods like SIFT, SURF, ORB, BRISK, CenSurE. Based on the potential of U-FaHB detector and descriptor, we extended it for use in optical flow computation, which we termed the Nearest-flow method. This method has the potential of computing flow vectors for use in obstacle detection. Just like any other new methods, we evaluated the Nearest flow method using real and synthetic image sequences. We compare the performance of the Nearest-flow with other methods like the Lucas and Kanade, Farneback and SIFT-flow. The results obtained show that our Nearest-flow method is faster to compute and performs better on real scene images compared with the other methods. In the final part of this research, we demonstrate the application potential of our proposed methods by developing an obstacle detection and exit sign recognition system on a camera phone and the result obtained shows that the methods have the potential to solve this vision based object detection and recognition problem.
500

Aide au diagnostic du cancer de la prostate par IRM multi-paramétrique : une approche par classification supervisée / Computer-aided diagnosis of prostate cancer using multi-parametric MRI : a supervised learning approach

Niaf, Émilie 10 December 2012 (has links)
Le cancer de la prostate est la deuxième cause de mortalité chez l’homme en France. L’IRM multiparamétrique est considérée comme la technique la plus prometteuse pour permettre une cartographie du cancer, ouvrant la voie au traitement focal, alternatif à la prostatectomie radicale. Néanmoins, elle reste difficile à interpréter et est sujette à une forte variabilité inter- et intra-expert, d’où la nécessité de développer des systèmes experts capables d’aider le radiologue dans son diagnostic. Nous proposons un système original d’aide au diagnostic (CAD) offrant un second avis au radiologue sur des zones suspectes pointées sur l’image. Nous évaluons notre système en nous appuyant sur une base de données clinique de 30 patients, annotées de manière fiable et exhaustive grâce à l’analyse des coupes histologiques obtenues par prostatectomie. Les performances mesurées dans des conditions cliniques auprès de 12 radiologues, sans et avec notre outil, démontrent l’apport significatif de ce CAD sur la qualité du diagnostic, la confiance des radiologues et la variabilité inter-expert. La création d’une base de corrélations anatomo-radiologiques est une tâche complexe et fastidieuse. Beaucoup d’études n’ont pas d’autre choix que de s’appuyer sur l’analyse subjective d’un radiologue expert, entâchée d’incertitude. Nous proposons un nouveau schéma de classification, basé sur l’algorithme du séparateur à vaste marge (SVM), capable d’intégrer, dans la fonction d’apprentissage, l’incertitude sur l’appartenance à une classe (ex. sain/malin) de certains échantillons de la base d’entraînement. Les résultats obtenus, tant sur des exemples simulés que sur notre base de données cliniques, démontrent le potentiel de ce nouvel algorithme, en particulier pour les applications CAD, mais aussi de manière plus générale pour toute application de machine learning s’appuyant sur un étiquetage quantitatif des données / Prostate cancer is one of the leading cause of death in France. Multi-parametric MRI is considered the most promising technique for cancer visualisation, opening the way to focal treatments as an alternative to prostatectomy. Nevertheless, its interpretation remains difficult and subject to inter- and intra-observer variability, which motivates the development of expert systems to assist radiologists in making their diagnosis. We propose an original computer-aided diagnosis system returning a malignancy score to any suspicious region outlined on MR images, which can be used as a second view by radiologists. The CAD performances are evaluated based on a clinical database of 30 patients, exhaustively and reliably annotated thanks to the histological ground truth obtained via prostatectomy. Finally, we demonstrate the influence of this system in clinical condition based on a ROC analysis involving 12 radiologists, and show a significant increase of diagnostic accuracy, rating confidence and a decrease in inter-expert variability. Building an anatomo-radiological correlation database is a complex and fastidious task, so that numerous studies base their evaluation analysis on the expertise of one experienced radiologist, which is thus doomed to contain uncertainties. We propose a new classification scheme, based on the support vector machine (SVM) algorithm, which is able to account for uncertain data during the learning step. The results obtained, both on toy examples and on our clinical database, demonstrate the potential of this new approach that can be extended to any machine learning problem relying on a probabilitic labelled dataset

Page generated in 0.1424 seconds