• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 95
  • 80
  • 11
  • 11
  • 10
  • 4
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 255
  • 92
  • 80
  • 69
  • 60
  • 57
  • 53
  • 52
  • 47
  • 47
  • 44
  • 41
  • 38
  • 37
  • 36
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
211

Computational Intelligence Based Classifier Fusion Models for Biomedical Classification Applications

Chen, Xiujuan 27 November 2007 (has links)
The generalization abilities of machine learning algorithms often depend on the algorithms’ initialization, parameter settings, training sets, or feature selections. For instance, SVM classifier performance largely relies on whether the selected kernel functions are suitable for real application data. To enhance the performance of individual classifiers, this dissertation proposes classifier fusion models using computational intelligence knowledge to combine different classifiers. The first fusion model called T1FFSVM combines multiple SVM classifiers through constructing a fuzzy logic system. T1FFSVM can be improved by tuning the fuzzy membership functions of linguistic variables using genetic algorithms. The improved model is called GFFSVM. To better handle uncertainties existing in fuzzy MFs and in classification data, T1FFSVM can also be improved by applying type-2 fuzzy logic to construct a type-2 fuzzy classifier fusion model (T2FFSVM). T1FFSVM, GFFSVM, and T2FFSVM use accuracy as a classifier performance measure. AUC (the area under an ROC curve) is proved to be a better classifier performance metric. As a comparison study, AUC-based classifier fusion models are also proposed in the dissertation. The experiments on biomedical datasets demonstrate promising performance of the proposed classifier fusion models comparing with the individual composing classifiers. The proposed classifier fusion models also demonstrate better performance than many existing classifier fusion methods. The dissertation also studies one interesting phenomena in biology domain using machine learning and classifier fusion methods. That is, how protein structures and sequences are related each other. The experiments show that protein segments with similar structures also share similar sequences, which add new insights into the existing knowledge on the relation between protein sequences and structures: similar sequences share high structure similarity, but similar structures may not share high sequence similarity.
212

Αναγνώριση ομιλητή / Speaker recognition

Ganchev, Todor 25 June 2007 (has links)
Η παρούσα διατριβή πραγματεύεται την αναγνώριση ομιλητή σε πραγματικές συνθήκες. Τα κύρια σημεία της εργασίας είναι: (1) αξιολόγηση διαφόρων προσεγγίσεων εξαγωγής χαρακτηριστικών παραμέτρων ομιλίας, (2) μείωση της ισχύος της περιβαλλοντικής επίδρασης στην απόδοση της αναγνώρισης ομιλητή, και (3) μελέτη τεχνικών κατηγοριοποίησης, εναλλακτικών προς τις υπάρχουσες. Συγκεκριμένα, στο (1), προτείνεται μια νέα δομή εξαγωγής παραμέτρων ομιλίας βασισμένη σε πακέτα κυματομορφών, κατάλληλα σχεδιασμένη για αναγνώριση ομιλητή. Εξάγεται με ένα αντικειμενικό τρόπο σε σχέση με την απόδοση αναγνώρισης ομιλητή, σε αντίθεση με την MFCC προσέγγιση, που βασίζεται στην προσέγγιση της αντίληψης της ανθρώπινης ακοής. Έπειτα, στο (2), δίνεται μια δομή για την εξαγωγή παραμέτρων βασισμένη στα MFCC, ανεκτική στο θόρυβο, για την βελτίωση της απόδοσης της αναγνώρισης ομιλητή σε πραγματικό περιβάλλον. Συνοπτικά, μια τεχνική μείωσης του θορύβου βασισμένη σε μοντέλο προσαρμοσμένη στο πρόβλημα της επιβεβαίωσης ομιλητή ενσωματώνεται απευθείας στη δομή υπολογισμού των MFCC. Αυτή η προσέγγιση επέδειξε σημαντικό πλεονέκτημα σε πραγματικό και ταχέως μεταβαλλόμενο περιβάλλον. Τέλος, στο (3), εισάγονται δύο νέοι κατηγοριοποιητές που αναφέρονται ως Locally Recurrent Probabilistic Neural Network (LR PNN), και Generalized Locally Recurrent Probabilistic Neural Network (GLR PNN). Είναι υβρίδια μεταξύ των Recurrent Neural Network (RNN) και Probabilistic Neural Network (PNN) και συνδυάζουν τα πλεονεκτήματα των γεννετικών και διαφορικών προσσεγγίσεων κατηγοριοποίησης. Επιπλέον, τα νέα αυτά νευρωνικά δίκτυα είναι ευαίσθητα σε παροδικές και ειδικές συσχετίσεις μεταξύ διαδοχικών εισόδων, και έτσι, είναι κατάλληλα για να αξιοποιήσουν την συσχέτιση παραμέτρων ομιλίας μεταξύ πλαισίων ομιλίας. Κατά την εξαγωγή των πειραμάτων, διαφάνηκε ότι οι αρχιτεκτονικές LR PNN και GLR PNN παρέχουν καλύτερη απόδοση, σε σχέση με τα αυθεντικά PNN. / This dissertation dials with speaker recognition in real-world conditions. The main accent falls on: (1) evaluation of various speech feature extraction approaches, (2) reduction of the impact of environmental interferences on the speaker recognition performance, and (3) studying alternative to the present state-of-the-art classification techniques. Specifically, within (1), a novel wavelet packet-based speech features extraction scheme fine-tuned for speaker recognition is proposed. It is derived in an objective manner with respect to the speaker recognition performance, in contrast to the state-of-the-art MFCC scheme, which is based on approximation of human auditory perception. Next, within (2), an advanced noise-robust feature extraction scheme based on MFCC is offered for improving the speaker recognition performance in real-world environments. In brief, a model-based noise reduction technique adapted for the specifics of the speaker verification task is incorporated directly into the MFCC computation scheme. This approach demonstrated significant advantage in real-world fast-varying environments. Finally, within (3), two novel classifiers referred to as Locally Recurrent Probabilistic Neural Network (LR PNN), and Generalized Locally Recurrent Probabilistic Neural Network (GLR PNN) are introduced. They are hybrids between Recurrent Neural Network (RNN) and Probabilistic Neural Network (PNN) and combine the virtues of the generative and discriminative classification approaches. Moreover, these novel neural networks are sensitive to temporal and special correlations among consecutive inputs, and therefore, are capable to exploit the inter-frame correlations among speech features derived for successive speech frames. In the experimentations, it was demonstrated that the LR PNN and GLR PNN architectures provide benefit in terms of performance, when compared to the original PNN.
213

L’extraction de phrases en relation de traduction dans Wikipédia

Rebout, Lise 06 1900 (has links)
Afin d'enrichir les données de corpus bilingues parallèles, il peut être judicieux de travailler avec des corpus dits comparables. En effet dans ce type de corpus, même si les documents dans la langue cible ne sont pas l'exacte traduction de ceux dans la langue source, on peut y retrouver des mots ou des phrases en relation de traduction. L'encyclopédie libre Wikipédia constitue un corpus comparable multilingue de plusieurs millions de documents. Notre travail consiste à trouver une méthode générale et endogène permettant d'extraire un maximum de phrases parallèles. Nous travaillons avec le couple de langues français-anglais mais notre méthode, qui n'utilise aucune ressource bilingue extérieure, peut s'appliquer à tout autre couple de langues. Elle se décompose en deux étapes. La première consiste à détecter les paires d’articles qui ont le plus de chance de contenir des traductions. Nous utilisons pour cela un réseau de neurones entraîné sur un petit ensemble de données constitué d'articles alignés au niveau des phrases. La deuxième étape effectue la sélection des paires de phrases grâce à un autre réseau de neurones dont les sorties sont alors réinterprétées par un algorithme d'optimisation combinatoire et une heuristique d'extension. L'ajout des quelques 560~000 paires de phrases extraites de Wikipédia au corpus d'entraînement d'un système de traduction automatique statistique de référence permet d'améliorer la qualité des traductions produites. Nous mettons les données alignées et le corpus extrait à la disposition de la communauté scientifique. / Working with comparable corpora can be useful to enhance bilingual parallel corpora. In fact, in such corpora, even if the documents in the target language are not the exact translation of those in the source language, one can still find translated words or sentences. The free encyclopedia Wikipedia is a multilingual comparable corpus of several millions of documents. Our task is to find a general endogenous method for extracting a maximum of parallel sentences from this source. We are working with the English-French language pair but our method -- which uses no external bilingual resources -- can be applied to any other language pair. It can best be described in two steps. The first one consists of detecting article pairs that are most likely to contain translations. This is achieved through a neural network trained on a small data set composed of sentence aligned articles. The second step is to perform the selection of sentence pairs through another neural network whose outputs are then re-interpreted by a combinatorial optimization algorithm and an extension heuristic. The addition of the 560~000 pairs of sentences extracted from Wikipedia to the training set of a baseline statistical machine translation system improves the quality of the resulting translations. We make both the aligned data and the extracted corpus available to the scientific community.
214

Stereo vision and LIDAR based Dynamic Occupancy Grid mapping : Application to scenes analysis for Intelligent Vehicles

Li, You 03 December 2013 (has links) (PDF)
Intelligent vehicles require perception systems with high performances. Usually, perception system consists of multiple sensors, such as cameras, 2D/3D lidars or radars. The works presented in this Ph.D thesis concern several topics on cameras and lidar based perception for understanding dynamic scenes in urban environments. The works are composed of four parts.In the first part, a stereo vision based visual odometry is proposed by comparing several different approaches of image feature detection and feature points association. After a comprehensive comparison, a suitable feature detector and a feature points association approach is selected to achieve better performance of stereo visual odometry. In the second part, independent moving objects are detected and segmented by the results of visual odometry and U-disparity image. Then, spatial features are extracted by a kernel-PCA method and classifiers are trained based on these spatial features to recognize different types of common moving objects e.g. pedestrians, vehicles and cyclists. In the third part, an extrinsic calibration method between a 2D lidar and a stereoscopic system is proposed. This method solves the problem of extrinsic calibration by placing a common calibration chessboard in front of the stereoscopic system and 2D lidar, and by considering the geometric relationship between the cameras of the stereoscopic system. This calibration method integrates also sensor noise models and Mahalanobis distance optimization for more robustness. At last, dynamic occupancy grid mapping is proposed by 3D reconstruction of the environment, obtained from stereovision and Lidar data separately and then conjointly. An improved occupancy grid map is obtained by estimating the pitch angle between ground plane and the stereoscopic system. The moving object detection and recognition results (from the first and second parts) are incorporated into the occupancy grid map to augment the semantic meanings. All the proposed and developed methods are tested and evaluated with simulation and real data acquired by the experimental platform "intelligent vehicle SetCar" of IRTES-SET laboratory.
215

Sistemas inteligentes aplicados em monitoramento de estruturas aeronáuticas. / Intelligent systems applied in monitoring of aeronautical structures.

Luis Antonio Rodrigues Lopes 19 June 2013 (has links)
Este trabalho apresenta o desenvolvimento de sistemas inteligentes aplicados ao monitoramento de estruturas aeronáuticas abordando dois modelos distintos: o primeiro é a análise e classificação de imagens de ultrassom de estruturas aeronáuticas com objetivo de apoiar decisões em reparo de estruturas aeronáuticas. Foi definido como escopo do trabalho uma seção transversal da asa da aeronave modelo Boeing 707. Após a remoção de material superficial em áreas comprometidas por corrosão, é realizada a medição da espessura ao longo da área da peça. Com base nestas medições, a Engenharia realiza a análise estrutural, observando os limites determinados pelo manual de manutenção e determina a necessidade ou não de reparo. O segundo modelo compreende o método de impedância eletromecânica. É proposto o desenvolvimento de um sistema de monitoramento de baixo custo aplicado em uma barra de alumínio aeronáutico com 10 posições de fixação de porcas e parafusos. O objetivo do sistema é avaliar, a partir das curvas de impedância extraídas do transdutor PZT fixado na barra, sua capacidade de classificar a existência ou não de um dano na estrutura e, em caso de existência do dano, indicar sua localização e seu grau de severidade. Foram utilizados os seguintes classificadores neste trabalho: máquina de vetor de suporte, redes neurais artificiais e K vizinhos mais próximos. / This work presents the development of intelligent systems applied to the monitoring of aircraft structures addressing two distinct models: the first is the analysis and classification of ultrasound images of aircraft structures in order to support decisions on repair of aircraft structures. A scope of work was defined as a cross section of the wing of the aircraft model Boeing 707. After the removal of surface material in damaged areas by corrosion, thickness measurements in the whole structure are evaluated. Based on the measurements, the Engineering performs structural analysis, observing the limits determined by the maintenance manual and determining the necessity of repair. The second model includes the method of electromechanical impedance. It is proposed to develop a low cost monitoring system applied to an aircraft aluminum bar with 10 positions for fixing nuts and bolts. The goal of the system is to classify an impedance curve in the condition of the aluminum bar if there is or not a damage to the structure and, in case of the existence of damage, indicating their position in the aluminum bar and if the damage is severe or not. The following classifiers were used in this work: support vector machine, artificial neural networks and K nearest neighbors.
216

Modelagem acústica no auxílio ao diagnóstico do funcionamento de motores de usinas termoelétricas. / Acoustic modeling to aid in the diagnosis of the operation of thermoelectric plant motors.

TEIXEIRA JÚNIOR, Adalberto Gomes. 01 May 2018 (has links)
Submitted by Johnny Rodrigues (johnnyrodrigues@ufcg.edu.br) on 2018-05-01T14:25:43Z No. of bitstreams: 1 ADALBERTO GOMES TEIXEIRA JÚNIOR - DISSERTAÇÃO PPGCC 2015..pdf: 2611686 bytes, checksum: 6b9c4a2efc3946611ad0263328434bd1 (MD5) / Made available in DSpace on 2018-05-01T14:25:43Z (GMT). No. of bitstreams: 1 ADALBERTO GOMES TEIXEIRA JÚNIOR - DISSERTAÇÃO PPGCC 2015..pdf: 2611686 bytes, checksum: 6b9c4a2efc3946611ad0263328434bd1 (MD5) Previous issue date: 2015-07 / Capes / O som gerado por motores em funcionamento contém informações sobre seu estado e condições, tornando-se uma fonte importante para a avaliação de seu funcionamento sem a necessidade de intervenção no equipamento. A análise do estado do equipamento muitas vezes é realizada por diagnóstico humano, a partir da experiência vivenciada no ambiente ruidoso de operação. Como o funcionamento dos motores é regido por um processo periódico, o sinal de áudio gerado segue um padrão bem definido, possibilitando, assim, a avaliação de seu estado de funcionamento por meio desse sinal. Dentro deste contexto, a pesquisa ora descrita trata da modelagem do sinal acústico gerado por motores em usinas termoelétricas, aplicando técnicas de processamento digital de sinais e inteligência artificial, com o intuito de auxiliar o diagnóstico de falhas, minimizando a presença humana no ambiente de uma sala de motores. A técnica utilizada baseia-se no estudo do funcionamento dos equipamentos e dos sinais acústicos por eles gerados por esses, para a extração de características representativas do sinal, em diferentes domínios, combinadas a métodos de aprendizagem de máquinas para a construção de um multiclassificador, responsável pela avaliação do estado de funcionamento desses motores. Para a avaliação da eficácia do método proposto, foram utilizados sinais extraídos de motores da Usina Termoelétrica Borborema Energética S.A., no âmbito do projeto REPARAI (REPair over AiR using Artificial Intelligence, código ANEEL PD6471-0002/2012). Ao final do estudo, o método proposto demonstrou acurácia próxima a 100%. A abordagem proposta caracterizou-se, portanto, como eficiente para o diagnóstico de falhas, principalmente por não ser um método invasivo, não exigindo, portanto, o contato direto do avaliador humano com o motor em funcionamento. / The sound generated by an engine during operation contains information about its conditions, becoming an important source of information to evaluate its status without requiring intervention in equipment. The fault diagnosis of the engine usually is performed by a human, based on his experience in a noisy environment. As the operation of the engine is a periodic procedure, the generated signal follows a well-defined pattern, allowing the evaluation of its operating conditions. On this context, this research deals with modeling the acoustic signal generated by engines in power plants, using techniques from digital signal processing and artificial intelligence, with the purpose of assisting the fault diagnosis, minimizing the human presence at the engine room. The technique applied is based on the study of engines operation and the acoustic signal generated by them, extracting signal representative characteristics in different domains, combined with machine learning methods, to build a multiclassifier to evaluate the engines status. Signals extracted from engines of Borborema Energética S.A. power plant, during the REPARAI Project (REPair over AiR using Artificial Intelligence), ANEEL PD-6471-0002/2012, were used in the experiments. In this research, the method proposed has demonstrated an accuracy rate of nearly 100%. The approach has proved itself to be efficient to fault diagnosis, mainly by not being an invasive method and not requiring human direct contact with the engine.
217

Uma an?lise da aplica??o do modelo de Rede Neural RePART em Comit?s de classificadores

Santos, Araken de Medeiros 01 February 2008 (has links)
Made available in DSpace on 2014-12-17T15:47:47Z (GMT). No. of bitstreams: 1 ArakenMS_da_capa_ate_pag_66.pdf: 612002 bytes, checksum: 77ee53e5ec8496b7cf1c4503e222c41d (MD5) Previous issue date: 2008-02-01 / RePART (Reward/Punishment ART) is a neural model that constitutes a variation of the Fuzzy Artmap model. This network was proposed in order to minimize the inherent problems in the Artmap-based model, such as the proliferation of categories and misclassification. RePART makes use of additional mechanisms, such as an instance counting parameter, a reward/punishment process and a variable vigilance parameter. The instance counting parameter, for instance, aims to minimize the misclassification problem, which is a consequence of the sensitivity to the noises, frequently presents in Artmap-based models. On the other hand, the use of the variable vigilance parameter tries to smoouth out the category proliferation problem, which is inherent of Artmap-based models, decreasing the complexity of the net. RePART was originally proposed in order to minimize the aforementioned problems and it was shown to have better performance (higer accuracy and lower complexity) than Artmap-based models. This work proposes an investigation of the performance of the RePART model in classifier ensembles. Different sizes, learning strategies and structures will be used in this investigation. As a result of this investigation, it is aimed to define the main advantages and drawbacks of this model, when used as a component in classifier ensembles. This can provide a broader foundation for the use of RePART in other pattern recognition applications / O RePART (Reward/Punishiment ART), modelo neural que se constitui numa varia??o do modelo Fuzzy Artmap, foi proposto objetivando minimizar problemas inerentes aos modelos da classe Artmap, tais como: prolifera??o de categorias e m? classifica??o. Por essa raz?o, o RePART faz uso de mecanismos adicionais, como: um par?metro contador de inst?ncia, um processo de recompensa/puni??o e um par?metro de vigil?ncia vari?vel. O par?metro contador de inst?ncia busca minimizar o problema de m? classifica??o, resultante da sensibilidade ? ru?dos, freq?entemente presente nos modelos da classe Artmap. O uso da vigil?ncia vari?vel tem como objetivo minimizar o problema de prolifera??o de categorias, diminuindo a complexidade da rede, quando utilizado em aplica??es com um grande n?mero de padr?es de treinamento. A proposta do RePART visou a minimiza??o desses problemas e foi mostrado que o RePART obteve desempenho superior que alguns modelos da classe Artmap. Neste trabalho ? proposta a realiza??o de uma investiga??o do desempenho do modelo RePART em comit?s de classificadores. Nesta investiga??o ser? realizada uma an?lise com comit?s utilizando diferentes tamanhos, estrat?gias de aprendizados e estruturas. Os resultados obtidos com esta investiga??o servir?o como meio de descoberta das vantagens e desvantagens de cada um dos modelos abordados em comit?s. Com isso, poder? ser dado um embasamento ainda mais amplo ? utiliza??o do RePART em outras aplica??es de reconhecimento de padr?es
218

Sistemas inteligentes aplicados em monitoramento de estruturas aeronáuticas. / Intelligent systems applied in monitoring of aeronautical structures.

Luis Antonio Rodrigues Lopes 19 June 2013 (has links)
Este trabalho apresenta o desenvolvimento de sistemas inteligentes aplicados ao monitoramento de estruturas aeronáuticas abordando dois modelos distintos: o primeiro é a análise e classificação de imagens de ultrassom de estruturas aeronáuticas com objetivo de apoiar decisões em reparo de estruturas aeronáuticas. Foi definido como escopo do trabalho uma seção transversal da asa da aeronave modelo Boeing 707. Após a remoção de material superficial em áreas comprometidas por corrosão, é realizada a medição da espessura ao longo da área da peça. Com base nestas medições, a Engenharia realiza a análise estrutural, observando os limites determinados pelo manual de manutenção e determina a necessidade ou não de reparo. O segundo modelo compreende o método de impedância eletromecânica. É proposto o desenvolvimento de um sistema de monitoramento de baixo custo aplicado em uma barra de alumínio aeronáutico com 10 posições de fixação de porcas e parafusos. O objetivo do sistema é avaliar, a partir das curvas de impedância extraídas do transdutor PZT fixado na barra, sua capacidade de classificar a existência ou não de um dano na estrutura e, em caso de existência do dano, indicar sua localização e seu grau de severidade. Foram utilizados os seguintes classificadores neste trabalho: máquina de vetor de suporte, redes neurais artificiais e K vizinhos mais próximos. / This work presents the development of intelligent systems applied to the monitoring of aircraft structures addressing two distinct models: the first is the analysis and classification of ultrasound images of aircraft structures in order to support decisions on repair of aircraft structures. A scope of work was defined as a cross section of the wing of the aircraft model Boeing 707. After the removal of surface material in damaged areas by corrosion, thickness measurements in the whole structure are evaluated. Based on the measurements, the Engineering performs structural analysis, observing the limits determined by the maintenance manual and determining the necessity of repair. The second model includes the method of electromechanical impedance. It is proposed to develop a low cost monitoring system applied to an aircraft aluminum bar with 10 positions for fixing nuts and bolts. The goal of the system is to classify an impedance curve in the condition of the aluminum bar if there is or not a damage to the structure and, in case of the existence of damage, indicating their position in the aluminum bar and if the damage is severe or not. The following classifiers were used in this work: support vector machine, artificial neural networks and K nearest neighbors.
219

Efficient multi-class objet detection with a hierarchy of classes / Détection efficace des objets multi-classes avec une hiérarchie des classes

Odabai Fard, Seyed Hamidreza 20 November 2015 (has links)
Dans cet article, nous présentons une nouvelle approche de détection multi-classes basée sur un parcours hiérarchique de classifieurs appris simultanément. Pour plus de robustesse et de rapidité, nous proposons d’utiliser un arbre de classes d’objets. Notre modèle de détection est appris en combinant les contraintes de tri et de classification dans un seul problème d’optimisation. Notre formulation convexe permet d’utiliser un algorithme de recherche pour accélérer le temps d’exécution. Nous avons mené des évaluations de notre algorithme sur les benchmarks PASCAL VOC (2007 et 2010). Comparé à l’approche un-contre-tous, notre méthode améliore les performances pour 20 classes et gagne 10x en vitesse. / Recent years have witnessed a competition in autonomous navigation for vehicles boosted by the advances in computer vision. The on-board cameras are capable of understanding the semantic content of the environment. A core component of this system is to localize and classify objects in urban scenes. There is a need to have multi-class object detection systems. Designing such an efficient system is a challenging and active research area. The algorithms can be found for applications in autonomous driving, object searches in images or video surveillance. The scale of object classes varies depending on the tasks. The datasets for object detection started with containing one class only e.g. the popular INRIA Person dataset. Nowadays, we witness an expansion of the datasets consisting of more training data or number of object classes. This thesis proposes a solution to efficiently learn a multi-class object detector. The task of such a system is to localize all instances of target object classes in an input image. We distinguish between three major efficiency criteria. First, the detection performance measures the accuracy of detection. Second, we strive low execution times during run-time. Third, we address the scalability of our novel detection framework. The two previous criteria should scale suitably with the number of input classes and the training algorithm has to take a reasonable amount of time when learning with these larger datasets. Although single-class object detection has seen a considerable improvement over the years, it still remains a challenge to create algorithms that work well with any number of classes. Most works on this subject extent these single-class detectors to work accordingly with multiple classes but remain hardly flexible to new object descriptors. Moreover, they do not consider all these three criteria at the same time. Others use a more traditional approach by iteratively executing a single-class detector for each target class which scales linearly in training time and run-time. To tackle the challenges, we present a novel framework where for an input patch during detection the closest class is ranked highest. Background labels are rejected as negative samples. The detection goal is to find the highest scoring class. To this end, we derive a convex problem formulation that combines ranking and classification constraints. The accuracy of the system is improved by hierarchically arranging the classes into a tree of classifiers. The leaf nodes represent the individual classes and the intermediate nodes called super-classes group recursively these classes together. The super-classes benefit from the shared knowledge of their descending classes. All these classifiers are learned in a joint optimization problem along with the previouslymentioned constraints. The increased number of classifiers are prohibitive to rapid execution times. The formulation of the detection goal naturally allows to use an adapted tree traversal algorithm to progressively search for the best class but reject early in the detection process the background samples and consequently reduce the system’s run-time. Our system balances between detection performance and speed-up. We further experimented with feature reduction to decrease the overhead of applying the high-level classifiers in the tree. The framework is transparent to the used object descriptor where we implemented the histogram of orientated gradients and deformable part model both introduced in [Felzenszwalb et al., 2010a]. The capabilities of our system are demonstrated on two challenging datasets containing different object categories not necessarily semantically related. We evaluate both the detection performance with different number of classes and the scalability with respect to run-time. Our experiments show that this framework fulfills the requirements of a multi-class object detector and highlights the advantages of structuring class-level knowledge.
220

Optimisation de stratégies de fusion pour la reconnaissance de visages 3D.

Ben Soltana, Wael 11 December 2012 (has links)
La reconnaissance faciale (RF) est un domaine de recherche très actif en raison de ses nombreuses applications dans le domaine de la vision par ordinateur en général et en biométrie en particulier. Cet intérêt est motivé par plusieurs raisons. D’abord, le visage est universel. Ensuite, il est le moyen le plus naturel par les êtres humains de s’identifier les uns des autres. Enfin, le visage en tant que modalité biométrique est présente un caractère non intrusif, ce qui le distingue d’autres modalités biométriques comme l’iris ou l’emprunte digitale. La RF représente aussi des défis scientifiques importants. D’abord parce que tous les visages humains ont des configurations similaires. Ensuite, avec les images faciales 2D que l’on peut acquérir facilement, la variation intra-classe, due à des facteurs comme le changement de poses et de conditions d’éclairage, les variations d’expressions faciales, le vieillissement, est bien plus importante que la variation inter-classe.Avec l’arrivée des systèmes d’acquisition 3D capables de capturer la profondeur d’objets, la reconnaissance faciale 3D (RF 3D) a émergé comme une voie prometteuse pour traiter les deux problèmes non résolus en 2D, à savoir les variations de pose et d’éclairage. En effet, les caméras 3D délivrent généralement les scans 3D de visages avec leurs images de texture alignées. Une solution en RF 3D peut donc tirer parti d’une fusion avisée d’informations de forme en 3D et celles de texture en 2D. En effet, étant donné que les scans 3D de visage offrent à la fois les surfaces faciales pour la modalité 3D pure et les images de texture 2D alignées, le nombre de possibilités de fusion pour optimiser le taux de reconnaissance est donc considérable. L’optimisation de stratégies de fusion pour une meilleure RF 3D est l’objectif principal de nos travaux de recherche menés dans cette thèse.Dans l’état d’art, diverses stratégies de fusion ont été proposées pour la reconnaissance de visages 3D, allant de la fusion précoce "early fusion" opérant au niveau de caractéristiques à la fusion tardive "late fusion" sur les sorties de classifieurs, en passant par de nombreuses stratégies intermédiaires. Pour les stratégies de fusion tardive, nous distinguons encore des combinaisons en parallèle, en cascade ou multi-niveaux. Une exploration exhaustive d’un tel espace étant impossible, il faut donc recourir à des solutions heuristiques qui constituent nos démarches de base dans le cadre des travaux de cette thèse.En plus, en s’inscrivant dans un cadre de systèmes biométriques, les critères d’optimalité des stratégies de fusion restent des questions primordiales. En effet, une stratégie de fusion est dite optimisée si elle est capable d’intégrer et de tirer parti des différentes modalités et, plus largement, des différentes informations extraites lors du processus de reconnaissance quelque soit leur niveau d’abstraction et, par conséquent, de difficulté.Pour surmonter toutes ces difficultés et proposer une solution optimisée, notre démarche s’appuie d’une part sur l’apprentissage qui permet de qualifier sur des données d’entrainement les experts 2D ou 3D, selon des critères de performance comme ERR, et d’autre part l’utilisation de stratégie d’optimisation heuristique comme le recuit simulé qui permet d’optimiser les mélanges des experts à fusionner. [...] / Face recognition (FR) was one of the motivations of computer vision for a long time, but only in recent years reliable automatic face recognition has become a realistic target of biometrics research. This interest is motivated by several reasons. First, the face is one of the most preferable biometrics for person identification and verification related applications, because it is natural, non-intrusive, and socially well accepted. The second reason relates to the challenges encountered in the FR domain, in which all human faces are similar to each other and hence offer low distinctiveness as compared with other biometrics, e.g., fingerprints and irises. Furthermore, when employing facial texture images, intra-class variations due to various factors as illumination and pose changes are usually greater than inter-class ones, preventing 2D face recognition systems from being completely reliable in real conditions.Recent, 3D acquisition systems are capable to capture the shape information of objects. Thus, 3D face recognition (3D FR) has been extensively investigated by the research community to deal with the unsolved issues in 2D face recognition, i.e., illumination and pose changes. Indeed, 3D cameras generally deliver the 3D scans of faces with their aligned texture images. 3D FR can benefit from the fusion of 2D texture and 3D shape information.This Ph.D thesis is dedicated to the optimization of fusion strategies based on three dimensional data. However, there are some problems. Indeed, since the 3D face scans provide both the facial surfaces for the 3D model and 2D texture images, the number of fusion method is high.In the literature, many fusion strategies exist that have been proposed for 3D face recognition. We can roughly classify the fusion strategies into two categories: early fusion and late fusion. Some intermediate strategies such as serial fusion and multi-level fusion have been proposed as well. Meanwhile, the search for an optimal fusion scheme remains extraordinarily complex because the cardinality of the space of possible fusion strategies. It is exponentially proportional to the number of competing features and classifiers. Thus, we require fusion technique to efficiently manage all these features and classifiers that constitute our contribution in this work. In addition, the optimality criteria of fusion strategies remain critical issues. By definition, an optimal fusion strategy is able to integrate and take advantage from different data.To overcome all these difficulties and propose an optimized solution, we adopted the following reflection. [...]

Page generated in 0.0465 seconds