• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 53
  • 35
  • 13
  • 6
  • 6
  • 5
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 154
  • 154
  • 56
  • 41
  • 37
  • 31
  • 30
  • 23
  • 23
  • 23
  • 23
  • 23
  • 22
  • 21
  • 18
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
101

Detecção de faces humanas em imagens coloridas utilizando redes neurais artificiais / Detection of human faces in color images using artificial neural networks

Wellington da Rocha Gouveia 28 January 2010 (has links)
A tarefa de encontrar faces em imagens é extremamente complexa, pois pode ocorrer variação de luminosidade, fundos extremamente complexos e objetos que podem se sobrepor parcialmente à face que será localizada, entre outros problemas. Com o avanço na área de visão computacional técnicas mais recentes de processamento de imagens e inteligência artificial têm sido combinadas para desenvolver algoritmos mais eficientes para a tarefa de detecção de faces. Este trabalho apresenta uma metodologia de visão computacional que utiliza redes neurais MLP (Perceptron Multicamadas) para segmentar a cor da pele e a textura da face, de outros objetos presentes em uma imagem de fundo complexo. A imagem resultante é dividida em regiões, e para cada região são extraídas características que são aplicadas em outra rede neural MLP para identificar se naquela região contem face ou não. Para avaliação do software implementado foram utilizados dois banco de imagens, um com imagens padronizadas (Banco AR) e outro banco com imagens adquiridas na Internet contendo faces com diferentes tons de pele e fundo complexo. Os resultados finais obtidos foram de 83% de faces detectadas para o banco de imagens da Internet e 88% para o Banco AR, evidenciando melhores resultados para as imagens deste banco, pelo fato de serem padronizadas, não conterem faces inclinadas e fundo complexo. A etapa de segmentação apesar de reduzir a quantidade de informação a ser processada para os demais módulos foi a que contribuiu para o maior número de falsos negativos. / The task of finding faces in images is extremely complex, as there is variation in brightness, backgrounds and highly complex objects that may overlap partially in the face to be found, among other problems. With the advancement in the field of computer vision techniques latest image processing and artificial intelligence have been combined to develop more efficient algorithms for the task of face detection. This work presents a methodology for computer vision using neural networks MLP (Multilayer Perceptron) to segment the skin color and texture of the face, from other objects present in a complex background image. The resulting image is divided into regions and from each region are extracted features that are applied in other MLP neural network to identify whether this region contains the face or not. To evaluate the software two sets of images were used, images with a standard database (AR) and another database with images acquired from the Internet, containing faces with different skin tones and complex background. The final results were 83% of faces detected in the internet database of images and 88% for the database AR. These better results for the database AR is due to the fact that they are standardized, are not rotated and do not contain complex background. The segmentation step, despite reducing the amount of information being processed for the other modules contributed to the higher number of false negatives.
102

Detecção de faces e rastreamento da pose da cabeça

Schramm, Rodrigo 20 March 2009 (has links)
Submitted by Mariana Dornelles Vargas (marianadv) on 2015-04-27T19:08:59Z No. of bitstreams: 1 deteccao_faces.pdf: 3878917 bytes, checksum: 2fbf8222ef54d5fc0b1df0bf3b3a5292 (MD5) / Made available in DSpace on 2015-04-27T19:08:59Z (GMT). No. of bitstreams: 1 deteccao_faces.pdf: 3878917 bytes, checksum: 2fbf8222ef54d5fc0b1df0bf3b3a5292 (MD5) Previous issue date: 2009-03-20 / HP - Hewlett-Packard Brasil Ltda / As câmeras de vídeo já fazem parte dos novos modelos de interação entre o homem e a máquina. Através destas, a face e a pose da cabeça podem ser detectadas promovendo novos recursos para o usuário. Entre o conjunto de aplicações que têm se beneficiado deste tipo de recurso estão a vídeo-conferência, os jogos educacionais e de entretenimento, o controle de atenção de motoristas e a medida de foco de atenção. Nesse contexto insere-se essa proposta de mestrado, a qual propõe um novo modelo para detectar e rastrear a pose da cabeça a partir de uma seqüência de vídeo obtida com uma câmera monocular. Para alcançar esse objetivo, duas etapas principais foram desenvolvidas: a detecção da face e o rastreamento da pose. Nessa etapa, a face é detectada em pose frontal utilizando-se um detector com haar-like features. Na segunda etapa do algoritmo, após a detecção da face em pose frontal, atributos específicos da mesma são rastreados para estimar a variação da pose de cabeça. / Video cameras are already part of the new man-machine interaction models. Through these, the face and pose of the head can be found, providing new resources for users. Among the applications that have benefited from this type of resource are video conference, educational and entertainment games, and measurement of attention focus. In this context, this Master's thesis proposes a new model to detect and track the pose of the head in a video sequence captured by a monocular camera. To achieve this goal, two main stages were developed: face detection and head pose tracking. The first stage is the starting point for tracking the pose. In this stage, the face is detected in frontal pose using a detector with Haar-like features. In the second step of the algorithm, after detecting the face in frontal pose, specific attributes of the read are tracked to estimate the change in the pose of the head.
103

Image registration and super-resolution mosaicing

Ye, Getian, Information Technology & Electrical Engineering, Australian Defence Force Academy, UNSW January 2005 (has links)
This thesis presents new approaches to image registration and super-resolution mosaicing as well as their applications. Firstly, a feature-based image registration method is proposed for a multisensor surveillance system that consists of an optical camera and an infrared camera. By integrating a non-rigid object tracking technique into this method, a novel approach to simultaneous object tracking and multisensor image registration is proposed. Based on the registration and fusion of multisensor information, automatic face detection is greatly improved. Secondly, some extensions of a gradient-based image registration method, called inverse compositional algorithm, are proposed. These extensions include cumulative multi-image registration and the incorporation of illumination change and lens distortion correction. They are incorporated into the framework of the original algorithm in a consistent manner and efficiency can still be achieved for multi-image registration with illumination and lens distortion correction. Thirdly, new super-resolution mosaicing algorithms are proposed for multiple uncompressed and compressed images. Considering the process of image formation, observation models are introduced to describe the relationship between the superresolution mosaic image and the uncompressed and compressed low-resolution images. To improve the performance of super-resolution mosaicing, a wavelet-based image interpolation technique and an approach to adaptive determination of the regularization parameter are presented. For compressed images, a spatial-domain algorithm and a transform-domain algorithm are proposed. All the proposed superresolution mosaicing algorithms are robust against outliers. They can produce superresolution mosaics and reconstructed super-resolution images with improved subjective quality. Finally, new techniques for super-resolution sprite generation and super-resolution sprite coding are proposed. Considering both short-term and long-term motion influences, an object-based image registration method is proposed for handling long image sequences. In order to remove the influence of outliers, a robust technique for super-resolution sprite generation is presented. This technique produces sprite images and reconstructed super-resolution images with high visual quality. Moreover, it provides better reconstructed low-resolution images compared with low-resolution sprite generation techniques. Due to the advantages of the super-resolution sprite, a super-resolution sprite coding technique is also proposed. It achieves high coding efficiency especially at a low bit-rate and produces both decoded low-resolution and super-resolution images with improved subjective quality. Throughout this work, the performance of all the proposed algorithms is evaluated using both synthetic and real image sequences.
104

HUMAN FACE RECOGNITION BASED ON FRACTAL IMAGE CODING

Tan, Teewoon January 2004 (has links)
Human face recognition is an important area in the field of biometrics. It has been an active area of research for several decades, but still remains a challenging problem because of the complexity of the human face. In this thesis we describe fully automatic solutions that can locate faces and then perform identification and verification. We present a solution for face localisation using eye locations. We derive an efficient representation for the decision hyperplane of linear and nonlinear Support Vector Machines (SVMs). For this we introduce the novel concept of $\rho$ and $\eta$ prototypes. The standard formulation for the decision hyperplane is reformulated and expressed in terms of the two prototypes. Different kernels are treated separately to achieve further classification efficiency and to facilitate its adaptation to operate with the fast Fourier transform to achieve fast eye detection. Using the eye locations, we extract and normalise the face for size and in-plane rotations. Our method produces a more efficient representation of the SVM decision hyperplane than the well-known reduced set methods. As a result, our eye detection subsystem is faster and more accurate. The use of fractals and fractal image coding for object recognition has been proposed and used by others. Fractal codes have been used as features for recognition, but we need to take into account the distance between codes, and to ensure the continuity of the parameters of the code. We use a method based on fractal image coding for recognition, which we call the Fractal Neighbour Distance (FND). The FND relies on the Euclidean metric and the uniqueness of the attractor of a fractal code. An advantage of using the FND over fractal codes as features is that we do not have to worry about the uniqueness of, and distance between, codes. We only require the uniqueness of the attractor, which is already an implied property of a properly generated fractal code. Similar methods to the FND have been proposed by others, but what distinguishes our work from the rest is that we investigate the FND in greater detail and use our findings to improve the recognition rate. Our investigations reveal that the FND has some inherent invariance to translation, scale, rotation and changes to illumination. These invariances are image dependent and are affected by fractal encoding parameters. The parameters that have the greatest effect on recognition accuracy are the contrast scaling factor, luminance shift factor and the type of range block partitioning. The contrast scaling factor affect the convergence and eventual convergence rate of a fractal decoding process. We propose a novel method of controlling the convergence rate by altering the contrast scaling factor in a controlled manner, which has not been possible before. This helped us improve the recognition rate because under certain conditions better results are achievable from using a slower rate of convergence. We also investigate the effects of varying the luminance shift factor, and examine three different types of range block partitioning schemes. They are Quad-tree, HV and uniform partitioning. We performed experiments using various face datasets, and the results show that our method indeed performs better than many accepted methods such as eigenfaces. The experiments also show that the FND based classifier increases the separation between classes. The standard FND is further improved by incorporating the use of localised weights. A local search algorithm is introduced to find a best matching local feature using this locally weighted FND. The scores from a set of these locally weighted FND operations are then combined to obtain a global score, which is used as a measure of the similarity between two face images. Each local FND operation possesses the distortion invariant properties described above. Combined with the search procedure, the method has the potential to be invariant to a larger class of non-linear distortions. We also present a set of locally weighted FNDs that concentrate around the upper part of the face encompassing the eyes and nose. This design was motivated by the fact that the region around the eyes has more information for discrimination. Better performance is achieved by using different sets of weights for identification and verification. For facial verification, performance is further improved by using normalised scores and client specific thresholding. In this case, our results are competitive with current state-of-the-art methods, and in some cases outperform all those to which they were compared. For facial identification, under some conditions the weighted FND performs better than the standard FND. However, the weighted FND still has its short comings when some datasets are used, where its performance is not much better than the standard FND. To alleviate this problem we introduce a voting scheme that operates with normalised versions of the weighted FND. Although there are no improvements at lower matching ranks using this method, there are significant improvements for larger matching ranks. Our methods offer advantages over some well-accepted approaches such as eigenfaces, neural networks and those that use statistical learning theory. Some of the advantages are: new faces can be enrolled without re-training involving the whole database; faces can be removed from the database without the need for re-training; there are inherent invariances to face distortions; it is relatively simple to implement; and it is not model-based so there are no model parameters that need to be tweaked.
105

Entwicklung einer offenen Softwareplattform für Visual Servoing

Sprößig, Sören 29 June 2010 (has links) (PDF)
Ziel dieser Diplomarbeit ist es, eine flexibel zu verwendende Plattform für Visual Servoing-Aufgaben zu Erstellen, mit der eine Vielzahl von verschiedenen Anwendungsfällen abgedeckt werden kann. Kernaufgabe der Arbeit ist es dabei, verschiedene Verfahren der Gesichtserkennung (face detection) am Beispiel der Haar-Kaskade und -wiedererkennung (face recognition) am Beispiel von Eigenfaces und Fisherfaces zu betrachten und an ausführlichen Beispielen vorzustellen. Dabei sollen allgemeine Grundbegriffe der Bildverarbeitung und bereits bekannte Verfahren vorgestellt und ihre Implementierung im Detail dargestellt werden. Aus den dadurch gewonnen Erkenntnissen und dem sich ergebenden Anforderungsprofil an die zu entwickelnde Plattform leitet sich anschließend die Realisierung als eigenständige Anwendung ab. Hierbei ist weiterhin zu untersuchen, wie die neu zu entwickelnde Software zukunftssicher und in Hinblick auf einen möglichen Einsatz in Praktika einfach zu verwenden realisiert werden kann. Sämtliche während der Arbeit entstandenen Programme und Quellcodes werden auf einem separaten Datenträger zur Verfügung gestellt. Eine komplett funktionsfähige Entwicklungsumgebung wird als virtuelle Maschine beigelegt.
106

Veido atpažinimo algoritmų tyrimas ir įgyvendinimas operacinėje Android sistemoje / Analysis of face recognition algorithms and implementation in Android operating system

Balinskas, Justinas 26 July 2012 (has links)
Baigiamajame magistro darbe yra apžvelgti metodai, naudojami veidų atpažinimui bei išanalizuotas jų veikimas. Apžvelgus veidų atpažinimo metodus buvo pasirinkti trys algoritmai (tikrinių veidų, Fišerio veidų ir 2D–DCT+SOM), kurie išsamiai išanalizuoti ir įgyvendinti MATLAB aplinkoje bei ištirti įvairus jų parametrai. Pagal gautus rezultatus buvo išrinktas optimalus algoritmas, tinkantis įgyvendinimui Android operacinėje sistemoje ir ten įgyvendintas. Baigiamajame darbe taip pat buvo apžvelgtos ir išanalizuotos problemos, su kuriomis susiduriama perkeliant algoritmą į Android operacinę sistemą, pateikti siūlymai algoritmo patobulinimui bei išvados. Visi užsibrėžti tikslai buvo pasiekti, o uždaviniai – išspręsti. Veido atpažinimo algoritmų tyrimas ir įgyvendinimas operacinėje Android sistemoje. Magistro baigiamasis darbas informatikos inžinerijos laipsniui. Vilniaus Gedimino technikos universitetas. Vilnius, 2012, 187 p., 49 iliustr., 6 lent., 74 bibl., 6 priedai. / The main goal of Master degree thesis is to review face recognition algorithms and analyze their performance. After this survey three face recognition algorithms (eigenfaces, fisherfaces and 2D–DCT+SOM) have been chosen for detailed analysis and investigation of their various parameters in MATLAB environment. According to the results obtained during this research only one algorithm, which is optimal for implementation in Android operating system, has been implemented on the mobile platform. This Master degree thesis also includes problems and suggestions regarding eigenface’s algorithm implementation in Android operating system, proposals for algorithm improvement and detailed conclusions. All the objectives have been achieved and all problems – solved. Analysis of face recognition algorithms and implementation in Android operating system. Master Thesis for Informatics Engineering degree. Vilnius Gediminas Technical University. Vilnius, 2012, 187 p., 49 figures, 6 tables, 74 references, 6 appendices.
107

Automated video-based measurement of eye closure using a remote camera for detecting drowsiness and behavioural microsleeps

Malla, Amol Man January 2008 (has links)
A device capable of continuously monitoring an individual’s levels of alertness in real-time is highly desirable for preventing drowsiness and lapse related accidents. This thesis presents the development of a non-intrusive and light-insensitive video-based system that uses computer-vision methods to localize face, eyes, and eyelids positions to measure level of eye closure within an image, which, in turn, can be used to identify visible facial signs associated with drowsiness and behavioural microsleeps. The system was developed to be non-intrusive and light-insensitive to make it practical and end-user compliant. To non-intrusively monitor the subject without constraining their movement, the video was collected by placing a camera, a near-infrared (NIR) illumination source, and an NIR-pass optical filter at an eye-to-camera distance of 60 cm from the subject. The NIR-illumination source and filter make the system insensitive to lighting conditions, allowing it to operate in both ambient light and complete darkness without visually distracting the subject. To determine the image characteristics and to quantitatively evaluate the developed methods, reference videos of nine subjects were recorded under four different lighting conditions with the subjects exhibiting several levels of eye closure, head orientations, and eye gaze. For each subject, a set of 66 frontal face reference images was selected and manually annotated with multiple face and eye features. The eye-closure measurement system was developed using a top-down passive feature-detection approach, in which the face region of interest (fROI), eye regions of interests (eROIs), eyes, and eyelid positions were sequentially localized. The fROI was localized using an existing Haar-object detection algorithm. In addition, a Kalman filter was used to stabilize and track the fROI in the video. The left and the right eROIs were localized by scaling the fROI with corresponding proportional anthropometric constants. The position of an eye within each eROI was detected by applying a template-matching method in which a pre-formed eye-template image was cross-correlated with the sub-images derived from the eROI. Once the eye position was determined, the positions of the upper and lower eyelids were detected using a vertical integral-projection of the eROI. The detected positions of the eyelids were then used to measure eye closure. The detection of fROI and eROI was very reliable for frontal-face images, which was considered sufficient for an alertness monitoring system as subjects are most likely facing straight ahead when they are drowsy or about to have microsleep. Estimation of the y- coordinates of the eye, upper eyelid, and lower eyelid positions showed average median errors of 1.7, 1.4, and 2.1 pixels and average 90th percentile (worst-case) errors of 3.2, 2.7, and 6.9 pixels, respectively (1 pixel 1.3 mm in reference images). The average height of a fully open eye in the reference database was 14.2 pixels. The average median and 90th percentile errors of the eye and eyelid detection methods were reasonably low except for the 90th percentile error of the lower eyelid detection method. Poor estimation of the lower eyelid was the primary limitation for accurate eye-closure measurement. The median error of fractional eye-closure (EC) estimation (i.e., the ratio of closed portions of an eye to average height when the eye is fully open) was 0.15, which was sufficient to distinguish between the eyes being fully open, half closed, or fully closed. However, compounding errors in the facial-feature detection methods resulted in a 90th percentile EC estimation error of 0.42, which was too high to reliably determine extent of eye-closure. The eye-closure measurement system was relatively robust to variation in facial-features except for spectacles, for which reflections can saturate much of the eye-image. Therefore, in its current state, the eye-closure measurement system requires further development before it could be used with confidence for monitoring drowsiness and detecting microsleeps.
108

HUMAN FACE RECOGNITION BASED ON FRACTAL IMAGE CODING

Tan, Teewoon January 2004 (has links)
Human face recognition is an important area in the field of biometrics. It has been an active area of research for several decades, but still remains a challenging problem because of the complexity of the human face. In this thesis we describe fully automatic solutions that can locate faces and then perform identification and verification. We present a solution for face localisation using eye locations. We derive an efficient representation for the decision hyperplane of linear and nonlinear Support Vector Machines (SVMs). For this we introduce the novel concept of $\rho$ and $\eta$ prototypes. The standard formulation for the decision hyperplane is reformulated and expressed in terms of the two prototypes. Different kernels are treated separately to achieve further classification efficiency and to facilitate its adaptation to operate with the fast Fourier transform to achieve fast eye detection. Using the eye locations, we extract and normalise the face for size and in-plane rotations. Our method produces a more efficient representation of the SVM decision hyperplane than the well-known reduced set methods. As a result, our eye detection subsystem is faster and more accurate. The use of fractals and fractal image coding for object recognition has been proposed and used by others. Fractal codes have been used as features for recognition, but we need to take into account the distance between codes, and to ensure the continuity of the parameters of the code. We use a method based on fractal image coding for recognition, which we call the Fractal Neighbour Distance (FND). The FND relies on the Euclidean metric and the uniqueness of the attractor of a fractal code. An advantage of using the FND over fractal codes as features is that we do not have to worry about the uniqueness of, and distance between, codes. We only require the uniqueness of the attractor, which is already an implied property of a properly generated fractal code. Similar methods to the FND have been proposed by others, but what distinguishes our work from the rest is that we investigate the FND in greater detail and use our findings to improve the recognition rate. Our investigations reveal that the FND has some inherent invariance to translation, scale, rotation and changes to illumination. These invariances are image dependent and are affected by fractal encoding parameters. The parameters that have the greatest effect on recognition accuracy are the contrast scaling factor, luminance shift factor and the type of range block partitioning. The contrast scaling factor affect the convergence and eventual convergence rate of a fractal decoding process. We propose a novel method of controlling the convergence rate by altering the contrast scaling factor in a controlled manner, which has not been possible before. This helped us improve the recognition rate because under certain conditions better results are achievable from using a slower rate of convergence. We also investigate the effects of varying the luminance shift factor, and examine three different types of range block partitioning schemes. They are Quad-tree, HV and uniform partitioning. We performed experiments using various face datasets, and the results show that our method indeed performs better than many accepted methods such as eigenfaces. The experiments also show that the FND based classifier increases the separation between classes. The standard FND is further improved by incorporating the use of localised weights. A local search algorithm is introduced to find a best matching local feature using this locally weighted FND. The scores from a set of these locally weighted FND operations are then combined to obtain a global score, which is used as a measure of the similarity between two face images. Each local FND operation possesses the distortion invariant properties described above. Combined with the search procedure, the method has the potential to be invariant to a larger class of non-linear distortions. We also present a set of locally weighted FNDs that concentrate around the upper part of the face encompassing the eyes and nose. This design was motivated by the fact that the region around the eyes has more information for discrimination. Better performance is achieved by using different sets of weights for identification and verification. For facial verification, performance is further improved by using normalised scores and client specific thresholding. In this case, our results are competitive with current state-of-the-art methods, and in some cases outperform all those to which they were compared. For facial identification, under some conditions the weighted FND performs better than the standard FND. However, the weighted FND still has its short comings when some datasets are used, where its performance is not much better than the standard FND. To alleviate this problem we introduce a voting scheme that operates with normalised versions of the weighted FND. Although there are no improvements at lower matching ranks using this method, there are significant improvements for larger matching ranks. Our methods offer advantages over some well-accepted approaches such as eigenfaces, neural networks and those that use statistical learning theory. Some of the advantages are: new faces can be enrolled without re-training involving the whole database; faces can be removed from the database without the need for re-training; there are inherent invariances to face distortions; it is relatively simple to implement; and it is not model-based so there are no model parameters that need to be tweaked.
109

Neural Network Gaze Tracking using Web Camera

Bäck, David January 2006 (has links)
Gaze tracking means to detect and follow the direction in which a person looks. This can be used in for instance human-computer interaction. Most existing systems illuminate the eye with IR-light, possibly damaging the eye. The motivation of this thesis is to develop a truly non-intrusive gaze tracking system, using only a digital camera, e.g. a web camera. The approach is to detect and track different facial features, using varying image analysis techniques. These features will serve as inputs to a neural net, which will be trained with a set of predetermined gaze tracking series. The output is coordinates on the screen. The evaluation is done with a measure of accuracy and the result is an average angular deviation of two to four degrees, depending on the quality of the image sequence. To get better and more robust results, a higher image quality from the digital camera is needed.
110

Evidential calibration and fusion of multiple classifiers : application to face blurring / Calibration et fusion évidentielles de classifieurs : application à l'anonymisation de visages

Minary, Pauline 08 December 2017 (has links)
Afin d’améliorer les performances d’un problème de classification, une piste de recherche consiste à utiliser plusieurs classifieurs et à fusionner leurs sorties. Pour ce faire, certaines approches utilisent une règle de fusion. Cela nécessite que les sorties soient d’abord rendues comparables, ce qui est généralement effectué en utilisant une calibration probabiliste de chaque classifieur. La fusion peut également être réalisée en concaténant les sorties et en appliquant à ce vecteur une calibration probabiliste conjointe. Récemment, des extensions des calibrations d’un classifieur individuel ont été proposées en utilisant la théorie de l’évidence, afin de mieux représenter les incertitudes. Premièrement, cette idée est adaptée aux techniques de calibrations probabilistes conjointes, conduisant à des versions évidentielles. Cette approche est comparée à celles mentionnées ci-dessus sur des jeux de données de classification classiques. Dans la seconde partie, le problème d’anonymisation de visages sur des images, auquel SNCF doit répondre, est considéré. Une méthode consiste à utiliser plusieurs détecteurs de visages, qui retournent des boites et des scores de confiance associés, et à combiner ces sorties avec une étape d’association et de calibration évidentielle. Il est montré que le raisonnement au niveau pixel est plus intéressant que celui au niveau boite et que, parmi les approches de fusion abordées dans la première partie, la calibration conjointe évidentielle donne les meilleurs résultats. Enfin, le cas des images provenant de vidéos est considéré. Pour tirer parti de l’information contenue dans les vidéos, un algorithme de suivi classique est ajouté au système. / In order to improve overall performance of a classification problem, a path of research consists in using several classifiers and to fuse their outputs. To perform this fusion, some approaches merge the outputs using a fusion rule. This requires that the outputs be made comparable beforehand, which is usually done using a probabilistic calibration of each classifier. The fusion can also be performed by concatenating the classifier outputs into a vector, and applying a joint probabilistic calibration to it. Recently, extensions of probabilistic calibrations of an individual classifier have been proposed using evidence theory, in order to better represent the uncertainties inherent to the calibration process. In the first part of this thesis, this latter idea is adapted to joint probabilistic calibration techniques, leading to evidential versions. This approach is then compared to the aforementioned ones on classical classification datasets. In the second part, the challenging problem of blurring faces on images, which SNCF needs to address, is tackled. A state-of-the-art method for this problem is to use several face detectors, which return boxes with associated confidence scores, and to combine their outputs using an association step and an evidential calibration. In this report, it is shown that reasoning at the pixel level is more interesting than reasoning at the box-level, and that among the fusion approaches discussed in the first part, the evidential joint calibration yields the best results. Finally, the case of images coming from videos is considered. To leverage the information contained in videos, a classical tracking algorithm is added to the blurring system.

Page generated in 0.1056 seconds