Global ETD Search

381	Restauration des images par l'elimination du flou et des occlusions Whyte, Oliver 15 March 2012 (has links) (PDF) This thesis investigates the removal of spatially-variant blur from photographs degraded by camera shake, and the removal of large occluding objects from photographs of popular places. We examine these problems in the case where the photographs are taken with standard consumer cameras, and we have no particular information about the scene being photographed. Most existing deblurring methods model the observed blurry image as the convolution of a sharp image with a uniform blur kernel. However, we show that blur from camera shake is in general mostly due to the 3D rotation of the camera, resulting in a blur that can be significantly non-uniform across the image. We model this blur using a weighted set of camera poses, which induce homographies on the image being captured. The blur in a particular image is parameterised by the set of weights, which provides a compact global descriptor for the blur, analogous to a convolution kernel. This descriptor fully captures the spatially-variant blur at all pixels, and is able to model camera shake more accurately than previous methods. We demonstrate direct estimation of the blur weights from single and multiple blurry images captured by conventional cameras. This permits a sharp image to be recovered from a blurry "shaken" image without any user interaction or additional infor- mation about the camera motion. For single image deblurring, we adapt an existing marginalisation-based algorithm and a maximum a posteriori-based algorithm, which are both compatible with our model of spatially-variant blur. In order to reduce the computational cost of our homography-based model, we introduce an efficient approximation based on local-uniformity of the blur. By grouping pixels into local regions which share a single PSF, we are able to take advantage of fast, frequency domain convolutions to perform the blur computation. We apply this approximation to single image deblurring, obtaining an order of magnitude reduction in computation time with no visible reduction in quality. For deblurring images with saturated pixels, we propose a modification of the forward model to include this non-linearity, and re-derive the Richardson-Lucy algorithm with this new model. To prevent ringing artefacts from propagating in the deblurred image, we propose separate updates for those pixels affected by saturation, and those not affected. This prevents the loss of information caused by clipping from propagating to the rest of the image. In order to remove large occluders from photos, we automatically retrieve a set of exemplar images of the same scene from the Internet, using a visual search engine. We extract multiple homographies between each of these images and the target image to provide pixel correspondences. Finally we combine pixels from several exemplars in a seamless manner to replace the occluded pixels, by solving an energy minimisation problem on a conditional random field. Experimental results are shown on both synthetic images and real photographs captured by consumer cameras or downloaded from the Internet. computer vision deblurring
382	Statistiques Supervisées pour la Reconnaissance d'Actions Humaines dans les Vidéos Muneeb Ullah, Muhammad 23 October 2012 (has links) (PDF) This thesis addresses the problem of human action recognition in realistic video data, such as movies and online videos. Automatic and accurate recognition of human actions in video is a fascinating capability. The potential applications range from surveillance and robotics to medical diagnosis, content-based video retrieval, and intelligent human- computer interfaces. The task is highly challenging due to the large variations in person appearances, dynamic backgrounds, view-point changes, lighting conditions, action styles and other factors. Statistical video representations based on local space-time features have been recently shown successful for action recognition in realistic scenarios. Their success can be at- tributed to the mild assumptions about the data and robustness to several variations in the video. Such representations, however, often encode videos by disordered collection of low-level primitives. This thesis extends current methods by developing more discrimi- native features and integrating additional supervision into Bag-of-Features based video representations, aiming to improve action recognition in unconstrained and challenging video data. We start by evaluating a range of available local space-time feature detectors and descriptors under the standard Bag-of-Features framework. We then propose to improve the basic Bag-of-Features model by integrating additional supervision in the form of non-local region-level information. We further investigate an attribute-based representation, wherein the attributes range from objects (e.g., car, chair, table, etc.) to human poses and actions. We demonstrate that such representation captures high-level information in video, and provides complementary information to the low-level features. We finally propose a novel local representation for human action recognition in video, denoted as Actlets. Actlets are body part detectors undergoing characteristic motion patterns. We train Actlets using a large synthetic video dataset of rendered avatars and demonstrate the advantages of Actlets for action recognition in realistic data. All methods proposed and developed in this thesis represent alternative ways of construct- ing supervised video representations and demonstrate improvements of human action recognition in realistic settings. computer vision action recognition
383	Optimization convexe pour cosegmentation Joulin, Armand 17 December 2012 (has links) (PDF) Les hommes et la plupart des animaux ont une capacité naturelle à voir le monde et à le comprendre sans effort. La simplicité apparente avec laquelle un humain perçoit ce qui l'entoure suggère que le processus impliqué ne nécessite pas, dans une certaine mesure, un haut degré de réflexion. Cette observation suggère que notre perception visuelle du monde peut être simulée sur un ordinateur. La vision par ordinateur est le domaine de la recherche consacré au problème de la création d'une forme de perception visuelle pour des ordinateurs. Les premiers travaux dans ce domaine remontent aux années cinquante, mais la puissance de calcul des ordinateurs de cette époque ne permettait pas de traiter et d'analyser les données visuelles nécessaires à l'elaboration d'une perception visuelle virtuelle. Ce n'est que récemment que la puissance de calcul et la capacité de stockage ont permis à ce domaine de vrai- ment émerger. Depuis maintenant deux décennies, la vision par ordinateur a permis de répondre à problèmes pratiques ou industrielles comme par exemple, la détection des visages, de personnes au comportement suspect dans une foule ou de défauts de fabrication dans des chaînes de production. En revanche, en ce qui concerne l'émergence d'une perception visuelle virtuelle non spécifique à une tâche donnée, peu de progrès ont été réalisés et la communauté est toujours confrontée à des problèmes fondamentaux. Un de ces problèmes est de segmenter une image ou une video en régions porteuses de sens, ou en d'autres termes, en objets ou actions. La segmentation de scène est non seulement naturelle pour les humains, mais aussi essentielle pour comprendre pleinement son environnement. Malheureusement elle est aussi extrêmement difficile à reproduire sur un ordinateur. Une des raisons est qu'il n'existe pas de définition claire de ce qu'est une région "significative". En effet, en fonction de la scène ou de la situation, une région peut avoir des interprétations différentes. Par exemple, étant donnée une scène se passant dans la rue, on peut considérer que distinguer un piéton est important dans cette situation, par contre ses vêtements ne le semblent pas nécessairement. Si maintenant nous considérons une scène ayant lieu pendant un défilé de mode, un vêtement devient un élément important, donc une région significative. Dans cette thèse, nous nous concentrons sur ce problème de segmentation et nous l'abordons sous un angle particulier afin d'éviter cette difficulté fondamentale. Nous allons considérer la segmentation comme un problème d'apprentissage faible- ment supervisé, c'est-à-dire qu'au lieu de segmenter des images selon une certaine définition prédéfinie de régions "significatives", nous développons des méthodes per- mettant de segmenter simultanément un ensemble d'images en régions qui apparais- sent régulièrement. En d'autres termes, nous définissons une région "significative" d'un point de vue statistique: Ce sont les régions qui apparaissent régulièrement dans l'ensemble des images données. Pour cela nous concevons des modèles ayant une portée qui va au-delà de l'application à la vision. Notre approche prend ses racines dans l'apprentissage statistique, dont l'objectif est de concevoir des méthodes efficaces pour extraire et/ou apprendre des motifs récurrents dans des jeux de données. Ce domaine a récemment connu une forte popularité en raison de l'augmentation du nombre, de la taille des bases de données disponibles et la nécessité de traiter les données automatiquement. Dans cette thèse, nous nous concentrons sur des méthodes conçues pour découvrir l'information "cachée" dans une base de données à partir d'annotations incomplètes ou inexistantes. Enfin, nos travaux prennent aussi racines dans le domaine de l'optimisation numérique afin d'élaborer des algorithmes efficaces et adaptés spécialement à nos prob- lèmes. En particulier, nous utilisons et adaptons des outils récemment développés afin de relaxer des problèmes combinatoires complexes en des problèmes convexes pour lesquels il est garanti de trouver la solution optimale à l'aide de procedures developpees en optimisation convexe. Nous illustrons la qualité de nos formulations et algorithmes aussi sur des problèmes tirés de domaines autres que la vision par ordinateur. En particulier, nous montrons que nos travaux peuvent être utilisés dans la classification de texte et en biologie cellulaire. computer vision object recognition cosegmentation
384	Alignement élastique d'images pour la reconnaissance d'objet Duchenne, Olivier 29 November 2012 (has links) (PDF) The objective of this thesis is to explore the use of graph matching in object recognition systems. In the continuity of the previously described articles, rather than using descriptors invariant to misalignment, this work directly tries to find explicit correspondences between prototypes and test images, in order to build a robust similarity measure and infer the class of the test images. In chapter 2, we will present a method that given interest points in two images tries to find correspondences between them. It extends previous graph matching approaches [Leordeanu and Hebert, 2005a] to handle interactions between more than two feature correspondences. This allows us to build a more discriminative and/or more invariant matching method. The main contributions of this chapter are: The introduction of an high-order objective function for hyper-graph matching (Section 2.3.1). The application of the tensor power iteration method to the high-order matching task, combined with a relaxation based on constraints on the row norms of assignment matrices, which is tighter than previous methods (Section 2.3.1). An l1-norm instead of the classical l2-norm relaxation, that provides solutions that are more interpretable but still allows an efficient power iteration algorithm (Section 2.3.5). The design of appropriate similarity measures that can be chosen either to improve the invariance of matching, or to improve the expressivity of the model (Section 2.3.6). The proposed approach has been implemented, and it is compared to stateof-the-art algorithms on both synthetic and real data. As shown by our experiments (Section 2.5), our implementation is, overall, as fast as these methods in spite of the higher complexity of the model, with better accuracy on standard databases. In chapter 3, we build a graph-matching method for object categorization. The main contributions of this chapter are: Generalizing [Caputo and Jie, 2009; Wallraven et al., 2003], we propose in Section 3.3 to use the optimum value of the graph-matching problem associated with two images as a (non positive definite) kernel, suitable for SVM classification. We propose in Section 3.4 a novel extension of Ishikawa's method [Ishikawa, 2003] for optimizing MRFs which is orders of magnitude faster than competing algorithms (e.g., [Kim and Grauman, 2010; Kolmogorov and Zabih, 2004; Leordeanu and Hebert, 2005a]) for the grids with a few hundred nodes considered in this article). In turn, this allows us to combine our kernel with SVMs in image classification tasks. We demonstrate in Section 3.5 through experiments with standard benchmarks (Caltech 101, Caltech 256, and Scenes datasets) that our method matches and in some cases exceeds the state of the art for methods using a single type of features. In chapter 4, we introduce our work about object detection that perform fast image alignment. The main contributions of this chapter are: We propose a novel image similarity measure that allows for arbitrary deformations of the image pattern within some given disparity range and can be evaluated very efficiently [Lemire, 2006], with a cost equal to a small constant times that of correlation in a sliding-window mode. Our similarity measure relies on a hierarchical notion of parts based on simple rectangular image primitives and HOG cells [Dalal and Triggs, 2005a], and does not require manual part specification [Felzenszwalb and Huttenlocher, 2005b; Bourdev and Malik, 2009; Felzenszwalb et al., 2010] or automated discovery [Lazebnik et al., 2005; Kushal et al., 2007]. computer vision object recognition image matching
385	Learning Hierarchical Feature Extractors For Image Recognition Boureau, Y-Lan 01 September 2012 (has links) (PDF) Telling cow from sheep is effortless for most animals, but requires much engineering for computers. In this thesis, we seek to tease out basic principles that underlie many recent advances in image recognition. First, we recast many methods into a common unsu- pervised feature extraction framework based on an alternation of coding steps, which encode the input by comparing it with a collection of reference patterns, and pooling steps, which compute an aggregation statistic summarizing the codes within some re- gion of interest of the image. Within that framework, we conduct extensive comparative evaluations of many coding or pooling operators proposed in the literature. Our results demonstrate a robust superiority of sparse coding (which decomposes an input as a linear combination of a few visual words) and max pooling (which summarizes a set of inputs by their maximum value). We also propose macrofeatures, which import into the popu- lar spatial pyramid framework the joint encoding of nearby features commonly practiced in neural networks, and obtain significantly improved image recognition performance. Next, we analyze the statistical properties of max pooling that underlie its better perfor- mance, through a simple theoretical model of feature activation. We then present results of experiments that confirm many predictions of the model. Beyond the pooling oper- ator itself, an important parameter is the set of pools over which the summary statistic is computed. We propose locality in feature configuration space as a natural criterion for devising better pools. Finally, we propose ways to make coding faster and more powerful through fast convolutional feedforward architectures, and examine how to incorporate supervision into feature extraction schemes. Overall, our experiments offer insights into what makes current systems work so well, and state-of-the-art results on several image recognition benchmarks. computer vision object recognition feature extraction
386	Modeling and visual recognition of human actions and interactions Laptev, Ivan 03 July 2013 (has links) (PDF) This work addresses the problem of recognizing actions and interactions in realistic video settings such as movies and consumer videos. The first contribution of this thesis (Chapters 2 and 4) is concerned with new video representations for action recognition. We introduce local space-time descriptors and demonstrate their potential to classify and localize actions in complex settings while circumventing the difficult intermediate steps of person detection, tracking and human pose estimation. The material on bag-of-features action recognition in Chapter 2 is based on publications [L14, L22, L23] and is related to other work by the author [L6, L7, L8, L11, L12, L13, L16, L21]. The work on object and action localization in Chapter 4 is based on [L9, L10, L13, L15] and relates to [L1, L17, L19, L20]. The second contribution of this thesis is concerned with weakly-supervised action learning. Chap- ter 3 introduces methods for automatic annotation of action samples in video using readily-available video scripts. It addresses the ambiguity of action expressions in text and the uncertainty of tem- poral action localization provided by scripts. The material presented in Chapter 3 is based on publications [L4, L14, L18]. Finally Chapter 5 addresses interactions of people with objects and concerns modeling and recognition of object function. We exploit relations between objects and co-occurring human poses and demonstrate object recognition improvements using automatic pose estimation in challenging videos from YouTube. This part of the thesis is based on the publica- tion [L2] and relates to other work by the author [L3, L5]. computer vision action recognition video analysis
387	Des paysages impossibles : nature, forme et historicité chez W. Wordsworth et S.T. Coleridge Folliot, Laurent 11 December 2010 (has links) (PDF) Souvent perçu comme le poète de la " nature " par excellence, William Wordsworth serait bien plutôt celui qui a donné définitivement congé à une riche tradition descriptive, puisque les évocations du paysage sont chez lui bien plus rares que chez tous ses prédécesseurs du XVIIIe siècle. Le présent travail se propose de prêter attention à cette raréfaction, qu'on peut également voir, sur le plan de l'histoire esthétique, comme le moment d'émergence d 'une modernité abstraite. La poésie wordsworthienne, qui a pour ambition de refonder le langage et les formes poétiques par un retour à l'authenticité de la nature, apparaît indissociablement comme une rupture avec un mode essentiel de la première modernité anglaise, celui des Géorgiques. Elle prend ainsi acte de la crise de la représentation qui affecte l'optimisme du XVIIIe siècle et qui empêche désormais de voir dans le paysage la manifestation d' un ordre providentiel. Le " romantisme " anglais est ce qui surgit au défaut de la cosmologie, pour témoigner d'une fondamentale absence au monde. Cette évolution est ici étudiée en deux temps. On s'attachera d'abord à retracer, dans son détail, la trajectoire de la poésie de jeunesse de Wordsworth et de Coleridge, pour montrer que le moment refondateur de Lyrical Ballads intervient au terme d'un épuisement des formes et de la topique qui garantissaient traditionnellement l'intelligibilité du cosmos. Et l'on abordera ensuite trois moments distincts de la maturité poétique de Wordsworth [1798, 1802, 1807], qui suggèrent que le retour de l'idéologie dans son œuvre répond intimement à l'ébranlement radical dans lequel elle trouve son inspiration. Poésie romantique Paysage Cosmologie Géorgique Modernité Formes poétiques
388	Equilibrage de charge dynamique sur plates-formes hiérarchiques Quintin, Jean-noël 08 December 2011 (has links) (PDF) La course à l'augmentation de la puissance de calcul qui se déroule depuis de nombreuses années entre les différents producteurs de matériel a depuis quelques années changé de visage: nous assistons en effet désormais à une véritable démocratisation des machines parallèles avec une complexification sans cesse croissante de la structure des processeurs. À terme, il est tout à fait envisageable de voir apparaître pour le grand public des architecture pleinement hétérogènes composées d'un ensemble de cœurs reliés par un réseau sur puce. La parallélisation et l'exécution parallèle d'applications sur les machines à venir soulèvent ainsi de nombreux problèmes. Parmi ceux-ci, nous nous intéressons ici au problème de l'ordonnancement d'un ensemble de tâches sur un ensemble de cœurs, c'est à dire le choix de l'affectation du travail à réaliser sur les ressources disponibles. Parmi les méthodes existantes, on distingue deux types d'algorithmes: en-ligne et hors-ligne. Les algorithmes en-ligne comme le vol de travail présentent l'avantage de fonctionner en l'absence d'informations sur le matériel ou la durée des tâches mais ne permettent généralement pas une gestion efficace des communications. Dans cette thèse, nous nous intéressons à l'ordonnancement de tâches en-ligne sur des plates-formes complexes pour lesquelles le réseau peut, par des problèmes de congestion, limiter les performances. Plus précisément, nous proposons de nouveaux algorithmes d'ordonnancement en-ligne, basés sur le vol de travail, ciblant deux configurations différentes. D'une part, nous considérons des applications pour lesquelles le graphe de dépendance est connu à priori. L'utilisation de cette information nous permet ainsi de limiter les quantités de données transférées et d'obtenir des performances supérieures aux meilleurs algorithmes hors-ligne connus. D'autre part, nous étudions les optimisations possibles lorsque l'algorithme d'ordonnancement connaît la topologie de la plate-forme. Encore une fois, nous montrons qu'il est possible de tirer parti de cette information pour réaliser un gain non-négligeable en performance. Nos travaux permettent ainsi d'étendre le champ d'application des algorithmes d'ordonnancement vers des architectures plus complexes et permettront peut-être une meilleure utilisation des machines de demain. Vol de travail Équilibrage de charge Plates-formes hiérarchiques
389	Etude expérimentale des instabilités thermoconvectives de Rayleigh-Bénard dans les fluides viscoplastiques Abdelali, Ahmed 13 March 2012 (has links) (PDF) Le phénomène de Rayleigh-Bénard correspond à l'état instable dans lequel se trouve une couche horizontale d'un fluide dilatable, soumise à un gradient de température DT. Si ce dernier dépasse une valeur critique DTc, des mouvements convectifs naissent à l'intérieur du fluide. Concernant les fluides à seuil, le phénomène devient plus complexe. Le seuil s'ajoute aux forces stabilisatrices au sein du fluide et modifie de manière fondamentale le transfert de matière et le transfert thermique. Au départ, le fluide est au repos ; le gradient de vitesse est alors nul et la viscosité efficace infinie partout. L'approche de stabilité linéaire est incapable de fournir une solution aux équations d'écoulement car on doit perturber, par les forces d'Archimède, un fluide d'une viscosité infinie. Dans ce travail de thèse, des expériences de Rayleigh-Bénard ont été effectuées sur des solutions à base de Carbopol 940 présentant un seuil de contrainte. Le dispositif expérimental nous a permis d'avoir des résultats quantitatifs et qualitatifs intéressants. Les mouvements thermoconvectifs ont ensuite été filmés par la technique d'ombroscopie. L'effet non-linéaire au début de la convection a été observé. [SPI:OTHER] Engineering Sciences/Other Rhéologie Fluides à seuil Convection de Rayleigh-Bénard Nombre de Nusselt Structures et formes thermoconvectives Ombroscopie
390	Évaluation de système biométrique El Abed, Mohamad 09 December 2011 (has links) (PDF) Les systèmes biométriques sont de plus en plus utilisés pour vérifier ou déterminer l'identité d'un individu. Compte tenu des enjeux liés à leur utilisation, notamment pour des applications dans le domaine de commerce électronique, il est particulièrement important de disposer d'une méthodologie d'évaluation de tels systèmes. Le problème traité dans cette thèse réside dans la conception d'une méthodologie générique visant à évaluer un système biométrique. Trois méthodes ont été proposées dans cette thèse: 1) une méthode de qualité sans référence pour prédire la qualité d'une donnée biométrique, 2) une méthode d'usage pour évaluer l'acceptabilité et la satisfaction des usagers lors de l'utilisation des systèmes biométriques et 3) une méthode d'analyse sécuritaire d'un système biométrique afin de mesurer sa robustesse aux attaques EVALUATION RECONNAISSANCE DE FORMES (INFORMATIQUE) TRAITEMENT D'IMAGES TECHNIQUES NUMERIQUES CLASSIFICATION

Search results