1 |
Extended Subwindow Search and Pictorial StructuresGu, Zhiqiang January 2012 (has links)
<p>In computer vision, the pictorial structure model represents an object in an image by parts that are arranged in a deformable configuration. Each part describes an object's local photometric appearance, and the configuration encodes the global geometric layout. This model has been very successful in recent object recognition systems.</p><p>We extend the pictorial structure model in three aspects. First, when the model contains only a single part, we develop new methods ranging from regularized subwindow search, nested window search, to twisted window search, for handling richer priors and more flexible shapes. Second, we develop the notion of a weak pictorial structure, as opposed to the strong one, for the characterization of a loose geometric layout in a rotationally invariant way. Third, we develop nested models to encode topological inclusion relations between parts to represent richer patterns.</p><p>We show that all the extended models can be efficiently matched to images by using dynamic programming and variants of the generalized distance transform, which computes the lower envelope of transformed cones on a dense image grid. This transform turns out to be important for a wide variety of computer vision tasks and often accelerates the computation at hand by an order of magnitude. We demonstrate improved results in either quality or speed, and sometimes both, in object matching, saliency measure, online and offline tracking, object localization and recognition.</p> / Dissertation
|
2 |
Visual tracking of articulated and flexible objectsWESIERSKI, Daniel 25 March 2013 (has links) (PDF)
Humans can visually track objects mostly effortlessly. However, it is hard for a computer to track a fast moving object under varying illumination and occlusions, in clutter, and with varying appearance in camera projective space due to its relaxed rigidity or change in viewpoint. Since a generic, precise, robust, and fast tracker could trigger many applications, object tracking has been a fundamental problem of practical importance since the beginnings of computer vision. The first contribution of the thesis is a computationally efficient approach to tracking objects of various shapes and motions. It describes a unifying tracking system that can be configured to track the pose of a deformable object in a low or high-dimensional state-space. The object is decomposed into a chained assembly of segments of multiple parts that are arranged under a hierarchy of tailored spatio-temporal constraints. The robustness and generality of the approach is widely demonstrated on tracking various flexible and articulated objects. Haar-like features are widely used in tracking. The second contribution of the thesis is a parser of ensembles of Haar-like features to compute them efficiently. The features are decomposed into simpler kernels, possibly shared by subsets of features, thus forming multi-pass convolutions. Discovering and aligning these kernels within and between passes allows forming recursive trees of kernels that require fewer memory operations than the classic computation, thereby producing the same result but more efficiently. The approach is validated experimentally on popular examples of Haar-like features
|
3 |
Visual tracking of articulated and flexible objects / Suivi par vision d’objets articulés et flexiblesWesierski, Daniel 25 March 2013 (has links)
Les humains sont capables de suivre visuellement des objets sans effort. Cependant les algorithmes de vision artificielle rencontrent des limitations pour suivre des objets en mouvement rapide, sous un éclairage variable, en présence d'occultations, dans un environnement complexe ou dont l'apparence varie à cause de déformations et de changements de point de vue. Parce que des systèmes génériques, précis, robustes et rapides sont nécessaires pour de nombreuses d’applications, le suivi d’objets reste un problème pratique important en vision par ordinateur. La première contribution de cette thèse est une approche calculatoire rapide pour le suivi d'objets de forme et de mouvement variable. Elle consiste en un système unifié et configurable pour estimer l'attitude d’un objet déformable dans un espace d'états de dimension petite ou grande. L’objet est décomposé en une suite de segments composés de parties et organisés selon une hiérarchie spatio-temporelle contrainte. L'efficacité et l’universalité de cette approche sont démontrées expérimentalement sur de nombreux exemples de suivi de divers objets flexibles et articulés. Les caractéristiques de Haar (HLF) sont abondement utilisées pour le suivi d’objets. La deuxième contribution est une méthode de décomposition des HLF permettant de les calculer de manière efficace. Ces caractéristiques sont décomposées en noyaux plus simples, éventuellement réutilisables, et reformulées comme des convolutions multi-passes. La recherche et l'alignement des noyaux dans et entre les passes permet de créer des arbres récursifs de noyaux qui nécessitent moins d’opérations en mémoire que les systèmes de calcul classiques, pour un résultat de convolution identique et une mise en œuvre plus efficace. Cette approche a été validée expérimentalement sur des exemples de HLF très utilisés / Humans can visually track objects mostly effortlessly. However, it is hard for a computer to track a fast moving object under varying illumination and occlusions, in clutter, and with varying appearance in camera projective space due to its relaxed rigidity or change in viewpoint. Since a generic, precise, robust, and fast tracker could trigger many applications, object tracking has been a fundamental problem of practical importance since the beginnings of computer vision. The first contribution of the thesis is a computationally efficient approach to tracking objects of various shapes and motions. It describes a unifying tracking system that can be configured to track the pose of a deformable object in a low or high-dimensional state-space. The object is decomposed into a chained assembly of segments of multiple parts that are arranged under a hierarchy of tailored spatio-temporal constraints. The robustness and generality of the approach is widely demonstrated on tracking various flexible and articulated objects. Haar-like features are widely used in tracking. The second contribution of the thesis is a parser of ensembles of Haar-like features to compute them efficiently. The features are decomposed into simpler kernels, possibly shared by subsets of features, thus forming multi-pass convolutions. Discovering and aligning these kernels within and between passes allows forming recursive trees of kernels that require fewer memory operations than the classic computation, thereby producing the same result but more efficiently. The approach is validated experimentally on popular examples of Haar-like features
|
4 |
3D detection and pose estimation of medical staff in operating rooms using RGB-D images / Détection et estimation 3D de la pose des personnes dans la salle opératoire à partir d'images RGB-DKadkhodamohammadi, Abdolrahim 01 December 2016 (has links)
Dans cette thèse, nous traitons des problèmes de la détection des personnes et de l'estimation de leurs poses dans la Salle Opératoire (SO), deux éléments clés pour le développement d'applications d'assistance chirurgicale. Nous percevons la salle grâce à des caméras RGB-D qui fournissent des informations visuelles complémentaires sur la scène. Ces informations permettent de développer des méthodes mieux adaptées aux difficultés propres aux SO, comme l'encombrement, les surfaces sans texture et les occlusions. Nous présentons des nouvelles approches qui tirent profit des informations temporelles, de profondeur et des vues multiples afin de construire des modèles robustes pour la détection des personnes et de leurs poses. Une évaluation est effectuée sur plusieurs jeux de données complexes enregistrés dans des salles opératoires avec une ou plusieurs caméras. Les résultats obtenus sont très prometteurs et montrent que nos approches surpassent les méthodes de l'état de l'art sur ces données cliniques. / In this thesis, we address the two problems of person detection and pose estimation in Operating Rooms (ORs), which are key ingredients in the development of surgical assistance applications. We perceive the OR using compact RGB-D cameras that can be conveniently integrated in the room. These sensors provide complementary information about the scene, which enables us to develop methods that can cope with numerous challenges present in the OR, e.g. clutter, textureless surfaces and occlusions. We present novel part-based approaches that take advantage of depth, multi-view and temporal information to construct robust human detection and pose estimation models. Evaluation is performed on new single- and multi-view datasets recorded in operating rooms. We demonstrate very promising results and show that our approaches outperform state-of-the-art methods on this challenging data acquired during real surgeries.
|
Page generated in 0.3711 seconds