Global ETD Search

41	Towards scalable, multi-view urban modeling using structure priors / Vers une modélisation urbaine 3D extensible intégrant des à priori de structure géométrique Bourki, Amine 21 December 2017 (has links) Nous étudions dans cette thèse le problème de reconstruction 3D multi-vue à partir d’une séquence d’images au sol acquises dans des environnements urbains ainsi que la prise en compte d’a priori permettant la préservation de la structure sous-jacente de la géométrie 3D observée, ainsi que le passage à l’échelle de tels processus de reconstruction qui est intrinsèquement délicat dans le contexte de l’imagerie urbaine. Bien que ces deux axes aient été traités de manière extensive dans la littérature, les méthodes de reconstruction 3D structurée souffrent d’une complexité en temps de calculs restreignant significativement leur intérêt. D’autre part, les approches de reconstruction 3D large échelle produisent généralement une géométrie simplifiée, perdant ainsi des éléments de structures qui sont importants dans le contexte urbain. L’objectif de cette thèse est de concilier les avantages des approches de reconstruction 3D structurée à celles des méthodes rapides produisant une géométrie simplifiée. Pour ce faire, nous présentons “Patchwork Stereo”, un framework qui combine stéréoscopie photométrique utilisant une poignée d’images issues de points de vue éloignés, et un nuage de point épars. Notre méthode intègre une analyse simultanée 2D-3D réalisant une extraction robuste de plans 3D ainsi qu’une segmentation d’images top-down structurée et repose sur une optimisation par champs de Markov aléatoires. Les contributions présentées sont évaluées via des expériences quantitatives et qualitatives sur des données d’imagerie urbaine complexes illustrant des performances tant quant à la fidélité structurelle des reconstructions 3D que du passage à l’échelle / In this thesis, we address the problem of 3D reconstruction from a sequence of calibrated street-level photographs with a simultaneous focus on scalability and the use of structure priors in Multi-View Stereo (MVS).While both aspects have been studied broadly, existing scalable MVS approaches do not handle well the ubiquitous structural regularities, yet simple, of man-made environments. On the other hand, structure-aware 3D reconstruction methods are slow and scale poorly with the size of the input sequences and/or may even require additional restrictive information. The goal of this thesis is to reconcile scalability and structure awareness within common MVS grounds using soft, generic priors which encourage : (i) piecewise planarity, (ii) alignment of objects boundaries with image gradients and (iii) with vanishing directions (VDs), and (iv) objects co-planarity. To do so, we present the novel “Patchwork Stereo” framework which integrates photometric stereo from a handful of wide-baseline views and a sparse 3D point cloud combining robust 3D plane extraction and top-down image partitioning from a unified 2D-3D analysis in a principled Markov Random Field energy minimization. We evaluate our contributions quantitatively and qualitatively on challenging urban datasets and illustrate results which are at least on par with state-of-the-art methods in terms of geometric structure, but achieved in several orders of magnitude faster paving the way for photo-realistic city-scale modeling Reconstruction 3D Multi-Vue A priori de Structure Passage à l'échelle Modélisation Urbaine Multi-View Stereo 3D Reconstruction Structure Priors Scalability Urban Modeling
42	A Formalized Approach to Multi-View Components for Embedded Systems : Applied to Tool Integration, Run-Time Adaptivity and Architecture Exploration Persson, Magnus January 2013 (has links) Development of embedded systems poses an increasing challenge fordevelopers largely due to increasing complexity. Several factors contribute tothe complexity challenge: • the number of extra-functional properties applying to embedded systems,such as resource usage, timing effects, safety. • the functionality of embedded systems, to a larger extent than for othersoftware, involves engineers from multiple different disciplines, such asmechanical, control, software, safety, systems and electrical engineers.Themulti-disciplinarity causes the development environments to consistof separate data, models and tools. Several engineering paradigms to handle this complexity increase havebeen suggested, including methodologies focused on architecture, models andcomponents. In systems engineering, a long-standing approach has been todescribe the system in several views, each according to a certain viewpoint.By doing so, a divide-and-conquer strategy is applied to system concerns.Unfortunately, it is hard to always find completely independent concerns:there is always some semantic overlap between the different views. Modelbaseddesign (MBD) deals with building sound abstractions that can representa system under design and be used for analysis. Component-based design(CBD) focuses on how to build reusable component models with well-definedcomposition models. In this thesis, a concept of formalized multi-viewed component models (MVCM) is proposed, which integrates the three above mentioned paradigms.Principles and guidelines for MV CMs are developed. One of the main challengesfor the proposition is to provide MV CMs that produce composabilityboth along component boundaries and viewpoint boundaries. To accomplishthis, the relations between viewpoints need to be explicitly taken into account.Further, the semantic relations between these viewpoints need to be explicitlymodeled in order to efficiently ensure that the views are kept consistent. Asa main contribution, this thesis presents the formalization of the conceptsneeded to build such component models. A proper formalization of multiviewedconcerns provides several opportunities. Given suitable tool support, itwill be feasible to automate architecture analysis and architecture exploration. The thesis includes a number of case studies that provide insight andfeedback to the problem formulation and validating the results. The casestudies include a resource-aware reconfigurable middleware, a design of anarchitecture exploration methodology, and a windshield wiper system. / <p>QC 20130527</p> view viewpoint architecture component view integration architecture exploration component-based development multi-view modeling model-based design CESAR DySCAS ESPRESSO
43	Joint Utilization Of Local Appearance Descriptors And Semi-local Geometry For Multi-view Object Recognition Soysal, Medeni 01 May 2012 (has links) (PDF) Novel methods of object recognition that form a bridge between today&rsquo / s local feature frameworks and previous decade&rsquo / s strong but deserted geometric invariance field are presented in this dissertation. The rationale behind this effort is to complement the lowered discriminative capacity of local features, by the invariant geometric descriptions. Similar to our predecessors, we first start with constrained cases and then extend the applicability of our methods to more general scenarios. Local features approach, on which our methods are established, is reviewed in three parts / namely, detectors, descriptors and the methods of object recognition that employ them. Next, a novel planar object recognition framework that lifts the requirement for exact appearance-based local feature matching is presented. This method enables matching of groups of features by utilizing both appearance information and group geometric descriptions. An under investigated area, scene logo recognition, is selected for real life application of this method. Finally, we present a novel method for three-dimensional (3D) object recognition, which utilizes well-known local features in a more efficient way without any reliance on partial or global planarity. Geometrically consistent local features, which form the crucial basis for object recognition, are identified using affine 3D geometric invariants. The utilization of 3D geometric invariants replaces the classical 2D affine transform estimation /verification step, and provides the ability to directly verify 3D geometric consistency. The accuracy and robustness of the proposed method in highly cluttered scenes with no prior segmentation or post 3D reconstruction requirements, are presented during the experiments.
44	On discriminative semi-supervised incremental learning with a multi-view perspective for image concept modeling Byun, Byungki 17 January 2012 (has links) This dissertation presents the development of a semi-supervised incremental learning framework with a multi-view perspective for image concept modeling. For reliable image concept characterization, having a large number of labeled images is crucial. However, the size of the training set is often limited due to the cost required for generating concept labels associated with objects in a large quantity of images. To address this issue, in this research, we propose to incrementally incorporate unlabeled samples into a learning process to enhance concept models originally learned with a small number of labeled samples. To tackle the sub-optimality problem of conventional techniques, the proposed incremental learning framework selects unlabeled samples based on an expected error reduction function that measures contributions of the unlabeled samples based on their ability to increase the modeling accuracy. To improve the convergence property of the proposed incremental learning framework, we further propose a multi-view learning approach that makes use of multiple features such as color, texture, etc., of images when including unlabeled samples. For robustness to mismatches between training and testing conditions, a discriminative learning algorithm, namely a kernelized maximal- figure-of-merit (kMFoM) learning approach is also developed. Combining individual techniques, we conduct a set of experiments on various image concept modeling problems, such as handwritten digit recognition, object recognition, and image spam detection to highlight the effectiveness of the proposed framework. Discriminative learning Semi-supervised learning Incremental learning Image modeling Multi-view learning Machine learning Supervised learning (Machine learning) Boosting (Algorithms)
45	Stochastic methods in computational stereo Coffman, Thayne Richard 16 June 2011 (has links) Computational stereo estimates 3D structure by analyzing visual changes between two or more passive images of a scene that are captured from different viewpoints. It is a key enabler for ubiquitous autonomous systems, large-scale surveying, virtual reality, and improved techniques for compression, tracking, and object recognition. The fact that computational stereo is an under-constrained inverse problem causes many challenges. Its computational and memory requirements are high. Typical heuristics and assumptions, used to constrain solutions or reduce computation, prevent treatment of key realities such as reflection, translucency, ambient lighting changes, or moving objects in the scene. As a result, a general solution is lacking. Stochastic models are common in computational stereo, but stochastic algorithms are severely under-represented. In this dissertation I present two stochastic algorithms and demonstrate their advantages over deterministic approaches. I first present the Quality-Efficient Stochastic Sampling (QUESS) approach. QUESS reduces the number of match quality function evaluations needed to estimate dense stereo correspondences. This facilitates the use of complex quality metrics or metrics that take unique values at non-integer disparities. QUESS is shown to outperform two competing approaches, and to have more attractive memory and scaling properties than approaches based on exhaustive sampling. I then present a second novel approach based on the Hough transform and extend it with distributed ray tracing (DRT). DRT is a stochastic anti-aliasing technique common to computer rendering but which has not been used in computational stereo. I demonstrate that the DRT-enhanced approach outperforms the unenhanced approach, a competing variation that uses re-accumulation in the Hough domain, and another baseline approach. DRT’s advantages are particularly strong for reduced image resolution and/or reduced accumulator matrix resolution. In support of this second approach, I develop two novel variations of the Hough transform that use DRT, and demonstrate that they outperform competing variations on a traditional line segment detection problem. I generalize these two examples to draw broader conclusions, suggest future work, and call for a deeper exploration by the community. Both practical and academic gaps in the state of the art can be reduced by a renewed exploration of stochastic computational stereo techniques. / text Computational stereo Stereo vision Multi-view stereo Stochastic approximation Hough transform Distributed ray tracing Autonomous systems Object recognition
46	Compréhension de scènes urbaines par combinaison d'information 2D/3D / Urban scenes understanding by combining 2D/3D information Bauda, Marie-Anne 13 June 2016 (has links) Cette thèse traite du problème de segmentation sémantique d'une séquence d'images calibrées acquises dans un environnement urbain. Ce problème consiste, plus précisément, à partitionner chaque image en régions représentant les objets de la scène (façades, routes, etc.). Ainsi, à chaque région est associée une étiquette sémantique. Dans notre approche, l'étiquetage s'opère via des primitives visuelles de niveau intermédiaire appelés super-pixels, lesquels regroupent des pixels similaires au sens de différents critères proposés dans la littérature, qu'ils soient photométriques (s'appuyant sur les couleurs) ou géométriques (limitant la taille des super-pixels formés). Contrairement à l'état de l'art, où les travaux récents traitant le même problème s'appuient en entrée sur une sur-segmentation initiale sans la remettre en cause, notre idée est de proposer, dans un contexte multi-vues, une nouvelle approche de constructeur de superpixels s'appuyant sur une analyse tridimensionnelle de la scène et, en particulier, de ses structures planes. Pour construire de «meilleurs» superpixels, une mesure de planéité locale, qui quantifie à quel point la zone traitée de l'image correspond à une surface plane de la scène, est introduite. Cette mesure est évaluée à partir d'une rectification homographique entre deux images proches, induites par un plan candidat au support des points 3D associés à la zone traitée. Nous analysons l'apport de la mesure UQI (Universal Quality Image) et montrons qu'elle se compare favorablement aux autres métriques qui ont le potentiel de détecter des structures planes. On introduit ensuite un nouvel algorithme de construction de super-pixels, fondé sur l'algorithme SLIC (Simple Linear Iterative Clustering) dont le principe est de regrouper les plus proches voisins au sens d'une distance fusionnant similarités en couleur et en distance, et qui intègre cette mesure de planéité. Ainsi la sur-segmentation obtenue, couplée à la cohérence interimages provenant de la validation de la contrainte de planéité locale de la scène, permet d'attribuer une étiquette à chaque entité et d'obtenir ainsi une segmentation sémantique qui partitionne l'image en objets plans. / This thesis deals with the semantic segmentation problem of a calibrated sequence of images acquired in an urban environment. The problem is, specifically, to partition each image into regions representing the objects in the scene such as facades, roads, etc. Thus, each region is associated with a semantic tag. In our approach, the labelling is done through mid-level visual features called super-pixels, which are groups of similar pixels within the meaning of some criteria proposed in research such as photometric criteria (based on colour) or geometrical criteria thus limiting the size of super-pixel formed. Unlike the state of the art, where recent work addressing the same problem are based on an initial over-segmentation input without calling it into question, our idea is to offer, in a multi-view environment, another super-pixel constructor approach based on a three-dimensional scene analysis and, in particular, an analysis of its planar structures. In order to construct "better" super-pixels, a local flatness measure is introduced which quantifies at which point the zone of the image in question corresponds to a planar surface of the scene. This measure is assessed from the homographic correction between two close images, induced by a candidate plan as support to the 3D points associated with the area concerned. We analyze the contribution of the UQI measure (Universal Image Quality) and demonstrate that it compares favorably with other metrics which have the potential to detect planar structures. Subsequently we introduce a new superpixel construction algorithm based on the SLIC (Simple Linear Iterative Clustering) algorithm whose principle is to group the nearest neighbors in terms of a distance merging similarities in colour and distance, and which includes this local planarity measure. Hence the over-segmentation obtained, coupled with the inter-image coherence as a result of the validation of the local flatness constraint related to the scene, allows assigning a label to each entity and obtaining in this way a semantic segmentation which divides the image into planar objects. Segmentation sémantique Superpixels Multi-vues Mesures de cohérence photométrique Planéité et homographie Semantic segmentation Superpixels Multi-view Photo-consistency measure Flatness and homography
47	Détection de personnes pour des systèmes de videosurveillance multi-caméra intelligents / People detection methods for intelligent multi-Camera surveillance systems Mehmood, Muhammad Owais 28 September 2015 (has links) La détection de personnes dans les vidéos est un défi bien connu du domaine de la vision par ordinateur avec un grand nombre d'applications telles que le développement de systèmes de surveillance visuels. Même si les détecteurs monoculaires sont plus simples à mettre en place, ils sont dans l’incapacité de gérer des scènes complexes avec des occultations, une grande densité de personnes ou des scènes avec beaucoup de profondeur de champ menant à une grande variabilité dans la taille des personnes. Dans cette thèse, nous étudions la détection de personnes multi-vues et notamment l'utilisation de cartes d'occupation probabilistes créées en fusionnant les différentes vues grâce à la connaissance de la géométrie du système. La détection à partir de ces cartes d'occupation amène cependant des fausses détections (appelées « fantômes ») dues aux différentes projections. Nous proposons deux nouvelles techniques afin de remédier à ce phénomène et améliorer la détection des personnes. La première utilise une déconvolution par un noyau dont la forme varie spatialement tandis que la seconde est basée sur un principe de validation d’hypothèse. Ces deux approches n'utilisent volontairement pas l'information temporelle qui pourra être réintroduite par la suite dans des algorithmes de suivi. Les deux approches ont été validées dans des conditions difficiles présentant des occultations, une densité de personnes plus ou moins élevée et de fortes variations dans les réponses colorimétriques des caméras. Une comparaison avec d'autres méthodes de l’état de l'art a également été menée sur trois bases de données publiques, validant les méthodes proposées pour la surveillance d'une gare et d'un aéroport / People detection is a well-studied open challenge in the field of Computer Vision with applications such as in the visual surveillance systems. Monocular detectors have limited ability to handle occlusion, clutter, scale, density. Ubiquitous presence of cameras and computational resources fuel the development of multi-camera detection systems. In this thesis, we study the multi-camera people detection; specifically, the use of multi-view probabilistic occupancy maps based on the camera calibration. Occupancy maps allow multi-view geometric fusion of several camera views. Detection with such maps create several false detections and we study this phenomenon: ghost pruning. Further, we propose two novel techniques in order to improve multi-view detection based on: (a) kernel deconvolution, and (b) occupancy shape modeling. We perform non-temporal, multi-view reasoning in occupancy maps to recover accurate positions of people in challenging conditions such as of occlusion, clutter, lighting, and camera variations. We show improvements in people detections across three challenging datasets for visual surveillance including comparison with state-of-the-art techniques. We show the application of this work in exigent transportation scenarios i.e. people detection for surveillance at a train station and at an airport Géométrie multi-Vues Fusion de capteurs Reconnaissance des Formes Détection d'objects Surveillance Multi-View Geometry Sensor Fusion Pattern Recognition Object Detection Surveillance
48	Procedural reconstruction of buildings : towards large scale automatic 3D modeling of urban environments / Reconstruction procédurale de bâtiments : vers l’automatisation à grande échelle de la modélisation 3D d’environnements urbains Simon, Loïc 25 July 2011 (has links) La présente thèse est consacrée à la modélisation 2D et 3D d’environnements urbains à l’aide de représentations structurées et de grammaires de formes. Notre approche consiste à introduire une représentation sémantique de bâtiments, qui encode les contraintes architecturales attendues, et qui soit capable de traiter des exemples complexes en utilisant des grammaires très simples. En outre, nous proposons deux nouveaux algorithmes d’inférence permettant l’analyse grammaticale d’images en utilisant ces grammaires. En premier lieu, un algorithme dit de hill climbing permet d’extraire les règles de grammaire et les paramètres correspondants à partir d’une vue unique d’une façade. Ce concept combine astucieusement les contraintes grammaticales et les propriétés visuelles attendues pour les différents éléments architecturaux. Cependant, afin de pouvoir traiter de cas plus complexes et également d’incorporer de l’information 3D, une deuxième stratégie d’inférence basée sur des algorithmes évolutionnaires a été adoptée pour optimiser un fonction à deux objectifs qui introduit notamment des notions de profondeur. Le système proposé a été évalué tant qualitativement que quantitativement sur un panel de façades de référence toute munies d’annotations, démontrant ainsi sa robustesse face à des situations d’abords difficiles. Grâce à la force du contexte grammatical, des améliorations substantielles ont été démontrées par rapport aux performances des mêmes modèles couplés à des a priori uniquement locaux. Par conséquent, notre approche fournit des outils puissants permettant de faire face à la demande croissante en modélisation 3D d’environnements réels à large échelle, grâce à des représentations sémantiques compactes et structurées. Ce travail ouvre par ailleurs un vaste champ de perspectives pour le domaine de l’interprétation d’images / This thesis is devoted to 2D and 3D modeling of urban environments using structured representations and grammars. Our approach introduces a semantic representation for buildings that encodes expected architectural constraints and is able to derive complex instances using fairly simple grammars. Furthermore, we propose two novel inference algorithms to parse images using such grammars. To this end, a steepest ascent hill climbing concept is considered to derive the grammar and the corresponding parameters from a single facade view. It combines the grammar constraints with the expected visual properties of the different architectural elements. Towards addressing more complex scenarios and incorporating 3D information, a second inference strategy based on evolutionary computational algorithms is adopted to optimize a two-component objective function introducing depth cues. The proposed framework was evaluated qualitatively and quantitatively on a benchmark of annotated facades, demonstrating robustness to challenging situations. Substantial improvement due to the strong grammatical context was shown in comparison to the performance of the same appearance models coupled with local priors. Therefore, our approach provides powerful techniques in response to increasing demand on large scale 3D modeling of real environments through compact, structured and semantic representations, while opening new perspectives for image understanding Modélisation 3D d’architecture Grammaire de formes Reconstruction multi-vues Architectural image-based modeling Shape grammars Multi-view reconstruction
49	Automatic Volume Estimation of Timber from Multi-View Stereo 3D Reconstruction Rundgren, Emil January 2017 (has links) The ability to automatically estimate the volume of timber is becoming increasingly important within the timber industry. The large number of timber trucks arriving each day at Swedish timber terminals fortifies the need for a volume estimation performed in real-time and on-the-go as the trucks arrive. This thesis investigates if a volumetric integration of disparity maps acquired from a Multi-View Stereo (MVS) system is a suitable approach for automatic volume estimation of timber loads. As real-time execution is preferred, efforts were made to provide a scalable method. The proposed method was quantitatively evaluated on datasets containing two geometric objects of known volume. A qualitative comparison to manual volume estimates of timber loads was also made on datasets recorded at a Swedish timber terminal. The proposed method is shown to be both accurate and precise under specific circumstances. However, robustness is poor to varying weather conditions, although a more thorough evaluation of this aspect needs to be performed. The method is also parallelizable, which means that future efforts can be made to significantly decrease execution time. 3D Reconstruction Multi-View Stereo Automatic Volume Estimation Signed Distance Function Computer Vision Signal Processing Image Processing Signal Processing Signalbehandling
50	Improving Classification and Attribute Clustering: An Iterative Semi-supervised Approach Seifi, Farid January 2015 (has links) This thesis proposes a novel approach to attribute clustering. It exploits the strength of semi-supervised learning to improve the quality of attribute clustering particularly when labeled data is limited. The significance of this work derives in part from the broad, and increasingly important, usage of attribute clustering to address outstanding problems within the machine learning community. This form of clustering has also been shown to have strong practical applications, being usable in heavyweight industrial applications. Although researchers have focused on supervised and unsupervised attribute clustering in recent years, semi-supervised attribute clustering has not received substantial attention. In this research, we propose an innovative two step iterative semi-supervised attribute clustering framework. This new framework, in each iteration, uses the result of attribute clustering to improve a classifier. It then uses the classifier to augment the training data used by attribute clustering in next iteration. This iterative framework outputs an improved classifier and attribute clustering at the same time. It gives more accurate clusters of attributes which better fit the real relations between attributes. In this study we proposed two new usages for attribute clustering to improve classification: solving the automatic view definition problem for multi-view learning and improving missing attribute-value handling at induction and prediction time. The application of these two new usages of attribute clustering in our proposed semi-supervised attribute clustering is evaluated using real world data sets from different domains. Attribute Clustering Classification Big Data Semi-supervised Learning Machine Learning Multi-view Learning Missing Attribute-value Handling

Search results