Global ETD Search

1	Spatio-temporal data interpolation for dynamic scene analysis Kim, Kihwan 06 January 2012 (has links) Analysis and visualization of dynamic scenes is often constrained by the amount of spatio-temporal information available from the environment. In most scenarios, we have to account for incomplete information and sparse motion data, requiring us to employ interpolation and approximation methods to fill for the missing information. Scattered data interpolation and approximation techniques have been widely used for solving the problem of completing surfaces and images with incomplete input data. We introduce approaches for such data interpolation and approximation from limited sensors, into the domain of analyzing and visualizing dynamic scenes. Data from dynamic scenes is subject to constraints due to the spatial layout of the scene and/or the configurations of video cameras in use. Such constraints include: (1) sparsely available cameras observing the scene, (2) limited field of view provided by the cameras in use, (3) incomplete motion at a specific moment, and (4) varying frame rates due to different exposures and resolutions. In this thesis, we establish these forms of incompleteness in the scene, as spatio-temporal uncertainties, and propose solutions for resolving the uncertainties by applying scattered data approximation into a spatio-temporal domain. The main contributions of this research are as follows: First, we provide an efficient framework to visualize large-scale dynamic scenes from distributed static videos. Second, we adopt Radial Basis Function (RBF) interpolation to the spatio-temporal domain to generate global motion tendency. The tendency, represented by a dense flow field, is used to optimally pan and tilt a video camera. Third, we propose a method to represent motion trajectories using stochastic vector fields. Gaussian Process Regression (GPR) is used to generate a dense vector field and the certainty of each vector in the field. The generated stochastic fields are used for recognizing motion patterns under varying frame-rate and incompleteness of the input videos. Fourth, we also show that the stochastic representation of vector field can also be used for modeling global tendency to detect the region of interests in dynamic scenes with camera motion. We evaluate and demonstrate our approaches in several applications for visualizing virtual cities, automating sports broadcasting, and recognizing traffic patterns in surveillance videos. Surveillance application Dynamic scene analysis Scattered data approximation Dynamic scene visualization Multivew analysis Video analysis Scattered data interpolation Computer vision Image processing Interpolation Electronic surveillance Video surveillance
2	The Study of Energy Consumption of Acceleration Structures for Dynamic CPU and GPU Ray Tracing Chang, Chen Hao Jason 08 January 2007 (has links) Battery life has been the slowest growing resource on mobile systems for several decades. Although much work has been done on designing new chips and peripherals that use less energy, there has not been much work on reducing energy consumption by removing energy intensive tasks from graphics algorithms. In our work, we focus on energy consumption of the ray tracing task because it is a resource-intensive, global-illumination algorithm. We focus our effort on ray tracing dynamic scenes, thus we concentrate on identifying the major elements determining the energy consumption of acceleration structures. We believe acceleration structures are critical in reducing energy consumption because they need to be built inexpensively, but must also be complex enough to boost rendering speed. We conducted tests on a Pentium 1.6 GHz laptop with GeForce Go 6800 GPU. In our experiments, we investigated various elements that modify the acceleration structure build algorithm, and we compared the energy usage of CPU and GPU rendering with different acceleration structures. Furthermore, the energy per frame when ray tracing dynamic scenes was gathered and compared to identify the best acceleration structure that provides a good balance between building energy consumption and rendering energy consumption. We found the bounding volume hierarchy to be the best acceleration structure when rendering dynamic scenes with the GPU on our test system. A bounding volume hierarchy is not the most inexpensive structure to build, but it can be rendered cheaply on the GPU while introducing acceptable energy overhead when rebuilding. In addition, we found the fastest algorithm was also the most inexpensive in terms of energy consumption. We propose an energy model based on this finding. Battery Energy Ray Tracing GPU Dynamic Scene Acceleration Structure Computer graphics Energy consumption Ray tracing
3	Motion Capture of Deformable Surfaces in Multi-View Studios Cagniart, Cedric 16 July 2012 (has links) (PDF) In this thesis we address the problem of digitizing the motion of three-dimensional shapes that move and deform in time. These shapes are observed from several points of view with cameras that record the scene's evolution as videos. Using available reconstruction methods, these videos can be converted into a sequence of three-dimensional snapshots that capture the appearance and shape of the objects in the scene. The focus of this thesis is to complement appearance and shape with information on the motion and deformation of objects. In other words, we want to measure the trajectory of every point on the observed surfaces. This is a challenging problem because the captured videos are only sequences of images, and the reconstructed shapes are built independently from each other. While the human brain excels at recreating the illusion of motion from these snapshots, using them to automatically measure motion is still largely an open problem. The majority of prior works on the subject has focused on tracking the performance of one human actor, and used the strong prior knowledge on the articulated nature of human motion to handle the ambiguity and noise inherent to visual data. In contrast, the presented developments consist of generic methods that allow to digitize scenes involving several humans and deformable objects of arbitrary nature. To perform surface tracking as generically as possible, we formulate the problem as the geometric registration of surfaces and deform a reference mesh to fit a sequence of independently reconstructed meshes. We introduce a set of algorithms and numerical tools that integrate into a pipeline whose output is an animated mesh. Our first contribution consists of a generic mesh deformation model and numerical optimization framework that divides the tracked surface into a collection of patches, organizes these patches in a deformation graph and emulates elastic behavior with respect to the reference pose. As a second contribution, we present a probabilistic formulation of deformable surface registration that embeds the inference in an Expectation-Maximization framework that explicitly accounts for the noise and in the acquisition. As a third contribution, we look at how prior knowledge can be used when tracking articulated objects, and compare different deformation model with skeletal-based tracking. The studies reported by this thesis are supported by extensive experiments on various 4D datasets. They show that in spite of weaker assumption on the nature of the tracked objects, the presented ideas allow to process complex scenes involving several arbitrary objects, while robustly handling missing data and relatively large reconstruction artifacts. Deformable surface tracking Multi-view Dynamic scene Deformable registration Expectation-Maximization EM
4	Fusions multimodales pour la recherche d'humains par un robot mobile / Multimodal fusions for human detection by a mobile robot Labourey, Quentin 19 May 2017 (has links) Dans ce travail, nous considérons le cas d'un robot mobile d'intérieur dont l'objectif est de détecter les humains présents dans l'environnement et de se positionner physiquement par rapport à eux, dans le but de mieux percevoir leur état. Pour cela, le robot dispose de différents capteurs (capteur RGB-Depth, microphones, télémètre laser). Des contributions de natures variées ont été effectuées :Classification d'événements sonores en environnement intérieur : La méthode de classification proposée repose sur une taxonomie de petite taille et est destinée à différencier les marqueurs de la présence humaine. L'utilisation de fonctions de croyance permet de prendre en compte l'incertitude de la classification, et de labelliser un son comme « inconnu ».Fusion audiovisuelle pour la détection de locuteurs successifs dans une conversation : Une méthode de détection de locuteurs est proposée dans le cas du robot immobile, placé comme témoin d'une interaction sociale. Elle repose sur une fusion audiovisuelle probabiliste. Cette méthode a été testée sur des vidéos acquises par le robot.Navigation dédiée à la détection d'humains à l'aide d'une fusion multimodale : A partir d'informations provenant des capteurs hétérogènes, le robot cherche des humains de manière autonome dans un environnement connu. Les informations sont fusionnées au sein d'une grille de perception multimodale. Cette grille permet au robot de prendre une décision quant à son prochain déplacement, à l'aide d'un automate reposant sur des niveaux de priorité des informations perçues. Ce système a été implémenté et testé sur un robot Q.bo.Modélisation crédibiliste de l'environnement pour la navigation : La construction de la grille de perception multimodale est améliorée à l'aide d'un mécanisme de fusion reposant sur la théorie des fonctions de croyance. Ceci permet au robot de maintenir une grille « évidentielle » dans le temps comprenant l'information perçue et son incertitude. Ce système a d'abord été évalué en simulation, puis sur le robot Q.bo. / In this work, we consider the case of mobile robot that aims at detecting and positioning itself with respect to humans in its environment. In order to fulfill this mission, the robot is equipped with various sensors (RGB-Depth, microphones, laser telemeter). This thesis contains contributions of various natures:Sound classification in indoor environments: A small taxonomy is proposed in a classification method destined to enable a robot to detect human presence. Uncertainty of classification is taken into account through the use of belief functions, allowing us to label a sound as "unknown".Speaker tracking thanks to audiovisual data fusion: The robot is witness to a social interaction and tracks the successive speakers with probabilistic audiovisual data fusion. The proposed method was tested on videos extracted from the robot's sensors.Navigation dedicated to human detection thanks to a multimodal fusion:} The robot autonomously navigates in a known environment to detect humans thanks to heterogeneous sensors. The data is fused to create a multimodal perception grid. This grid enables the robot to chose its destinations, depending on the priority of perceived information. This system was implemented and tested on a Q.bo robot.Credibilist modelization of the environment for navigation: The creation of the multimodal perception grid is improved by the use of credibilist fusion. This enables the robot to maintain an evidential grid in time, containing the perceived information and its uncertainty. This system was implemented in simulation first, and then on a Q.bo robot. Fusion de donnée capteurs Robot compagnon Perception multimodale Analyse de scène dynamique Multisensor data fusion Companion robot Multimodal perception Dynamic scene analysis 004
5	Motion Capture of Deformable Surfaces in Multi-View Studios / Acquisition de surfaces déformables à partir d'un système multicaméra calibré Cagniart, Cédric 16 July 2012 (has links) Cette thèse traite du suivi temporel de surfaces déformables. Ces surfaces sont observées depuis plusieurs points de vue par des caméras qui capturent l'évolution de la scène et l'enregistrent sous la forme de vidéos. Du fait des progrès récents en reconstruction multi-vue, cet ensemble de vidéos peut être converti en une série de clichés tridimensionnels qui capturent l'apparence et la forme des objets dans la scène. Le problème au coeur des travaux rapportés par cette thèse est de complémenter les informations d'apparence et de forme avec des informations sur les mouvements et les déformations des objets. En d'autres mots, il s'agit de mesurer la trajectoire de chacun des points sur les surfaces observées. Ceci est un problème difficile car les vidéos capturées ne sont que des séquences d'images, et car les formes reconstruites à chaque instant le sont indépendemment les unes des autres. Si le cerveau humain excelle à recréer l'illusion de mouvement à partir de ces clichés, leur utilisation pour la mesure automatisée du mouvement reste une question largement ouverte. La majorité des précédents travaux sur le sujet se sont focalisés sur la capture du mouvement humain et ont bénéficié de la nature articulée de ce mouvement qui pouvait être utilisé comme a-priori dans les calculs. La spécificité des développements présentés ici réside dans la généricité des méthodes qui permettent de capturer des scènes dynamiques plus complexes contenant plusieurs acteurs et différents objets déformables de nature inconnue a priori. Pour suivre les surfaces de la façon la plus générique possible, nous formulons le problème comme celui de l'alignement géométrique de surfaces, et déformons un maillage de référence pour l'aligner avec les maillages indépendemment reconstruits de la séquence. Nous présentons un ensemble d'algorithmes et d'outils numériques intégrés dans une chaîne de traitements dont le résultat est un maillage animé. Notre première contribution est une méthode de déformation de maillage qui divise la surface en une collection de morceaux élémentaires de surfaces que nous nommons patches. Ces patches sont organisés dans un graphe de déformation, et une force est appliquée sur cette structure pour émuler une déformation élastique par rapport à la pose de référence. Comme seconde contribution, nous présentons une formulation probabiliste de l'alignement de surfaces déformables qui modélise explicitement le bruit dans le processus d'acquisition. Pour finir, nous étudions dans quelle mesure les a-prioris sur la nature articulée du mouvement peuvent aider, et comparons différents modèles de déformation à une méthode de suivi de squelette. Les développements rapportés par cette thèse sont validés par de nombreuses expériences sur une variété de séquences. Ces résultats montrent qu'en dépit d'a-prioris moins forts sur les surfaces suivies, les idées présentées permettent de traiter des scènes complexes contenant de multiples objets tout en se comportant de façon robuste vis-a-vis de données fragmentaires et d'erreurs de reconstruction. / In this thesis we address the problem of digitizing the motion of three-dimensional shapes that move and deform in time. These shapes are observed from several points of view with cameras that record the scene's evolution as videos. Using available reconstruction methods, these videos can be converted into a sequence of three-dimensional snapshots that capture the appearance and shape of the objects in the scene. The focus of this thesis is to complement appearance and shape with information on the motion and deformation of objects. In other words, we want to measure the trajectory of every point on the observed surfaces. This is a challenging problem because the captured videos are only sequences of images, and the reconstructed shapes are built independently from each other. While the human brain excels at recreating the illusion of motion from these snapshots, using them to automatically measure motion is still largely an open problem. The majority of prior works on the subject has focused on tracking the performance of one human actor, and used the strong prior knowledge on the articulated nature of human motion to handle the ambiguity and noise inherent to visual data. In contrast, the presented developments consist of generic methods that allow to digitize scenes involving several humans and deformable objects of arbitrary nature. To perform surface tracking as generically as possible, we formulate the problem as the geometric registration of surfaces and deform a reference mesh to fit a sequence of independently reconstructed meshes. We introduce a set of algorithms and numerical tools that integrate into a pipeline whose output is an animated mesh. Our first contribution consists of a generic mesh deformation model and numerical optimization framework that divides the tracked surface into a collection of patches, organizes these patches in a deformation graph and emulates elastic behavior with respect to the reference pose. As a second contribution, we present a probabilistic formulation of deformable surface registration that embeds the inference in an Expectation-Maximization framework that explicitly accounts for the noise and in the acquisition. As a third contribution, we look at how prior knowledge can be used when tracking articulated objects, and compare different deformation model with skeletal-based tracking. The studies reported by this thesis are supported by extensive experiments on various 4D datasets. They show that in spite of weaker assumption on the nature of the tracked objects, the presented ideas allow to process complex scenes involving several arbitrary objects, while robustly handling missing data and relatively large reconstruction artifacts. Suivi de surfaces déformables Multi-vues Scène dynamique Alignement de surfaces Espérance-Maximisation EM Deformable surface tracking Multi-view Dynamic scene Deformable registration Expectation-Maximization EM
6	Real-time Realistic Rendering And High Dynamic Range Image Display And Compression Xu, Ruifeng 01 January 2005 (has links) This dissertation focuses on the many issues that arise from the visual rendering problem. Of primary consideration is light transport simulation, which is known to be computationally expensive. Monte Carlo methods represent a simple and general class of algorithms often used for light transport computation. Unfortunately, the images resulting from Monte Carlo approaches generally suffer from visually unacceptable noise artifacts. The result of any light transport simulation is, by its very nature, an image of high dynamic range (HDR). This leads to the issues of the display of such images on conventional low dynamic range devices and the development of data compression algorithms to store and recover the corresponding large amounts of detail found in HDR images. This dissertation presents our contributions relevant to these issues. Our contributions to high dynamic range image processing include tone mapping and data compression algorithms. This research proposes and shows the efficacy of a novel level set based tone mapping method that preserves visual details in the display of high dynamic range images on low dynamic range display devices. The level set method is used to extract the high frequency information from HDR images. The details are then added to the range compressed low frequency information to reconstruct a visually accurate low dynamic range version of the image. Additional challenges associated with high dynamic range images include the requirements to reduce excessively large amounts of storage and transmission time. To alleviate these problems, this research presents two methods for efficient high dynamic range image data compression. One is based on the classical JPEG compression. It first converts the raw image into RGBE representation, and then sends the color base and common exponent to classical discrete cosine transform based compression and lossless compression, respectively. The other is based on the wavelet transformation. It first transforms the raw image data into the logarithmic domain, then quantizes the logarithmic data into the integer domain, and finally applies the wavelet based JPEG2000 encoder for entropy compression and bit stream truncation to meet the desired bit rate requirement. We believe that these and similar such contributions will make a wide application of high dynamic range images possible. The contributions to light transport simulation include Monte Carlo noise reduction, dynamic object rendering and complex scene rendering. Monte Carlo noise is an inescapable artifact in synthetic images rendered using stochastic algorithm. This dissertation proposes two noise reduction algorithms to obtain high quality synthetic images. The first one models the distribution of noise in the wavelet domain using a Laplacian function, and then suppresses the noise using a Bayesian method. The other extends the bilateral filtering method to reduce all types of Monte Carlo noise in a unified way. All our methods reduce Monte Carlo noise effectively. Rendering of dynamic objects adds more dimension to the expensive light transport simulation issue. This dissertation presents a pre-computation based method. It pre-computes the surface radiance for each basis lighting and animation key frame, and then renders the objects by synthesizing the pre-computed data in real-time. Realistic rendering of complex scenes is computationally expensive. This research proposes a novel 3D space subdivision method, which leads to a new rendering framework. The light is first distributed to each local region to form local light fields, which are then used to illuminate the local scenes. The method allows us to render complex scenes at interactive frame rates. Rendering has important applications in mixed reality. Consistent lighting and shadows between real scenes and virtual scenes are important features of visual integration. The dissertation proposes to render the virtual objects by irradiance rendering using live captured environmental lighting. This research also introduces a virtual shadow generation method that computes shadows cast by virtual objects to the real background. We finally conclude the dissertation by discussing a number of future directions for rendering research, and presenting our proposed approaches. high dynamic range image data compression tone mapping Monte Carlo noise dynamic scene complex scene real-time rendering realistic rendering global illumination Computer Sciences Engineering
7	Immersive Dynamic Scenes for Virtual Reality from a Single RGB-D Camera Lai, Po Kong 26 September 2019 (has links) In this thesis we explore the concepts and components which can be used as individual building blocks for producing immersive virtual reality (VR) content from a single RGB-D sensor. We identify the properties of immersive VR videos and propose a system composed of a foreground/background separator, a dynamic scene re-constructor and a shape completer. We initially explore the foreground/background separator component in the context of video summarization. More specifically, we examined how to extract trajectories of moving objects from video sequences captured with a static camera. We then present a new approach for video summarization via minimization of the spatial-temporal projections of the extracted object trajectories. New evaluation criterion are also presented for video summarization. These concepts of foreground/background separation can then be applied towards VR scene creation by extracting relative objects of interest. We present an approach for the dynamic scene re-constructor component using a single moving RGB-D sensor. By tracking the foreground objects and removing them from the input RGB-D frames we can feed the background only data into existing RGB-D SLAM systems. The result is a static 3D background model where the foreground frames are then super-imposed to produce a coherent scene with dynamic moving foreground objects. We also present a specific method for extracting moving foreground objects from a moving RGB-D camera along with an evaluation dataset with benchmarks. Lastly, the shape completer component takes in a single view depth map of an object as input and "fills in" the occluded portions to produce a complete 3D shape. We present an approach that utilizes a new data minimal representation, the additive depth map, which allows traditional 2D convolutional neural networks to accomplish the task. The additive depth map represents the amount of depth required to transform the input into the "back depth map" which would exist if there was a sensor exactly opposite of the input. We train and benchmark our approach using existing synthetic datasets and also show that it can perform shape completion on real world data without fine-tuning. Our experiments show that our data minimal representation can achieve comparable results to existing state-of-the-art 3D networks while also being able to produce higher resolution outputs. virtual reality immersive virtual reality VR immersive VR computer vision 3D computer vision machine learning 3D machine learning deep learning convolutional neural networks image processing 3D reconstruction scene reconstruction dynamic scene reconstruction shape completion occlusion filling video summarization
8	Représentation dynamique de modèles d'acteurs issus de reconstructions multi-vues / Dynamic representation of actors' models from multi-view reconstructions Blache, Ludovic 20 April 2016 (has links) Les technologies de reconstruction multi-vues permettent de réaliser un clone virtuel d'un acteur à partir d'une simple acquisition vidéo réalisée par un ensemble de caméras à partir de multiples points de vue. Cette approche offre de nouvelles opportunités dans le domaine de la composition de scènes hybrides mélangeant les images réelles et virtuelles. Cette thèse a été réalisée dans le cadre du projet RECOVER 3D dont l'objectif est de développer une chaîne de production TV complète, de l'acquisition jusqu'à la diffusion, autour de la reconstruction multi-vues. Cependant la technologie utilisée dans ce contexte est mal adaptée à la reconstruction de scènes dynamiques. En effet, la performance d'un acteur est reproduite sous la forme d'une séquence d'objets 3D statiques qui correspondent aux poses successives du personnage au cours de la capture vidéo. L'objectif de cette thèse est de développer une méthode pour transformer ces séquences de poses en un modèle animé unique. Les travaux de recherches menés dans ce cadre sont répartis en deux étapes principales. La première a pour but de calculer un champ de déplacements qui décrit les mouvements de l'acteur entre deux poses consécutives. La seconde étape consiste à animer un maillage suivant les trajectoires décrites par le champ de mouvements, de manière à le déplacer vers la pose suivante. En répétant ce processus tout au long la séquence, nous parvenons ainsi à reproduire un maillage animé qui adopte les poses successives de l'acteur. Les résultats obtenus montrent que notre méthode peut générer un modèle temporellement cohérent à partir d'une séquence d'enveloppes visuelles. / 4D multi-view reconstruction technologies are more and more used in media production due to their abilities to produce a virtual clone of an actor from a simple video acquisition performed by a set of multi-viewpoint cameras. This approach is a major advance for the composition of animations which mix virtual and real images, and also offers new possibilities for the rendering of such complex hybrid scenes. The work described in this thesis takes parts in the RECOVER 3D project which aims at developing an innovative industrial framework for TV production, based on multi-view reconstruction, from studio acquisition to broadcasting. The major drawback of the methods used in this context is that they are not adapted to the reconstruction of dynamic scenes. The output are time series which describe the successive poses of the actor, figured as a sequence of static objects. The goal of this thesis is to transform these initial results into a dynamic 3D object where the actor is figured as an animated character. The research detailed in this manuscript presents two main contributions. The first one is centered on the computation of a motion flow which represents the displacements occurring in the reconstructed scene between two consecutive poses. The second one presents a mesh animation process that leads to the animation of a 3D model from one pose to another, following the motion flow. This two-step operation is repeated throughout the entire pose sequence to finally obtain a single animated mesh that matches the evolving shape of the reconstructed actor. Results show that our method is able to produce a temporally consistent mesh animation from various sequences of visual hulls. Reconstruction multi-Vues Animation Déformation de maillage Champ de déplacements As-Rigid-As-Possible Scène dynamique Multi-View reconstruction Animation Mesh deformation Motion flow As-Rigid-As-Possible Dynamic scene 006.6
9	Skládání HDR obrazu pro pohyblivou scénu / HDR Composition for Dynamic Scene Martinů, Lukáš January 2015 (has links) Master's thesis is focused on capturing of low dynamic range images using common devices such as camera and its multiple exposure. The main part of thesis is dedicated to composing these images to HDR image, inclusive sequence of images of static scenes, but also dynamic ones. Next part describes tone mapping used for display HDR image on LDR monitors. Moreover, there is given design and implementation of application solving problems mentioned earlier. In the end, the implemented application is evaluated and the possible continuation of this work is stated.
10	Lightmap Generation and Parameterizationfor Real-Time 3D Infra-Red Scenes Amjad, Meisam 02 August 2019 (has links) No description available. Computer Science lightmap parameterization lightmap generation dynamic scene Real-Time scene Infra-Red 3D infra-Red Scene lightmap for heat signature lightmap for heat sources 3D mesh to UV mesh parameterization

Search results