Global ETD Search

91	AFFECT-PRESERVING VISUAL PRIVACY PROTECTION Xu, Wanxin 01 January 2018 (has links) The prevalence of wireless networks and the convenience of mobile cameras enable many new video applications other than security and entertainment. From behavioral diagnosis to wellness monitoring, cameras are increasing used for observations in various educational and medical settings. Videos collected for such applications are considered protected health information under privacy laws in many countries. Visual privacy protection techniques, such as blurring or object removal, can be used to mitigate privacy concern, but they also obliterate important visual cues of affect and social behaviors that are crucial for the target applications. In this dissertation, we propose to balance the privacy protection and the utility of the data by preserving the privacy-insensitive information, such as pose and expression, which is useful in many applications involving visual understanding. The Intellectual Merits of the dissertation include a novel framework for visual privacy protection by manipulating facial image and body shape of individuals, which: (1) is able to conceal the identity of individuals; (2) provide a way to preserve the utility of the data, such as expression and pose information; (3) balance the utility of the data and capacity of the privacy protection. The Broader Impacts of the dissertation focus on the significance of privacy protection on visual data, and the inadequacy of current privacy enhancing technologies in preserving affect and behavioral attributes of the visual content, which are highly useful for behavior observation in educational and medical settings. This work in this dissertation represents one of the first attempts in achieving both goals simultaneously. Pose Estimation Human Body Reshaping 3D Face Reconstruction Facial Expression Transfer Visual Privacy Protection Engineering Social and Behavioral Sciences
92	Pose AR: Assessing Pose Based Input in an AR Context Jakub, Nilsson January 2019 (has links) Despite the rapidly growing adoption of augmented reality (AR) applications, existing methods for interacting with AR content are rated poorly, with surveyors of the area calling for better means of interaction, while researchers strive to create more natural input methods, mainly focusing on gesture input. This thesis aims to contribute to the aforementioned efforts by recognizing that technologies for consumer-grade smartphone-based pose estimation have been rapidly improving in recent years and due to their increased accuracy may have untapped potential ready to be utilized for user input. To this end, a rudimentary system for pose based input is integrated into prototype applications, which are constructed with both pose based input and touch input in mind. In this work, pose, pose estimation, and posed based input refer to using the distance and orientation of the user (or more precisely, the distance and orientation of their device) in relation to the AR content. Using said prototypes within a user interaction study allowed the identification of user preferences which indicate the approaches that future efforts into utilizing pose for input in an AR context ought to adopt. By comparing questionnaire answers and logged positional data across four prototype scenarios, it can be clearly identified that to perceive pose input as intuitive, the AR experiences shouldn’t employ a scale which is so large that it requires substantial shifts in the position of the user, as opposed to merely shifts in the position of the user’s device. Augmented Reality Natural User Interfaces Mobile AR Pose Estimation relative distance and orientation input Human Computer Interaction
93	Compact Representations and Multi-cue Integration for Robotics Söderberg, Robert January 2005 (has links) <p>This thesis presents methods useful in a bin picking application, such as detection and representation of local features, pose estimation and multi-cue integration.</p><p>The scene tensor is a representation of multiple line or edge segments and was first introduced by Nordberg in [30]. A method for estimating scene tensors from gray-scale images is presented. The method is based on orientation tensors, where the scene tensor can be estimated by correlations of the elements in the orientation tensor with a number of 1<em>D</em> filters. Mechanisms for analyzing the scene tensor are described and an algorithm for detecting interest points and estimating feature parameters is presented. It is shown that the algorithm works on a wide spectrum of images with good result.</p><p>Representations that are invariant with respect to a set of transformations are useful in many applications, such as pose estimation, tracking and wide baseline stereo. The scene tensor itself is not invariant and three different methods for implementing an invariant representation based on the scene tensor is presented. One is based on a non-linear transformation of the scene tensor and is invariant to perspective transformations. Two versions of a tensor doublet is presented, which is based on a geometry of two interest points and is invariant to translation, rotation and scaling. The tensor doublet is used in a framework for view centered pose estimation of 3<em>D</em> objects. It is shown that the pose estimation algorithm has good performance even though the object is occluded and has a different scale compared to the training situation.</p><p>An industrial implementation of a bin picking application have to cope with several different types of objects. All pose estimation algorithms use some kind of model and there is yet no model that can cope with all kinds of situations and objects. This thesis presents a method for integrating cues from several pose estimation algorithms for increasing the system stability. It is also shown that the same framework can also be used for increasing the accuracy of the system by using cues from several views of the object. An extensive test with several different objects, lighting conditions and backgrounds shows that multi-cue integration makes the system more robust and increases the accuracy.</p><p>Finally, a system for bin picking is presented, built from the previous parts of this thesis. An eye in hand setup is used with a standard industrial robot arm. It is shown that the system works for real bin-picking situations with a positioning error below 1 mm and an orientation error below 1<sup>o</sup> degree for most of the different situations.</p> / Report code: LiU-TEK-LIC-2005:15. Electrical engineering bin picking application pose estimation multi-cue integration Elektroteknik Electrical engineering Elektroteknik
94	Acquiring 3D Full-body Motion from Noisy and Ambiguous Input Lou, Hui 2012 May 1900 (has links) Natural human motion is highly demanded and widely used in a variety of applications such as video games and virtual realities. However, acquisition of full-body motion remains challenging because the system must be capable of accurately capturing a wide variety of human actions and does not require a considerable amount of time and skill to assemble. For instance, commercial optical motion capture systems such as Vicon can capture human motion with high accuracy and resolution while they often require post-processing by experts, which is time-consuming and costly. Microsoft Kinect, despite its high popularity and wide applications, does not provide accurate reconstruction of complex movements when significant occlusions occur. This dissertation explores two different approaches that accurately reconstruct full-body human motion from noisy and ambiguous input data captured by commercial motion capture devices. The first approach automatically generates high-quality human motion from noisy data obtained from commercial optical motion capture systems, eliminating the need for post-processing. The second approach accurately captures a wide variety of human motion even under significant occlusions by using color/depth data captured by a single Kinect camera. The common theme that underlies two approaches is the use of prior knowledge embedded in pre-recorded motion capture database to reduce the reconstruction ambiguity caused by noisy and ambiguous input and constrain the solution to lie in the natural motion space. More specifically, the first approach constructs a series of spatial-temporal filter bases from pre-captured human motion data and employs them along with robust statistics techniques to filter noisy motion data corrupted by noise/outliers. The second approach formulates the problem in a Maximum a Posterior (MAP) framework and generates the most likely pose which explains the observations as well as consistent with the patterns embedded in the pre-recorded motion capture database. We demonstrate the effectiveness of our approaches through extensive numerical evaluations on synthetic data and comparisons against results created by commercial motion capture systems. The first approach can effectively denoise a wide variety of noisy motion data, including walking, running, jumping and swimming while the second approach is shown to be capable of accurately reconstructing a wider range of motions compared with Microsoft Kinect. motion capture, motion denoising motion tracking pose estimation data-driven priors optimization Principal Component Analysis depth camera
95	Statistical and geometric methods for visual tracking with occlusion handling and target reacquisition Lee, Jehoon 17 January 2012 (has links) Computer vision is the science that studies how machines understand scenes and automatically make decisions based on meaningful information extracted from an image or multi-dimensional data of the scene, like human vision. One common and well-studied field of computer vision is visual tracking. It is challenging and active research area in the computer vision community. Visual tracking is the task of continuously estimating the pose of an object of interest from the background in consecutive frames of an image sequence. It is a ubiquitous task and a fundamental technology of computer vision that provides low-level information used for high-level applications such as visual navigation, human-computer interaction, and surveillance system. The focus of the research in this thesis is visual tracking and its applications. More specifically, the object of this research is to design a reliable tracking algorithm for a deformable object that is robust to clutter and capable of occlusion handling and target reacquisition in realistic tracking scenarios by using statistical and geometric methods. To this end, the approaches developed in this thesis make extensive use of region-based active contours and particle filters in a variational framework. In addition, to deal with occlusions and target reacquisition problems, we exploit the benefits of coupling 2D and 3D information of an image and an object. In this thesis, first, we present an approach for tracking a moving object based on 3D range information in stereoscopic temporal imagery by combining particle filtering and geometric active contours. Range information is weighted by the proposed Gaussian weighting scheme to improve segmentation achieved by active contours. In addition, this work present an on-line shape learning method based on principal component analysis to reacquire track of an object in the event that it disappears from the field of view and reappears later. Second, we propose an approach to jointly track a rigid object in a 2D image sequence and to estimate its pose in 3D space. In this work, we take advantage of knowledge of a 3D model of an object and we employ particle filtering to generate and propagate the translation and rotation parameters in a decoupled manner. Moreover, to continuously track the object in the presence of occlusions, we propose an occlusion detection and handling scheme based on the control of the degree of dependence between predictions and measurements of the system. Third, we introduce the fast level-set based algorithm applicable to real-time applications. In this algorithm, a contour-based tracker is improved in terms of computational complexity and the tracker performs real-time curve evolution for detecting multiple windows. Lastly, we deal with rapid human motion in context of object segmentation and visual tracking. Specifically, we introduce a model-free and marker-less approach for human body tracking based on a dynamic color model and geometric information of a human body from a monocular video sequence. The contributions of this thesis are summarized as follows: 1. Reliable algorithm to track deformable objects in a sequence consisting of 3D range data by combining particle filtering and statistics-based active contour models. 2. Effective handling scheme based on object's 2D shape information for the challenging situations in which the tracked object is completely gone from the image domain during tracking. 3. Robust 2D-3D pose tracking algorithm using a 3D shape prior and particle filters on SE(3). 4. Occlusion handling scheme based on the degree of trust between predictions and measurements of the tracking system, which is controlled in an online fashion. 5. Fast level set based active contour models applicable to real-time object detection. 6. Model-free and marker-less approach for tracking of rapid human motion based on a dynamic color model and geometric information of a human body. Computer vision Visual tracking Image segmentation Pose estimation Shape analysis Particle filtering Active contours Image processing Pattern recognition systems Algorithms
96	Visual Perception of Objects and their Parts in Artificial Systems Schoeler, Markus 12 October 2015 (has links) No description available. 510 Object Recognition Object Segmentation Object Partitioning Function Recognition Pose Estimation Scene Segmentation Object Categorization Object Classification Informatik (PPN619939052)
97	Détection et estimation de pose d'instances d'objet rigide pour la manipulation robotisée / Detection and pose estimation of instances of a rigid object for robotic bin-picking Brégier, Romain 11 June 2018 (has links) La capacité à détecter des objets dans une scène et à estimer leur pose constitue un préalable essentiel à l'automatisation d'un grand nombre de tâches, qu'il s'agisse d'analyser automatiquement une situation, de proposer une expérience de réalité augmentée, ou encore de permettre à un robot d'interagir avec son environnement.Dans cette thèse, nous nous intéressons à cette problématique à travers le scénario du dévracage industriel, dans lequel il convient de détecter des instances d'un objet rigide au sein d'un vrac et d'estimer leur pose -- c'est-à-dire leur position et orientation -- à des fins de manipulation robotisée.Nous développons pour ce faire une méthode basée sur l'exploitation d'une image de profondeur, procédant par agrégation d'hypothèses générées par un ensemble d'estimateurs locaux au moyen d'une forêt de décision.La pose d'un objet rigide est usuellement modélisée sous forme d'une transformation rigide 6D dans la littérature. Cette représentation se révèle cependant inadéquate lorsqu'il s'agit de traiter des objets présentant des symétries, pourtant nombreux parmi les objets manufacturés.Afin de contourner ces difficultés, nous introduisons une formulation de la notion de pose compatible avec tout objet rigide physiquement admissible, et munissons l'espace des poses d'une distance quantifiant la longueur du plus petit déplacement entre deux poses. Ces notions fournissent un cadre théorique rigoureux à partir duquel nous développons des outils permettant de manipuler efficacement le concept de pose, et constituent le socle de notre approche du problème du dévracage.Les standards d'évaluation utilisés dans l'état de l'art souffrant de certaines limitations et n'étant que partiellement adaptés à notre contexte applicatif, nous proposons une méthodologie d'évaluation adaptée à des scènes présentant un nombre variable d'instances d'objet arbitraire, potentiellement occultées. Nous mettons celle-ci en œuvre sur des données synthétiques et réelles, et montrons la viabilité de la méthode proposée, compatible avec les problématiques de temps de cycle, de performance et de simplicité de mise en œuvre du dévracage industriel. / Visual object detection and estimation of their poses -- i.e. position and orientation for a rigid object -- is of utmost interest for automatic scene understanding.In this thesis, we address this topic through the bin-picking scenario, in which instances of a rigid object have to be automatically detected and localized in bulk, so as to be manipulated by a robot for various industrial tasks such as machine feeding, assembling, packing, etc.To this aim, we propose a novel method for object detection and pose estimation given an input depth image, based on the aggregation of local predictions through an Hough forest technique, that is suitable with industrial constraints of performance and ease of use.Overcoming limitations of existing approaches that assume objects not to have any proper symmetries, we develop a theoretical and practical framework enabling us to consider any physical rigid object, thanks to a novel definition of the notion of pose and an associated distance.This framework provides tools to deal with poses efficiently for operations such as pose averaging or neighborhood queries, and is based on rigorous mathematical developments.Evaluation benchmarks used in the literature are not very representative of our application scenario and suffer from some intrinsic limitations, therefore we formalize a methodology suited for scenes in which many object instances, partially occluded, in arbitrary poses may be considered. We apply this methodology on real and synthetic data, and demonstrate the soundness of our approach compared to the state of the art. Vision par ordinateur Robotique Estimation de pose Dévracage Pose Symétrie Computer vision Robotics Pose estimation Bin-Picking Pose Symmetry 004
98	Stanovení pozice objektu / Detection of object position Baáš, Filip January 2019 (has links) Master’s thesis deals with object pose estimation using monocular camera. As an object is considered every rigid, shape fixed entity with strong edges, ideally textureless. Object position in this work is represented by transformation matrix, which describes object translation and rotation towards world coordinate system. First chapter is dedicated to explanation of theory of geometric transformations and intrinsic and extrinsic parameters of camera. This chapter also describes detection algorithm Chamfer Matching, which is used in this work. Second chapter describes all development tools used in this work. Third, fourth and fifth chapter are dedicated to practical realization of this works goal and achieved results. Last chapter describes created application, that realizes known object pose estimation in scene.
99	Evaluation of 3D motion capture data from a deep neural network combined with a biomechanical model Rydén, Anna, Martinsson, Amanda January 2021 (has links) Motion capture has in recent years grown in interest in many fields from both game industry to sport analysis. The need of reflective markers and expensive multi-camera systems limits the business since they are costly and time-consuming. One solution to this could be a deep neural network trained to extract 3D joint estimations from a 2D video captured with a smartphone. This master thesis project has investigated the accuracy of a trained convolutional neural network, MargiPose, that estimates 25 joint positions in 3D from a 2D video, against a gold standard, multi-camera Vicon-system. The project has also investigated if the data from the deep neural network can be connected to a biomechanical modelling software, AnyBody, for further analysis. The final intention of this project was to analyze how accurate such a combination could be in golf swing analysis. The accuracy of the deep neural network has been evaluated with three parameters: marker position, angular velocity and kinetic energy for different segments of the human body. MargiPose delivers results with high accuracy (Mean Per Joint Position Error (MPJPE) = 1.52 cm) for a simpler movement but for a more advanced motion such as a golf swing, MargiPose achieves less accuracy in marker distance (MPJPE = 3.47 cm). The mean difference in angular velocity shows that MargiPose has difficulties following segments that are occluded or has a greater motion, such as the wrists in a golf swing where they both move fast and are occluded by other body segments. The conclusion of this research is that it is possible to connect data from a trained CNN with a biomechanical modelling software. The accuracy of the network is highly dependent on the intention of the data. For the purpose of golf swing analysis, this could be a great and cost-effective solution which could enable motion analysis for professionals but also for interested beginners. MargiPose shows a high accuracy when evaluating simple movements. However, when using it with the intention of analyzing a golf swing in i biomechanical modelling software, the outcome might be beyond the bounds of reliable results. Human pose estimation motion capture deep neural network CNN MargiPose biomechanical modelling AnyBody modelling system Other Medical Engineering Annan medicinteknik
100	Learning to Predict Dense Correspondences for 6D Pose Estimation Brachmann, Eric 17 January 2018 (has links) Object pose estimation is an important problem in computer vision with applications in robotics, augmented reality and many other areas. An established strategy for object pose estimation consists of, firstly, finding correspondences between the image and the object’s reference frame, and, secondly, estimating the pose from outlier-free correspondences using Random Sample Consensus (RANSAC). The first step, namely finding correspondences, is difficult because object appearance varies depending on perspective, lighting and many other factors. Traditionally, correspondences have been established using handcrafted methods like sparse feature pipelines. In this thesis, we introduce a dense correspondence representation for objects, called object coordinates, which can be learned. By learning object coordinates, our pose estimation pipeline adapts to various aspects of the task at hand. It works well for diverse object types, from small objects to entire rooms, varying object attributes, like textured or texture-less objects, and different input modalities, like RGB-D or RGB images. The concept of object coordinates allows us to easily model and exploit uncertainty as part of the pipeline such that even repeating structures or areas with little texture can contribute to a good solution. Although we can train object coordinate predictors independent of the full pipeline and achieve good results, training the pipeline in an end-to-end fashion is desirable. It enables the object coordinate predictor to adapt its output to the specificities of following steps in the pose estimation pipeline. Unfortunately, the RANSAC component of the pipeline is non-differentiable which prohibits end-to-end training. Adopting techniques from reinforcement learning, we introduce Differentiable Sample Consensus (DSAC), a formulation of RANSAC which allows us to train the pose estimation pipeline in an end-to-end fashion by minimizing the expectation of the final pose error. info:eu-repo/classification/ddc/004 ddc:004

Search results