Global ETD Search

41	Pose Estimation and Calibration Algorithms for Vision and Inertial Sensors Hol, Jeroen Diederik January 2008 (has links) This thesis deals with estimating position and orientation in real-time, using measurements from vision and inertial sensors. A system has been developed to solve this problem in unprepared environments, assuming that a map or scene model is available. Compared to ‘camera-only’ systems, the combination of the complementary sensors yields an accurate and robust system which can handle periods with uninformative or no vision data and reduces the need for high frequency vision updates. The system achieves real-time pose estimation by fusing vision and inertial sensors using the framework of nonlinear state estimation for which state space models have been developed. The performance of the system has been evaluated using an augmented reality application where the output from the system is used to superimpose virtual graphics on the live video stream. Furthermore, experiments have been performed where an industrial robot providing ground truth data is used to move the sensor unit. In both cases the system performed well. Calibration of the relative position and orientation of the camera and the inertial sensor turn out to be essential for proper operation of the system. A new and easy-to-use algorithm for estimating these has been developed using a gray-box system identification approach. Experimental results show that the algorithm works well in practice. Pose estimation Sensor fusion Computer vision Inertial navigation Calibration Automatic control Reglerteknik
42	Face Detection and Pose Estimation using Triplet Invariants / Ansiktsdetektering med hjälp av triplet-invarianter Isaksson, Marcus January 2002 (has links) Face detection and pose estimation are two widely studied problems - mainly because of their use as subcomponents in important applications, e.g. face recognition. In this thesis I investigate a new approach to the general problem of object detection and pose estimation and apply it to faces. Face detection can be considered a special case of this general problem, but is complicated by the fact that faces are non-rigid objects. The basis of the new approach is the use of scale and orientation invariant feature structures - feature triplets - extracted from the image, as well as a biologically inspired associative structure which maps from feature triplets to desired responses (position, pose, etc.). The feature triplets are constructed from curvature features in the image and coded in a way to represent distances between major facial features (eyes, nose and mouth). The final system has been evaluated on different sets of face images. Technology Face Detection Pose Estimation Neural Networks HiperLearn Triplet Invariants TEKNIKVETENSKAP TECHNOLOGY TEKNIKVETENSKAP
43	A Comparative Study On Pose Estimation Algorithms Using Visual Data Cetinkaya, Guven 01 February 2012 (has links) (PDF) Computation of the position and orientation of an object with respect to a camera from its images is called pose estimation problem. Pose estimation is one of the major problems in computer vision, robotics and photogrammetry. Object tracking, object recognition, self-localization of robots are typical examples for the use of pose estimation. Determining the pose of an object from its projections requires 3D model of an object in its own reference system, the camera parameters and 2D image of the object. Most of the pose estimation algorithms require the correspondences between the 3D model points of the object and 2D image points. In this study, four well-known pose estimation algorithms requiring the 2D-3D correspondences to be known a priori / namely, Orthogonal Iterations, POSIT, DLT and Efficient PnP are compared. Moreover, two other well-known algorithms that solve the correspondence and pose problems simultaneously / Soft POSIT and Blind- PnP are also compared in the scope of this thesis study. In the first step of the simulations, synthetic data is formed using a realistic motion scenario and the algorithms are compared using this data. In the next step, real images captured by a calibrated camera for an object with known 3D model are exploited. The simulation results indicate that POSIT algorithm performs the best among the algorithms requiring point correspondences. Another result obtained from the experiments is that Soft-POSIT algorithm can be considered to perform better than Blind-PnP algorithm. TK Electronics 7800-8360
44	Statistical methods for 2D image segmentation and 3D pose estimation Sandhu, Romeil Singh 26 October 2010 (has links) The field of computer vision focuses on the goal of developing techniques to exploit and extract information from underlying data that may represent images or other multidimensional data. In particular, two well-studied problems in computer vision are the fundamental tasks of 2D image segmentation and 3D pose estimation from a 2D scene. In this thesis, we first introduce two novel methodologies that attempt to independently solve 2D image segmentation and 3D pose estimation separately. Then, by leveraging the advantages of certain techniques from each problem, we couple both tasks in a variational and non-rigid manner through a single energy functional. Thus, the three theoretical components and contributions of this thesis are as follows: Firstly, a new distribution metric for 2D image segmentation is introduced. This is employed within the geometric active contour (GAC) framework. Secondly, a novel particle filtering approach is proposed for the problem of estimating the pose of two point sets that differ by a rigid body transformation. Thirdly, the two techniques of image segmentation and pose estimation are coupled in a single energy functional for a class of 3D rigid objects. After laying the groundwork and presenting these contributions, we then turn to their applicability to real world problems such as visual tracking. In particular, we present an example where we develop a novel tracking scheme for 3-D Laser RADAR imagery. However, we should mention that the proposed contributions are solutions for general imaging problems and therefore can be applied to medical imaging problems such as extracting the prostate from MRI imagery Pose estimation Segmentation Registration Computer vision Particle filtering Image processing Image processing Digital techniques Geometry, Differential
45	Face Detection and Pose Estimation using Triplet Invariants / Ansiktsdetektering med hjälp av triplet-invarianter Isaksson, Marcus January 2002 (has links) <p>Face detection and pose estimation are two widely studied problems - mainly because of their use as subcomponents in important applications, e.g. face recognition. In this thesis I investigate a new approach to the general problem of object detection and pose estimation and apply it to faces. Face detection can be considered a special case of this general problem, but is complicated by the fact that faces are non-rigid objects. The basis of the new approach is the use of scale and orientation invariant feature structures - feature triplets - extracted from the image, as well as a biologically inspired associative structure which maps from feature triplets to desired responses (position, pose, etc.). The feature triplets are constructed from curvature features in the image and coded in a way to represent distances between major facial features (eyes, nose and mouth). The final system has been evaluated on different sets of face images.</p> Technology Face Detection Pose Estimation Neural Networks HiperLearn Triplet Invariants TEKNIKVETENSKAP TECHNOLOGY TEKNIKVETENSKAP
46	Channel-Coded Feature Maps for Computer Vision and Machine Learning Jonsson, Erik January 2008 (has links) <p>This thesis is about channel-coded feature maps applied in view-based object recognition, tracking, and machine learning. A channel-coded feature map is a soft histogram of joint spatial pixel positions and image feature values. Typical useful features include local orientation and color. Using these features, each channel measures the co-occurrence of a certain orientation and color at a certain position in an image or image patch. Channel-coded feature maps can be seen as a generalization of the SIFT descriptor with the options of including more features and replacing the linear interpolation between bins by a more general basis function.</p><p>The general idea of channel coding originates from a model of how information might be represented in the human brain. For example, different neurons tend to be sensitive to different orientations of local structures in the visual input. The sensitivity profiles tend to be smooth such that one neuron is maximally activated by a certain orientation, with a gradually decaying activity as the input is rotated.</p><p>This thesis extends previous work on using channel-coding ideas within computer vision and machine learning. By differentiating the channel-coded feature maps with respect to transformations of the underlying image, a method for image registration and tracking is constructed. By using piecewise polynomial basis functions, the channel coding can be computed more efficiently, and a general encoding method for N-dimensional feature spaces is presented.</p><p>Furthermore, I argue for using channel-coded feature maps in view-based pose estimation, where a continuous pose parameter is estimated from a query image given a number of training views with known pose. The optimization of position, rotation and scale of the object in the image plane is then included in the optimization problem, leading to a simultaneous tracking and pose estimation algorithm. Apart from objects and poses, the thesis examines the use of channel coding in connection with Bayesian networks. The goal here is to avoid the hard discretizations usually required when Markov random fields are used on intrinsically continuous signals like depth for stereo vision or color values in image restoration.</p><p>Channel coding has previously been used to design machine learning algorithms that are robust to outliers, ambiguities, and discontinuities in the training data. This is obtained by finding a linear mapping between channel-coded input and output values. This thesis extends this method with an incremental version and identifies and analyzes a key feature of the method -- that it is able to handle a learning situation where the correspondence structure between the input and output space is not completely known. In contrast to a traditional supervised learning setting, the training examples are groups of unordered input-output points, where the correspondence structure within each group is unknown. This behavior is studied theoretically and the effect of outliers and convergence properties are analyzed.</p><p>All presented methods have been evaluated experimentally. The work has been conducted within the cognitive systems research project COSPAL funded by EC FP6, and much of the contents has been put to use in the final COSPAL demonstrator system.</p> computer vision machine learning object recognition pose estimation Image analysis Bildanalys
47	Pose Estimation and Calibration Algorithms for Vision and Inertial Sensors Hol, Jeroen Diederik January 2008 (has links) <p>This thesis deals with estimating position and orientation in real-time, using measurements from vision and inertial sensors. A system has been developed to solve this problem in unprepared environments, assuming that a map or scene model is available. Compared to ‘camera-only’ systems, the combination of the complementary sensors yields an accurate and robust system which can handle periods with uninformative or no vision data and reduces the need for high frequency vision updates.</p><p>The system achieves real-time pose estimation by fusing vision and inertial sensors using the framework of nonlinear state estimation for which state space models have been developed. The performance of the system has been evaluated using an augmented reality application where the output from the system is used to superimpose virtual graphics on the live video stream. Furthermore, experiments have been performed where an industrial robot providing ground truth data is used to move the sensor unit. In both cases the system performed well.</p><p>Calibration of the relative position and orientation of the camera and the inertial sensor turn out to be essential for proper operation of the system. A new and easy-to-use algorithm for estimating these has been developed using a gray-box system identification approach. Experimental results show that the algorithm works well in practice.</p> Pose estimation Sensor fusion Computer vision Inertial navigation Calibration Automatic control Reglerteknik
48	Steps towards the object semantic hierarchy Xu, Changhai, 1977- 17 November 2011 (has links) An intelligent robot must be able to perceive and reason robustly about its world in terms of objects, among other foundational concepts. The robot can draw on rich data for object perception from continuous sensory input, in contrast to the usual formulation that focuses on objects in isolated still images. Additionally, the robot needs multiple object representations to deal with different tasks and/or different classes of objects. We propose the Object Semantic Hierarchy (OSH), which consists of multiple representations with different ontologies. The OSH factors the problems of object perception so that intermediate states of knowledge about an object have natural representations, with relatively easy transitions from less structured to more structured representations. Each layer in the hierarchy builds an explanation of the sensory input stream, in terms of a stochastic model consisting of a deterministic model and an unexplained "noise" term. Each layer is constructed by identifying new invariants from the previous layer. In the final model, the scene is explained in terms of constant background and object models, and low-dimensional dynamic poses of the observer and objects. The OSH contains two types of layers: the Object Layers and the Model Layers. The Object Layers describe how the static background and each foreground object are individuated, and the Model Layers describe how the model for the static background or each foreground object evolves from less structured to more structured representations. Each object or background model contains the following layers: (1) 2D object in 2D space (2D2D): a set of constant 2D object views, and the time-variant 2D object poses, (2) 2D object in 3D space (2D3D): a collection of constant 2D components, with their individual time-variant 3D poses, and (3) 3D object in 3D space (3D3D): the same collection of constant 2D components but with invariant relations among their 3D poses, and the time-variant 3D pose of the object as a whole. In building 2D2D object models, a fundamental problem is to segment out foreground objects in the pixel-level sensory input from the background environment, where motion information is an important cue to perform the segmentation. Traditional approaches for moving object segmentation usually appeal to motion analysis on pure image information without exploiting the robot's motor signals. We observe, however, that the background motion (from the robot's egocentric view) has stronger correlation to the robot's motor signals than the motion of foreground objects. Based on this observation, we propose a novel approach to segmenting moving objects by learning homography and fundamental matrices from motor signals. In building 2D3D and 3D3D object models, estimating camera motion parameters plays a key role. We propose a novel method for camera motion estimation that takes advantage of both planar features and point features and fuses constraints from both homography and essential matrices in a single probabilistic framework. Using planar features greatly improves estimation accuracy over using point features only, and with the help of point features, the solution ambiguity from a planar feature is resolved. Compared to the two classic approaches that apply the constraint of either homography or essential matrix, the proposed method gives more accurate estimation results and avoids the drawbacks of the two approaches. / text Object semantic hierarchy 3D object model Object tracking Object segmentation Motion segmentation 3D pose estimation
49	MONOCULAR POSE ESTIMATION AND SHAPE RECONSTRUCTION OF QUASI-ARTICULATED OBJECTS WITH CONSUMER DEPTH CAMERA Ye, Mao 01 January 2014 (has links) Quasi-articulated objects, such as human beings, are among the most commonly seen objects in our daily lives. Extensive research have been dedicated to 3D shape reconstruction and motion analysis for this type of objects for decades. A major motivation is their wide applications, such as in entertainment, surveillance and health care. Most of existing studies relied on one or more regular video cameras. In recent years, commodity depth sensors have become more and more widely available. The geometric measurements delivered by the depth sensors provide significantly valuable information for these tasks. In this dissertation, we propose three algorithms for monocular pose estimation and shape reconstruction of quasi-articulated objects using a single commodity depth sensor. These three algorithms achieve shape reconstruction with increasing levels of granularity and personalization. We then further develop a method for highly detailed shape reconstruction based on our pose estimation techniques. Our first algorithm takes advantage of a motion database acquired with an active marker-based motion capture system. This method combines pose detection through nearest neighbor search with pose refinement via non-rigid point cloud registration. It is capable of accommodating different body sizes and achieves more than twice higher accuracy compared to a previous state of the art on a publicly available dataset. The above algorithm performs frame by frame estimation and therefore is less prone to tracking failure. Nonetheless, it does not guarantee temporal consistent of the both the skeletal structure and the shape and could be problematic for some applications. To address this problem, we develop a real-time model-based approach for quasi-articulated pose and 3D shape estimation based on Iterative Closest Point (ICP) principal with several novel constraints that are critical for monocular scenario. In this algorithm, we further propose a novel method for automatic body size estimation that enables its capability to accommodate different subjects. Due to the local search nature, the ICP-based method could be trapped to local minima in the case of some complex and fast motions. To address this issue, we explore the potential of using statistical model for soft point correspondences association. Towards this end, we propose a unified framework based on Gaussian Mixture Model for joint pose and shape estimation of quasi-articulated objects. This method achieves state-of-the-art performance on various publicly available datasets. Based on our pose estimation techniques, we then develop a novel framework that achieves highly detailed shape reconstruction by only requiring the user to move naturally in front of a single depth sensor. Our experiments demonstrate reconstructed shapes with rich geometric details for various subjects with different apparels. Last but not the least, we explore the applicability of our method on two real-world applications. First of all, we combine our ICP-base method with cloth simulation techniques for Virtual Try-on. Our system delivers the first promising 3D-based virtual clothing system. Secondly, we explore the possibility to extend our pose estimation algorithms to assist physical therapist to identify their patients’ movement dysfunctions that are related to injuries. Our preliminary experiments have demonstrated promising results by comparison with the gold standard active marker-based commercial system. Throughout the dissertation, we develop various state-of-the-art algorithms for pose estimation and shape reconstruction of quasi-articulated objects by leveraging the geometric information from depth sensors. We also demonstrate their great potentials for different real-world applications. Pose Estimation Shape Modeling 3D Reconstruction Depth Sensor Non-rigid Registration Computer Vision Artificial Intelligence and Robotics
50	IMU-baserad skattning av verktygets position och orientering hos industrirobot / IMU-based Robot Tool Pose Estimation Norén, Johan January 2014 (has links) Robotar är en självklar del av modern automation och produktion. Användningsområdenaär många och innefattar bland annat repetitiva arbetsuppgifter ochuppgifter som kan vara hälsofarliga för oss människor, så som t.ex. målning,punktsvetsning och materialhantering. Ett problem inom robotik är att noggrant skatta position och orientering för robotensverktyg. Detta examensarbete syftar till att ta fram metoder för dennaskattning baserad på mätningar från en Inertial Measurement Unit (IMU) sommonteras vid robotens verktyg. En IMU är en kombinationsenhet som består av flera sensorer, vanligtvis accelerometeroch gyroskop. Enheten mäter då acceleration och rotationshastighetbaserat på kroppars tröghet. Examensarbetet presenterar tre metoder för att skatta position och orienteringav robotens verktyg. En skattningsmetod endast är baserad på mätningar frånIMU:n, död räkning, samt två filter där även robotkinematiken tillsammans meduppmätta motorvinklar används, extended Kalmanfilter (EKF) och komplementärfilter(CF). Resultat för skattningsmetoderna visas för experimentell data från en högpresterandeIMU tillsammans med en industrirobot med sex frihetsgrader. / Industrial robots have a well established part within modern automation and production.The uses for robots are many and include e.g. repetitive tasks, painting, spot welding and material handling. One problem in robotics is to sufficiently well estimate the position and orientation for the end effector of the robot. This thesis aims to present estimationmethods based on data from an Inertial Measurement Unit (IMU) mounted onthe end effector of the robot. An IMU is a combination unit typically containing accelerometers and gyroscopes.The unit measures acceleration and rotational speed based on the inertia of bodies. The thesis presents three methods for position and orientation estimation. One based exclusively on IMU data, dead reckoning, and two filters based on IMUdata in combination with robot kinematics and motor angles, extended Kalmanfilter (EKF) and complementary filter (CF). Results for the estimation methods are shown based on experimental data froma high-performance IMU and a industrial robot with six degrees of freedom. IMU robots industrial robot pose estimation extended kalman filter compementary filter dead reckoning

Search results