Global ETD Search

11	Generating Synthetic Data for Evaluation and Improvement of Deep 6D Pose Estimation Löfgren, Tobias, Jonsson, Daniel January 2020 (has links) The task of 6D pose estimation with deep learning is to train networks to, from an im-age of an object, determine the rotation and translation of the object. Impressive resultshave recently been shown in deep learning based 6D pose estimation. However, many cur-rent solutions rely on real-world data when training, which as opposed to synthetic data,requires time consuming annotation. In this thesis, we introduce a pipeline for generatingsynthetic ground truth data for deep 6D pose estimation, where annotation is done auto-matically. With a 3D CAD-model, we use Blender to render 2D images of the model fromdifferent view points. We also create all other relevant data needed for pose estimation, e.g.,the poses of an object, mask images and 3D keypoints on the object. Using this pipeline, itis possible to adjust different settings to reduce the domain gap between synthetic data andreal-world data and get better pose estimation results. Such settings could be changing themethod of extracting 3D keypoints and varying the scale of the object or the light settingsin the scene.The network used to test the performance of training on our synthetic data is PVNet,which achieves state-of-the-art results for 6D pose estimation. This architecture learns tofind 2D keypoints of the object in the image, as well as 2D–3D keypoint correspondences.With these correspondences, the Perspective-n-Point (PnP) algorithm is used to extract apose. We evaluate the pose estimation of the different settings on the synthetic data andcompare these results to other state-of-the-art work. We find that using only real-worlddata for training is worse than using a combination of synthetic and real-world data. Sev-eral other findings are that varying scale and lightning, in addition to adding random back-ground images to the rendered images improves results. Four different novel keypoint se-lection methods are introduced in this work, and tried against methods used in previouswork. We observe that our methods achieve similar or better results. Finally, we use thebest possible settings from the synthetic data pipeline, but with memory limitations on theamount of training data. We are close to state-of-the-art results, and could get closer withmore data. Pose Estimation 6D Pose Estimation Synthetic Data Deep learning Signal Processing Signalbehandling
12	Pose Estimation for Gesture Recovery in Occluded Television Videos Pham, Kyle 26 August 2022 (has links) No description available. Computer Science
13	3D POSE ESTIMATION IN THE CONTEXT OF GRIP POSITION FOR PHRI Norman, Jacob January 2021 (has links) For human-robot interaction with the intent to grip a human arm, it is necessary that the ideal gripping location can be identified. In this work, the gripping location is situated on the arm and thus it can be extracted using the position of the wrist and elbow joints. To achieve this human pose estimation is proposed as there exist robust methods that work both in and outside of lab environments. One such example is OpenPose which thanks to the COCO and MPII datasets has recorded impressive results in a variety of different scenarios in real-time. However, most of the images in these datasets are taken from a camera mounted at chest height on people that for the majority of the images are oriented upright. This presents the potential problem that prone humans which are the primary focus of this project can not be detected. Especially if seen from an angle that makes the human appear upside down in the camera frame. To remedy this two different approaches were tested, both aimed at creating a rotation-invariant 2D pose estimation method. The first method rotates the COCO training data in an attempt to create a model that can find humans regardless of orientation in the image. The second approach adds a RotationNet as a preprocessing step to correctly orient the images so that OpenPose can be used to estimate the 2D pose before rotating back the resulting skeletons. 3D pose estimation human pose estimation pose estimation rotation-invariant Computer Sciences Datavetenskap (datalogi)
14	Concept Design and Testing of a GPS-less System for Autonomous Shovel-Truck Spotting OWENS, BRETT 29 January 2013 (has links) Haul truck drivers frequently have difficulties spotting beside shovels. This is typically a combination of reduced visibility and poor mining conditions. Based on first-hand data collected from the Goldstrike Open Pit, it was learned that, on average, 9% of all spotting actions required corrective movements to facilitate loading. This thesis investigates an automated solution to haul truck spotting that does not rely on the use of the satellite global positioning system (GPS), since GPS can perform unreliably. This thesis proposes that if spotting was automated, a significant decrease in cycle times could result. Using conventional algorithms and techniques from the field of mobile robotics, vehicle pose estimation and control algorithms were designed to enable autonomous shovel-truck spotting. The developed algorithms were verified by using both simulation and field testing with real hardware. Tests were performed in analog conditions on an automation-ready Kubota RTV 900 utility truck. When initiated from a representative pose, the RTV successfully spotted to the desired location (within 1 m) in 95% of the conducted trials. The results demonstrate that the proposed approach is a strong candidate for an auto-spot system. / Thesis (Master, Mining Engineering) -- Queen's University, 2013-01-28 09:49:20.584 mobile robotics GPS pose estimation autonomous vehicles spotting times
15	A Ladar-Based Pose Estimation Algorithm for Determining Relative Motion of a Spacecraft for Autonomous Rendezvous and Dock Fenton, Ronald Christopher 01 May 2008 (has links) Future autonomous space missions will require autonomous rendezvous and docking operations. The servicing spacecraft must be able to determine the relative 6 degree-of-freedom (6 DOF) motion between the vehicle and the target spacecraft. One method to determine the relative 6 DOF position and attitude is with 3D ladar imaging. Ladar sensor systems can capture close-proximity range images of the target spacecraft, producing 3D point cloud data sets. These sequentially collected point-cloud data sets were then registered with one another using a point correspondence-less variant of the Iterative Closest Points (ICP) algorithm to determine the relative 6 DOF displacements. Simulation experiments were performed and indicated that the mean-squared error (MSE), angular error, mean, and standard deviations for position and orientation estimates did not vary as a function of position and attitude and meet most minimum angular and translational error requirements for rendezvous and dock. Furthermore, the computational times required by this algorithm were comparable to previously reported variants of the point-to-point and point-to-plane-based ICP variants for single iterations when the initialization was already performed. ladar ICP pose estimation rendezvous and dock Electrical and Electronics
16	Robust Real-Time Estimation of Region Displacements in Video Sequences Skoglund, Johan January 2007 (has links) <p>The possibility to use real-time computer vision in video sequences gives many opportunities for a system to interact with the environment. Possible ways for interaction are e.g. augmented reality like in the MATRIS project where the purpose is to add new objects into the video sequence, or surveillance where the purpose is to find abnormal events.</p><p>The increase of the speed of computers the last years has simplified this process and it is now possible to use at least some of the more advanced computer vision algorithms that are available. The computational speed of computers is however still a problem, for an efficient real-time system efficient code and methods are necessary. This thesis deals with both problems, one part is about efficient implementations using single instruction multiple data (SIMD) instructions and one part is about robust tracking.</p><p>An efficient real-time system requires efficient implementations of the used computer vision methods. Efficient implementations requires knowledge about the CPU and the possibilities given. In this thesis, one method called SIMD is explained. SIMD is useful when the same operation is applied to multiple data which usually is the case in computer vision, the same operation is executed on each pixel.</p><p>Following the position of a feature or object in a video sequence is called tracking. Tracking can be used for a number of applications. The application in this thesis is to use tracking for pose estimation. One way to do tracking is to cut out a small region around the feature, creating a patch and find the position on this patch in the other frames. To find the position, a measure of the difference between the patch and the image in a given position is used. This thesis thoroughly investigates the sum of absolute difference (SAD) error measure. The investigation involves different ways to improve the robustness and to decrease the average error. One method to estimate the average error, the covariance of the position error is proposed. An estimate of the average error is needed when different measurements are combined.</p><p>Finally, a system for camera pose estimation is presented. The computer vision part of this system is based on the result in this thesis. This presentation contains also a discussion about the result of this system.</p> / Report code: LIU-TEK-LIC-2007:5. The report code in the thesis is incorrect. tracking subpixel real time covariance pose estimation Image analysis Bildanalys
17	Towards Man-Machine Interfaces: Combining Top-down Constraints with Bottom-up Learning in Facial Analysis Kumar, Vinay P. 01 September 2002 (has links) This thesis proposes a methodology for the design of man-machine interfaces by combining top-down and bottom-up processes in vision. From a computational perspective, we propose that the scientific-cognitive question of combining top-down and bottom-up knowledge is similar to the engineering question of labeling a training set in a supervised learning problem. We investigate these questions in the realm of facial analysis. We propose the use of a linear morphable model (LMM) for representing top-down structure and use it to model various facial variations such as mouth shapes and expression, the pose of faces and visual speech (visemes). We apply a supervised learning method based on support vector machine (SVM) regression for estimating the parameters of LMMs directly from pixel-based representations of faces. We combine these methods for designing new, more self-contained systems for recognizing facial expressions, estimating facial pose and for recognizing visemes. AI Facial Expression Recognition Pose Estimation Viseme Recognition SVM
18	Local Features for Range and Vision-Based Robotic Automation Viksten, Fredrik January 2010 (has links) Robotic automation has been a part of state-of-the-art manufacturing for many decades. Robotic manipulators are used for such tasks as welding, painting, pick and place tasks etc. Robotic manipulators are quite flexible and adaptable to new tasks, but a typical robot-based production cell requires extensive specification of the robot motion and construction of tools and fixtures for material handling. This incurs a large effort both in time and monetary expenses. The task of a vision system in this setting is to simplify the control and guidance of the robot and to reduce the need for supporting material handling machinery. This dissertation examines performance and properties of the current state-of-the-art local features within the setting of object pose estimation. This is done through an extensive set of experiments replicating various potential problems to which a vision system in a robotic cell could be subjected. The dissertation presents new local features which are shown to increase the performance of object pose estimation. A new local descriptor details how to use log-polar sampled image patches for truly rotational invariant matching. This representation is also extended to use a scale-space interest point detector which in turn makes it very competitive in our experiments. A number of variations of already available descriptors are constructed resulting in new and competitive features, among them a scale-space based Patch-duplet. In this dissertation a successful vision-based object pose estimation system is extended for multi-cue integration, yielding increased robustness and accuracy. Robustness is increased through algorithmic multi-cue integration, combining the individual strengths of multiple local features. Increased accuracy is achieved by utilizing manipulator movement and applying temporal multi-cue integration. This is implemented using a real flexible robotic manipulator arm. Besides work done on local features for ordinary image data a number of local features for range data has also been developed. This dissertation describes the theory behind and the application of the scene tensor to the problem of object pose estimation. The scene tensor is a fourth order tensor representation using projective geometry. It is shown how to use the scene tensor as a detector as well as how to apply it to the task of object pose estimation. The object pose estimation system is extended to work with 3D data. A novel way of handling sampling of range data when constructing a detector is discussed. A volume rasterization method is presented and the classic Harris detector is adapted to it. Finally, a novel region detector, called Maximally Robust Range Regions, is presented. All developed detectors are compared in a detector repeatability test. Local features object pose estimation range data Image analysis Bildanalys
19	Pose estimation of a VTOL UAV using IMU, Camera and GPS / Position- och orienteringsskattning av en VTOL UAV med IMU, Kamera och GPS Bodesund, Fredrik January 2010 (has links) When an autonomous vehicle has a mission to perform, it is of high importance that the robot has good knowledge about its position. Without good knowledge of the position, it will not be able to navigate properly and the data that it gathers, which could be of importance for the mission, might not be usable. A helicopter could for example be used to collect laser data from the terrain beneath it, which could be used to produce a 3D map of the terrain. If the knowledge about the position and orientation of the helicopter is poor then the collected laser data will be useless since it is unknown what the laser actually measures. A successful solution to position and orientation (pose) estimation of an autonomous helicopter, using an inertial measurement unit (IMU), a camera and a GPS, is proposed in this thesis. The problem is to estimate the unknown pose using sensors that measure different physical attributes and give readings containing noise. An extended Kalman filter solution to the simultaneous localisation and mapping (SLAM) is used to fuse data from the different sensors and estimate the pose of the robot. The scale invariant feature transform (SIFT) is used for feature extraction and the unified inverse depth parametrisation (UIDP) model is used to parametrise the landmarks. The orientation of the robot is described by quaternions. To be able to evaluate the performance of the filter an ABB industrial robot has been used as reference. The pose of the end tool of the robot is known with high accuracy and gives a good ground truth so that the estimations can be evaluated. The results shows that the algorithm performs well and that the pose is estimated with good accuracy. / När en autonom farkost skall utföra ett uppdrag är det av högsta vikt att den har god kännedom av sin position. Utan detta kommer den inte att kunna navigera och den data som den samlar in, relevant för uppdraget, kan vara oanvändbar. Till exempel skulle en helikopter kunna användas för att samla in laser data av terrängen under den, för att skapa en 3D karta av terrängen. Om kännedomen av helikopterns position och orientering är dålig kommer de insamlade lasermätningarna att vara oanvändbara eftersom det inte är känt vad lasern faktiskt mäter. I detta examensarbete presenteras en väl fungerande lösning för position och orienterings estimering av autonom helikopter med hjälp av en inertial measurement unit (IMU), en kamera och GPS. Problemet är att skatta positionen och orienteringen med hjälp av sensorer som mäter olika fysiska storheter och vars mätningar innehåller brus. En extended Kalman filter (EKF) lösning för simultaneous localisation and mapping (SLAM) problemet används för att fusionera data från de olika sensorerna och estimera positionen och orienteringen. För feature extrahering används scale invariant feature transform (SIFT) och för att parametrisera landmärken används unified inverse depth parametrisation (UIDP). Orienteringen av roboten beskrivs med hjälp av qvartinjoner. För att evaluera skattningarna har en ABB robot används som referens vid datainsamling. Då roboten har god kännedom om position och orientering av sitt främre verktyg gör detta att prestandan i filtret kan undersökas. Resultaten visar att algorithmen fungerar bra och att skattningar har hög noggrannhet. Pose Estimation SLAM Unified Inverse Depth Parametrisation Signal processing Signalbehandling
20	Indoor 3D Mapping using Kinect / Kartering av inomhusmiljöer med Kinect Bengtsson, Morgan January 2014 (has links) In recent years several depth cameras have emerged on the consumer market, creating many interesting possibilities forboth professional and recreational usage. One example of such a camera is the Microsoft Kinect sensor originally usedwith the Microsoft Xbox 360 game console. In this master thesis a system is presented that utilizes this device in order to create an as accurate as possible 3D reconstruction of an indoor environment. The major novelty of the presented system is the data structure based on signed distance fields and voxel octrees used to represent the observed environment. / Under de senaste åren har flera olika avståndskameror lanserats på konsumentmarkanden. Detta har skapat många intressanta applikationer både i professionella system samt för underhållningssyfte. Ett exempel på en sådan kamera är Microsoft Kinect som utvecklades för Microsofts spelkonsol Xbox 360. I detta examensarbete presenteras ett system som använder Kinect för att skapa en så exakt rekonstruktion i 3D av en innomhusmiljö som möjligt. Den främsta innovationen i systemet är en datastruktur baserad på signed distance fields (SDF) och octrees, vilket används för att representera den rekonstruerade miljön. Kinect mapping sparse voxel octree signed distance function pose estimation

Search results