• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 196
  • 24
  • 17
  • 10
  • 9
  • 6
  • 6
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 339
  • 214
  • 143
  • 105
  • 70
  • 61
  • 56
  • 48
  • 44
  • 43
  • 43
  • 43
  • 39
  • 38
  • 36
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
221

Learning to Predict Dense Correspondences for 6D Pose Estimation

Brachmann, Eric 17 January 2018 (has links)
Object pose estimation is an important problem in computer vision with applications in robotics, augmented reality and many other areas. An established strategy for object pose estimation consists of, firstly, finding correspondences between the image and the object’s reference frame, and, secondly, estimating the pose from outlier-free correspondences using Random Sample Consensus (RANSAC). The first step, namely finding correspondences, is difficult because object appearance varies depending on perspective, lighting and many other factors. Traditionally, correspondences have been established using handcrafted methods like sparse feature pipelines. In this thesis, we introduce a dense correspondence representation for objects, called object coordinates, which can be learned. By learning object coordinates, our pose estimation pipeline adapts to various aspects of the task at hand. It works well for diverse object types, from small objects to entire rooms, varying object attributes, like textured or texture-less objects, and different input modalities, like RGB-D or RGB images. The concept of object coordinates allows us to easily model and exploit uncertainty as part of the pipeline such that even repeating structures or areas with little texture can contribute to a good solution. Although we can train object coordinate predictors independent of the full pipeline and achieve good results, training the pipeline in an end-to-end fashion is desirable. It enables the object coordinate predictor to adapt its output to the specificities of following steps in the pose estimation pipeline. Unfortunately, the RANSAC component of the pipeline is non-differentiable which prohibits end-to-end training. Adopting techniques from reinforcement learning, we introduce Differentiable Sample Consensus (DSAC), a formulation of RANSAC which allows us to train the pose estimation pipeline in an end-to-end fashion by minimizing the expectation of the final pose error.
222

Hypothesis Generation for Object Pose Estimation From local sampling to global reasoning

Michel, Frank 14 February 2019 (has links)
Pose estimation has been studied since the early days of computer vision. The task of object pose estimation is to determine the transformation that maps an object from it's inherent coordinate system into the camera-centric coordinate system. This transformation describes the translation of the object relative to the camera and the orientation of the object in three dimensional space. The knowledge of an object's pose is a key ingredient in many application scenarios like robotic grasping, augmented reality, autonomous navigation and surveillance. A general estimation pipeline consists of the following four steps: extraction of distinctive points, creation of a hypotheses pool, hypothesis verification and, finally, the hypotheses refinement. In this work, we focus on the hypothesis generation process. We show that it is beneficial to utilize geometric knowledge in this process. We address the problem of hypotheses generation of articulated objects. Instead of considering each object part individually we model the object as a kinematic chain. This enables us to use the inner-part relationships when sampling pose hypotheses. Thereby we only need K correspondences for objects consisting of K parts. We show that applying geometric knowledge about part relationships improves estimation accuracy under severe self-occlusion and low quality correspondence predictions. In an extension we employ global reasoning within the hypotheses generation process instead of sampling 6D pose hypotheses locally. We therefore formulate a Conditional-Random-Field operating on the image as a whole inferring those pixels that are consistent with the 6D pose. Within the CRF we use a strong geometric check that is able to assess the quality of correspondence pairs. We show that our global geometric check improves the accuracy of pose estimation under heavy occlusion.
223

Pose Estimation using Implicit Functions and Uncertainty in 3D

Blomstedt, Frida January 2023 (has links)
Human pose estimation in 3D is a large area within computer vision, with many application areas. A common approach is to first estimate the pose in 2D, resulting in a confidence heatmap, and then estimate the 3D pose using the most likely estimations in 2D. This may, however, cause problems in cases where pose estimates are more uncertain and the estimation of one point is far from the true position, for example when a limb is occluded. This thesis adapts the method Neural Radiance Fields (NeRF) to 2D confidence heatmaps in order to create an implicit representation of the uncertainty in 3D, thus attempting to make use of as much information in 2D as possible. The adapted method was evaluated on the Human3.6M dataset, and results show that this method outperforms a simple triangulation baseline, especially when the estimation in 2D is far from the true pose.
224

3D-Reconstruction of the Common Murre / 3D-Rekonstruering av Sillgrissla

Hägerlind, Johannes January 2023 (has links)
Automatic 3D reconstruction of birds can aid researchers in studying their behavior. Recently there has been an attempt to reconstruct a variety of birds from single-view images. However, the common murre's appearance is different from the birds that have been studied. Moreover, recent studies have focused on side views. This thesis studies the 3D reconstruction of the common murre from single-view top-view images. A template mesh is first optimized to fit a 3D scan. Then the result is used to optimize a species-specific mean from side-view images annotated with keypoints and silhouettes. The resulting mean mesh is used to initialize the optimization for top-down images. Using a mask loss, a pose prior loss, and a bone length loss that uses a mean vector from the side-view images improves the 3D reconstruction as rated by humans. Furthermore, the intersection over union (IoU) and percentage of correct keypoint (PCK), although used by other authors, are insufficient in a single-view top-view setting.
225

Polarimetric Imagery for Object Pose Estimation

Siefring, Matthew D. 15 May 2023 (has links)
No description available.
226

Scanning Laser Registration and Structural Energy Density Based Active Structural Acoustic Control

Manwill, Daniel Alan 17 December 2010 (has links) (PDF)
To simplify the measurement of energy-based structural metrics, a general registration process for the scanning laser doppler vibrometer (SLDV) has been developed. Existing registration techniques, also known as pose estimation or position registration, suffer from mathematical complexity, instrument specificity, and the need for correct optimization initialization. These difficulties have been addressed through development of a general linear laser model and hybrid registration algorithm. These are applicable to any SLDV and allow the registration problem to be solved using straightforward mathematics. Additionally, the hybrid registration algorithm eliminates the need for correct optimization initialization by separating the optimization process from solution selection. The effectiveness of this approach is demonstrated through simulated application and by validation measurements performed on a specially prepared pipe. To increase understanding of the relationships between structural energy metrics and the acoustic response, the use of structural energy density (SED) in active structural acoustic control (ASAC) has also been studied. A genetic algorithm and other simulations were used to determine achievable reduction in acoustic radiation, characterize control system design, and compare SED-based control with the simpler velocity-based control. Using optimized sensor and actuator placements at optimally excited modal frequencies, attenuation of net acoustic intensity was proportional to attenuation of SED. At modal and non-modal frequencies, optimal SED-based ASAC system design is guided by establishing general symmetry between the structural disturbing force and the SED sensor and control actuator. Using fixed sensor and actuator placement, SED-based control has been found to provide superior performance to single point velocity control and very comparable performance to two-point velocity control. Its greatest strength is that it rarely causes unwanted amplifications of large amplitude when properly designed. Genetic algorithm simulations of SED-based ASAC indicated that optimal control effectiveness is obtained when sensors and actuators function in more than one role. For example, an actuator can be placed to simultaneously reduce structural vibration amplitude and reshape the response such that it radiates less efficiently. These principles can be applied to the design of any type of ASAC system.
227

Geometric Invariance In The Analysis Of Human Motion In Video Data

Shen, Yuping 01 January 2009 (has links)
Human motion analysis is one of the major problems in computer vision research. It deals with the study of the motion of human body in video data from different aspects, ranging from the tracking of body parts and reconstruction of 3D human body configuration, to higher level of interpretation of human action and activities in image sequences. When human motion is observed through video camera, it is perspectively distorted and may appear totally different from different viewpoints. Therefore it is highly challenging to establish correct relationships between human motions across video sequences with different camera settings. In this work, we investigate the geometric invariance in the motion of human body, which is critical to accurately understand human motion in video data regardless of variations in camera parameters and viewpoints. In human action analysis, the representation of human action is a very important issue, and it usually determines the nature of the solutions, including their limits in resolving the problem. Unlike existing research that study human motion as a whole 2D/3D object or a sequence of postures, we study human motion as a sequence of body pose transitions. We also decompose a human body pose further into a number of body point triplets, and break down a pose transition into the transition of a set of body point triplets. In this way the study of complex non-rigid motion of human body is reduced to that of the motion of rigid body point triplets, i.e. a collection of planes in motion. As a result, projective geometry and linear algebra can be applied to explore the geometric invariance in human motion. Based on this formulation, we have discovered the fundamental ratio invariant and the eigenvalue equality invariant in human motion. We also propose solutions based on these geometric invariants to the problems of view-invariant recognition of human postures and actions, as well as analysis of human motion styles. These invariants and their applicability have been validated by experimental results supporting that their effectiveness in understanding human motion with various camera parameters and viewpoints.
228

Take the Lead: Toward a Virtual Video Dance Partner

Farris, Ty 01 August 2021 (has links) (PDF)
My work focuses on taking a single person as input and predicting the intentional movement of one dance partner based on the other dance partner's movement. Human pose estimation has been applied to dance and computer vision, but many existing applications focus on a single individual or multiple individuals performing. Currently there are very few works that focus specifically on dance couples combined with pose prediction. This thesis is applicable to the entertainment and gaming industry by training people to dance with a virtual dance partner. Many existing interactive or virtual dance partners require a motion capture system, multiple cameras or a robot which creates an expensive cost. This thesis does not use a motion capture system and combines OpenPose with swing dance YouTube videos to create a virtual dance partner. By taking in the current dancer's moves as input, the system predicts the dance partner's corresponding moves in the video frames. In order to create a virtual dance partner, datasets that contain information about the skeleton keypoints are necessary to predict a dance partner's pose. There are existing dance datasets for a specific type of dance, but these datasets do not cover swing dance. Furthermore, the dance datasets that do include swing have a limited number of videos. The contribution of this thesis is a large swing dataset that contains three different types of swing dance: East Coast, Lindy Hop and West Coast. I also provide a basic framework to extend the work to create a real-time and interactive dance partner.
229

Reinforcement learning for robotic manipulation / Reinforcement learning för manipulering med robot

Arnekvist, Isac January 2017 (has links)
Reinforcement learning was recently successfully used for real-world robotic manipulation tasks, without the need for human demonstration, usinga normalized advantage function-algorithm (NAF). Limitations on the shape of the advantage function however poses doubts to what kind of policies can be learned using this method. For similar tasks, convolutional neural networks have been used for pose estimation from images taken with fixed position cameras. For some applications however, this might not be a valid assumption. It was also shown that the quality of policies for robotic tasks severely deteriorates from small camera offsets. This thesis investigates the use of NAF for a pushing task with clear multimodal properties. The results are compared with using a deterministic policy with minimal constraints on the Q-function surface. Methods for pose estimation using convolutional neural networks are further investigated, especially with regards to randomly placed cameras with unknown offsets. By defining the coordinate frame of objects with respect to some visible feature, it is hypothesized that relative pose estimation can be accomplished even when the camera is not fixed and the offset is unknown. NAF is successfully implemented to solve a simple reaching task on a real robotic system where data collection is distributed over several robots, and learning is done on a separate server. Using NAF to learn a pushing task fails to converge to a good policy, both on the real robots and in simulation. Deep deterministic policy gradient (DDPG) is instead used in simulation and successfully learns to solve the task. The learned policy is then applied on the real robots and accomplishes to solve the task in the real setting as well. Pose estimation from fixed position camera images is learned and the policy is still able to solve the task using these estimates. By defining a coordinate frame from an object visible to the camera, in this case the robot arm, a neural network learns to regress the pushable objects pose in this frame without the assumption of a fixed camera. However, the precision of the predictions were too inaccurate to be used for solving the pushing task. Further modifications to this approach could however show to be a feasible solution to randomly placed cameras with unknown poses. / Reinforcement learning har nyligen använts framgångsrikt för att lära icke-simulerade robotar uppgifter med hjälp av en normalized advantage function-algoritm (NAF), detta utan att använda mänskliga demonstrationer. Restriktioner på funktionsytorna som använts kan dock visa sig vara problematiska för generalisering till andra uppgifter. För poseestimering har i liknande sammanhang convolutional neural networks använts med bilder från kamera med konstant position. I vissa applikationer kan dock inte kameran garanteras hålla en konstant position och studier har visat att kvaliteten på policys kraftigt förvärras när kameran förflyttas.   Denna uppsats undersöker användandet av NAF för att lära in en ”pushing”-uppgift med tydliga multimodala egenskaper. Resultaten jämförs med användandet av en deterministisk policy med minimala restriktioner på Q-funktionsytan. Vidare undersöks användandet av convolutional neural networks för pose-estimering, särskilt med hänsyn till slumpmässigt placerade kameror med okänd placering. Genom att definiera koordinatramen för objekt i förhållande till ett synligt referensobjekt så tros relativ pose-estimering kunna utföras även när kameran är rörlig och förflyttningen är okänd. NAF appliceras i denna uppsats framgångsrikt på enklare problem där datainsamling är distribuerad över flera robotar och inlärning sker på en central server. Vid applicering på ”pushing”- uppgiften misslyckas dock NAF, både vid träning på riktiga robotar och i simulering. Deep deterministic policy gradient (DDPG) appliceras istället på problemet och lär sig framgångsrikt att lösa problemet i simulering. Den inlärda policyn appliceras sedan framgångsrikt på riktiga robotar. Pose-estimering genom att använda en fast kamera implementeras också framgångsrikt. Genom att definiera ett koordinatsystem från ett föremål i bilden med känd position, i detta fall robotarmen, kan andra föremåls positioner beskrivas i denna koordinatram med hjälp av neurala nätverk. Dock så visar sig precisionen vara för låg för att appliceras på robotar. Resultaten visar ändå att denna metod, med ytterligare utökningar och modifikationer, skulle kunna lösa problemet.
230

Cooperative Navigation of Autonomous Vehicles in Challenging Environments

Forsgren, Brendon Peter 18 September 2023 (has links) (PDF)
As the capabilities of autonomous systems have increased so has interest in utilizing teams of autonomous systems to accomplish tasks more efficiently. This dissertation takes steps toward enabling the cooperation of unmanned systems in scenarios that are challenging, such as GPS-denied or perceptually aliased environments. This work begins by developing a cooperative navigation framework that is scalable in the number of agents, robust against communication latency or dropout, and requires little a priori information. Additionally, this framework is designed to be easily adopted by existing single-agent systems with minimal changes to existing software and software architectures. All systems in the framework are validated through Monte Carlo simulations. The second part of this dissertation focuses on making cooperative navigation robust in challenging environments. This work first focuses on enabling a more robust version of pose graph SLAM, called cycle-based pose graph optimization, to be run in real-time by implementing and validating an algorithm to incrementally approximate a minimum cycle basis. A new algorithm is proposed that is tailored to multi-agent systems by approximating the cycle basis of two graphs that have been joined. These algorithms are validated through extensive simulation and hardware experiments. The last part of this dissertation focuses on scenarios where perceptual aliasing and incorrect or unknown data association are present. This work presents a unification of the framework of consistency maximization, and extends the concept of pairwise consistency to group consistency. This work shows that by using group consistency, low-degree-of-freedom measurements can be rejected in high-outlier regimes if the measurements do not fit the distribution of other measurements. The efficacy of this method is verified extensively using both simulation and hardware experiments.

Page generated in 0.0515 seconds