• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 124
  • 10
  • 7
  • 5
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 189
  • 189
  • 98
  • 72
  • 50
  • 36
  • 33
  • 33
  • 30
  • 29
  • 29
  • 28
  • 26
  • 25
  • 25
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Reconstructing 3D Humans From Visual Data

Zheng, Ce 01 January 2023 (has links) (PDF)
Understanding humans in visual content is fundamental for numerous computer vision applications. Extensive research has been conducted in the field of human pose estimation (HPE) to accurately locate joints and construct body representations from images and videos. Expanding on HPE, human mesh recovery (HMR) addresses the more complex task of estimating the 3D pose and shape of the entire human body. HPE and HMR have gained significant attention due to their applications in areas such as digital human avatar modeling, AI coaching, and virtual reality [135]. However, HPE and HMR come with notable challenges, including intricate body articulation, occlusion, depth ambiguity, and the limited availability of annotated 3D data. Despite the progress made so far, the research community continues to strive for robust, accurate, and efficient solutions in HPE and HMR, advancing us closer to the ultimate goals in the field. This dissertation tackles various challenges in the domains of HPE and HMR. The initial focus is on video-based HPE, where we proposed a transformer architecture named PoseFormer [136] to leverage to capture the spatial relationships between body joints and temporal correlations across frames. This approach effectively harnesses the comprehensive connectivity and expressive power of transformers, leading to improved pose estimation accuracy in video sequences. Building upon this, the dissertation addresses the heavy computational and memory burden associated with image-based HMR. Our proposed Feater Map-based Transformer method (FeatER [133]) and a Pooling attention transformer method (POTTER[130]), demonstrate superior performance while significantly reducing computational and memory requirements compared to existing state-of-the-art techniques. Furthermore, a diffusion-based framework (DiffMesh[134]) is proposed for reconstructing high-quality human mesh outputs given input video sequences. These achievements provide practical and efficient solutions that cater to the demands of real-world applications in HPE and HMR. In this dissertation, our contributions advance the fields of HPE and HMR, bringing us closer to accurate and efficient solutions for understanding humans in visual content.
72

Rigorous Model of Panoramic Cameras

Shin, Sung Woong 31 March 2003 (has links)
No description available.
73

Deep Learning for estimation of fingertip location in 3-dimensional point clouds : An investigation of deep learning models for estimating fingertips in a 3D point cloud and its predictive uncertainty

Hölscher, Phillip January 2021 (has links)
Sensor technology is rapidly developing and, consequently, the generation of point cloud data is constantly increasing. Since the recent release of PointNet, it is possible to process this unordered 3-dimensional data directly in a neural network. The company TLT Screen AB, which develops cutting-edge tracking technology, seeks to optimize the localization of the fingertips of a hand in a point cloud. To do so, the identification of relevant 3D neural network models for modeling hands and detection of fingertips in various hand orientations is essential. The Hand PointNet processes point clouds of hands directly and generate estimations of fixed points (joints), including fingertips, of the hands. Therefore, this model was selected to optimize the localization of fingertips for TLT Screen AB and forms the subject of this research. The model has advantages over conventional convolutional neural networks (CNN). First of all, in contrast to the 2D CNN, the Hand PointNet can use the full 3-dimensional spatial information. Compared to the 3D CNN, moreover, it avoids unnecessarily voluminous data and enables more efficient learning. The model was trained and evaluated on the public dataset MRSA Hand. In contrast to previously published work, the main object of this investigation is the estimation of only 5 joints, for the fingertips. The behavior of the model with a reduction from the usual 21 to 11 and only 5 joints are examined. It is found that the reduction of joints contributed to an increase in the mean error of the estimated joints. Furthermore, the examination of the distribution of the residuals of the estimate for fingertips is found to be less dense. MC dropout to study the prediction uncertainty for the fingertips has shown that the uncertainty increases when the joints are decreased. Finally, the results show that the uncertainty is greatest for the prediction of the thumb tip. Starting from the tip of the thumb, it is observed that the uncertainty of the estimates decreases with each additional fingertip.
74

Learned structural and temporal context for dynamic 3D pose optimization and tracking

Patel, Mahir 30 September 2022 (has links)
Accurate 3D tracking of animals from video recordings is critical for many behavioral studies. However, other than for humans, there is a lack of publicly available datasets of videos of animals that the computer vision community could use for model development. Furthermore, due to occlusion and the uncontrollable nature of the animals, existing pose estimation models suffer from inadequate precision. People rely on biomechanical expertise to design mathematical models to optimize poses to mitigate this issue at the cost of generalization. We propose OptiPose, a generalizable attention-based deep learning pose optimization model, as a part of a post-processing pipeline for refining 3D poses estimated by pre-existing systems. Our experiments show how OptiPose is highly robust to noise and occlusion and can be used to optimize pose sequences provided by state-of-the-art models for animal pose estimation. Furthermore, we will make Rodent3D, a multimodal (RGB, Thermal, and Depth) dataset for rats, publicly available.
75

3-D Face Modeling from a 2-D Image with Shape and Head Pose Estimation

Oyini Mbouna, Ralph January 2014 (has links)
This paper presents 3-D face modeling with head pose and depth information estimated from a 2-D query face image. Many recent approaches to 3-D face modeling are based on a 3-D morphable model that separately encodes the shape and texture in a parameterized model. The model parameters are often obtained by applying statistical analysis to a set of scanned 3-D faces. Such approaches tend to depend on the number and quality of scanned 3-D faces, which are difficult to obtain and computationally intensive. To overcome the limitations of 3-D morphable models, several modeling techniques from 2-D images have been proposed. We propose a novel framework for depth estimation from a single 2-D image with an arbitrary pose. The proposed scheme uses a set of facial features in a query face image and a reference 3-D face model to estimate the head pose angles of the face. The depth information of the subject at each feature point is represented by the depth information of the reference 3-D face model multiplied by a vector of scale factors. We use the positions of a set of facial feature points on the query 2-D image to deform the reference face dense model into a person specific 3-D face by minimizing an objective function. The objective function is defined as the feature disparity between the facial features in the face image and the corresponding 3-D facial features on the rotated reference model projected onto 2-D space. The pose and depth parameters are iteratively refined until stopping criteria are reached. The proposed method requires only a face image of arbitrary pose for the reconstruction of the corresponding 3-D face dense model with texture. Experiment results with USF Human-ID and Pointing'04 databases show that the proposed approach is effective to estimate depth and head pose information with a single 2-D image. / Electrical and Computer Engineering
76

Design of Viewpoint-Equivariant Networks to Improve Human Pose Estimation

Garau, Nicola 31 May 2022 (has links)
Human pose estimation (HPE) is an ever-growing research field, with an increasing number of publications in the computer vision and deep learning fields and it covers a multitude of practical scenarios, from sports to entertainment and from surveillance to medical applications. Despite the impressive results that can be obtained with HPE, there are still many problems that need to be tackled when dealing with real-world applications. Most of the issues are linked to a poor or completely wrong detection of the pose that emerges from the inability of the network to model the viewpoint. This thesis shows how designing viewpoint-equivariant neural networks can lead to substantial improvements in the field of human pose estimation, both in terms of state-of-the-art results and better real-world applications. By jointly learning how to build hierarchical human body poses together with the observer viewpoint, a network can learn to generalise its predictions when dealing with previously unseen viewpoints. As a result, the amount of training data needed can be drastically reduced, simultaneously leading to faster and more efficient training and more robust and interpretable real-world applications.
77

Gappy POD and Temporal Correspondence for Lizard Motion Estimation

Kurdila, Hannah Robertshaw 20 June 2018 (has links)
With the maturity of conventional industrial robots, there has been increasing interest in designing robots that emulate realistic animal motions. This discipline requires careful and systematic investigation of a wide range of animal motions from biped, to quadruped, and even to serpentine motion of centipedes, millipedes, and snakes. Collecting optical motion capture data of such complex animal motions can be complicated for several reasons. Often there is the need to use many high-quality cameras for detailed subject tracking, and self-occlusion, loss of focus, and contrast variations challenge any imaging experiment. The problem of self-occlusion is especially pronounced for animals. In this thesis, we walk through the process of collecting motion capture data of a running lizard. In our collected raw video footage, it is difficult to make temporal correspondences using interpolation methods because of prolonged blurriness, occlusion, or the limited field of vision of our cameras. To work around this, we first make a model data set by making our best guess of the points' locations through these corruptions. Then, we randomly eclipse the data, use Gappy POD to repair the data and then see how closely it resembles the initial set, culminating in a test case where we simulate the actual corruptions we see in the raw video footage. / Master of Science
78

Infared Light-Based Data Association and Pose Estimation for Aircraft Landing in Urban Environments

Akagi, David 10 June 2024 (has links) (PDF)
In this thesis we explore an infrared light-based approach to the problem of pose estimation during aircraft landing in urban environments where GPS is unreliable or unavailable. We introduce a novel fiducial constellation composed of sparse infrared lights that incorporates projective invariant properties in its design to allow for robust recognition and association from arbitrary camera perspectives. We propose a pose estimation pipeline capable of producing high accuracy pose measurements at real-time rates from monocular infrared camera views of the fiducial constellation, and present as part of that pipeline a data association method that is able to robustly identify and associate individual constellation points in the presence of clutter and occlusions. We demonstrate the accuracy and efficiency of this pose estimation approach on real-world data obtained from multiple flight tests, and show that we can obtain decimeter level accuracy from distances of over 100 m from the constellation. To achieve greater robustness to the potentially large number of outlier infrared detections that can arise in urban environments, we also explore learning-based approaches to the outlier rejection and data association problems. By formulating the problem of camera image data association as a 2D point cloud analysis, we can apply deep learning methods designed for 3D point cloud segmentation to achieve robust, high-accuracy associations at constant real-time speeds on infrared images with high outlier-to-inlier ratios. We again demonstrate the efficiency of our learning-based approach on both synthetic and real-world data, and compare the results and limitations of this method to our first-principles-based data association approach.
79

MORP: Monocular Orientation Regression Pipeline

Gunderson, Jacob 01 June 2024 (has links) (PDF)
Orientation estimation of objects plays a pivotal role in robotics, self-driving cars, and augmented reality. Beyond mere position, accurately determining the orientation of objects is essential for constructing precise models of the physical world. While 2D object detection has made significant strides, the field of orientation estimation still faces several challenges. Our research addresses these hurdles by proposing an efficient pipeline which facilitates rapid creation of labeled training data and enables direct regression of object orientation from a single image. We start by creating a digital twin of a physical object using an iPhone, followed by generating synthetic images using the Unity game engine and domain randomization. Our deep learning model, trained exclusively on these synthetic images, demonstrates promising results in estimating the orientations of common objects. Notably, our model achieves a median geodesic distance error of 3.9 degrees and operates at a brisk 15 frames per second.
80

6DOF MAGNETIC TRACKING AND ITS APPLICATION TO HUMAN GAIT ANALYSIS

Ravi Abhishek Shankar (18855049) 28 June 2024 (has links)
<p dir="ltr">There is growing research in analyzing human gait in the context of various applications. This has been aided by the improvement in sensing technologies and computation power. A complex motor skill that it is, gait has found its use in medicine for diagnosing different neurological ailments and injuries. In sports, gait can be used to provide feedback to the player/athlete to improve his/her skill and to prevent injuries. In biometrics, gait can be used to identify and authenticate individuals. This can be easier to scale to perform biometrics of individuals in large crowds compared to conventional biometric methods. In the field of Human Computer Interaction (HCI), gait can be an additional input that could be provided to be used in applications such as video games. Gait analysis has also been used for Human Activity Recognition (HAR) for purposes such as personal fitness, elderly care and rehabilitation. </p><p dir="ltr">The current state-of-the-art methods for gait analysis involves non-wearable technology due to its superior performance. The sophistication afforded in non-wearable technologies, such as cameras, is better able to capture gait information as compared to wearables. However, non-wearable systems are expensive, not scalable and typically, inaccessible to the general public. These systems sometimes need to be set up in specialized clinical facilities by experts. On the other hand, wearables offer scalability and convenience but are not able to match the performance of non-wearables. So the current work is a step in the direction to bridge the gap between the performance of non-wearable systems and the convenience of wearables. </p><p dir="ltr">A magnetic tracking system is developed to be applied for gait analysis. The system performs position and orientation tracking, i.e. 6 degrees of freedom or 6DoF tracking. One or more tracker modules, called Rx modules, is tracked with respect to a module called the Tx module. The Tx module mainly consists of a magnetic field generating coil, Inertial Measurement Unit (IMU) and magnetometer. The Rx module mainly consists of a tri-axis sensing coil, IMU and magnetometer. The system is minimally intrusive, works with Non-Line-of-Sight (NLoS) condition, low power consuming, compact and light weight. </p><p dir="ltr">The magnetic tracking system has been applied to the task of Human Activity Recognition (HAR) in this work as a proof-of-concept. The tracking system was worn by participants, and 4 activities - walking, walking with weight, marching and jogging - were performed. The Tx module was worn on the waist and the Rx modules were placed on the feet. To compare magnetic tracking with the most commonly used wearable sensors - IMUs + magnetometer - the same system was used to provide IMU and magnetometer data for the same 4 activities. The gait data was processed by 2 commonly used deep learning models - Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). The magnetic tracking system shows an overall accuracy of 92\% compared to 86.69\% of the IMU + magnetometer system. Moreover, an accuracy improvement of 8\% is seen with the magnetic tracking system in differentiating between the walking and walking with weight activities, which are very similar in nature. This goes to show the improvement in gait information that 6DoF tracking brings, that manifests as increased classification accuracy. This increase in gait information will have a profound impact in other applications of gait analysis as well.</p>

Page generated in 0.1399 seconds