Global ETD Search

1	A Human Kinetic Dataset and a Hybrid Model for 3D Human Pose Estimation Wang, Jianquan 12 November 2020 (has links) Human pose estimation represents the skeleton of a person in color or depth images to improve a machine’s understanding of human movement. 3D human pose estimation uses a three-dimensional skeleton to represent the human body posture, which is more stereoscopic than a two-dimensional skeleton. Therefore, 3D human pose estimation can enable machines to play a role in physical education and health recovery, reducing labor costs and the risk of disease transmission. However, the existing datasets for 3D pose estimation do not involve fast motions that would cause optical blur for a monocular camera but would allow the subjects’ limbs to move in a more extensive range of angles. The existing models cannot guarantee both real-time performance and high accuracy, which are essential in physical education and health recovery applications. To improve real-time performance, researchers have tried to minimize the size of the model and have studied more efficient deployment methods. To improve accuracy, researchers have tried to use heat maps or point clouds to represent features, but this increases the difficulty of model deployment. To address the lack of datasets that include fast movements and easy-to-deploy models, we present a human kinetic dataset called the Kivi dataset and a hybrid model that combines the benefits of a heat map-based model and an end-to-end model for 3D human pose estimation. We describe the process of data collection and cleaning in this thesis. Our proposed Kivi dataset contains large-scale movements of humans. In the dataset, 18 joint points represent the human skeleton. We collected data from 12 people, and each person performed 38 sets of actions. Therefore, each frame of data has a corresponding person and action label. We design a preliminary model and propose an improved model to infer 3D human poses in real time. When validating our method on the Invariant Top-View (ITOP) dataset, we found that compared with the initial model, our improved model improves the mAP@10cm by 29%. When testing on the Kivi dataset, our improved model improves the mAP@10cm by 15.74% compared to the preliminary model. Our improved model can reach 65.89 frames per second (FPS) on the TensorRT platform. Human pose estimation Kinetic dataset
2	3D reconstruction of a catheter path from a single view X-ray sequence Weng, Ji Yao January 2003 (has links) Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal. Cathéter Pose estimation Vue unique Séquence IVUS
3	A Deep 3D Object Pose Estimation Framework for Robots with RGB-D Sensors Wagh, Ameya Yatindra 24 April 2019 (has links) The task of object detection and pose estimation has widely been done using template matching techniques. However, these algorithms are sensitive to outliers and occlusions, and have high latency due to their iterative nature. Recent research in computer vision and deep learning has shown great improvements in the robustness of these algorithms. However, one of the major drawbacks of these algorithms is that they are specific to the objects. Moreover, the estimation of pose depends significantly on their RGB image features. As these algorithms are trained on meticulously labeled large datasets for object's ground truth pose, it is difficult to re-train these for real-world applications. To overcome this problem, we propose a two-stage pipeline of convolutional neural networks which uses RGB images to localize objects in 2D space and depth images to estimate a 6DoF pose. Thus the pose estimation network learns only the geometric features of the object and is not biased by its color features. We evaluate the performance of this framework on LINEMOD dataset, which is widely used to benchmark object pose estimation frameworks. We found the results to be comparable with the state of the art algorithms using RGB-D images. Secondly, to show the transferability of the proposed pipeline, we implement this on ATLAS robot for a pick and place experiment. As the distribution of images in LINEMOD dataset and the images captured by the MultiSense sensor on ATLAS are different, we generate a synthetic dataset out of very few real-world images captured from the MultiSense sensor. We use this dataset to train just the object detection networks used in the ATLAS Robot experiment. Atlas robots pose estimation semantic segmentation
4	An Improved Path Integration Mechanism Using Neural Fields Which Implement A Biologically Plausible Analogue To A Kalman Filter Connors, Warren Anthoney 22 February 2013 (has links) Interaction with the world is necessary for both animals and robots to complete tasks. This interaction requires a sense of self, or the orientation of the robot or animal with respect to the world. Creating and maintaining this model is a task which is easily maintained by animals, however can be difficult for robots due to the uncertainties in the world, sensing, and movement of the robot. This estimation difficulty is increased in sensory deprived environments, where no external, inputs are available to correct the estimate. Therefore, self generated cues of movement are needed, such as vestibular input in an animal, or accelerometer input in a robot. In spite of the difficulties, animals can easily maintain this model. This leads to the question of whether we can learn from nature by examining the biological mechanisms for pose estimation in animals. Previous work has shown that neural fields coupled with a mechanism for updating the estimate can be used to maintain a pose estimate through a sustained area of activity called a packet. Analysis of this mechanism however has shown conditions where the field can provide unexpected results or break down due to high accelerations input into the field. This analysis illustrates the challenges of controlling the activity packet size under strong inputs, and a limited speed capability using the existing mechanism. As a result of this, a novel weight combination method is proposed to provide a higher speed and increased robustness. The results of this is an increase of over two times the existing speed capability, and a resistance of the field to break down under strong rotational inputs. This updated neural field model provides a method for maintaining a stable pose estimate. To show this, a novel comparison between the proposed neural field model and the Kalman filter is considered, resulting in comparable performance in pose prediction. This work shows that an updated neural field model provides a biologically plausible pose prediction model using Bayesian inference, providing a biological analogue to a Kalman filter.
5	Recognition using tagged objects Soh, Ling Min January 2000 (has links) This thesis describes a method for the recognition of objects in an unconstrained environment with a widely ranging illumination, imaged from unknown view points and complicated background. The general problem is simplified by placing specially designed patterns on the object that allows us to solve the pose determination problem easily. There are several key components involved in the proposed recognition approach. They include pattern detection, pose estimation, model acquisition and matching, searching and indexing the model database. Other crucial issues pertaining to the individual components of the recognition system such as the choice of the pattern, the reliability and accuracy of the pattern detector, pose estimator and matching and the speed of the overall system are addressed. After establishing the methodological framework, experiments are carried out on a wide range of both synthetic and real data to illustrate the validity and usefulness of the proposed methods. The principal contribution of this research is a methodology for Tagged Object Recognition (TOR) in unconstrained conditions. A robust pattern (calibration chart) detector is developed for off-the-shelf use. To empirically assess the effectiveness of the pattern detector and the pose estimator under various scenarios, simulated data generated using a graphics rendering process is used. This simulated data provides ground truth which is difficult to obtain in projected images. Using the ground truth, the detection error, which is usually ignored, can be analysed. For model matching, the Chamfer matching algorithm is modified to get a more reliable matching score. The technique facilitates reliable Tagged Object Recognition (TOR). Finally, the results of extensive quantitative and qualitative tests are presented that show the plausibility of practical use of Tagged Object Recognition (TOR). The features characterising the enabling technology developed are the ability to a) recognise an object which is tagged with the calibration chart, b) establish camera position with respect to a landmark and c) test any camera calibration and 3D pose estimation routines, thus facilitating future research and applications in mobile robots navigations, 3D reconstruction and stereo vision. 621.3994
6	Digital Twin Coaching for Edge Computing Using Deep Learning Based 2D Pose Estimation Gámez Díaz, Rogelio 15 April 2021 (has links) In these challenging times caused by the COVID-19, technology that leverages Artificial Intelligence potential can help people cope with the pandemic. For example, people looking to perform physical exercises while in quarantine. We also find another opportunity in the widespread adoption of mobile smart devices, making complex Artificial Intelligence (AI) models accessible to the average user. Taking advantage of this situation, we propose a Smart Coaching experience on the Edge with our Digital Twin Coaching (DTC) architecture. Since the general population is advised to work from home, sedentarism has become prevalent. Coaching is a positive force in exercising, but keeping physical distance while exercising is a significant problem. Therefore, a Smart Coach can help in this scenario as it involves using smart devices instead of direct communication with another person. Some researchers have worked on Smart Coaching, but their systems often involve complex devices such as RGB-Depth cameras, making them cumbersome to use. Our approach is one of the firsts to focus on everyday smart devices, like smartphones, to solve this problem. Digital Twin Coaching can be defined as a virtual system designed to help people improve in a specific field and is a powerful tool if combined with edge technology. The DTC architecture has six characteristics that we try to fulfill: adaptability, compatibility, flexibility, portability, security, and privacy. We collected training data of 10 subjects using a 2D pose estimation model to train our models since there was no dataset of Coach-Trainee videos. To effectively use this information, the most critical pre-processing step was synchronization. This step synchronizes the coach and the trainee’s poses to overcome the trainee's action lag while performing the routine in real-time. We trained a light neural network called “Pose Inference Neural Network” (PINN) to serve as a fine-tuning architecture mechanism. We improved the generalist 2D pose estimation model with this trained neural network while keeping the time complexity relatively unaffected. We also propose an Angular Pose Representation to compare the trainee and coach's stances that consider the differences in different people's body proportions. For the PINN model, we use Random Search Optimization to come up with the best configuration. The configurations tested included using 1, 2, 3, 4, 5, and 10 layers. We chose the 2-Layer Neural Network (2-LNN) configuration because it was the fastest to train and predict while providing a fair tradeoff between performance and resource consumption. Using frame synchronization in pre-processing, we improved 76% on the test loss (Mean Squared Error) while training with the 2-LNN. The PINN improved the R2 score of the PoseNet model by at least 15% and at most 93% depending on the configuration. Our approach only added 4 seconds (roughly 2% of the total time) to the total processing time on average. Finally, the usability test results showed that our Proof of Concept application, DTCoach, was considered easy to learn and convenient to use. At the same time, some participants mentioned that they would like to have more features and improved clarity to be more invested in using the app frequently. We hope DTCoach can help people stay more active, especially in quarantine, as the application can serve as a motivator. Since it can be run on modern smartphones, it can quickly be adopted by many people. Digital Twin Pose Estimation Deep Learning E-coaching
7	Performance Enhancements of the Spin-Image Pose Estimation Algorithm Gerlach, Adam R. 12 April 2010 (has links) No description available. Aerospace Materials pose estimation spin-image gpu
8	Pose Estimation for Gesture Recovery in Occluded Television Videos Pham, Kyle 26 August 2022 (has links) No description available. Computer Science
9	3D POSE ESTIMATION IN THE CONTEXT OF GRIP POSITION FOR PHRI Norman, Jacob January 2021 (has links) For human-robot interaction with the intent to grip a human arm, it is necessary that the ideal gripping location can be identified. In this work, the gripping location is situated on the arm and thus it can be extracted using the position of the wrist and elbow joints. To achieve this human pose estimation is proposed as there exist robust methods that work both in and outside of lab environments. One such example is OpenPose which thanks to the COCO and MPII datasets has recorded impressive results in a variety of different scenarios in real-time. However, most of the images in these datasets are taken from a camera mounted at chest height on people that for the majority of the images are oriented upright. This presents the potential problem that prone humans which are the primary focus of this project can not be detected. Especially if seen from an angle that makes the human appear upside down in the camera frame. To remedy this two different approaches were tested, both aimed at creating a rotation-invariant 2D pose estimation method. The first method rotates the COCO training data in an attempt to create a model that can find humans regardless of orientation in the image. The second approach adds a RotationNet as a preprocessing step to correctly orient the images so that OpenPose can be used to estimate the 2D pose before rotating back the resulting skeletons. 3D pose estimation human pose estimation pose estimation rotation-invariant Computer Sciences Datavetenskap (datalogi)
10	Concept Design and Testing of a GPS-less System for Autonomous Shovel-Truck Spotting OWENS, BRETT 29 January 2013 (has links) Haul truck drivers frequently have difficulties spotting beside shovels. This is typically a combination of reduced visibility and poor mining conditions. Based on first-hand data collected from the Goldstrike Open Pit, it was learned that, on average, 9% of all spotting actions required corrective movements to facilitate loading. This thesis investigates an automated solution to haul truck spotting that does not rely on the use of the satellite global positioning system (GPS), since GPS can perform unreliably. This thesis proposes that if spotting was automated, a significant decrease in cycle times could result. Using conventional algorithms and techniques from the field of mobile robotics, vehicle pose estimation and control algorithms were designed to enable autonomous shovel-truck spotting. The developed algorithms were verified by using both simulation and field testing with real hardware. Tests were performed in analog conditions on an automation-ready Kubota RTV 900 utility truck. When initiated from a representative pose, the RTV successfully spotted to the desired location (within 1 m) in 95% of the conducted trials. The results demonstrate that the proposed approach is a strong candidate for an auto-spot system. / Thesis (Master, Mining Engineering) -- Queen's University, 2013-01-28 09:49:20.584 mobile robotics GPS pose estimation autonomous vehicles spotting times

Search results