Global ETD Search

221	Facial Feature Tracking and Head Pose Tracking as Input for Platform Games Andersson, Anders Tobias January 2016 (has links) Modern facial feature tracking techniques can automatically extract and accurately track multiple facial landmark points from faces in video streams in real time. Facial landmark points are deﬁned as points distributed on a face in regards to certain facial features, such as eye corners and face contour. This opens up for using facial feature movements as a handsfree human-computer interaction technique. These alternatives to traditional input devices can give a more interesting gaming experience. They also open up for more intuitive controls and can possibly give greater access to computers and video game consoles for certain disabled users with diﬃculties using their arms and/or ﬁngers. This research explores using facial feature tracking to control a character's movements in a platform game. The aim is to interpret facial feature tracker data and convert facial feature movements to game input controls. The facial feature input is compared with other handsfree inputmethods, as well as traditional keyboard input. The other handsfree input methods that are explored are head pose estimation and a hybrid between the facial feature and head pose estimation input. Head pose estimation is a method where the application is extracting the angles in which the user's head is tilted. The hybrid input method utilises both head pose estimation and facial feature tracking. The input methods are evaluated by user performance and subjective ratings from voluntary participants playing a platform game using the input methods. Performance is measured by the time, the amount of jumps and the amount of turns it takes for a user to complete a platform level. Jumping is an essential part of platform games. To reach the goal, the player has to jump between platforms. An ineﬃcient input method might make this a diﬃcult task. Turning is the action of changing the direction of the player character from facing left to facing right or vice versa. This measurement is intended to pick up diﬃculties in controling the character's movements. If the player makes many turns, it is an indication that it is diﬃcult to use the input method to control the character movements eﬃciently. The results suggest that keyboard input is the most eﬀective input method, while it is also the least entertaining of the input methods. There is no signiﬁcant diﬀerence in performance between facial feature input and head pose input. The hybrid input version has the best results overall of the alternative input methods. The hybrid input method got signiﬁcantly better performance results than the head pose input and facial feature input methods, while it got results that were of no statistically signiﬁcant diﬀerence from the keyboard input method. Keywords: Computer Vision, Facial Feature Tracking, Head Pose Tracking, Game Control / Moderna tekniker kan automatiskt extrahera och korrekt följa multipla landmärken från ansikten i videoströmmar. Landmärken från ansikten är deﬁnerat som punkter placerade på ansiktet utefter ansiktsdrag som till exempel ögat eller ansikts konturer. Detta öppnar upp för att använda ansiktsdragsrörelser som en teknik för handsfree människa-datorinteraktion. Dessa alternativ till traditionella tangentbord och spelkontroller kan användas för att göra datorer och spelkonsoler mer tillgängliga för vissa rörelsehindrade användare. Detta examensarbete utforskar användbarheten av ansiktsdragsföljning för att kontrollera en karaktär i ett plattformsspel. Målet är att tolka data från en appliktion som följer ansiktsdrag och översätta ansiktsdragens rörelser till handkontrollsinmatning. Ansiktsdragsinmatningen jämförs med inmatning med huvudposeuppskattning, en hybrid mellan ansikstdragsföljning och huvudposeuppskattning, samt traditionella tangentbordskontroller. Huvudposeuppskattning är en teknik där applikationen extraherar de vinklar användarens huvud lutar. Hybridmetoden använder både ansiktsdragsföljning och huvudposeuppskattning. Inmatningsmetoderna granskas genom att mäta eﬀektivitet i form av tid, antal hopp och antal vändningar samt subjektiva värderingar av frivilliga testanvändare som spelar ett plattformspel med de olika inmatningsmetoderna. Att hoppa är viktigt i ett plattformsspel. För att nå målet, måste spelaren hoppa mellan plattformar. En inefektiv inmatningsmetod kan göra detta svårt. En vändning är när spelarkaraktären byter riktning från att rikta sig åt höger till att rikta sig åt vänster och vice versa. Ett högt antal vändningar kan tyda på att det är svårt att kontrollera spelarkaraktärens rörelser på ett eﬀektivt sätt. Resultaten tyder på att tangentbordsinmatning är den mest eﬀektiva metoden för att kontrollera plattformsspel. Samtidigt ﬁck metoden lägst resultat gällande hur roligt användaren hade under spelets gång. Där var ingen statisktiskt signiﬁkant skillnad mellan huvudposeinmatning och ansikstsdragsinmatning. Hybriden mellan ansiktsdragsinmatning och huvudposeinmatning ﬁck bäst helhetsresultat av de alternativa inmatningsmetoderna. Nyckelord: Datorseende, Följning av Ansiktsdrag, Följning av Huvud, Spelinmatning facial feature tracking head pose tracking alternative interface real-time hci human computer interface Interaction Technologies Interaktionsteknik
222	Evaluation of 3D motion capture data from a deep neural network combined with a biomechanical model Rydén, Anna, Martinsson, Amanda January 2021 (has links) Motion capture has in recent years grown in interest in many fields from both game industry to sport analysis. The need of reflective markers and expensive multi-camera systems limits the business since they are costly and time-consuming. One solution to this could be a deep neural network trained to extract 3D joint estimations from a 2D video captured with a smartphone. This master thesis project has investigated the accuracy of a trained convolutional neural network, MargiPose, that estimates 25 joint positions in 3D from a 2D video, against a gold standard, multi-camera Vicon-system. The project has also investigated if the data from the deep neural network can be connected to a biomechanical modelling software, AnyBody, for further analysis. The final intention of this project was to analyze how accurate such a combination could be in golf swing analysis. The accuracy of the deep neural network has been evaluated with three parameters: marker position, angular velocity and kinetic energy for different segments of the human body. MargiPose delivers results with high accuracy (Mean Per Joint Position Error (MPJPE) = 1.52 cm) for a simpler movement but for a more advanced motion such as a golf swing, MargiPose achieves less accuracy in marker distance (MPJPE = 3.47 cm). The mean difference in angular velocity shows that MargiPose has difficulties following segments that are occluded or has a greater motion, such as the wrists in a golf swing where they both move fast and are occluded by other body segments. The conclusion of this research is that it is possible to connect data from a trained CNN with a biomechanical modelling software. The accuracy of the network is highly dependent on the intention of the data. For the purpose of golf swing analysis, this could be a great and cost-effective solution which could enable motion analysis for professionals but also for interested beginners. MargiPose shows a high accuracy when evaluating simple movements. However, when using it with the intention of analyzing a golf swing in i biomechanical modelling software, the outcome might be beyond the bounds of reliable results. Human pose estimation motion capture deep neural network CNN MargiPose biomechanical modelling AnyBody modelling system Other Medical Engineering Annan medicinteknik
223	Learning to Predict Dense Correspondences for 6D Pose Estimation Brachmann, Eric 17 January 2018 (has links) Object pose estimation is an important problem in computer vision with applications in robotics, augmented reality and many other areas. An established strategy for object pose estimation consists of, firstly, finding correspondences between the image and the object’s reference frame, and, secondly, estimating the pose from outlier-free correspondences using Random Sample Consensus (RANSAC). The first step, namely finding correspondences, is difficult because object appearance varies depending on perspective, lighting and many other factors. Traditionally, correspondences have been established using handcrafted methods like sparse feature pipelines. In this thesis, we introduce a dense correspondence representation for objects, called object coordinates, which can be learned. By learning object coordinates, our pose estimation pipeline adapts to various aspects of the task at hand. It works well for diverse object types, from small objects to entire rooms, varying object attributes, like textured or texture-less objects, and different input modalities, like RGB-D or RGB images. The concept of object coordinates allows us to easily model and exploit uncertainty as part of the pipeline such that even repeating structures or areas with little texture can contribute to a good solution. Although we can train object coordinate predictors independent of the full pipeline and achieve good results, training the pipeline in an end-to-end fashion is desirable. It enables the object coordinate predictor to adapt its output to the specificities of following steps in the pose estimation pipeline. Unfortunately, the RANSAC component of the pipeline is non-differentiable which prohibits end-to-end training. Adopting techniques from reinforcement learning, we introduce Differentiable Sample Consensus (DSAC), a formulation of RANSAC which allows us to train the pose estimation pipeline in an end-to-end fashion by minimizing the expectation of the final pose error. info:eu-repo/classification/ddc/004 ddc:004
224	Hypothesis Generation for Object Pose Estimation From local sampling to global reasoning Michel, Frank 14 February 2019 (has links) Pose estimation has been studied since the early days of computer vision. The task of object pose estimation is to determine the transformation that maps an object from it's inherent coordinate system into the camera-centric coordinate system. This transformation describes the translation of the object relative to the camera and the orientation of the object in three dimensional space. The knowledge of an object's pose is a key ingredient in many application scenarios like robotic grasping, augmented reality, autonomous navigation and surveillance. A general estimation pipeline consists of the following four steps: extraction of distinctive points, creation of a hypotheses pool, hypothesis verification and, finally, the hypotheses refinement. In this work, we focus on the hypothesis generation process. We show that it is beneficial to utilize geometric knowledge in this process. We address the problem of hypotheses generation of articulated objects. Instead of considering each object part individually we model the object as a kinematic chain. This enables us to use the inner-part relationships when sampling pose hypotheses. Thereby we only need K correspondences for objects consisting of K parts. We show that applying geometric knowledge about part relationships improves estimation accuracy under severe self-occlusion and low quality correspondence predictions. In an extension we employ global reasoning within the hypotheses generation process instead of sampling 6D pose hypotheses locally. We therefore formulate a Conditional-Random-Field operating on the image as a whole inferring those pixels that are consistent with the 6D pose. Within the CRF we use a strong geometric check that is able to assess the quality of correspondence pairs. We show that our global geometric check improves the accuracy of pose estimation under heavy occlusion. info:eu-repo/classification/ddc/004 ddc:004
225	Pose Estimation using Implicit Functions and Uncertainty in 3D Blomstedt, Frida January 2023 (has links) Human pose estimation in 3D is a large area within computer vision, with many application areas. A common approach is to first estimate the pose in 2D, resulting in a confidence heatmap, and then estimate the 3D pose using the most likely estimations in 2D. This may, however, cause problems in cases where pose estimates are more uncertain and the estimation of one point is far from the true position, for example when a limb is occluded. This thesis adapts the method Neural Radiance Fields (NeRF) to 2D confidence heatmaps in order to create an implicit representation of the uncertainty in 3D, thus attempting to make use of as much information in 2D as possible. The adapted method was evaluated on the Human3.6M dataset, and results show that this method outperforms a simple triangulation baseline, especially when the estimation in 2D is far from the true pose. pose estimation neural radiance fields nerf computer vision machine learning
226	3D-Reconstruction of the Common Murre / 3D-Rekonstruering av Sillgrissla Hägerlind, Johannes January 2023 (has links) Automatic 3D reconstruction of birds can aid researchers in studying their behavior. Recently there has been an attempt to reconstruct a variety of birds from single-view images. However, the common murre's appearance is different from the birds that have been studied. Moreover, recent studies have focused on side views. This thesis studies the 3D reconstruction of the common murre from single-view top-view images. A template mesh is first optimized to fit a 3D scan. Then the result is used to optimize a species-specific mean from side-view images annotated with keypoints and silhouettes. The resulting mean mesh is used to initialize the optimization for top-down images. Using a mask loss, a pose prior loss, and a bone length loss that uses a mean vector from the side-view images improves the 3D reconstruction as rated by humans. Furthermore, the intersection over union (IoU) and percentage of correct keypoint (PCK), although used by other authors, are insufficient in a single-view top-view setting. pose estimation shape estimation 3D reconstruction articulated mesh bird common murre
227	Polarimetric Imagery for Object Pose Estimation Siefring, Matthew D. 15 May 2023 (has links) No description available. Electrical Engineering Optics Polarimetric Imagery visible-spectrum deep-learning object pose estimation CNN late-fusion Stokes-products dataset
228	Scanning Laser Registration and Structural Energy Density Based Active Structural Acoustic Control Manwill, Daniel Alan 17 December 2010 (has links) (PDF) To simplify the measurement of energy-based structural metrics, a general registration process for the scanning laser doppler vibrometer (SLDV) has been developed. Existing registration techniques, also known as pose estimation or position registration, suffer from mathematical complexity, instrument specificity, and the need for correct optimization initialization. These difficulties have been addressed through development of a general linear laser model and hybrid registration algorithm. These are applicable to any SLDV and allow the registration problem to be solved using straightforward mathematics. Additionally, the hybrid registration algorithm eliminates the need for correct optimization initialization by separating the optimization process from solution selection. The effectiveness of this approach is demonstrated through simulated application and by validation measurements performed on a specially prepared pipe. To increase understanding of the relationships between structural energy metrics and the acoustic response, the use of structural energy density (SED) in active structural acoustic control (ASAC) has also been studied. A genetic algorithm and other simulations were used to determine achievable reduction in acoustic radiation, characterize control system design, and compare SED-based control with the simpler velocity-based control. Using optimized sensor and actuator placements at optimally excited modal frequencies, attenuation of net acoustic intensity was proportional to attenuation of SED. At modal and non-modal frequencies, optimal SED-based ASAC system design is guided by establishing general symmetry between the structural disturbing force and the SED sensor and control actuator. Using fixed sensor and actuator placement, SED-based control has been found to provide superior performance to single point velocity control and very comparable performance to two-point velocity control. Its greatest strength is that it rarely causes unwanted amplifications of large amplitude when properly designed. Genetic algorithm simulations of SED-based ASAC indicated that optimal control effectiveness is obtained when sensors and actuators function in more than one role. For example, an actuator can be placed to simultaneously reduce structural vibration amplitude and reshape the response such that it radiates less efficiently. These principles can be applied to the design of any type of ASAC system. Daniel Manwill scanning laser doppler vibrometer registration pose estimation active structural acoustic control structural vibration acoustics genetic algorithm Mechanical Engineering
229	Geometric Invariance In The Analysis Of Human Motion In Video Data Shen, Yuping 01 January 2009 (has links) Human motion analysis is one of the major problems in computer vision research. It deals with the study of the motion of human body in video data from different aspects, ranging from the tracking of body parts and reconstruction of 3D human body configuration, to higher level of interpretation of human action and activities in image sequences. When human motion is observed through video camera, it is perspectively distorted and may appear totally different from different viewpoints. Therefore it is highly challenging to establish correct relationships between human motions across video sequences with different camera settings. In this work, we investigate the geometric invariance in the motion of human body, which is critical to accurately understand human motion in video data regardless of variations in camera parameters and viewpoints. In human action analysis, the representation of human action is a very important issue, and it usually determines the nature of the solutions, including their limits in resolving the problem. Unlike existing research that study human motion as a whole 2D/3D object or a sequence of postures, we study human motion as a sequence of body pose transitions. We also decompose a human body pose further into a number of body point triplets, and break down a pose transition into the transition of a set of body point triplets. In this way the study of complex non-rigid motion of human body is reduced to that of the motion of rigid body point triplets, i.e. a collection of planes in motion. As a result, projective geometry and linear algebra can be applied to explore the geometric invariance in human motion. Based on this formulation, we have discovered the fundamental ratio invariant and the eigenvalue equality invariant in human motion. We also propose solutions based on these geometric invariants to the problems of view-invariant recognition of human postures and actions, as well as analysis of human motion styles. These invariants and their applicability have been validated by experimental results supporting that their effectiveness in understanding human motion with various camera parameters and viewpoints. Human motion analysis geometric invariant human action recognition view invariant human motion style analysis human pose recognition Computer Sciences Engineering
230	Take the Lead: Toward a Virtual Video Dance Partner Farris, Ty 01 August 2021 (has links) (PDF) My work focuses on taking a single person as input and predicting the intentional movement of one dance partner based on the other dance partner's movement. Human pose estimation has been applied to dance and computer vision, but many existing applications focus on a single individual or multiple individuals performing. Currently there are very few works that focus specifically on dance couples combined with pose prediction. This thesis is applicable to the entertainment and gaming industry by training people to dance with a virtual dance partner. Many existing interactive or virtual dance partners require a motion capture system, multiple cameras or a robot which creates an expensive cost. This thesis does not use a motion capture system and combines OpenPose with swing dance YouTube videos to create a virtual dance partner. By taking in the current dancer's moves as input, the system predicts the dance partner's corresponding moves in the video frames. In order to create a virtual dance partner, datasets that contain information about the skeleton keypoints are necessary to predict a dance partner's pose. There are existing dance datasets for a specific type of dance, but these datasets do not cover swing dance. Furthermore, the dance datasets that do include swing have a limited number of videos. The contribution of this thesis is a large swing dataset that contains three different types of swing dance: East Coast, Lindy Hop and West Coast. I also provide a basic framework to extend the work to create a real-time and interactive dance partner. Computer Vision Dance Interactive Dance Partner Swing Dance Human Pose Estimation Machine Learning Artificial Intelligence and Robotics Data Science

Search results