• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 198
  • 24
  • 18
  • 10
  • 9
  • 6
  • 6
  • 4
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 343
  • 217
  • 145
  • 106
  • 70
  • 61
  • 58
  • 48
  • 45
  • 45
  • 44
  • 43
  • 39
  • 38
  • 36
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Multi-person Pose Estimation in Soccer Videos with Convolutional Neural Networks

Skyttner, Axel January 2018 (has links)
Pose estimation is the problem of detecting poses of people in images, multiperson pose estimation is the problem of detecting poses of multiple persons in images. This thesis investigates multi-person pose estimation by applying the associative embedding method on images from soccer videos. Three models are compared, first a pre-trained model, second a fine-tuned model and third a model extended to handle image sequences. The pre-trained method performed well on soccer images and the fine-tuned model performed better then the pre-trained model. The image sequence model performed equally as the fine-tuned model but not better. This thesis concludes that the associative embedding model is a feasible option for pose estimation in soccer videos and should be further researched.
52

KUMONEKOTABI L'art de la relation, la relation comme oeuvre d'art : Japon, emprunts, possibles et nuances comme dispositifs formels pour une figure photographique entre paysage, animal et détour. Passages. / Kumonekotabi The art of relationship – Relationship as a work of art : Japan, quotation, possibilities and nuances as visual configuration for a photographic figure between landscape, animal and detour. Passages.

Druet, Lucile 30 October 2014 (has links)
Réfléchir les images pour réfléchir le monde. Entre unité, éclats et nuances, Kumonekotabi est un travail de recherche fonctionnant à la fois sur des images et des écritures qui sont comme autant d’espaces où une figure photographique entretient une relation à la fois simple et pourtant liée avec le Japon. Construites sur des modalités comme le plié, l’encadré, le sériel, les images construisent une logique bricolée mais également performative qui expose le corps capté dans sa relation avec le Japon, ses rêves et ses ancrages. Le travail d’écriture est une analyse de ces mêmes images, mettant en évidence les histoires mais aussi les forces et les intertextes qui habitent leurs territoires.Ainsi, à la fois théoricienne, poète et plasticienne, Kumonekotabi est une thèse éventail, entremêlant expériences visuelles et écrites avec la culture japonaise. C’est une ouverture et une fermeture, une interface de créativité, un espace entre lenteur et vitesse, la maturation et la fulgurance. Un phénomène qui avec ces emmêlements nous donne finalement les outils pour comprendre la pertinence de cet archipel éphémère créé par les images et leur simultanéité.En cela, Kumonekotabi est un travail qui envisage avec des images et leurs interprétations une certaine idée de la limite et de l’écart, du contact et de la distance. Un travail d’ambiance pris dans ce dialogue entre l’art de la relation, la relation comme œuvre d’art pour aboutir à cette idée que réfléchir les images permet de comprendre une certaine partie de notre monde et vice-versa. / Think about the pictures to think about the world. Between unity, slivers and nuances, Kumonekotabi is a research work balanced between visual shapes and articulated theory, both working like spaces where the idea of a photographic figure is having a simple and yet established relationship with Japan.Based on visual specificities like folds, frames, series, the pictures are building an homemade logic but also a performative one that display the captured body in its relationship with Japan, its dreams and anchors. The writing part then is an analysis of those very pictures, putting together the stories but also the strenghts and the interexts living in their territories.Thus, at the same time theoritical, poetic and artistic, Kumonekotabi is a dissertation to be considered like a fan, intertwining visual and literary experiences with the Japanese culture. It’s an opening, a creative interface, a space between slowness and speed, maturation and outbreaks. A phenomenon that, with its intricacy, gives us tools to understand the relevance of this archipelago created by the pictures and their simultaneity.In short, Kumonekotabi is a dissertation to contemplate with pictures and their interpretations a certain idea of limits and contrasts, contacts and distances. An atmospheric work placed in this dialog between the art of relation, relationship as a work of art in order to get to this idea that thinking about pictures allows to understand a certain part of our world, and vice-versa.
53

Estimation of Human Poses Categories and Physical Object Properties from Motion Trajectories

Fathollahi Ghezelghieh, Mona 22 June 2017 (has links)
Despite the impressive advancements in people detection and tracking, safety is still a key barrier to the deployment of autonomous vehicles in urban environments [1]. For example, in non-autonomous technology, there is an implicit communication between the people crossing the street and the driver to make sure they have communicated their intent to the driver. Therefore, it is crucial for the autonomous car to infer the future intent of the pedestrian quickly. We believe that human body orientation with respect to the camera can help the intelligent unit of the car to anticipate the future movement of the pedestrians. To further improve the safety of pedestrians, it is important to recognize whether they are distracted, carrying a baby, or pushing a shopping cart. Therefore, estimating the fine- grained 3D pose, i.e. (x,y,z)-coordinates of the body joints provides additional information for decision-making units of driverless cars. In this dissertation, we have proposed a deep learning-based solution to classify the categorized body orientation in still images. We have also proposed an efficient framework based on our body orientation classification scheme to estimate human 3D pose in monocular RGB images. Furthermore, we have utilized the dynamics of human motion to infer the body orientation in image sequences. To achieve this, we employ a recurrent neural network model to estimate continuous body orientation from the trajectories of body joints in the image plane. The proposed body orientation and 3D pose estimation framework are tested on the largest 3D pose estimation benchmark, Human3.6m (both in still images and video), and we have proved the efficacy of our approach by benchmarking it against the state-of-the-art approaches. Another critical feature of self-driving car is to avoid an obstacle. In the current prototypes the car either stops or changes its lane even if it causes other traffic disruptions. However, there are situations when it is preferable to collide with the object, for example a foam box, rather than take an action that could result in a much more serious accident than collision with the object. In this dissertation, for the first time, we have presented a novel method to discriminate between physical properties of these types of objects such as bounciness, elasticity, etc. based on their motion characteristics . The proposed algorithm is tested on synthetic data, and, as a proof of concept, its effectiveness on a limited set of real-world data is demonstrated.
54

Height Estimation of a Blimp Unmanned Aerial Vehicle Using Inertial Measurement Unit and Infrared Camera

Villeneuve, Hubert January 2017 (has links)
Increasing demands in areas such as security, surveillance, search and rescue, and communication, has promoted the research and development of unmanned aerial vehicles (UAVs) as such technologies can replace manned flights in dangerous or unfavorable conditions. Lighter-than-air UAVs such as blimps can carry higher payloads and can stay longer in the air compared to typical heavier-than-air UAVs such as aeroplanes or quadrotors. One purpose of this thesis is to develop a sensor suite basis for estimating the position and orientation of a blimp UAV in development with respect to a reference point for safer landing procedures using minimal on-board sensors. While the existing low-cost sensor package, including inertial measurement unit (IMU) and Global Navigation System (GPS) module, could be sufficient to estimate the pose of the blimp to a certain extent, the GPS module is not as precise in the short term, especially for altitude. The proposed system combines GPS and inertial data with information from a grounded infrared (IR) camera. Image frames are processed to identify three IR LEDs located on the UAV and each LED coordinate is estimated using a Perspective-n-Point (PnP) algorithm. Then the results from the PnP algorithm are fused with the GPS, accelerometer and gyroscope measurements using an Extended Kalman Filter (EKF) to get a more accurate estimate of the position and the orientation. Tests were conducted on a simulated blimp using the experimental avionics.
55

VECTOR REPRESENTATION TO ENHANCE POSE ESTIMATION FROM RGB IMAGES

Zongcheng Chu (8791457) 03 May 2020 (has links)
Head pose estimation is an essential task to be solved in computer vision. Existing research for pose estimation based on RGB images mainly uses either Euler angles or quaternions to predict pose. Nevertheless, both Euler angle- and quaternion-based approaches encounter the problem of discontinuity when describing three-dimensional rotations. This issue makes learning visual pattern more difficult for the convolutional neural network(CNN) which, in turn, compromises the estimation performance. To solve this problem, we introduce TriNet, a novel method based on three vectors converted from three Euler angles(roll, pitch, yaw). The orthogonality of the three vectors enables us to implement a complementary multi-loss function, which effectively reduces the prediction error. Our method achieves state-of-the-art performance on the AFLW2000, AFW and BIWI datasets. We also extend our work to general object pose estimation and show results in the experiment part.
56

Generating 3D Scenes From Single RGB Images in Real-Time Using Neural Networks

Grundberg, Måns, Altintas, Viktor January 2021 (has links)
The ability to reconstruct 3D scenes of environments is of great interest in a number of fields such as autonomous driving, surveillance, and virtual reality. However, traditional methods often rely on multiple cameras or sensor-based depth measurements to accurately reconstruct 3D scenes. In this thesis we propose an alternative, deep learning-based approach to 3D scene reconstruction for objects of interest, using nothing but single RGB images. We evaluate our approach using the Deep Object Pose Estimation (DOPE) neural network for object detection and pose estimation, and the NVIDIA Deep learning Dataset Synthesizer for synthetic data generation. Using two unique objects, our results indicate that it is possible to reconstruct 3D scenes from single RGB images within a few centimeters of error margin.
57

Vehicle-pedestrian interaction using naturalistic driving video through tractography of relative positions and pedestrian pose estimation

Mueid, Rifat M. 11 April 2017 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Research on robust Pre-Collision Systems (PCS) requires new techniques that will allow a better understanding of the vehicle-pedestrian dynamic relationship, and which can predict pedestrian future movements. Our research analyzed videos from the Transportation Active Safety Institute (TASI) 110-Car naturalistic driving dataset to extract two dynamic pedestrian semantic features. The dataset consists of videos recorded with forward facing cameras from 110 cars over a year in all weather and illumination conditions. This research focuses on the potential-conflict situations where a collision may happen if no avoidance action is taken from driver or pedestrian. We have used 1000 such 15 seconds videos to find vehicle-pedestrian relative dynamic trajectories and pose of pedestrians. Adaptive structural local appearance model and particle filter methods have been implemented and modified to track the pedestrians more accurately. We have developed new algorithm to compute Focus of Expansion (FoE) automatically. Automatically detected FoE height data have a correlation of 0.98 with the carefully clicked human data. We have obtained correct tractography results for over 82% of the videos. For pose estimation, we have used flexible mixture model for capturing co-occurrence between pedestrian body segments. Based on existing single-frame human pose estimation model, we have introduced Kalman filtering and temporal movement reduction techniques to make stable stick-figure videos of the pedestrian dynamic motion. We were able to reduce frame to frame pixel offset by 86% compared to the single frame method. These tractographs and pose estimation data were used as features to train a neural network for classifying ‘potential conflict’ and ‘no potential conflict’ situations. The training of the network achieved 91.2% true label accuracy, and 8.8% false level accuracy. Finally, the trained network was used to assess the probability of collision over time for the 15 seconds videos which generates a spike when there is a ‘potential conflict’ situation. We have also tested our method with TASI mannequin crash data. With the crash data we were able to get a danger spike for 70% of the videos. The research enables new analysis on potential-conflict pedestrian cases with 2D tractography data and stick-figure pose representation of pedestrians, which provides significant insight on the vehicle-pedestrian dynamics that are critical for safe autonomous driving and transportation safety innovations.
58

Contributions on 3D Human Computer-Interaction using Deep approaches

Castro-Vargas, John Alejandro 16 March 2023 (has links)
There are many challenges facing society today, both socially and industrially. Whether it is to improve productivity in factories or with the intention of improving the quality of life of people in their homes, technological advances in robotics and computing have led to solutions to many problems in modern society. These areas are of great interest and are in constant development, especially in societies with a relatively ageing population. In this thesis, we address different challenges in which robotics, artificial intelligence and computer vision are used as tools to propose solutions oriented to home assistance. These tools can be organised into three main groups: “Grasping Challenges”, where we have addressed the problem of performing robot grasping in domestic environments; “Hand Interaction Challenges”, where we have addressed the detection of static and dynamic hand gestures, using approaches based on DeepLearning and GeometricLearning; and finally, “Human Behaviour Recognition”, where using a machine learning model based on hyperbolic geometry, we seek to group the actions that performed in a video sequence.
59

Using Pitch Tipping for Baseball Pitch Prediction

Ishii, Brian 01 June 2021 (has links) (PDF)
Data Analytics and technology have changed baseball as we know it. From the increase in defensive shifts to teams using cameras in the outfield to steal signs, teams will try anything to win. One way to gain an edge in baseball is to figure out what pitches a pitcher will pitch. Pitch prediction is a popular task to try to accomplish with all the data that baseball provides. Most methods involve using situational data like the ball and strike count. In this paper, we try a different method of predicting pitch type by only looking at the pitcher's pose in the set position. We do this to find a pitcher's tell or "tip". In baseball, if a pitcher is tipping their pitches, they are doing something that gives away what they will pitch. This could be because the pitcher changes the grip on the ball only for some pitches or something as small as a different flex in their wrist. Professional baseball players will study pitchers before they pitch the ball to try to pick up on these tips. If a tip is found, the batters have a significant advantage over the pitcher. Our paper uses pose estimation and object detection to predict the pitch type based on the pitcher's set position before throwing the ball. Given a successful model, we can extract the important features or the potential tip from the data. Then, we can try to predict the pitches ourselves like a batter. We tested this method on three pitchers: Tyler Glasnow, Yu Darvish, and Stephen Strasburg. Our results demonstrate that when we predict pitch type at a 70\% accuracy, we can reasonably extract useful features. However, finding a useful tip from these features still requires manual observation.
60

Exploring the Feasibility of Machine Learning Techniques in Recognizing Complex Human Activities

Hu, Shengnan 01 January 2023 (has links) (PDF)
This dissertation introduces several technical innovations that improve the ability of machine learning models to recognize a wide range of complex human activities. As human sensor data becomes more abundant, the need to develop algorithms for understanding and interpreting complex human actions has become increasingly important. Our research focuses on three key areas: multi-agent activity recognition, multi-person pose estimation, and multimodal fusion. To tackle the problem of monitoring coordinated team activities from spatio-temporal traces, we introduce a new framework that incorporates field of view data to predict team performance. Our framework uses Spatial Temporal Graph Convolutional Networks (ST-GCN) and recurrent neural network layers to capture and model the dynamic spatial relationships between agents. The second part of the dissertation addresses the problem of multi-person pose estimation (MPPE) from video data. Our proposed technique (Language Assisted Multi-person Pose estimation) leverages text representations from multimodal foundation models to learn a visual representation that is more robust to occlusion. By infusing semantic information into pose estimation, our approach enables precise estimations, even in cluttered scenes. The final part of the dissertation examines the problem of fusing multimodal physiological input from cardiovascular and gaze tracking sensors to exploit the complementary nature of these modalities. When dealing with multimodal features, uncovering the correlations between different modalities is as crucial as identifying effective unimodal features. This dissertation introduces a hybrid multimodal tensor fusion network that is effective at learning both unimodal and bimodal dynamics. The outcomes of this dissertation contribute to advancing the field of complex human activity recognition by addressing the challenges associated with multi-agent activity recognition, multi-person pose estimation, and multimodal fusion. The proposed innovations have potential applications in various domains, including video surveillance, human-robot interaction, sports analysis, and healthcare monitoring. By developing intelligent systems capable of accurately recognizing complex human activities, this research paves the way for improved safety, efficiency, and decision-making in a wide range of real-world applications.

Page generated in 0.0477 seconds