Spelling suggestions: "subject:"3dpose"" "subject:"4close""
231 |
Geometric Invariance In The Analysis Of Human Motion In Video DataShen, Yuping 01 January 2009 (has links)
Human motion analysis is one of the major problems in computer vision research. It deals with the study of the motion of human body in video data from different aspects, ranging from the tracking of body parts and reconstruction of 3D human body configuration, to higher level of interpretation of human action and activities in image sequences. When human motion is observed through video camera, it is perspectively distorted and may appear totally different from different viewpoints. Therefore it is highly challenging to establish correct relationships between human motions across video sequences with different camera settings. In this work, we investigate the geometric invariance in the motion of human body, which is critical to accurately understand human motion in video data regardless of variations in camera parameters and viewpoints. In human action analysis, the representation of human action is a very important issue, and it usually determines the nature of the solutions, including their limits in resolving the problem. Unlike existing research that study human motion as a whole 2D/3D object or a sequence of postures, we study human motion as a sequence of body pose transitions. We also decompose a human body pose further into a number of body point triplets, and break down a pose transition into the transition of a set of body point triplets. In this way the study of complex non-rigid motion of human body is reduced to that of the motion of rigid body point triplets, i.e. a collection of planes in motion. As a result, projective geometry and linear algebra can be applied to explore the geometric invariance in human motion. Based on this formulation, we have discovered the fundamental ratio invariant and the eigenvalue equality invariant in human motion. We also propose solutions based on these geometric invariants to the problems of view-invariant recognition of human postures and actions, as well as analysis of human motion styles. These invariants and their applicability have been validated by experimental results supporting that their effectiveness in understanding human motion with various camera parameters and viewpoints.
|
232 |
Take the Lead: Toward a Virtual Video Dance PartnerFarris, Ty 01 August 2021 (has links) (PDF)
My work focuses on taking a single person as input and predicting the intentional movement of one dance partner based on the other dance partner's movement. Human pose estimation has been applied to dance and computer vision, but many existing applications focus on a single individual or multiple individuals performing. Currently there are very few works that focus specifically on dance couples combined with pose prediction. This thesis is applicable to the entertainment and gaming industry by training people to dance with a virtual dance partner.
Many existing interactive or virtual dance partners require a motion capture system, multiple cameras or a robot which creates an expensive cost. This thesis does not use a motion capture system and combines OpenPose with swing dance YouTube videos to create a virtual dance partner. By taking in the current dancer's moves as input, the system predicts the dance partner's corresponding moves in the video frames.
In order to create a virtual dance partner, datasets that contain information about the skeleton keypoints are necessary to predict a dance partner's pose. There are existing dance datasets for a specific type of dance, but these datasets do not cover swing dance. Furthermore, the dance datasets that do include swing have a limited number of videos. The contribution of this thesis is a large swing dataset that contains three different types of swing dance: East Coast, Lindy Hop and West Coast. I also provide a basic framework to extend the work to create a real-time and interactive dance partner.
|
233 |
Reinforcement learning for robotic manipulation / Reinforcement learning för manipulering med robotArnekvist, Isac January 2017 (has links)
Reinforcement learning was recently successfully used for real-world robotic manipulation tasks, without the need for human demonstration, usinga normalized advantage function-algorithm (NAF). Limitations on the shape of the advantage function however poses doubts to what kind of policies can be learned using this method. For similar tasks, convolutional neural networks have been used for pose estimation from images taken with fixed position cameras. For some applications however, this might not be a valid assumption. It was also shown that the quality of policies for robotic tasks severely deteriorates from small camera offsets. This thesis investigates the use of NAF for a pushing task with clear multimodal properties. The results are compared with using a deterministic policy with minimal constraints on the Q-function surface. Methods for pose estimation using convolutional neural networks are further investigated, especially with regards to randomly placed cameras with unknown offsets. By defining the coordinate frame of objects with respect to some visible feature, it is hypothesized that relative pose estimation can be accomplished even when the camera is not fixed and the offset is unknown. NAF is successfully implemented to solve a simple reaching task on a real robotic system where data collection is distributed over several robots, and learning is done on a separate server. Using NAF to learn a pushing task fails to converge to a good policy, both on the real robots and in simulation. Deep deterministic policy gradient (DDPG) is instead used in simulation and successfully learns to solve the task. The learned policy is then applied on the real robots and accomplishes to solve the task in the real setting as well. Pose estimation from fixed position camera images is learned and the policy is still able to solve the task using these estimates. By defining a coordinate frame from an object visible to the camera, in this case the robot arm, a neural network learns to regress the pushable objects pose in this frame without the assumption of a fixed camera. However, the precision of the predictions were too inaccurate to be used for solving the pushing task. Further modifications to this approach could however show to be a feasible solution to randomly placed cameras with unknown poses. / Reinforcement learning har nyligen använts framgångsrikt för att lära icke-simulerade robotar uppgifter med hjälp av en normalized advantage function-algoritm (NAF), detta utan att använda mänskliga demonstrationer. Restriktioner på funktionsytorna som använts kan dock visa sig vara problematiska för generalisering till andra uppgifter. För poseestimering har i liknande sammanhang convolutional neural networks använts med bilder från kamera med konstant position. I vissa applikationer kan dock inte kameran garanteras hålla en konstant position och studier har visat att kvaliteten på policys kraftigt förvärras när kameran förflyttas. Denna uppsats undersöker användandet av NAF för att lära in en ”pushing”-uppgift med tydliga multimodala egenskaper. Resultaten jämförs med användandet av en deterministisk policy med minimala restriktioner på Q-funktionsytan. Vidare undersöks användandet av convolutional neural networks för pose-estimering, särskilt med hänsyn till slumpmässigt placerade kameror med okänd placering. Genom att definiera koordinatramen för objekt i förhållande till ett synligt referensobjekt så tros relativ pose-estimering kunna utföras även när kameran är rörlig och förflyttningen är okänd. NAF appliceras i denna uppsats framgångsrikt på enklare problem där datainsamling är distribuerad över flera robotar och inlärning sker på en central server. Vid applicering på ”pushing”- uppgiften misslyckas dock NAF, både vid träning på riktiga robotar och i simulering. Deep deterministic policy gradient (DDPG) appliceras istället på problemet och lär sig framgångsrikt att lösa problemet i simulering. Den inlärda policyn appliceras sedan framgångsrikt på riktiga robotar. Pose-estimering genom att använda en fast kamera implementeras också framgångsrikt. Genom att definiera ett koordinatsystem från ett föremål i bilden med känd position, i detta fall robotarmen, kan andra föremåls positioner beskrivas i denna koordinatram med hjälp av neurala nätverk. Dock så visar sig precisionen vara för låg för att appliceras på robotar. Resultaten visar ändå att denna metod, med ytterligare utökningar och modifikationer, skulle kunna lösa problemet.
|
234 |
Cooperative Navigation of Autonomous Vehicles in Challenging EnvironmentsForsgren, Brendon Peter 18 September 2023 (has links) (PDF)
As the capabilities of autonomous systems have increased so has interest in utilizing teams of autonomous systems to accomplish tasks more efficiently. This dissertation takes steps toward enabling the cooperation of unmanned systems in scenarios that are challenging, such as GPS-denied or perceptually aliased environments. This work begins by developing a cooperative navigation framework that is scalable in the number of agents, robust against communication latency or dropout, and requires little a priori information. Additionally, this framework is designed to be easily adopted by existing single-agent systems with minimal changes to existing software and software architectures. All systems in the framework are validated through Monte Carlo simulations. The second part of this dissertation focuses on making cooperative navigation robust in challenging environments. This work first focuses on enabling a more robust version of pose graph SLAM, called cycle-based pose graph optimization, to be run in real-time by implementing and validating an algorithm to incrementally approximate a minimum cycle basis. A new algorithm is proposed that is tailored to multi-agent systems by approximating the cycle basis of two graphs that have been joined. These algorithms are validated through extensive simulation and hardware experiments. The last part of this dissertation focuses on scenarios where perceptual aliasing and incorrect or unknown data association are present. This work presents a unification of the framework of consistency maximization, and extends the concept of pairwise consistency to group consistency. This work shows that by using group consistency, low-degree-of-freedom measurements can be rejected in high-outlier regimes if the measurements do not fit the distribution of other measurements. The efficacy of this method is verified extensively using both simulation and hardware experiments.
|
235 |
Fast Recognition and Pose Estimation for the Purpose of Bin-Picking RoboticsLonsberry, Alexander J. January 2011 (has links)
No description available.
|
236 |
Model-Based Human Pose Estimation with Spatio-Temporal InferencingZhu, Youding 15 July 2009 (has links)
No description available.
|
237 |
Spot the Pain: Exploring the Application of Skeleton Pose Estimation for Automated Pain AssessmentHjelm Gardner, Angelica January 2022 (has links)
Automated pain assessment is emerging as an essential part of pain management in areas such as healthcare, rehabilitation, sports and fitness. These automated systems are based on machine learning applications and can provide reliable, objective and cost-effective benefits. To enable an automated approach, at least one channel of sensory input, known as modality, must be available to the system. So far, most studies of automated pain assessment have focused on facial expressions or physiological signals, and although body gestures are considered to be indicators of pain, not much attention has been paid to this modality. Using skeleton pose estimation, we can model body gestures and investigate how body movement information affects pain assessment performance in different approaches. In this study, we explored approaches to pain assessment using skeleton pose estimation for three objectives: pain recognition, pain intensity estimation, and pain area classification. Because pain is a complex experience and is often expressed across multiple modalities, we analysed both unimodal approaches using only body data and bimodal approaches using skeleton pose estimation with facial expressions and head pose. In our experiments, we trained models based on two deep learning architectures: a hybrid CNN-BiLSTM and a recurrent CNN (RCNN), on a real-world dataset consisting of video recordings of people performing an overhead deep squat exercise. We also investigated bimodal fusion of body and face modalities in three different strategies: early fusion, late fusion and ensemble learning. Although our results are still preliminary, they show promising indications and possible future improvements. The best performance was obtained with ensemble for pain recognition (AUC 0.71), unimodal body CNN-BiLSTM for pain intensity estimation (AUC 0.75) and late fusion of body and face modalities using RCNN for pain area classification (AUC 0.75). Our experimental results demonstrate the feasibility of using skeleton pose estimation to represent body modality, the importance of incorporating body movements into automated pain assessment, and the exploration of the previously understudied assessment objective of localising pain areas in the body.
|
238 |
LEARNING GRASP POLICIES FOR MODULAR END-EFFECTORS OF MOBILE MANIPULATION PLATFORMS IN CLUTTERED ENVIRONMENTSJuncheng Li (18418974) 22 April 2024 (has links)
<p dir="ltr">This dissertation presents the findings and research conducted during my Ph.D. study, which focuses on developing grasp policies for modular end-effectors on mobile manipulation platforms operating in cluttered environments. The primary objective of this research is to enhance the performance and accuracy of robotic manipulation systems in complex, real-world scenarios. The work has potential implications for various domains, including the rapidly growing Industry 4.0 and the advancement of autonomous systems in space habitats.</p><p dir="ltr">The dissertation offers a comprehensive literature review, emphasizing the challenges faced by mobile manipulation platforms in cluttered environments and the state-of-the-art techniques for grasping and manipulation. It showcases the development and evaluation of a Modular End-Effector System (MEES) for mobile manipulation platforms, which includes the investigation of object 6D pose estimation techniques, the generation of a deep learning-based grasping dataset for MEES, the development of a suction cup gripper grasping policy (Sim-Suction), the development of a two-finger grasping policy (Sim-Grasp), and the integration of Modular End-Effector System grasping policy (Sim-MEES). The proposed methodology integrates hardware designs, control algorithms, data-driven methods, and large language models to facilitate adaptive grasping strategies that consider the unique constraints and requirements of cluttered environments.</p><p dir="ltr">Furthermore, the dissertation discusses future research directions, such as further investigating the Modular End-Effector System grasping policy. This Ph.D. study aims to contribute to the advancement of robotic manipulation technology, ultimately enabling more versatile and robust mobile manipulation platforms capable of effectively interacting with complex environments.</p>
|
239 |
Human Pose and Action Recognition using Negative Space AnalysisJanse Van Vuuren, Michaella 12 1900 (has links)
This thesis proposes a novel approach to extracting pose information from image sequences. Current state of the art techniques focus exclusively on the image space occupied by the body for pose and action recognition. The method proposed here, however, focuses on the negative spaces: the areas surrounding the individual. This has resulted in the colour-coded negative space approach, an image preprocessing step that circumvents the need for complicated model fitting or template matching methods. The approach can be described as follows: negative spaces surrounding the human silhouette are extracted using horizontal and vertical scanning processes. These negative space areas are more numerous, and undergo more radical changes in shape than the single area occupied by the figure of the person performing an action. The colour-coded negative space representation is formed using the four binary images produced by the scanning processes. Features are then extracted from the colour-coded images. These are based on the percentage of area occupied by distinct coloured regions as well as the bounding box proportions. Pose clusters are identified using feedback from an independent action set. Subsequent images are classified using a simple Euclidean distance measure. An image sequence is thus temporally segmented into its corresponding pose representations. Action recognition simply becomes the detection of a temporally ordered sequence of poses that characterises the action. The method is purely vision-based, utilising monocular images with no need for body markers or special clothing. Two datasets were constructed using several actors performing different poses and actions. Some of these actions included actors waving their arms, sitting down or kicking a leg. These actions were recorded against a monochrome background to simplify the segmentation of the actors from the background. The actions were then recorded on DV cam and digitised into a data base. The silhouette images from these actions were isolated and placed in a frame or bounding box. The next step was to highlight the negative spaces using a directional scanning method. This scanning method colour-codes the negative spaces of each action. What became immediately apparent is that very distinctive colour patterns formed for different actions. To emphasise the action, different colours were allocated to negative spaces surrounding the image. For example, the space between the legs of an actor standing in a T - pose with legs apart would be allocated yellow, while the space below the arms were allocated different shades of green. The space surrounding the head would be different shades of purple. During an action when the actor moves one leg up in a kicking fashion, the yellow colour would increase. Inversely, when the actor closes his legs and puts them together, the yellow colour filling the negative space would decrease substantially. What also became apparent is that these coloured negative spaces are interdependent and that they influence each other during the course of an action. For example, when an actor lifts one of his legs, increasing the yellow-coded negative space, the green space between that leg and the arm decreases. This interrelationship between colours hold true for all poses and actions as presented in this thesis. In terms of pose recognition, it is significant that these colour coded negative spaces and the way the change during an action or a movement are substantial and instantly recognisable. Compare for example, looking at someone lifting an arm as opposed to seeing a vast negative space changing shape. In a controlled research environment, several actors were instructed to perform a number of different actions. After colour coding the negative spaces, it became apparent that every action can be recognised by a unique colour coded pattern. The challenge is to ascribe a numerical presentation, a mathematical quotation, to extract the essence of what is so visually apparent. The essence of pose recognition and it's measurability lies in the relationship between the colours in these negative spaces and how they impact on each other during a pose or an action. The simplest way of measuring this relationship is by calculating the percentage of each colour present during an action. These calculated percentages become the basis of pose and action recognition. By plotting these percentages on a graph confirms that the essence of these different actions and poses can in fact been captured and recognised. Despite variations in these traces caused by time differences, personal appearance and mannerisms, what emerged is a clear recognisable pattern that can be married to an action or different parts of an action. 7 Actors might lift their left leg, some slightly higher than others, some slower than others and these variations in terms of colour percentages would be recorded as a trace, but there would be very specific stages during the action where the traces would correspond, making the action recognisable.In conclusion, using negative space as a tool in human pose and tracking recognition presents an exiting research avenue because it is influenced less by variations such as difference in personal appearance and changes in the angle of observation. This approach is also simplistic and does not rely on complicated models and templates
|
240 |
El proceso de animación de personajes en los largometrajes 3D contemporáneosSanz Mariscal, Alberto 24 July 2023 (has links)
[ES] Esta tesis se centra en la animación 3D y de forma más concreta, en la animación de personajes para un plano de actuación en un largometraje CGI.
La animación ocupa un lugar importante dentro del mundo del entretenimiento desde principios del siglo XX hasta ahora; sobre todo en estas últimas décadas, donde su presencia en cortometrajes, anuncios publicitarios, videojuegos, series de televisión y películas, ha llenado nuestras salas de cines y nuestros hogares con una gran variedad de historias, personajes fantásticos y símbolos gráficos que forman parte de nuestra cultura popular más arraigada.
Como animador 3D, trabajas y creces como profesional de un proyecto a otro, pero no te planteas cuánto sabes. Únicamente estás preocupado en hacer tu trabajo lo mejor posible y destacar. Cuando empecé a compaginar mi parte profesional con la docencia me di cuenta de que no era capaz de explicar al alumnado cómo aprender animación de una forma adecuada; porque una cosa es animar muy bien, pero otra muy distinta es saber explicar cómo lo haces. Está investigación me planteó el reto de recopilar todos esos conocimientos que he ido adquiriendo a través de libros, documentales, profesorado o compañeros de trabajo, y organizarlos de tal forma que fueran de fácil compresión y accesibilidad para cualquier persona que quisiera aprender animación o mejorar como animador 3D. / [CA] Aquesta tesi es centra en l'animació 3D i de forma més concreta, en l'animació de personatges per fer un pla d'actuació en un llargmetratge CGI.
L'animació ocupa un lloc important dins del món de l'entreteniment des de principis del segle XX fins ara; sobretot en aquestes últimes dècades, on la seua presència en curtmetratges, anuncis publicitaris, videojocs, sèries de televisió i pel·lícules, ha omplit les nostres sales de cinemes i les nostres llars amb una gran varietat d'històries, personatges fantàstics i símbols gràfics que formen part de la nostra cultura popular més arrelada.
Com a animador 3D, treballes i creixes com a professional d'un projecte a un altre, però no et planteges quant en saps. Únicament estàs preocupat per fer el teu treball tan bé com siga possible i destacar. Quan vaig començar a compaginar la meua part professional amb la docència em vaig adonar que no era capaç d'explicar al alumnat com aprendre animació d'una forma adequada; ja que una cosa és animar molt bé, però d'altra molt diferent és saber explicar com ho fas. Aquesta investigació em va plantejar el repte de recopilar tots aqueixos coneixements que he anat adquirint a través de llibres, documentals, professorat o col·legues de treball, i organitzar-los de tal forma que foren de fàcil compressió i accessibilitat per a qualsevol persona que volguera aprendre animació o millorar com a animador 3D. / [EN] This thesis focuses on 3D animation and more specifically, on the animation of characters for an acting shot in a CGI feature film.
Animation has taken an important place in the world of entertainment from the beginning of the 20th century until now; especially in recent decades, where its presence in short films, commercials, video games, TV series and movies has filled our cinemas and our homes with a wide variety of stories, fantastic characters and graphic symbols that are part of our deep-rooted popular culture.
As a 3D animator, you work and grow from one project to the next as a professional, but you don't think about how much you know. You are only concerned about doing your work the best you can and standing out in your field. When I started to balance my professional career with teaching I realised that I was not able to explain to my students how to learn animation in a proper way; because it is not the same thing to animate very well than explaining how to do so. This research posed the challenge of compiling all the knowledge I had acquired through books, documentaries, teachers and colleagues, and organising it in such a way that it would be easy to understand and accessible to anyone who wanted to learn animation or improve as a 3D animator. / Sanz Mariscal, A. (2023). El proceso de animación de personajes en los largometrajes 3D contemporáneos [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/195345
|
Page generated in 0.0291 seconds