• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 198
  • 24
  • 18
  • 10
  • 9
  • 6
  • 6
  • 4
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 343
  • 217
  • 145
  • 106
  • 70
  • 61
  • 58
  • 48
  • 45
  • 45
  • 44
  • 43
  • 39
  • 38
  • 36
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Neural representations of environmental features in retrosplenial cortex and 3-dimensional reconstruction of animal pose

Carstensen, Lucas 10 February 2024 (has links)
The behavior of animals is often complex and requires them to interact with their surroundings. Within the brain, there are specialized neural systems in place to create and store representations of space. In order to effectively utilize these spatial mappings, there must be coordination between sub-cortical systems, which are responsible for allocentric spatial processing, and cortical regions, which handle sensory processing in egocentric coordinates. The retrosplenial cortex is a candidate for the role in facilitating the transformation between different coordinates, as it is anatomically located between the hippocampus and sensory cortical areas and exhibits both egocentric and allocentric spatial responses. The first experiment explored the firing properties of neurons in the retrosplenial cortex in response to boundaries defined in egocentric coordinates. Based on previous research conducted in our lab, which showed that cells in the striatum respond to such boundaries, rats were implanted with electrodes to record the activity of neurons in the retrosplenial cortex while they roamed freely in an open field. The response properties of these neurons were analyzed as the arena was expanded, its shape altered, and boundaries were added and removed. A subgroup of neurons, referred to as Egocentric Boundary Cells, showed increased or decreased firing when the rat was at a specific distance and direction from any of the arena's boundaries. These cells showed no preference for any particular boundary, and their firing was not biased by the animal's angular velocity or turning behavior, suggesting that they respond to boundaries in a general manner, regardless of the features of the boundary or the animal's behavior. Building on experiment one, the second experiment examined the behavior of retrosplenial neurons in rats during open field exploration when alterations were made to the environment. In three separate sessions, a subset of neurons recorded showed either an increase or decrease in their mean firing rate throughout the entire session in which the environment was altered, then returned to prior levels when the environment was returned to a familiar configuration. These alterations included the introduction of an object, rotation of boundaries, expansion of boundaries, changes in the arena's geometry, and removal of boundaries. Similar proportions of neurons exhibited increases or decreases in firing rate for all experimental manipulations. Furthermore, the majority of retrosplenial neurons showed strong speed sensitivity. Some neurons showed an increase in firing rate as speed increased, others showed a decrease in firing rate as speed increased, and some had a specific speed at which their firing rate was the highest. These results support the idea that the RSC is involved in contextual and memory processing, scene processing, as well as transmitting information about self-movement to downstream regions. The third experiment analyzed the different poses of rats as they moved through open field arenas, using simultaneous high-resolution thermal, depth, and RGB cameras. This was organized into a new open-source dataset called Rodent3D. The three-dimensional posture of the animal was reconstructed from the two-dimensional videos using a model called OptiPose. We investigated aspects of the environment where the animals spent the most time looking, such as boundaries and objects, and the frequency and duration of behaviors such as rearing and changes in heading. Finally, we discuss the significance of our model and the potential uses of our unique dataset for the fields of neuroscience and computer vision, as well as future research plans. These experiments demonstrated that the retrosplenial cortex is a vital region for spatial processing, particularly emphasizing egocentric responses. We show that aspects of the environment, such as boundaries, the presence of objects, and changes to the local features induce multiple changes in spatial firing properties of neurons. We also provide a novel open-source model and dataset that provides an innovative tool for more rigorous behavioral analysis that can be used in many disciplines. Altogether, these results suggest that the retrosplenial cortex plays a crucial role as a hub for egocentric spatial processing and coordinating of spatial reference frames to support spatial memory and navigation.
22

Difference-Based Temporal Module for Monocular Category-Level 6 DoF Object Pose Tracking

Chen, Zishen 22 January 2024 (has links)
Monocular 6DoF pose tracking has many applications in augmented reality, robotics and other areas and because of the rise of deep learning new approaches such as category-level models are successful. The temporal information in sequential data is essential for both online and offline tasks, which can help boost the quality of predictions while encountering some unexpected influences like occlusions and vibration. In 2D object detection and tracking, substantial research has been done in leveraging temporal information to improve the performance of the model. Nevertheless, it is challenging to lift the temporal processing to 3D space because of the ambiguity of the visual data. In this thesis, we propose a method to calculate the temporal difference of points and pixels assuming that the K nearest points share similar features. The extracted features from the difference are learned to weigh the relevant points in the temporal sequence and aggregate them to provide support to the current frame's prediction. We propose a novel difference-based temporal module to incorporate both RGB and 3D points data in a temporal sequence. This module can be easily integrated with any category-level 6DoF pose tracking model which uses RGB and 3D points as input. We evaluate this module on two state-of-the-art category-level 6D pose tracking models and the result shows that it can increase the model's accuracy and robustness in complex scenarios.
23

Performance Enhancements of the Spin-Image Pose Estimation Algorithm

Gerlach, Adam R. 12 April 2010 (has links)
No description available.
24

Improving the kinematic control of robots with computer vision

Fallon, J. Barry 06 June 2008 (has links)
This dissertation describes the development and application of a computer vision system for improving the performance of robots. The vision-based approach determines position and orientation (pose) parameters more directly than conventional approaches that are based on kinematics and joint feedback. Traditional robot control systems rely on kinematic models, measured joint variables, knowledge of objects in the workspace, and the calibrated robot base pose to correctly position and orient a tool. Since this conventional approach involves a large number of parameters, unacceptable pose errors may accumulate. In contrast, the vision system approach uses images from a tool-mounted camera and geometric knowledge of objects in the workspace to accurately track and determine the end-effector pose. This approach is advantageous because the camera directly observes the parameters of interest (position and orientation of the robot tool with respect to the work-piece) during the positioning process. The vision approach is verified and its utility demonstrated by increasing the automation and accuracy of computer controlled robots used in the nuclear service industry. The overall solution strategy involves tracking and pose determination. Tracking is used as a coarse positioner and to verify the toolhead position prior to performing crucial servicing operations. Pose determination is used to calibrate the base location of the robot, verify the tool pose for insertions, and compute a precise correction if necessary. The major contributions of this work lie within its comprehensive treatment, which begins with theoretical modeling and follows through to the details of application. Specific contributions are made in the areas of robotics, image processing, calibration, tracking, pose determination, kinematic control strategies, and nuclear service operations. Performance results from laboratory experiments and actual field testing have been encouraging. The vision-based strategy offers robustness to the conventional error stackups encountered in robotics and promises to improve the accuracy, flexibility, and cost of both specialized and general-purpose robotic systems. / Ph. D.
25

Synthèse de vues pour l’initialisation de pose / Viewpoint synthesis for pose initialisation

Rolin, Pierre 08 March 2017 (has links)
La localisation est un problème récurrent de la vision par ordinateur, avec des applications dans des domaines multiples tels que la robotique ou la réalité augmentée. Dans cette thèse on considère en particulier le problème d'initialisation de la pose, c'est-à-dire la localisation sans information a priori sur la position de la caméra. Nous nous intéressons à la localisation à partir d'une image monoculaire et d'un nuage de points reconstruit à partir d'une séquence d'images. Puisque nous n'avons pas d'a priori sur la position de la caméra, l'estimation de la pose s'appuie sur la recherche de correspondances entre des points de l'image et des points du modèle de la scène. Cette mise en correspondance est difficile en raison de sa combinatoire élevée. Elle peut être mise en défaut lorsque l'image dont on cherche la pose est très différente de celles ayant servi à la construction du modèle, en particulier en présence de forts changements de point de vue. Cette thèse développe une approche permettant la mise en correspondance image-modèle dans ces situations complexes. Elle consiste à synthétiser localement l'apparence de la scène à partir de points de vue virtuels puis à ajouter au modèle des descripteurs extraits des images synthétisées. Comme le modèle de scène est un nuage de points, la synthèse n'est pas faite par rendu 3D mais utilise des transformations 2D locales des observations connues de la scène. Les contributions suivantes sont apportées. Nous étudions différents modèles de transformation possibles et montrons que la synthèse par homographie est la plus adaptée pour ce type d'application. Nous définissons une méthode de positionnement des points de vue virtuels par rapport à une segmentation de la scène en patchs plans. Nous assurons l'efficacité de l'approche proposée en ne synthétisant que des vues utiles : elles sont éloignées de celles existantes et elles ne se recouvrent pas. Nous vérifions également que la scène est visible à partir des points des vue virtuels pour ne pas produire des vues aberrantes à cause d’occultations. Enfin, nous proposons une méthode de recherche de correspondances image-modèle qui est à la fois rapide et robuste. Cette méthode exploite la répartition non-uniforme des correspondances correctes dans le modèle, ce qui permet de guider leur recherche. Les résultats expérimentaux montrent que la méthode proposée permet de calculer des poses dans des configurations défavorables où les approches standard échouent. De façon générale la précision des poses obtenues augmente significativement lorsque la synthèse de vue est utilisée. Enfin nous montrons que, en facilitant la mise en correspondance image-modèle, cette méthode accélère le calcul de pose / Localisation is a central problem of computer vision which has numerous applications such as robotics or augmented reality. In this thesis we consider the problem of pose initialisation, which is pose computation without prior knowledge on the camera position. We are interested in pose computation from a single image and a point cloud that has been reconstructed from a set of images. As we do not have prior knowledge on the camera position, pose estimation entirely rely on finding correspondences between the image and the model. The search for these correspondences is a difficult problem because of its high combinatorial complexity. It can fail if the image is very different from the ones we used to construct the model, in particular when there is a large viewpoint change between them. This thesis proposes an approach to make matching possible in such difficult scenarios. It consists in synthesising locally the appearance of the scene from virtual viewpoints and add descriptors extracted from these synthetic views to the model. Because the scene model is a point cloud, the synthesis is not a 3D rendering but a local 2D transform of existing observations of the scene. The following contributions have been proposed. We study different transform models and show that homographic transformations are the best suited for this application. We define a method to position the virtual viewpoints with respect to a planar segmentation of the scene model. We ensure time efficiency by only synthesising useful views, i.e. views that are far from the existing one and don't overlap. Furthermore we verify that the synthesized surface is visible from the virtual viewpoint to avoid producing aberrant views due to occlusions. Finally, we propose a robust and time efficient method to research image-model correspondences. It uses geometric cues in a guided matching framework to efficiently identify sets of correct correspondences. Experimental results show that the proposed approach makes possible pose computation in situation where standard methods fail. In general the precision and repeatability of computed poses is significantly improved by the use of view synthesis. We also show that it also reduce the pose computation times by making image-model matching easier
26

Kan man identifiera de karaktäristiska dragen hos stop-motion och applicera dem inom 3D-animation? / Is it possible to identify the characteristic features of stop-motion and apply these in 3D-animation?

Dahlin Jansson, Emelie January 2013 (has links)
For any animator either working in stop-motion or 3D-animation the knowledge of the first animation techniques are most important. For it is based on these principles that animation today can create such vibrant characters. With the extreme development 3D-animation has undergone the last twenty years it is easy to lose track of the old ways. But with this thesis a hope to show that it is these old ways that will take animation to its next stage. My aim with this thesis is to find out if it is possible to identify the characteristic features of stop-motion and apply these onto 3D-animation with purpose to create a more expressive style of animation. To be able to do this the method of this thesis is divided in two parts: the production and the perception. First the two work methods straight ahead and pose to pose are analyzed and the biggest differences are then identified. With this information two basic walk cycle animations are created, one with the work method of straight ahead and the other with pose to pose. These animations are kept as simple as possible so that all focus can be directed at the feel of the movement. In the perception part a survey is done where five participants observes the two animations and with a questionnaire answer how they perceive the differences. They answer which of the two they prefer, is most interesting and pleasant. According to the participants, they feel a difference and the majority prefer the animation done with the work method of stop-motion. They feel that it shows more personality and feel more natural. However the result is relatively even. My conclusion is that to achieve an animation that appeals to a larger group, a combination of the two work methods is optimal.
27

Automatic Pose and Position Estimation by Using Spiral Codes

Albayrak, Aras January 2014 (has links)
This master thesis is about providing the implementation of synthesis, detection of spiral symbols and estimating the pan/tilt angle and position by using camera calibration. The focus is however on the latter, the estimation of parameters of localization. Spiral symbols are used to be able to give an object an identity as well as to locate it. Due to the spiral symbol´s characteristic shape, we can use the generalized structure tensor (GST) algorithm which is particularly efficient to detect different members of the spiral family. Once we detect spirals, we know the position and identity parameters of the spirals within an apriori known geometric configuration (on a sheet of paper). In turn, this information can be used to estimate the 3D-position and orientation of the object on which spirals are attached using a camera calibration method.   This thesis provides an insight into how automatic detection of spirals attached on a sheet of paper, and from this, automatic deduction of position and pose parameters of the sheet, can be achieved by using a network camera. GST algorithm has an advantage of running the processes of detection of spirals efficiently w.r.t detection performance and computational resources because it uses a spiral image model well adapted to spiral spatial frequency characteristic. We report results on how detection is affected by zoom parameters of the network camera, as well as by the GST parameters; such as filter size. After all spirals centers are located and identified w.r.t. their twist/bending parameter, a flexible technique for camera calibration, proposed by Zhengyou Zhang implemented in Matlab within the present study, is performed. The performance of the position and pose estimation in 3D is reported. The main conclusion is, we have reasonable surface angle estimations for images which were taken by a WLAN network camera in different conditions such as different illumination and different distances.
28

Generating Synthetic Data for Evaluation and Improvement of Deep 6D Pose Estimation

Löfgren, Tobias, Jonsson, Daniel January 2020 (has links)
The task of 6D pose estimation with deep learning is to train networks to, from an im-age of an object, determine the rotation and translation of the object. Impressive resultshave recently been shown in deep learning based 6D pose estimation. However, many cur-rent solutions rely on real-world data when training, which as opposed to synthetic data,requires time consuming annotation. In this thesis, we introduce a pipeline for generatingsynthetic ground truth data for deep 6D pose estimation, where annotation is done auto-matically. With a 3D CAD-model, we use Blender to render 2D images of the model fromdifferent view points. We also create all other relevant data needed for pose estimation, e.g.,the poses of an object, mask images and 3D keypoints on the object. Using this pipeline, itis possible to adjust different settings to reduce the domain gap between synthetic data andreal-world data and get better pose estimation results. Such settings could be changing themethod of extracting 3D keypoints and varying the scale of the object or the light settingsin the scene.The network used to test the performance of training on our synthetic data is PVNet,which achieves state-of-the-art results for 6D pose estimation. This architecture learns tofind 2D keypoints of the object in the image, as well as 2D–3D keypoint correspondences.With these correspondences, the Perspective-n-Point (PnP) algorithm is used to extract apose. We evaluate the pose estimation of the different settings on the synthetic data andcompare these results to other state-of-the-art work. We find that using only real-worlddata for training is worse than using a combination of synthetic and real-world data. Sev-eral other findings are that varying scale and lightning, in addition to adding random back-ground images to the rendered images improves results. Four different novel keypoint se-lection methods are introduced in this work, and tried against methods used in previouswork. We observe that our methods achieve similar or better results. Finally, we use thebest possible settings from the synthetic data pipeline, but with memory limitations on theamount of training data. We are close to state-of-the-art results, and could get closer withmore data.
29

Pose Estimation for Gesture Recovery in Occluded Television Videos

Pham, Kyle 26 August 2022 (has links)
No description available.
30

Social-pose : Human Trajectory Prediction using Input Pose

Gao, Yang January 2022 (has links)
In this work, we study the benefits of predicting human trajectories using human body poses instead of solely their x-y locations in time. We propose ‘Social-pose’, an attention-based pose encoder that encodes the poses of all humans in the scene and their social relations. Our method can be used as a plugin to any existing trajectory predictor. We explore the advantages to use 2D versus 3D poses, as well as a limited set of poses. We also investigate the attention map to find out which frames of poses are critical to improve human trajectory prediction. We have done extensive experiments on state-of-the-art models (based on LSTMs, GANs and transformers), and showed improvements over all of them on synthetic (Joint Track Auto) and real (Human3.6M and Pedestrians and Cyclists in Road Traffic) datasets. / I det här arbetet studerar vi fördelarna med att förutsäga mänskliga banor med hjälp av människokroppspositioner istället för enbart deras x-y-positioner i tiden. Vi föreslår ”Social-pose”, en uppmärksamhetsbaserad poseringskodare som kodar poserna för alla människor på scenen och deras sociala relationer. Vår metod kan användas som en plugin till vilken befintlig bana som helst. Vi utforskar fördelarna med att använda 2D kontra 3D poser, såväl som en begränsad uppsättning poser. Vi undersöker också uppmärksamhetskartan för att ta reda på vilka ramar av poser som är avgörande för att förbättra förutsägelsen av mänsklig bana. Vi har gjort omfattande experiment på toppmoderna modeller (baserade på LSTM, GAN och transformers) och visat förbättringar jämfört med dem alla på syntetiska (Joint Track Auto) och riktiga (Human3.6M och Fotgängare och cyklister på vägen) trafik) datauppsättningar.

Page generated in 0.0389 seconds