Global ETD Search

311	Odhad pózy kamery z přímek pomocí přímé lineární transformace / Camera Pose Estimation from Lines using Direct Linear Transformation Přibyl, Bronislav Unknown Date (has links) Tato disertační práce se zabývá odhadem pózy kamery z korespondencí 3D a 2D přímek, tedy tzv. perspektivním problémem n přímek (angl. Perspective- n -Line, PnL). Pozornost je soustředěna na případy s velkým počtem čar, které mohou být efektivně řešeny metodami využívajícími lineární formulaci PnL. Dosud byly známy pouze metody pracující s korespondencemi 3D bodů a 2D přímek. Na základě tohoto pozorování byly navrženy dvě nové metody založené na algoritmu přímé lineární transformace (angl. Direct Linear Transformation, DLT): Metoda DLT-Plücker-Lines pracující s korespondencemi 3D a 2D přímek a metoda DLT-Combined-Lines pracující jak s korespondencemi 3D bodů a 2D přímek, tak s korespondencemi 3D přímek a 2D přímek. Ve druhém případě je redundantní 3D informace využita k redukci minimálního počtu požadovaných korespondencí přímek na 5 a ke zlepšení přesnosti metody. Navržené metody byly důkladně testovány za různých podmínek včetně simulovaných a reálných dat a porovnány s nejlepšími existujícími PnL metodami. Metoda DLT-Combined-Lines dosahuje výsledků lepších nebo srovnatelných s nejlepšími existujícími metodami a zároveň je značně rychlá. Tato disertační práce také zavádí jednotný rámec pro popis metod pro odhad pózy kamery založených na algoritmu DLT. Obě navržené metody jsou definovány v tomto rámci.
312	Vizuální systém identifikace letiště / Runway Visual Identification Mudrík, Samuel January 2013 (has links) This thesis deals with the usage of optical signal in aircraft navigation systems. The solution is based upon the creation of runway identi cation system working with the record of an onboard camera.
313	Terrain Mapping for Autonomous Vehicles / Terrängkartläggning för autonoma fordon Pedreira Carabel, Carlos Javier January 2015 (has links) Autonomous vehicles have become the forefront of the automotive industry nowadays, looking to have safer and more efficient transportation systems. One of the main issues for every autonomous vehicle consists in being aware of its position and the presence of obstacles along its path. The current project addresses the pose and terrain mapping problem integrating a visual odometry method and a mapping technique. An RGB-D camera, the Kinect v2 from Microsoft, was chosen as sensor for capturing information from the environment. It was connected to an Intel mini-PC for real-time processing. Both pieces of hardware were mounted on-board of a four-wheeled research concept vehicle (RCV) to test the feasibility of the current solution at outdoor locations. The Robot Operating System (ROS) was used as development environment with C++ as programming language. The visual odometry strategy consisted in a frame registration algorithm called Adaptive Iterative Closest Keypoint (AICK) based on Iterative Closest Point (ICP) using Oriented FAST and Rotated BRIEF (ORB) as image keypoint extractor. A grid-based local costmap rolling window type was implemented to have a two-dimensional representation of the obstacles close to the vehicle within a predefined area, in order to allow further path planning applications. Experiments were performed both offline and in real-time to test the system at indoors and outdoors scenarios. The results confirmed the viability of using the designed framework to keep tracking the pose of the camera and detect objects in indoor environments. However, outdoor environments evidenced the limitations of the features of the RGB-D sensor, making the current system configuration unfeasible for outdoor purposes. / Autonoma fordon har blivit spetsen för bilindustrin i dag i sökandet efter säkrare och effektivare transportsystem. En av de viktigaste sakerna för varje autonomt fordon består i att vara medveten om sin position och närvaron av hinder längs vägen. Det aktuella projektet behandlar position och riktning samt terrängkartläggningsproblemet genom att integrera en visuell distansmätnings och kartläggningsmetod. RGB-D kameran Kinect v2 från Microsoft valdes som sensor för att samla in information från omgivningen. Den var ansluten till en Intel mini PC för realtidsbehandling. Båda komponenterna monterades på ett fyrhjuligt forskningskonceptfordon (RCV) för att testa genomförbarheten av den nuvarande lösningen i utomhusmiljöer. Robotoperativsystemet (ROS) användes som utvecklingsmiljö med C++ som programmeringsspråk. Den visuella distansmätningsstrategin bestod i en bildregistrerings-algoritm som kallas Adaptive Iterative Closest Keypoint (AICK) baserat på Iterative Closest Point (ICP) med hjälp av Oriented FAST och Rotated BRIEF (ORB) som nyckelpunktsutvinning från bilder. En rutnätsbaserad lokalkostnadskarta av rullande-fönster-typ implementerades för att få en tvådimensionell representation av de hinder som befinner sig nära fordonet inom ett fördefinierat område, i syfte att möjliggöra ytterligare applikationer för körvägen. Experiment utfördes både offline och i realtid för att testa systemet i inomhus- och utomhusscenarier. Resultaten bekräftade möjligheten att använda den utvecklade metoden för att spåra position och riktning av kameran samt upptäcka föremål i inomhusmiljöer. Men utomhus visades begränsningar i RGB-D-sensorn som gör att den aktuella systemkonfigurationen är värdelös för utomhusbruk. mapping costmap ROS ORB registration obstacles detection visual odometry adaptive iterative keypoint RCV vehicle Kinect RGB-D AICK ICP gridmap autonomous vehicle pose estimation estimate point-cloud Computer Sciences Datavetenskap (datalogi)
314	Black Queer TV: Reparative Viewing and the Sociopolitical Questions of Our Now Spears, Tobias L. 11 May 2022 (has links) No description available. African American Studies American Studies Black Studies Economic Theory Gender Studies Black Queer Studies Television Studies Neoliberalism Queer Worldmaking Sexuality Gender Futurity The Prancing Elites Empire Pose Media Studies Cultural Studies
315	Movement Estimation with SLAM through Multimodal Sensor Fusion Cedervall Lamin, Jimmy January 2024 (has links) In the field of robotics and self-navigation, Simultaneous Localization and Mapping (SLAM) is a technique crucial for estimating poses while concurrently creating a map of the environment. Robotics applications often rely on various sensors for pose estimation, including cameras, inertial measurement units (IMUs), and more. Traditional discrete SLAM, utilizing stereo camera pairs and inertial measurement units, faces challenges such as time offsets between sensors. A solution to this issue is the utilization of continuous-time models for pose estimation. This thesis delves into the exploration and implementation of a continuous-time SLAM system, investigating the advantages of multi-modal sensor fusion over discrete stereo vision models. The findings indicate that incorporating an IMU into the system enhances pose estimation, providing greater robustness and accuracy compared to relying solely on visual SLAM. Furthermore, leveraging the continuous model's derivative and smoothness allows for decent pose estimation with fewer measurements, reducing the required quantity of measurements and computational resources. slam discrete-slam continuous-slam synchronous asynchronous computer vision BRISK opencv ceres visual inertial sensor fusion multimodal Simultaneous Localization and Mapping time offset pose estimation quaternions movement estimation Media and Communication Technology Medieteknik
316	Research and Application of 6D Pose Estimation for Mobile 3D Cameras / Forskning och tillämpning av 6D Pose Estimation för mobila 3D-kameror Ruichao, Qian January 2022 (has links) This work addresses the deep-learning-based 6 Degree-of-Freedom (DoF) pose estimation utilizing 3D cameras on an iPhone 13 Pro. The task of pose estimation is to estimate the spatial rotation and translation of an object given its 2D or 3D images. During the pose estimation network training process, a common way to expand the training dataset is to generate synthetic images, which requires the 3D mesh of the target object. Although several famous datasets provide the 3D object files, it is still a problem when one wants to generate a customized real-world object. The typical 3D scanners are mainly designed for industrial usage and are usually expensive. We investigated in this project whether the 3D cameras on Apple devices can replace the industrial 3D scanners in the pose estimation pipeline and what might influence the results during scanning. During the data synthesis, we introduced a pose sampling method to equally sample on a sphere. Random transformation and background images from the SUN2012 dataset are applied, and the synthetic image is rendered through Blender. We picked five testing objects with different sizes and surfaces. Each object is scanned both by front TrueDepth camera and rear Light Detection and Ranging (LiDAR) camera with the ‘3d Scanner App’ on iOS. The network we used is based on PVNet, which uses a pixel-wise voting scheme to find 2D keypoints on RGB images and utilizes uncertainty-driven Perspective-n-Point (PnP) to compute the pose. We achieved both quantitative and qualitative results for each instance. i) TrueDepth camera outperforms Light Detection and Ranging (LiDAR) camera in most scenarios, ii) when an object has less reflective surface and high-contrast texture, the advantage of TrueDepth is more obvious. We also picked three baseline objects from Linemod dataset. Although the average accuracy is lower than the original paper, the performance of our baseline instances shows a similar trend to the original paper’s results. In conclusion, we proved that the 3D cameras on iPhone are capable of the pose estimation pipeline. / Detta arbete tar upp den djupinlärningsbaserade 6 Degree-of-Freedom (DoF) poseringsuppskattning med 3D-kameror på en iPhone 13 Pro. Uppgiften med poseuppskattning är att uppskatta den rumsliga rotationen och translationen av ett objekt givet dess 2D- eller 3D-bilder. Ett vanligt sätt att utöka träningsdataup- psättningen under träningsprocessen för positionsuppskattning är att generera syntetiska bilder, vilket kräver 3D-nätet för målobjektet. Även om flera kända datamängder tillhandahåller 3D-objektfilerna, är det fortfarande ett problem när man vill generera ett anpassat verkligt objekt. De typiska 3D-skannrarna är främst designade för industriell användning och är vanligtvis dyra. Vi undersökte i detta projekt om 3D-kamerorna på Apple-enheter kan ersätta de industriella 3D-skannrarna i poseskattningspipelinen och vad som kan påverka resultaten under skanning. Under datasyntesen introducerade vi en posesamplingsmetod för att sampla lika mycket på en sfär. Slumpmässig transformation och bakgrundsbilder från SUN2012-datauppsättningen tillämpas, och den syntetiska bilden renderas genom Blender. Vi valde ut fem testobjekt med olika storlekar och ytor. Varje objekt skannas både av den främre TrueDepth-kameran och den bakre ljusdetektions- och avståndskameran (LiDAR) med "3d-skannerappenpå iOS. Nätverket vi använde är baserat på PVNet, som använder ett pixelvis röstningsschema för att hitta 2D-nyckelpunkter på RGB-bilder och använder osäkerhetsdrivet Perspective-n-Point (PnP) för att beräkna poseringen. Vi uppnådde både kvantitativa och kvalitativa resultat för varje instans. i) TrueDepth-kameran överträffar Light Detection and Ranging-kameran (LiDAR) i de flesta scenarier, ii) när ett objekt har mindre reflekterande yta och högkontraststruktur är fördelen med TrueDepth högre. Vi valde också tre baslinjeobjekt från Linemod dataset. Även om den genomsnittliga noggrannheten är lägre än originalpapperet, visar prestandan för våra baslinjeinstanser en liknande trend som originalpapperets resultat. Sammanfattningsvis bevisade vi att 3D-kamerorna på iPhone är kapabla att göra positionsuppskattning. 6 Degree-of-Freedom (DoF) pose estimation deep learning Light Detection and Ranging (LiDAR) structure light TrueDepth 6 frihetsgrader (DoF) poseringsuppskattning djupinlärning ljusdetektion och avstånd (LiDAR) strukturljus TrueDepth Elektroteknik och elektronik
317	Analyzing Lower Limb Motion Capture with Smartphone : Possible improvements using machine learning / Analys av rörelsefångst för nedre extremiteterna med smartphone : Möjliga förbättringar med hjälp av maskininlärning Brink, Anton January 2024 (has links) Human motion analysis (HMA) can play a crucial role in sports and healthcare by providing unique insights on movement mechanics in the form of objective measurements and quantitative data. Traditional, state of the art, marker-based techniques, despite their accuracy, come with financial and logistical barriers, and are restricted to laboratory settings. Markerless systems offer much improved affordability and portability, and can potentially be used outside of laboratories. However, these advantages come with a significant cost in accuracy. This thesis attempts to address the challenge of democratizing HMA by leveraging recent advances in smartphone technology and machine learning.\newline\newlineThis thesis evaluates two modalities of performing markerless HMA: Single smartphone using Apple Arkit, and multiple smartphone setup using OpenCap, and compares both to a state of the art multiple-camera marker-based system from Vicon. Additionally, this thesis presents and evaluates two approaches to improving the single smartphone modality: Employing a Gaussian Process Model (GPR), and a Long-short-term-memory (LSTM) neural network to refine the single smartphone data to align with the marker-based result. Specific movements were recorded simultaneously with all three modalities on 13 subjects to build a dataset. From this, GPR and LSTM models were trained and applied to refine the single camera modality data. Lower limb joint angles, and joint centers were evaluated across the different modalities, and analyzed for potential use in real-world applications. While the findings of this thesis are promising, as both the GPR and LSTM models improve the accuracy of Apple Arkit, and OpenCap providing accurate and consistent results. It is important to acknowledge limitations regarding demographic diversity and how real-world environmental factors may influence its application. This thesis contributes to the efforts in narrowing the gap between marker-based HMA methods, and more accessible solutions. / Rörelseanalys av människokroppen (HMA) kan spela en betydelsefull roll i både idrott och hälso- och sjukvården. Genom objektiv och kvantitativ data ger den unik insikt i mekaniken bakom rörelser. Traditionella, toppmoderna, markör-baserade tekniker är mycket precisa, men medför finansiella och logistikbaserade barriärer, och finns endast tillgängliga i laboratorier. Markör-fria system erbjuder mycket bättre pris, portabilitet och kan potentiellt användas utanför laboratorier. Dessa fördelar går dock hand i hand med en betydande minskning av nogrannhet. Denna avhandling försöker ta itu med utmaningen att demokratisera HMA genom att utnyttja de senaste framstegen inom smartphoneteknik och maskininlärning. Denna avhandling utvärderar två sätt att utföra markör-fri HMA: Genom att använda en smartphone som kör Apple Arkit, och en uppsättning med flera smartphones som kör OpenCap. Båda modaliteter jämförs med ett markör-baserat system som använder flera kameror, från Vicon. Dessutom presenteras och utvärderas två metoder för att förbättra modaliteten med endast en smartphone: Användning av en Gaussisk Process modell för Regression (GPR) och ett Long-short-term-memory (LSTM) neuronnät för att förbättra data från en smartphone modalititeten, så att det bättre överenstämmer med det markör-baserade resultatet. Specifika rörelser spelades in samtidigt med alla tre modaliteter på 13 försökspersoner för att bygga upp ett dataset. Utifrån detta tränades GPR- och LSTM-modeller och användas för att förbättra data från en kamera modaliteten (Apple Arkit). Ledvinklar och ledcentra för de nedre extremiteterna utvärderades i de olika modaliteterna och analyserades för potentiell använding i verkliga tillämpningar. Även om resultaten av denna avhandling är lovande, då både GPR- och LSTM-modellerna förbättrar nogrannheten hos Apple Arkit, och OpenCap ger korrekta och konsekventa resultat, så är det viktigt att erkänna begränsningarna när det gäller demografisk mångfald och hur miljöfaktorer i verkligheten kan påverka tillämpningen. Motion Analysis Markerless Pose Estimation Machine Learning Gaussian Process Regression Long-Short-Term-Memory Neural Network Apple Arkit OpenCap Vicon Medical Engineering Medicinteknik Bioinformatics and Systems Biology Bioinformatik och systembiologi Computer Sciences Datavetenskap (datalogi)
318	Generation and Optimization of Local Shape Descriptors for Point Matching in 3-D Surfaces Taati, BABAK 01 September 2009 (has links) We formulate Local Shape Descriptor selection for model-based object recognition in range data as an optimization problem and offer a platform that facilitates a solution. The goal of object recognition is to identify and localize objects of interest in an image. Recognition is often performed in three phases: point matching, where correspondences are established between points on the 3-D surfaces of the models and the range image; hypothesis generation, where rough alignments are found between the image and the visible models; and pose refinement, where the accuracy of the initial alignments is improved. The overall efficiency and reliability of a recognition system is highly influenced by the effectiveness of the point matching phase. Local Shape Descriptors are used for establishing point correspondences by way of encapsulating local shape, such that similarity between two descriptors indicates geometric similarity between their respective neighbourhoods. We present a generalized platform for constructing local shape descriptors that subsumes a large class of existing methods and allows for tuning descriptors to the geometry of specific models and to sensor characteristics. Our descriptors, termed as Variable-Dimensional Local Shape Descriptors, are constructed as multivariate observations of several local properties and are represented as histograms. The optimal set of properties, which maximizes the performance of a recognition system, depend on the geometry of the objects of interest and the noise characteristics of range image acquisition devices and is selected through pre-processing the models and sample training images. Experimental analysis confirms the superiority of optimized descriptors over generic ones in recognition tasks in LIDAR and dense stereo range images. / Thesis (Ph.D, Electrical & Computer Engineering) -- Queen's University, 2009-09-01 11:07:32.084 computer vision range data object recognition tracking local shape descriptor point matching pose estimation pose acquisition 3-D 3D point cloud satellite tracking optimization range image processing range image RANSAC registration alignment surface computational geometry detection localization model-based object identification point correspondence feature selection VD-LSD LSD genetic algorithm simulated annealing forward feature selection multivariate features subset selection local properties LIDAR dense stereo stereo precision feature matching machine learning training learning phase preprocessing
319	Asynchronous Event-Feature Detection and Tracking for SLAM Initialization Ta, Tai January 2024 (has links) Traditional cameras are most commonly used in visual SLAM to provide visual information about the scene and positional information about the camera motion. However, in the presence of varying illumination and rapid camera movement, the visual quality captured by traditional cameras diminishes. This limits the applicability of visual SLAM in challenging environments such as search and rescue situations. The emerging event camera has been shown to overcome the limitations of the traditional camera with the event camera's superior temporal resolution and wider dynamic range, opening up new areas of applications and research for event-based SLAM. In this thesis, several asynchronous feature detectors and trackers will be used to initialize SLAM using event camera data. To assess the pose estimation accuracy between the different feature detectors and trackers, the initialization performance was evaluated from datasets captured from various environments. Furthermore, two different methods to align corner-events were evaluated on the datasets to assess the difference. Results show that besides some slight variation in the number of accepted initializations, the alignment methods show no overall difference in any metric. Overall highest performance among the event-based trackers for initialization is HASTE with mostly high pose accuracy and a high number of accepted initializations. However, the performance degrades in featureless scenes. CET on the other hand shows mostly lower performance compared to HASTE. Visual SLAM SLAM Traditional cameras event camera events neuromorphic camera silicon retina dynamic vision sensor Camera motion Illumination variation Rapid camera movement Visual quality degradation Challenging environments Search and rescue Event cameras Temporal resolution Dynamic range Event-based SLAM Asynchronous feature detectors Trackers Pose estimation accuracy Initialization performance Datasets Corner-event alignment HASTE tracker Pose accuracy Featureless scenes CET tracker Initialization acceptance Performance comparison Signal Processing Signalbehandling Robotics Robotteknik och automation Control Engineering Reglerteknik
320	Non-linear dimensionality reduction and sparse representation models for facial analysis / Réduction de la dimension non-linéaire et modèles de la représentations parcimonieuse pour l’analyse du visage Zhang, Yuyao 20 February 2014 (has links) Les techniques d'analyse du visage nécessitent généralement une représentation pertinente des images, notamment en passant par des techniques de réduction de la dimension, intégrées dans des schémas plus globaux, et qui visent à capturer les caractéristiques discriminantes des signaux. Dans cette thèse, nous fournissons d'abord une vue générale sur l'état de l'art de ces modèles, puis nous appliquons une nouvelle méthode intégrant une approche non-linéaire, Kernel Similarity Principle Component Analysis (KS-PCA), aux Modèles Actifs d'Apparence (AAMs), pour modéliser l'apparence d'un visage dans des conditions d'illumination variables. L'algorithme proposé améliore notablement les résultats obtenus par l'utilisation d'une transformation PCA linéaire traditionnelle, que ce soit pour la capture des caractéristiques saillantes, produites par les variations d'illumination, ou pour la reconstruction des visages. Nous considérons aussi le problème de la classification automatiquement des poses des visages pour différentes vues et différentes illumination, avec occlusion et bruit. Basé sur les méthodes des représentations parcimonieuses, nous proposons deux cadres d'apprentissage de dictionnaire pour ce problème. Une première méthode vise la classification de poses à l'aide d'une représentation parcimonieuse active (Active Sparse Representation ASRC). En fait, un dictionnaire est construit grâce à un modèle linéaire, l'Incremental Principle Component Analysis (Incremental PCA), qui a tendance à diminuer la redondance intra-classe qui peut affecter la performance de la classification, tout en gardant la redondance inter-classes, qui elle, est critique pour les représentations parcimonieuses. La seconde approche proposée est un modèle des représentations parcimonieuses basé sur le Dictionary-Learning Sparse Representation (DLSR), qui cherche à intégrer la prise en compte du critère de la classification dans le processus d'apprentissage du dictionnaire. Nous faisons appel dans cette partie à l'algorithme K-SVD. Nos résultats expérimentaux montrent la performance de ces deux méthodes d'apprentissage de dictionnaire. Enfin, nous proposons un nouveau schéma pour l'apprentissage de dictionnaire adapté à la normalisation de l'illumination (Dictionary Learning for Illumination Normalization: DLIN). L'approche ici consiste à construire une paire de dictionnaires avec une représentation parcimonieuse. Ces dictionnaires sont construits respectivement à partir de visages illuminées normalement et irrégulièrement, puis optimisés de manière conjointe. Nous utilisons un modèle de mixture de Gaussiennes (GMM) pour augmenter la capacité à modéliser des données avec des distributions plus complexes. Les résultats expérimentaux démontrent l'efficacité de notre approche pour la normalisation d'illumination. / Face analysis techniques commonly require a proper representation of images by means of dimensionality reduction leading to embedded manifolds, which aims at capturing relevant characteristics of the signals. In this thesis, we first provide a comprehensive survey on the state of the art of embedded manifold models. Then, we introduce a novel non-linear embedding method, the Kernel Similarity Principal Component Analysis (KS-PCA), into Active Appearance Models, in order to model face appearances under variable illumination. The proposed algorithm successfully outperforms the traditional linear PCA transform to capture the salient features generated by different illuminations, and reconstruct the illuminated faces with high accuracy. We also consider the problem of automatically classifying human face poses from face views with varying illumination, as well as occlusion and noise. Based on the sparse representation methods, we propose two dictionary-learning frameworks for this pose classification problem. The first framework is the Adaptive Sparse Representation pose Classification (ASRC). It trains the dictionary via a linear model called Incremental Principal Component Analysis (Incremental PCA), tending to decrease the intra-class redundancy which may affect the classification performance, while keeping the extra-class redundancy which is critical for sparse representation. The other proposed work is the Dictionary-Learning Sparse Representation model (DLSR) that learns the dictionary with the aim of coinciding with the classification criterion. This training goal is achieved by the K-SVD algorithm. In a series of experiments, we show the performance of the two dictionary-learning methods which are respectively based on a linear transform and a sparse representation model. Besides, we propose a novel Dictionary Learning framework for Illumination Normalization (DL-IN). DL-IN based on sparse representation in terms of coupled dictionaries. The dictionary pairs are jointly optimized from normally illuminated and irregularly illuminated face image pairs. We further utilize a Gaussian Mixture Model (GMM) to enhance the framework's capability of modeling data under complex distribution. The GMM adapt each model to a part of the samples and then fuse them together. Experimental results demonstrate the effectiveness of the sparsity as a prior for patch-based illumination normalization for face images. Imagerie médicale Imagerie du visage Analyse du visage Apparence d'un visage Modèle actif d'apparence - AAMs Représentations parcimonieuses Apprentissage de dictionnaire Classification des poses faciales Normalisation d'illumination Medical Imaging Facial imaging Facial analysis Facial expression recognition Active appearance models - AAMs Sparce representation Dictionary learning Face pose classification Illumination Normalization 616.075 407 2

Search results