Global ETD Search

331	Isomorphic Visualization and Understanding of the Commutativity of Multiplication: from multiplication of whole numbers to multiplication of fractions Malaty, George 16 March 2012 (has links) No description available. info:eu-repo/classification/ddc/510 ddc:510
332	Learning Sampling-Based 6D Object Pose Estimation Krull, Alexander 31 August 2018 (has links) The task of 6D object pose estimation, i.e. of estimating an object position (three degrees of freedom) and orientation (three degrees of freedom) from images is an essential building block of many modern applications, such as robotic grasping, autonomous driving, or augmented reality. Automatic pose estimation systems have to overcome a variety of visual ambiguities, including texture-less objects, clutter, and occlusion. Since many applications demand real time performance the efficient use of computational resources is an additional challenge. In this thesis, we will take a probabilistic stance on trying to overcome said issues. We build on a highly successful automatic pose estimation framework based on predicting pixel-wise correspondences between the camera coordinate system and the local coordinate system of the object. These dense correspondences are used to generate a pool of hypotheses, which in turn serve as a starting point in a final search procedure. We will present three systems that each use probabilistic modeling and sampling to improve upon different aspects of the framework. The goal of the first system, System I, is to enable pose tracking, i.e. estimating the pose of an object in a sequence of frames instead of a single image. By including information from previous frames tracking systems can resolve many visual ambiguities and reduce computation time. System I is a particle filter (PF) approach. The PF represents its belief about the pose in each frame by propagating a set of samples through time. Our system uses the process of hypothesis generation from the original framework as part of a proposal distribution that efficiently concentrates samples in the appropriate areas. In System II, we focus on the problem of evaluating the quality of pose hypotheses. This task plays an essential role in the final search procedure of the original framework. We use a convolutional neural network (CNN) to assess the quality of an hypothesis by comparing rendered and observed images. To train the CNN we view it as part of an energy-based probability distribution in pose space. This probabilistic perspective allows us to train the system under the maximum likelihood paradigm. We use a sampling approach to approximate the required gradients. The resulting system for pose estimation yields superior results in particular for highly occluded objects. In System III, we take the idea of machine learning a step further. Instead of learning to predict an hypothesis quality measure, to be used in a search procedure, we present a way of learning the search procedure itself. We train a reinforcement learning (RL) agent, termed PoseAgent, to steer the search process and make optimal use of a given computational budget. PoseAgent dynamically decides which hypothesis should be refined next, and which one should ultimately be output as final estimate. Since the search procedure includes discrete non-differentiable choices, training of the system via gradient descent is not easily possible. To solve the problem, we model behavior of PoseAgent as non-deterministic stochastic policy, which is ultimately governed by a CNN. This allows us to use a sampling-based stochastic policy gradient training procedure. We believe that some of the ideas developed in this thesis, such as the sampling-driven probabilistically motivated training of a CNN for the comparison of images or the search procedure implemented by PoseAgent have the potential to be applied in fields beyond pose estimation as well. info:eu-repo/classification/ddc/004 ddc:004
333	Through the Blur with Deep Learning : A Comparative Study Assessing Robustness in Visual Odometry Techniques Berglund, Alexander January 2023 (has links) In this thesis, the robustness of deep learning techniques in the field of visual odometry is investigated, with a specific focus on the impact of motion blur. A comparative study is conducted, evaluating the performance of state-of-the-art deep convolutional neural network methods, namely DF-VO and DytanVO, against ORB-SLAM3, a well-established non-deep-learning technique for visual simultaneous localization and mapping. The objective is to quantitatively assess the performance of these models as a function of motion blur. The evaluation is carried out on a custom synthetic dataset, which simulates a camera navigating through a forest environment. The dataset includes trajectories with varying degrees of motion blur, caused by camera translation, and optionally, pitch and yaw rotational noise. The results demonstrate that deep learning-based methods maintained robust performance despite the challenging conditions presented in the test data, while excessive blur lead to tracking failures in the geometric model. This suggests that the ability of deep neural network architectures to automatically learn hierarchical feature representations and capture complex, abstract features may enhance the robustness of deep learning-based visual odometry techniques in challenging conditions, compared to their geometric counterparts. artificial intelligence AI machine learning ML deep learning DL computer vision neural networks NN convolutional neural networks CNN visual odometry VO robustness motion blur AirForestry localization navigation ego-motion pose estimation SLAM DF-VO DytanVO ORB-SLAM3 artificiell intelligens maskininlärning datorseende Computer Sciences Datavetenskap (datalogi)
334	Modulating Depth Map Features to Estimate 3D Human Pose via Multi-Task Variational Autoencoders / Modulerande djupkartfunktioner för att uppskatta människans ställning i 3D med multi-task-variationsautoenkoder Moerman, Kobe January 2023 (has links) Human pose estimation (HPE) constitutes a fundamental problem within the domain of computer vision, finding applications in diverse fields like motion analysis and human-computer interaction. This paper introduces innovative methodologies aimed at enhancing the accuracy and robustness of 3D joint estimation. Through the integration of Variational Autoencoders (VAEs), pertinent information is extracted from depth maps, even in the presence of inevitable image-capturing inconsistencies. This concept is enhanced through the introduction of noise to the body or specific regions surrounding key joints. The deliberate introduction of noise to these areas enables the VAE to acquire a robust representation that captures authentic pose-related patterns. Moreover, the introduction of a localised mask as a constraint in the loss function ensures the model predominantly relies on pose-related cues while disregarding potential confounding factors that may hinder the compact representation of accurate human pose information. Delving into the latent space modulation further, a novel model architecture is devised, joining a VAE and fully connected network into a multi-task joint training objective. In this framework, the VAE and regressor harmoniously influence the latent representations for accurate joint detection and localisation. By combining the multi-task model with the loss function constraint, this study attains results that compete with state-of-the-art techniques. These findings underscore the significance of leveraging latent space modulation and customised loss functions to address challenging human poses. Additionally, these novel methodologies pave the way for future explorations and provide prospects for advancing HPE. Subsequent research endeavours may optimising these techniques, evaluating their performance across diverse datasets, and exploring potential extensions to unravel further insights and advancements in the field. / Human pose estimation (HPE) är ett grundläggande problem inom datorseende och används inom områden som rörelseanalys och människa-datorinteraktion. I detta arbete introduceras innovativa metoder som syftar till att förbättra noggrannheten och robustheten i 3D-leduppskattning. Genom att integrera variationsautokodare (eng. variational autoencoder, VAE) extraheras relevant information från djupkartor, trots närvaro av inkonsekventa avvikelser i bilden. Dessa avvikelser förstärks genom att applicera brus på kroppen eller på specifika regioner som omger viktiga leder. Det avsiktliga införandet av brus i dessa områden gör det möjligt för VAE att lära sig en robust representation som fångar autentiska poseringsrelaterade mönster. Dessutom införs en lokaliserad mask som en begränsning i förlustfunktionen, vilket säkerställer att modellen främst förlitar sig på poseringsrelaterade signaler samtidigt som potentiella störande faktorer som hindrar den kompakta representationen av korrekt mänsklig poseringsinformation bortses ifrån. Genom att fördjupa sig ytterligare i den latenta rumsmoduleringen har en ny modellarkitektur tagits fram som förenar en VAE och ett fullständigt anslutet nätverk i en fleruppgiftsmodell. I detta ramverk påverkar VAE och det fullständigt ansluta nätverket de latenta representationerna på ett harmoniskt sätt för att uppnå korrekt leddetektering och lokalisering. Genom att kombinera fleruppgiftsmodellen med förlustfunktionsbegränsningen uppnår denna studie resultat som konkurrerar med toppmoderna tekniker. Dessa resultat understryker betydelsen av att utnyttja latent rymdmodulering och anpassade förlustfunktioner för att hantera utmanande mänskliga poser. Dessutom banar dessa nya metoder väg för framtida utveckling inom uppskattning av HPE. Efterföljande forskningsinsatser kan optimera dessa tekniker, utvärdera deras prestanda över olika datamängder och utforska potentiella tillägg för att avslöja ytterligare insikter och framsteg inom området. 3D pose estimation Joint landmarks Variational autoencoder Multi-task model Loss discrimination Latent-space modulation Depth map 3D-positionsuppskattning Gemensamma landmärken Variationell autoencoder Multitask-modell Förlustdiskriminering Latent-space-modulering Djupkarta Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik Computer and Information Sciences Data- och informationsvetenskap
335	Structureless Camera Motion Estimation of Unordered Omnidirectional Images Sastuba, Mark 08 August 2022 (has links) This work aims at providing a novel camera motion estimation pipeline from large collections of unordered omnidirectional images. In oder to keep the pipeline as general and flexible as possible, cameras are modelled as unit spheres, allowing to incorporate any central camera type. For each camera an unprojection lookup is generated from intrinsics, which is called P2S-map (Pixel-to-Sphere-map), mapping pixels to their corresponding positions on the unit sphere. Consequently the camera geometry becomes independent of the underlying projection model. The pipeline also generates P2S-maps from world map projections with less distortion effects as they are known from cartography. Using P2S-maps from camera calibration and world map projection allows to convert omnidirectional camera images to an appropriate world map projection in oder to apply standard feature extraction and matching algorithms for data association. The proposed estimation pipeline combines the flexibility of SfM (Structure from Motion) - which handles unordered image collections - with the efficiency of PGO (Pose Graph Optimization), which is used as back-end in graph-based Visual SLAM (Simultaneous Localization and Mapping) approaches to optimize camera poses from large image sequences. SfM uses BA (Bundle Adjustment) to jointly optimize camera poses (motion) and 3d feature locations (structure), which becomes computationally expensive for large-scale scenarios. On the contrary PGO solves for camera poses (motion) from measured transformations between cameras, maintaining optimization managable. The proposed estimation algorithm combines both worlds. It obtains up-to-scale transformations between image pairs using two-view constraints, which are jointly scaled using trifocal constraints. A pose graph is generated from scaled two-view transformations and solved by PGO to obtain camera motion efficiently even for large image collections. Obtained results can be used as input data to provide initial pose estimates for further 3d reconstruction purposes e.g. to build a sparse structure from feature correspondences in an SfM or SLAM framework with further refinement via BA. The pipeline also incorporates fixed extrinsic constraints from multi-camera setups as well as depth information provided by RGBD sensors. The entire camera motion estimation pipeline does not need to generate a sparse 3d structure of the captured environment and thus is called SCME (Structureless Camera Motion Estimation).:1 Introduction 1.1 Motivation 1.1.1 Increasing Interest of Image-Based 3D Reconstruction 1.1.2 Underground Environments as Challenging Scenario 1.1.3 Improved Mobile Camera Systems for Full Omnidirectional Imaging 1.2 Issues 1.2.1 Directional versus Omnidirectional Image Acquisition 1.2.2 Structure from Motion versus Visual Simultaneous Localization and Mapping 1.3 Contribution 1.4 Structure of this Work 2 Related Work 2.1 Visual Simultaneous Localization and Mapping 2.1.1 Visual Odometry 2.1.2 Pose Graph Optimization 2.2 Structure from Motion 2.2.1 Bundle Adjustment 2.2.2 Structureless Bundle Adjustment 2.3 Corresponding Issues 2.4 Proposed Reconstruction Pipeline 3 Cameras and Pixel-to-Sphere Mappings with P2S-Maps 3.1 Types 3.2 Models 3.2.1 Unified Camera Model 3.2.2 Polynomal Camera Model 3.2.3 Spherical Camera Model 3.3 P2S-Maps - Mapping onto Unit Sphere via Lookup Table 3.3.1 Lookup Table as Color Image 3.3.2 Lookup Interpolation 3.3.3 Depth Data Conversion 4 Calibration 4.1 Overview of Proposed Calibration Pipeline 4.2 Target Detection 4.3 Intrinsic Calibration 4.3.1 Selected Examples 4.4 Extrinsic Calibration 4.4.1 3D-2D Pose Estimation 4.4.2 2D-2D Pose Estimation 4.4.3 Pose Optimization 4.4.4 Uncertainty Estimation 4.4.5 PoseGraph Representation 4.4.6 Bundle Adjustment 4.4.7 Selected Examples 5 Full Omnidirectional Image Projections 5.1 Panoramic Image Stitching 5.2 World Map Projections 5.3 World Map Projection Generator for P2S-Maps 5.4 Conversion between Projections based on P2S-Maps 5.4.1 Proposed Workflow 5.4.2 Data Storage Format 5.4.3 Real World Example 6 Relations between Two Camera Spheres 6.1 Forward and Backward Projection 6.2 Triangulation 6.2.1 Linear Least Squares Method 6.2.2 Alternative Midpoint Method 6.3 Epipolar Geometry 6.4 Transformation Recovery from Essential Matrix 6.4.1 Cheirality 6.4.2 Standard Procedure 6.4.3 Simplified Procedure 6.4.4 Improved Procedure 6.5 Two-View Estimation 6.5.1 Evaluation Strategy 6.5.2 Error Metric 6.5.3 Evaluation of Estimation Algorithms 6.5.4 Concluding Remarks 6.6 Two-View Optimization 6.6.1 Epipolar-Based Error Distances 6.6.2 Projection-Based Error Distances 6.6.3 Comparison between Error Distances 6.7 Two-View Translation Scaling 6.7.1 Linear Least Squares Estimation 6.7.2 Non-Linear Least Squares Optimization 6.7.3 Comparison between Initial and Optimized Scaling Factor 6.8 Homography to Identify Degeneracies 6.8.1 Homography for Spherical Cameras 6.8.2 Homography Estimation 6.8.3 Homography Optimization 6.8.4 Homography and Pure Rotation 6.8.5 Homography in Epipolar Geometry 7 Relations between Three Camera Spheres 7.1 Three View Geometry 7.2 Crossing Epipolar Planes Geometry 7.3 Trifocal Geometry 7.4 Relation between Trifocal, Three-View and Crossing Epipolar Planes 7.5 Translation Ratio between Up-To-Scale Two-View Transformations 7.5.1 Structureless Determination Approaches 7.5.2 Structure-Based Determination Approaches 7.5.3 Comparison between Proposed Approaches 8 Pose Graphs 8.1 Optimization Principle 8.2 Solvers 8.2.1 Additional Graph Solvers 8.2.2 False Loop Closure Detection 8.3 Pose Graph Generation 8.3.1 Generation of Synthetic Pose Graph Data 8.3.2 Optimization of Synthetic Pose Graph Data 9 Structureless Camera Motion Estimation 9.1 SCME Pipeline 9.2 Determination of Two-View Translation Scale Factors 9.3 Integration of Depth Data 9.4 Integration of Extrinsic Camera Constraints 10 Camera Motion Estimation Results 10.1 Directional Camera Images 10.2 Omnidirectional Camera Images 11 Conclusion 11.1 Summary 11.2 Outlook and Future Work Appendices A.1 Additional Extrinsic Calibration Results A.2 Linear Least Squares Scaling A.3 Proof Rank Deficiency A.4 Alternative Derivation Midpoint Method A.5 Simplification of Depth Calculation A.6 Relation between Epipolar and Circumferential Constraint A.7 Covariance Estimation A.8 Uncertainty Estimation from Epipolar Geometry A.9 Two-View Scaling Factor Estimation: Uncertainty Estimation A.10 Two-View Scaling Factor Optimization: Uncertainty Estimation A.11 Depth from Adjoining Two-View Geometries A.12 Alternative Three-View Derivation A.12.1 Second Derivation Approach A.12.2 Third Derivation Approach A.13 Relation between Trifocal Geometry and Alternative Midpoint Method A.14 Additional Pose Graph Generation Examples A.15 Pose Graph Solver Settings A.16 Additional Pose Graph Optimization Examples Bibliography info:eu-repo/classification/ddc/004 ddc:004 Mobiler Roboter Kamera Panoramafotografie Bergwerk Dreidimensionale Rekonstruktion SLAM-Verfahren
336	Crime Detection From Pre-crime Video Analysis Sedat Kilic (18363729) 03 June 2024 (has links) <p dir="ltr">his research investigates the detection of pre-crime events, specifically targeting behaviors indicative of shoplifting, through the advanced analysis of CCTV video data. The study introduces an innovative approach that leverages augmented human pose and emotion information within individual frames, combined with the extraction of activity information across subsequent frames, to enhance the identification of potential shoplifting actions before they occur. Utilizing a diverse set of models including 3D Convolutional Neural Networks (CNNs), Graph Neural Networks (GNNs), Recurrent Neural Networks (RNNs), and a specially developed transformer architecture, the research systematically explores the impact of integrating additional contextual information into video analysis.</p><p dir="ltr">By augmenting frame-level video data with detailed pose and emotion insights, and focusing on the temporal dynamics between frames, our methodology aims to capture the nuanced behavioral patterns that precede shoplifting events. The comprehensive experimental evaluation of our models across different configurations reveals a significant improvement in the accuracy of pre-crime detection. The findings underscore the crucial role of combining visual features with augmented data and the importance of analyzing activity patterns over time for a deeper understanding of pre-shoplifting behaviors.</p><p dir="ltr">The study’s contributions are multifaceted, including a detailed examination of pre-crime frames, strategic augmentation of video data with added contextual information, the creation of a novel transformer architecture customized for pre-crime analysis, and an extensive evaluation of various computational models to improve predictive accuracy.</p> Computer vision Image and video coding Image processing Pattern recognition Video processing crime detection video analysis augmented information pose estimation emotion estimation optical flow deep learning pre-crime video analysis video understanding anomaly detection contextual information shoplifting prevention crime prevention vision transformer transformer generative AI
337	Určení pozice kamery v reálném čase pro rozšířenou realitou / Real-time camera pose estimation for augmented reality Szentandrási, István Unknown Date (has links) Definované markery tvoří základ určování polohy kamery pro velké množství aplikací s rozšířenou realitou, v případě že jsou přísné požadavky na rychlost a robustnost. Tato práce popisuje účinnou metodu pro určení pózy kamery pomocí Uniformního pole markerů a několik realistických aplikací na bázi popsané metody. Metoda je velice výpočetně levná a poskytuje spolehlivou detekci pro několik výpočetních platforem, včetně běžných chytrých telefonů. Markery jako část zobrazené informace na monitorech jsou použité v této práci pro určení relativní orientaci mezi poskytovatelem obsahu a užívatelským zařízením, sloužícím pro výběr prvků užívatelského rozhraní při interakci a migraci úkolů. Ve filmařském průmyslu poskytuje popsaná metoda pro zjištění polohy kamery jako součást klíčovaní pozadí filmářům živý náhled virtuální scény. Výsledky ukazují, že popsaná metoda pro detekci pole markerů má srovnatelnou úspěšnost a přesnost v porovnání s ostatními metodami na bázi markerů a je několikrát rýchlejší. Aplikace zahrnuté v této práci podle výsledků testů jsou životaschopné - rychlejší a levnější - alternativy k existujícím řešením.
338	Návrh a Aplikace Dvourozměrných Vizuálních Markerů pro Speciální Účely / Design and Applications of Special-Purpose Two-Dimensional Visual Markers Zachariáš, Michal Unknown Date (has links) Současné vizuální markerové systémy mají jednu zásadní nevýhodu oproti tzv. markerless přístupům - pohyb kamery je omezen na oblast pokrytou markery. V každém snímku musí být marker dostatečně velký, aby jej bylo možné identifikovat a vypočítat pozici a rotaci kamery. Zároveň musí být dostatečně malý, aby se celý (nebo alespoň jeho podstatná část) vešel do záběru kamery. Avšak tyto požadavky jsou protichůdné. Tato práce nabízí řešení tohoto problému za pomoci konceptu Marker Fields. Jde o strukturu, jejíž přítomnost je možné v obraze kamery snadno detekovat a identifikovat část, na kterou se kamera právě dívá, a to na základě jakékoli (malé) podoblasti s definovanou velikostí. Aby bylo možné podoblasti identifikovat zblízka i zdálky, nejsou od sebe odděleny, ale do velké míry se překrývají. V této práci jsou vysvětleny různé implementace konceptu marker fields, spolu s jejich zamýšleným použitím a výhodami a nevýhodami. Jako důkaz použitelnosti marker fields v reálném světě, se druhá největší část této práce věnuje popisu jejich reálných aplikací.
339	Analýza a optimalizace datové komunikace pro telemetrické systémy v energetice / Analysis and Optimization of Data Communication for Telemetric Systems in Energy Fujdiak, Radek January 2017 (has links) Telemetry system, Optimisation, Sensoric networks, Smart Grid, Internet of Things, Sensors, Information security, Cryptography, Cryptography algorithms, Cryptosystem, Confidentiality, Integrity, Authentication, Data freshness, Non-Repudiation.

Search results