• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 124
  • 10
  • 7
  • 5
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 187
  • 187
  • 97
  • 71
  • 48
  • 36
  • 33
  • 32
  • 30
  • 29
  • 29
  • 27
  • 26
  • 24
  • 24
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
181

Through the Blur with Deep Learning : A Comparative Study Assessing Robustness in Visual Odometry Techniques

Berglund, Alexander January 2023 (has links)
In this thesis, the robustness of deep learning techniques in the field of visual odometry is investigated, with a specific focus on the impact of motion blur. A comparative study is conducted, evaluating the performance of state-of-the-art deep convolutional neural network methods, namely DF-VO and DytanVO, against ORB-SLAM3, a well-established non-deep-learning technique for visual simultaneous localization and mapping. The objective is to quantitatively assess the performance of these models as a function of motion blur. The evaluation is carried out on a custom synthetic dataset, which simulates a camera navigating through a forest environment. The dataset includes trajectories with varying degrees of motion blur, caused by camera translation, and optionally, pitch and yaw rotational noise. The results demonstrate that deep learning-based methods maintained robust performance despite the challenging conditions presented in the test data, while excessive blur lead to tracking failures in the geometric model. This suggests that the ability of deep neural network architectures to automatically learn hierarchical feature representations and capture complex, abstract features may enhance the robustness of deep learning-based visual odometry techniques in challenging conditions, compared to their geometric counterparts.
182

Modulating Depth Map Features to Estimate 3D Human Pose via Multi-Task Variational Autoencoders / Modulerande djupkartfunktioner för att uppskatta människans ställning i 3D med multi-task-variationsautoenkoder

Moerman, Kobe January 2023 (has links)
Human pose estimation (HPE) constitutes a fundamental problem within the domain of computer vision, finding applications in diverse fields like motion analysis and human-computer interaction. This paper introduces innovative methodologies aimed at enhancing the accuracy and robustness of 3D joint estimation. Through the integration of Variational Autoencoders (VAEs), pertinent information is extracted from depth maps, even in the presence of inevitable image-capturing inconsistencies. This concept is enhanced through the introduction of noise to the body or specific regions surrounding key joints. The deliberate introduction of noise to these areas enables the VAE to acquire a robust representation that captures authentic pose-related patterns. Moreover, the introduction of a localised mask as a constraint in the loss function ensures the model predominantly relies on pose-related cues while disregarding potential confounding factors that may hinder the compact representation of accurate human pose information. Delving into the latent space modulation further, a novel model architecture is devised, joining a VAE and fully connected network into a multi-task joint training objective. In this framework, the VAE and regressor harmoniously influence the latent representations for accurate joint detection and localisation. By combining the multi-task model with the loss function constraint, this study attains results that compete with state-of-the-art techniques. These findings underscore the significance of leveraging latent space modulation and customised loss functions to address challenging human poses. Additionally, these novel methodologies pave the way for future explorations and provide prospects for advancing HPE. Subsequent research endeavours may optimising these techniques, evaluating their performance across diverse datasets, and exploring potential extensions to unravel further insights and advancements in the field. / Human pose estimation (HPE) är ett grundläggande problem inom datorseende och används inom områden som rörelseanalys och människa-datorinteraktion. I detta arbete introduceras innovativa metoder som syftar till att förbättra noggrannheten och robustheten i 3D-leduppskattning. Genom att integrera variationsautokodare (eng. variational autoencoder, VAE) extraheras relevant information från djupkartor, trots närvaro av inkonsekventa avvikelser i bilden. Dessa avvikelser förstärks genom att applicera brus på kroppen eller på specifika regioner som omger viktiga leder. Det avsiktliga införandet av brus i dessa områden gör det möjligt för VAE att lära sig en robust representation som fångar autentiska poseringsrelaterade mönster. Dessutom införs en lokaliserad mask som en begränsning i förlustfunktionen, vilket säkerställer att modellen främst förlitar sig på poseringsrelaterade signaler samtidigt som potentiella störande faktorer som hindrar den kompakta representationen av korrekt mänsklig poseringsinformation bortses ifrån. Genom att fördjupa sig ytterligare i den latenta rumsmoduleringen har en ny modellarkitektur tagits fram som förenar en VAE och ett fullständigt anslutet nätverk i en fleruppgiftsmodell. I detta ramverk påverkar VAE och det fullständigt ansluta nätverket de latenta representationerna på ett harmoniskt sätt för att uppnå korrekt leddetektering och lokalisering. Genom att kombinera fleruppgiftsmodellen med förlustfunktionsbegränsningen uppnår denna studie resultat som konkurrerar med toppmoderna tekniker. Dessa resultat understryker betydelsen av att utnyttja latent rymdmodulering och anpassade förlustfunktioner för att hantera utmanande mänskliga poser. Dessutom banar dessa nya metoder väg för framtida utveckling inom uppskattning av HPE. Efterföljande forskningsinsatser kan optimera dessa tekniker, utvärdera deras prestanda över olika datamängder och utforska potentiella tillägg för att avslöja ytterligare insikter och framsteg inom området.
183

Crime Detection From Pre-crime Video Analysis

Sedat Kilic (18363729) 03 June 2024 (has links)
<p dir="ltr">his research investigates the detection of pre-crime events, specifically targeting behaviors indicative of shoplifting, through the advanced analysis of CCTV video data. The study introduces an innovative approach that leverages augmented human pose and emotion information within individual frames, combined with the extraction of activity information across subsequent frames, to enhance the identification of potential shoplifting actions before they occur. Utilizing a diverse set of models including 3D Convolutional Neural Networks (CNNs), Graph Neural Networks (GNNs), Recurrent Neural Networks (RNNs), and a specially developed transformer architecture, the research systematically explores the impact of integrating additional contextual information into video analysis.</p><p dir="ltr">By augmenting frame-level video data with detailed pose and emotion insights, and focusing on the temporal dynamics between frames, our methodology aims to capture the nuanced behavioral patterns that precede shoplifting events. The comprehensive experimental evaluation of our models across different configurations reveals a significant improvement in the accuracy of pre-crime detection. The findings underscore the crucial role of combining visual features with augmented data and the importance of analyzing activity patterns over time for a deeper understanding of pre-shoplifting behaviors.</p><p dir="ltr">The study’s contributions are multifaceted, including a detailed examination of pre-crime frames, strategic augmentation of video data with added contextual information, the creation of a novel transformer architecture customized for pre-crime analysis, and an extensive evaluation of various computational models to improve predictive accuracy.</p>
184

Generation and Optimization of Local Shape Descriptors for Point Matching in 3-D Surfaces

Taati, BABAK 01 September 2009 (has links)
We formulate Local Shape Descriptor selection for model-based object recognition in range data as an optimization problem and offer a platform that facilitates a solution. The goal of object recognition is to identify and localize objects of interest in an image. Recognition is often performed in three phases: point matching, where correspondences are established between points on the 3-D surfaces of the models and the range image; hypothesis generation, where rough alignments are found between the image and the visible models; and pose refinement, where the accuracy of the initial alignments is improved. The overall efficiency and reliability of a recognition system is highly influenced by the effectiveness of the point matching phase. Local Shape Descriptors are used for establishing point correspondences by way of encapsulating local shape, such that similarity between two descriptors indicates geometric similarity between their respective neighbourhoods. We present a generalized platform for constructing local shape descriptors that subsumes a large class of existing methods and allows for tuning descriptors to the geometry of specific models and to sensor characteristics. Our descriptors, termed as Variable-Dimensional Local Shape Descriptors, are constructed as multivariate observations of several local properties and are represented as histograms. The optimal set of properties, which maximizes the performance of a recognition system, depend on the geometry of the objects of interest and the noise characteristics of range image acquisition devices and is selected through pre-processing the models and sample training images. Experimental analysis confirms the superiority of optimized descriptors over generic ones in recognition tasks in LIDAR and dense stereo range images. / Thesis (Ph.D, Electrical & Computer Engineering) -- Queen's University, 2009-09-01 11:07:32.084
185

Určení pozice kamery v reálném čase pro rozšířenou realitou / Real-time camera pose estimation for augmented reality

Szentandrási, István Unknown Date (has links)
Definované markery tvoří základ určování polohy kamery pro velké množství aplikací s rozšířenou realitou, v případě že jsou přísné požadavky na rychlost a robustnost. Tato práce popisuje účinnou metodu pro určení pózy kamery pomocí Uniformního pole markerů a několik realistických aplikací na bázi popsané metody. Metoda je velice výpočetně levná a poskytuje spolehlivou detekci pro několik výpočetních platforem, včetně běžných chytrých telefonů. Markery jako část zobrazené informace na monitorech jsou použité v této práci pro určení relativní orientaci mezi poskytovatelem obsahu a užívatelským zařízením, sloužícím pro výběr prvků užívatelského rozhraní při  interakci a migraci úkolů. Ve filmařském průmyslu poskytuje popsaná metoda pro zjištění polohy kamery jako součást klíčovaní pozadí filmářům živý náhled virtuální scény. Výsledky ukazují, že popsaná metoda pro detekci pole markerů má srovnatelnou úspěšnost a přesnost v porovnání s ostatními metodami na bázi markerů a je několikrát rýchlejší. Aplikace zahrnuté v této práci podle výsledků testů jsou životaschopné - rychlejší a levnější - alternativy k existujícím řešením.
186

Návrh a Aplikace Dvourozměrných Vizuálních Markerů pro Speciální Účely / Design and Applications of Special-Purpose Two-Dimensional Visual Markers

Zachariáš, Michal Unknown Date (has links)
Současné vizuální markerové systémy mají jednu zásadní nevýhodu oproti tzv. markerless přístupům - pohyb kamery je omezen na oblast pokrytou markery. V každém snímku musí být marker dostatečně velký, aby jej bylo možné identifikovat a vypočítat pozici a rotaci kamery. Zároveň musí být dostatečně malý, aby se celý (nebo alespoň jeho podstatná část) vešel do záběru kamery. Avšak tyto požadavky jsou protichůdné. Tato práce nabízí řešení tohoto problému za pomoci konceptu Marker Fields. Jde o strukturu, jejíž přítomnost je možné v obraze kamery snadno detekovat a identifikovat část, na kterou se kamera právě dívá, a to na základě jakékoli (malé) podoblasti s definovanou velikostí. Aby bylo možné podoblasti identifikovat zblízka i zdálky, nejsou od sebe odděleny, ale do velké míry se překrývají. V této práci jsou vysvětleny různé implementace konceptu marker fields, spolu s jejich zamýšleným použitím a výhodami a nevýhodami. Jako důkaz použitelnosti marker fields v reálném světě, se druhá největší část této práce věnuje popisu jejich reálných aplikací.
187

Asynchronous Event-Feature Detection and Tracking for SLAM Initialization

Ta, Tai January 2024 (has links)
Traditional cameras are most commonly used in visual SLAM to provide visual information about the scene and positional information about the camera motion. However, in the presence of varying illumination and rapid camera movement, the visual quality captured by traditional cameras diminishes. This limits the applicability of visual SLAM in challenging environments such as search and rescue situations. The emerging event camera has been shown to overcome the limitations of the traditional camera with the event camera's superior temporal resolution and wider dynamic range, opening up new areas of applications and research for event-based SLAM. In this thesis, several asynchronous feature detectors and trackers will be used to initialize SLAM using event camera data. To assess the pose estimation accuracy between the different feature detectors and trackers, the initialization performance was evaluated from datasets captured from various environments. Furthermore, two different methods to align corner-events were evaluated on the datasets to assess the difference. Results show that besides some slight variation in the number of accepted initializations, the alignment methods show no overall difference in any metric. Overall highest performance among the event-based trackers for initialization is HASTE with mostly high pose accuracy and a high number of accepted initializations. However, the performance degrades in featureless scenes. CET on the other hand shows mostly lower performance compared to HASTE.

Page generated in 0.1171 seconds