Global ETD Search

541	Exploring Deep generative models for Structured Object Generation and Complex Scenes Manipulation Ardino, Pierfrancesco 28 April 2023 (has links) The availability of powerful GPUs and the consequent development of deep neural networks, have brought remarkable results in videogame levels generation, image-to-image translation , video-to-video translation, image inpainting and video generation. Nonetheless, in conditional or constrained settings, unconditioned generative models still suffer because they have little to none control over the generated output. This leads to problems in some scenarios, such as structured objects generation or multimedia manipulation. In the manner, unconstrained GANs fail to generate objects that must satisfy hard constraints (e.g., molecules must be chemically valid or game levels must be playable). In the latter, the manipulation of complex scenes is a challenging and unsolved task, since these scenes are composed of objects and background of different classes. In this thesis , we focus on these two scenarios and propose different techniques to improve deep generative models. First, we introduce Constrained Adversarial Networks (CANs), an extension of GANs in which the constraints are embedded into the model during training. Then we focus on developing novel deep learning models to alter complex urban scenes. In particular, we aim to alter the scene by: i) studying how to better leverage the semantic and instance segmentation to model its content and structure; ii) modifying, inserting and/or removing specific object instances coherently to its semantic; iii) generating coherent and realistic videos where users can alter the object’s position. computer vision machine learning generative adversarial networks
542	Radar and Camera Fusion in Intelligent Transportation System Ding, Bao Ming January 2023 (has links) Modern smart cities often consist of a vast array of all-purpose traffic monitoring systems to understand city status, help reduce traffic congestion, and to enforce traffic laws. It is critical for these systems to be able to robustly and effectively detect and classify road objects. The majority of current traffic monitoring solutions consist of single RGB cameras. While cost-effective, these RGB cameras can fail in adverse weather or under poor lighting conditions. This thesis explores the viability of fusing an mmWave Radar with an RGB camera to increase performance and make the system robust in any operating conditions. This thesis discusses the fusion device's design, build, and sensor selection process. Next, this thesis proposes the fusion device processing pipeline consisting of a novel radar object detection and classification algorithm, State-of-the-Art camera processing algorithms, and a practical fusion algorithm to fuse the result from the camera and the radar. The proposed radar detection algorithm includes a novel clustering algorithm based on DBSCAN and a feature-based object classifier. The proposed algorithms show higher accuracy compared to the baseline. The camera processing algorithms include Yolov5 and StrongSort, which are pre-trained on their respective dataset and show high accuracy without the need for transfer learning. Finally, the practical fusion algorithm fuses the information between the radar and the camera at the decision level, where the camera results are matched with the radar results based on probability. The fusion allows the device to combine the high data association accuracy of the camera sensor with the additional measured states of the radar system to form a better understanding of the observed objects. / Thesis / Master of Applied Science (MASc) Computer Vision Machine Learning Object Detection Radar
543	TOWARDS IMPROVED REPRESENTATIONS ON HUMAN ACTIVITY UNDERSTANDING Hyung-gun Chi (17543172) 04 December 2023 (has links) <p dir="ltr">Human action recognition stands as a cornerstone in the domain of computer vision, with its utility spanning across emergency response, sign language interpretation, and the burgeoning fields of augmented and virtual reality. The transition from conventional video-based recognition to skeleton-based methodologies has been a transformative shift, offering a robust alternative less susceptible to environmental noise and more focused on the dynamics of human movement.</p><p dir="ltr">This body of work encapsulates the evolution of action recognition, emphasizing the pivotal role of Graph Convolution Network (GCN) based approaches, particularly through the innovative InfoGCN framework. InfoGCN has set a new precedent in the field by introducing an information bottleneck-based learning objective, a self-attention graph convolution module, and a multi-modal representation of the human skeleton. These advancements have collectively elevated the accuracy and efficiency of action recognition systems.</p><p dir="ltr">Addressing the prevalent challenge of occlusions, particularly in single-camera setups, the Pose Relation Transformer (PORT) framework has been introduced. Inspired by the principles of Masked Language Modeling in natural language processing, PORT refines the detection of occluded joints, thereby enhancing the reliability of pose estimation under visually obstructive conditions.</p><p dir="ltr">Building upon the foundations laid by InfoGCN, the Skeleton ODE framework has been developed for online action recognition, enabling real-time inference without the need for complete action observation. By integrating Neural Ordinary Differential Equations, Skeleton ODE facilitates the prediction of future movements, thus reducing latency and paving the way for real-time applications.</p><p dir="ltr">The implications of this research are vast, indicating a future where real-time, efficient, and accurate human action recognition systems could significantly impact various sectors, including healthcare, autonomous vehicles, and interactive technologies. Future research directions point towards the integration of multi-modal data, the application of transfer learning for enhanced generalization, the optimization of models for edge computing, and the ethical deployment of action recognition technologies. The potential for these systems to contribute to healthcare, particularly in patient monitoring and disease detection, underscores the need for continued interdisciplinary collaboration and innovation.</p> Computer vision Human Action Recognition Representation Learning
544	Navigation eines mobilen Roboters durch ebene Innenräume Buchmann, Lennart 07 February 2023 (has links) Die Begutachtung, der Handel und das Sammeln von Kunstgegenständen findet nicht ausschließlich analog statt. Die Firma 4ARTechnologies entwickelt Softwarelösungen für das digitale Kollektionsmanagement physischer und digitaler Kunst. Mittels Applikationen auf mobilen Endgeräten können Nutzer ihre Gemälde registrieren, authentifizieren und periodisch präzise Zustandsberichte erstellen. Die Erstellung von Zustandsberichten führt jedoch aufgrund von menschlichen Limitierungen zu Problemen in der Handhabung der Applikation und soll mithilfe eines mobilen Roboters automatisiert werden. Das Ziel dieser Arbeit ist die Entwicklung einer Navigation für einen mobilen Roboter. Diese soll folgendes Problem lösen: Lokalisierung eines Gemäldes, kollisionsfreie Annäherung und horizontal-mittige Positionierung davor. Zielplattform dieser Software ist das mobile Betriebssystem iOS. Für die Lösung wurden Verfahren der Navigation mobiler Roboter und der computergestützten Erkennung von Bildern untersucht. Die Navigationssoftware nutzt zur Zielfindung das Feature-Matching aus der OpenCV-Bibliothek. Für die Schätzung der eigenen Position werden relative Lokalisierungverfahren wie Posenverfolgung und Odometrie eingesetzt. Die Abbildung der Umgebung sowie der Bewegungsverlauf des Roboters werden auf einer topologischen Karte dargestellt. Mittels implementiertem BUG3-Algorithmus werden Hindernisse umfahren.:1. Einleitung 1.1. Problembeschreibung und thematische Abgrenzung 1.2. Aufbau Roboter 1.3. Randbedingungen und Anforderungen 2. Theoretische Grundlagen 2.1. Robotik 2.1.1. Mobile Robotik 2.2. Navigation 2.2.1. Lokalisierung 2.2.2. Kartierung 2.2.3. SLAM 2.2.4. Pfadfindung 2.2.5. Augmented Reality 2.3. Computer Vision 2.3.1. OpenCV 2.3.2. Vorlagen Erkennung 2.3.3. Template-basiertes Matching 2.3.4. Feature-basiertes Matching 3. Praktische Umsetzung 3.1. Programmablauf der Navigation 3.1.1. Verbindung mit dem Roboter 3.1.2. Initiale Exploration 3.1.3. Lokalisation und Annäherung 3.1.4. Kollisionsvermeidung 3.1.5. Zielanfahrt und Positionierung 4. Tests 4.1. Störfaktoren 5. Fazit und Ausblick 5.1. Fazit 5.2. Ausblick / The appraisal, trading and collecting of art objects does not only take place analogously. The company 4ARTechnologies develops software solutions for the digital collection management of physical and digital art. Using applications on mobile devices, users can register and authenticate their paintings and periodically create precise condition reports. The creation of condition reports leads to problems in handling the application due to human limitations and should be automated with the help of a mobile robot. The goal of this work is the development of a navigation system for a mobile robot. This should solve the following problem: Localization of a painting and the collision-free arrival and horizontal-center position in front of it. The target platform of this software is the mobile operating system iOS. Several methods, including the navigation of mobile robots and the computer-aided recognition of images were examined for the solution. The navigation software uses feature matching from the Open-CV library to find the destination. Relative localization methods such as pose tracking and odometry are used to estimate the robots own position. The environment and the movement of the robot are shown in a topological map. Obstacles are bypassed using the implemented BUG3 algorithm.:1. Einleitung 1.1. Problembeschreibung und thematische Abgrenzung 1.2. Aufbau Roboter 1.3. Randbedingungen und Anforderungen 2. Theoretische Grundlagen 2.1. Robotik 2.1.1. Mobile Robotik 2.2. Navigation 2.2.1. Lokalisierung 2.2.2. Kartierung 2.2.3. SLAM 2.2.4. Pfadfindung 2.2.5. Augmented Reality 2.3. Computer Vision 2.3.1. OpenCV 2.3.2. Vorlagen Erkennung 2.3.3. Template-basiertes Matching 2.3.4. Feature-basiertes Matching 3. Praktische Umsetzung 3.1. Programmablauf der Navigation 3.1.1. Verbindung mit dem Roboter 3.1.2. Initiale Exploration 3.1.3. Lokalisation und Annäherung 3.1.4. Kollisionsvermeidung 3.1.5. Zielanfahrt und Positionierung 4. Tests 4.1. Störfaktoren 5. Fazit und Ausblick 5.1. Fazit 5.2. Ausblick Robotik, SLAM Computer Vision, Feature Matching
545	Evaluation under Real-world Distribution Shifts Alhamoud, Kumail 07 1900 (has links) Recent advancements in empirical and certified robustness have shown promising results in developing reliable and deployable Deep Neural Networks (DNNs). However, most evaluations of DNN robustness have focused on testing models on images from the same distribution they were trained on. In real-world scenarios, DNNs may encounter dynamic environments with significant distribution shifts. This thesis aims to investigate the interplay between empirical and certified adversarial robustness and domain generalization. We take the first step by training robust models on multiple domains and evaluating their accuracy and robustness on an unseen domain. Our findings reveal that: (1) both empirical and certified robustness exhibit generalization to unseen domains, and (2) the level of generalizability does not correlate strongly with the visual similarity of inputs, as measured by the Fréchet Inception Distance (FID) between source and target domains. Furthermore, we extend our study to a real-world medical application, where we demonstrate that adversarial augmentation significantly enhances robustness generalization while minimally affecting accuracy on clean data. This research sheds light on the importance of evaluating DNNs under real-world distribution shifts and highlights the potential of adversarial augmentation in improving robustness in practical applications. machine learning computer vision medical imaging robustness
546	An automated vision system using a fast 2-dimensional moment invariants algorithm / Zakaria, Marwan F. January 1987 (has links) No description available. Computer vision Robot vision -- Computer programs
547	Facial image processing in computer vision Yap, M.H., Ugail, Hassan 20 March 2022 (has links) Yes / The application of computer vision in face processing remains an important research field. The aim of this chapter is to provide an up-to-date review of research efforts of computer vision scientist in facial image processing, especially in the areas of entertainment industry, surveillance, and other human computer interaction applications. To be more specific, this chapter reviews and demonstrates the techniques of visible facial analysis, regardless of specific application areas. First, the chapter makes a thorough survey and comparison of face detection techniques. It provides some demonstrations on the effect of computer vision algorithms and colour segmentation on face images. Then, it reviews the facial expression recognition from the psychological aspect (Facial Action Coding System, FACS) and from the computer animation aspect (MPEG-4 Standard). The chapter also discusses two popular existing facial feature detection techniques: Gabor feature based boosted classifiers and Active Appearance Models, and demonstrate the performance on our in-house dataset. Finally, the chapter concludes with the future challenges and future research direction of facial image processing. © 2011, IGI Global. Facial recognition Computer vision Face processing
548	Fragment Association Matching Enhancement (FAME) on a Video Tracker Johnson, Andrew 23 May 2014 (has links) No description available. Computer Engineering
549	3-D Scene Reconstruction from Line Correspondences between Multiple Views Linger, Michael 16 December 2014 (has links) No description available. Computer Engineering Structure from Motion Computer Vision
550	THREE-DIMENSIONAL OBJECT RECONSTRUCTION FROM RANGE IMAGES LI, XIAOKUN January 2004 (has links) No description available. Computer Vision Data Visualization 3D Reconstruction

Search results