Global ETD Search

11	Multiple Session 3D Reconstruction using RGB-D Cameras / 3D-rekonstruktion med RGB-D kamera över multipla sessioner Widebäck West, Nikolaus January 2014 (has links) In this thesis we study the problem of multi-session dense rgb-d slam for 3D reconstruc- tion. Multi-session reconstruction can allow users to capture parts of an object that could not easily be captured in one session, due for instance to poor accessibility or user mistakes. We first present a thorough overview of single-session dense rgb-d slam and describe the multi-session problem as a loosening of the incremental camera movement and static scene assumptions commonly held in the single-session case. We then implement and evaluate sev- eral variations on a system for doing two-session reconstruction as an extension to a single- session dense rgb-d slam system. The extension from one to several sessions is divided into registering separate sessions into a single reference frame, re-optimizing the camera trajectories, and fusing together the data to generate a final 3D model. Registration is done by matching reconstructed models from the separate sessions using one of two adaptations on a 3D object detection pipeline. The registration pipelines are evaluated with many different sub-steps on a challenging dataset and it is found that robust registration can be achieved using the proposed methods on scenes without degenerate shape symmetry. In particular we find that using plane matches between two sessions as constraints for as much as possible of the registration pipeline improves results. Several different strategies for re-optimizing camera trajectories using data from both ses- sions are implemented and evaluated. The re-optimization strategies are based on re- tracking the camera poses from all sessions together, and then optionally optimizing over the full problem as represented on a pose-graph. The camera tracking is done by incrementally building and tracking against a tsdf volume, from which a final 3D mesh model is extracted. The whole system is qualitatively evaluated against a realistic dataset for multi-session re- construction. It is concluded that the overall approach is successful in reconstructing objects from several sessions, but that other fine grained registration methods would be required in order to achieve multi-session reconstructions that are indistinguishable from singe-session results in terms of reconstruction quality. 3D-Reconstruction SLAM RGB-D 3D-Keypoints Registration
12	Určení směru pohledu / Gaze Detection Caha, Miloš January 2010 (has links) Main object of this work is to design and implement the algorithm for look direction determination, respectively the head movement. More specifically, it is a system that searches face in the video and then detects points, suitable for view direction estimation of tracked person. Estimation is realized using searching transformation, which has been performed on key points during head movement. For accuracy enhancement the calibration frames are used. Calibration frames determines the key points transformation in defined view directions. Main result is an application able to determine deflection of head from straight position in horizontal and vertical direction for tracked person. Output doesn't contain only information about deflection direction, but it also contains the size of deflection.
13	Re-recognition of vehicles for enhanced insights on road traffic Asefaw, Aron January 2023 (has links) This study investigates the performance of two keypoint detection algorithms, SIFTand LoFTR, for vehicle re-recognition on a 2+1 road in Täby, utilizing three differentmethods: proportion of matches, ”gates” based on the values of the features andSupport Vector Machines (SVM). Data was collected from four strategically placedcameras, with a subset of the data manually annotated and divided into training,validation, and testing sets to minimize overfitting and ensure generalization. TheF1-score was used as the primary metric to evaluate the performance of the variousmethods. Results indicate that LoFTR outperforms SIFT across all methods, with theSVM method demonstrating the best performance and adaptability. The findings havepractical implications in security, traffic management, and intelligent transportationsystems, and suggest directions for future research in real-time implementation andgeneralization across varied camera placements. Re-recognition Computer Vision LoFTR Keypoints SIFT Transformers Machine Learning Homogrpahy Deep Learning SVM Physical Sciences Fysik
14	Etude de la confusion des descripteurs locaux de points d'intérêt : application à la mise en correspondance d'images de documents / Study of keypoints and local features confusion : document images matching scenario Royer, Emilien 24 October 2017 (has links) Ce travail s’inscrit dans une tentative de liaison entre la communauté classique de la Vision par ordinateur et la communauté du traitement d’images de documents, analyse être connaissance (DAR). Plus particulièrement, nous abordons la question des détecteurs de points d’intérêts et des descripteurs locaux dans une image. Ceux-ci ayant été conçus pour des images issues du monde réel, ils ne sont pas adaptés aux problématiques issues du document dont les images présentent des caractéristiques visuelles différentes.Notre approche se base sur la résolution du problème de la confusion entre les descripteurs,ceux-ci perdant leur pouvoir discriminant. Notre principale contribution est un algorithme de réduction de la confusion potentiellement présente dans un ensemble de vecteurs caractéristiques d’une même image, ceci par une approche probabiliste en filtrant les vecteurs fortement confusifs. Une telle conception nous permet d’appliquer des algorithmes d’extractions de descripteurs sans avoir à les modifier ce qui constitue une passerelle entre ces deux mondes. / This work tries to establish a bridge between the field of classical computer vision and document analysis and recognition. Specificaly, we tackle the issue of keypoints detection and associated local features computation in the image. These are not suitable for document images since they were designed for real-world images which have different visual characteristic. Our approach is based on resolving the issue of reducing the confusion between feature vectors since they usually lose their discriminant power with document images. Our main contribution is an algorithm reducing the confusion between local features by filtering the ones which present a high confusing risk. We are tackling this by using tools from probability theory. Such a method allows us to apply features extraction algorithms without having to modify them, thus establishing a bridge between these two worlds. Détecteurs de points d'intérêt Descripteurs locaux Estimation par noyaux Traitement d'images de documents Keypoints detectors Local features Kernel density estimation Document analysis and recognition
15	Ανάπτυξη τεχνικών αντιστοίχισης εικόνων με χρήση σημείων κλειδιών Γράψα, Ιωάννα 17 September 2012 (has links) Ένα σημαντικό πρόβλημα είναι η αντιστοίχιση εικόνων με σκοπό τη δημιουργία πανοράματος. Στην παρούσα εργασία έχουν χρησιμοποιηθεί αλγόριθμοι που βασίζονται στη χρήση σημείων κλειδιών. Αρχικά στην εργασία βρίσκονται σημεία κλειδιά για κάθε εικόνα που μένουν ανεπηρέαστα από τις αναμενόμενες παραμορφώσεις με την βοήθεια του αλγορίθμου SIFT (Scale Invariant Feature Transform). Έχοντας τελειώσει αυτή τη διαδικασία για όλες τις εικόνες, προσπαθούμε να βρούμε το πρώτο ζευγάρι εικόνων που θα ενωθεί. Για να δούμε αν δύο εικόνες μπορούν να ενωθούν, ακολουθεί ταίριασμα των σημείων κλειδιών τους. Όταν ένα αρχικό σετ αντίστοιχων χαρακτηριστικών έχει υπολογιστεί, πρέπει να βρεθεί ένα σετ που θα παράγει υψηλής ακρίβειας αντιστοίχιση. Αυτό το πετυχαίνουμε με τον αλγόριθμο RANSAC, μέσω του οποίου βρίσκουμε το γεωμετρικό μετασχηματισμό ανάμεσα στις δύο εικόνες, ομογραφία στην περίπτωσή μας. Αν ο αριθμός των κοινών σημείων κλειδιών είναι επαρκής, δηλαδή ταιριάζουν οι εικόνες, ακολουθεί η ένωσή τους. Αν απλώς ενώσουμε τις εικόνες, τότε θα έχουμε σίγουρα κάποια προβλήματα, όπως το ότι οι ενώσεις των δύο εικόνων θα είναι πολύ εμφανείς. Γι’ αυτό, για την εξάλειψη αυτού του προβλήματος, χρησιμοποιούμε τη μέθοδο των Λαπλασιανών πυραμίδων. Επαναλαμβάνεται η παραπάνω διαδικασία μέχρι να δημιουργηθεί το τελικό πανόραμα παίρνοντας κάθε φορά σαν αρχική την τελευταία εικόνα που φτιάξαμε στην προηγούμενη φάση. / Stitching multiple images together to create high resolution panoramas is one of the most popular consumer applications of image registration and blending. At this work, feature-based registration algorithms have been used. The first step is to extract distinctive invariant features from every image which are invariant to image scale and rotation, using SIFT (Scale Invariant Feature Transform) algorithm. After that, we try to find the first pair of images in order to stitch them. To check if two images can be stitched, we match their keypoints (the results from SIFT). Once an initial set of feature correspondences has been computed, we need to find the set that is will produce a high-accuracy alignment. The solution at this problem is RANdom Sample Consensus (RANSAC). Using this algorithm (RANSAC) we find the motion model between the two images (homography). If there is enough number of correspond points, we stitch these images. After that, seams are visible. As solution to this problem is used the method of Laplacian Pyramids. We repeat the above procedure using as initial image the ex panorama which has been created. Αντιστοίχιση εικόνων Σημεία κλειδιά 006.42 Stitching images Keypoints Method of Laplacian pyramids Scale Invariant Feature Transform (SIFT) RANdom SAmple Consensus (RANSAC)
16	Image Recognition Techniques for Optical Head Mounted Displays Kondreddy, Mahendra 30 January 2017 (has links) The evolution of technology has led the research into new emerging wearable devices such as the Smart Glasses. This technology provides with new visualization techniques. Augmented Reality is an advanced technology that could significantly ease the execution of much complex operations. Augmented Reality is a combination of both Virtual and Actual Reality, making accessible to the user new tools to safeguard in the transfer of knowledge in several environments and for several processes. This thesis explores the development of an android based image recognition application. The feature point detectors and descriptors are used as they can deal great with the correspondence problems. The selection of best image recognition technique on the smart glasses is chosen based on the time taken to retrieve the results and the amount of power consumed in the process. As the smart glasses are equipped with the limited resources, the selected approach should use low computation on it by making the device operations uninterruptable. The effective and efficient method for detection and recognition of the safety signs from images is selected. The ubiquitous SIFT and SURF feature detectors consume more time and are computationally complex and require very high-level hardware components for processing. The binary descriptors are taken into account as they are light weight and can support low power devices in a much effective style. A comparative analysis is being done on the working of binary descriptors like BRIEF, ORB, AKAZE, FREAK, etc., on the smart glasses based on their performance and the requirements. ORB is the most efficient among the binary descriptors and has been more effective for the smart glasses in terms of time measurements and low power consumption. info:eu-repo/classification/ddc/000 ddc:000 Informatik Smartbrille, OPenCV4Android, ORB
17	Casamento de modelos baseado em projeções radiais e circulares invariante a pontos de vista. / Viewpoint invariant template matching based in radial and circular proejction. Pérez López, Guillermo Angel 23 November 2015 (has links) Este trabalho aborda o problema de casamento entre duas imagens. Casamento de imagens pode ser do tipo casamento de modelos (template matching) ou casamento de pontos-chaves (keypoint matching). Estes algoritmos localizam uma região da primeira imagem numa segunda imagem. Nosso grupo desenvolveu dois algoritmos de casamento de modelos invariante por rotação, escala e translação denominados Ciratefi (Circula, radial and template matchings filter) e Forapro (Fourier coefficients of radial and circular projection). As características positivas destes algoritmos são a invariância a mudanças de brilho/contraste e robustez a padrões repetitivos. Na primeira parte desta tese, tornamos Ciratefi invariante a transformações afins, obtendo Aciratefi (Affine-ciratefi). Construímos um banco de imagens para comparar este algoritmo com Asift (Affine-scale invariant feature transform) e Aforapro (Affine-forapro). Asift é considerado atualmente o melhor algoritmo de casamento de imagens invariante afim, e Aforapro foi proposto em nossa dissertação de mestrado. Nossos resultados sugerem que Aciratefi supera Asift na presença combinada de padrões repetitivos, mudanças de brilho/contraste e mudanças de pontos de vista. Na segunda parte desta tese, construímos um algoritmo para filtrar casamentos de pontos-chaves, baseado num conceito que denominamos de coerência geométrica. Aplicamos esta filtragem no bem-conhecido algoritmo Sift (scale invariant feature transform), base do Asift. Avaliamos a nossa proposta no banco de imagens de Mikolajczyk. As taxas de erro obtidas são significativamente menores que as do Sift original. / This work deals with image matching. Image matchings can be modeled as template matching or keypoints matching. These algorithms search a region of the first image in a second image. Our group has developed two template matching algorithms invariant by rotation, scale and translation called Ciratefi (circular, radial and template matching filter) and Forapro (Fourier coefficients of radial and circular projection). The positive characteristics of Ciratefi and Forapro are: the invariance to brightness/contrast changes and robustness to repetitive patterns. In the first part of this work, we make Ciratefi invariant to affine transformations, getting Aciratefi (Affine-ciratefi). We have built a dataset to compare Aciratefi with Asift (Affine-scale invariant feature transform) and Aforapro (Affine-forapro). Asift is currently considered the best affine invariant image matching algorithm, and Aforapro was proposed in our master\'s thesis. Our results suggest that Aciratefi overcome Asift in the combined presence of repetitive patterns, brightness/contrast and viewpoints changes. In the second part of this work, we filter keypoints matchings based on a concept that we call geometric coherence. We apply this filtering in the well-known algorithm Sift (scale invariant feature transform), the basis of Asift. We evaluate our proposal in the Mikolajczyk images database. The error rates obtained are significantly lower than those of the original Sift. Affine invariant Asift Asift, Casamento de modelos Ciratefi Ciratefi Forapro Forapro Illumination changes Imagem digital Invariância a escala Invariância afim Keypoints Mudança de iluminação Padrões repetitivos Pontos-chaves Repetitive patterns Scale invariant Sift Sift Simulação de pontos de vista Template matching Transformação afim Viewpoints simulation
18	Casamento de modelos baseado em projeções radiais e circulares invariante a pontos de vista. / Viewpoint invariant template matching based in radial and circular proejction. Guillermo Angel Pérez López 23 November 2015 (has links) Este trabalho aborda o problema de casamento entre duas imagens. Casamento de imagens pode ser do tipo casamento de modelos (template matching) ou casamento de pontos-chaves (keypoint matching). Estes algoritmos localizam uma região da primeira imagem numa segunda imagem. Nosso grupo desenvolveu dois algoritmos de casamento de modelos invariante por rotação, escala e translação denominados Ciratefi (Circula, radial and template matchings filter) e Forapro (Fourier coefficients of radial and circular projection). As características positivas destes algoritmos são a invariância a mudanças de brilho/contraste e robustez a padrões repetitivos. Na primeira parte desta tese, tornamos Ciratefi invariante a transformações afins, obtendo Aciratefi (Affine-ciratefi). Construímos um banco de imagens para comparar este algoritmo com Asift (Affine-scale invariant feature transform) e Aforapro (Affine-forapro). Asift é considerado atualmente o melhor algoritmo de casamento de imagens invariante afim, e Aforapro foi proposto em nossa dissertação de mestrado. Nossos resultados sugerem que Aciratefi supera Asift na presença combinada de padrões repetitivos, mudanças de brilho/contraste e mudanças de pontos de vista. Na segunda parte desta tese, construímos um algoritmo para filtrar casamentos de pontos-chaves, baseado num conceito que denominamos de coerência geométrica. Aplicamos esta filtragem no bem-conhecido algoritmo Sift (scale invariant feature transform), base do Asift. Avaliamos a nossa proposta no banco de imagens de Mikolajczyk. As taxas de erro obtidas são significativamente menores que as do Sift original. / This work deals with image matching. Image matchings can be modeled as template matching or keypoints matching. These algorithms search a region of the first image in a second image. Our group has developed two template matching algorithms invariant by rotation, scale and translation called Ciratefi (circular, radial and template matching filter) and Forapro (Fourier coefficients of radial and circular projection). The positive characteristics of Ciratefi and Forapro are: the invariance to brightness/contrast changes and robustness to repetitive patterns. In the first part of this work, we make Ciratefi invariant to affine transformations, getting Aciratefi (Affine-ciratefi). We have built a dataset to compare Aciratefi with Asift (Affine-scale invariant feature transform) and Aforapro (Affine-forapro). Asift is currently considered the best affine invariant image matching algorithm, and Aforapro was proposed in our master\'s thesis. Our results suggest that Aciratefi overcome Asift in the combined presence of repetitive patterns, brightness/contrast and viewpoints changes. In the second part of this work, we filter keypoints matchings based on a concept that we call geometric coherence. We apply this filtering in the well-known algorithm Sift (scale invariant feature transform), the basis of Asift. We evaluate our proposal in the Mikolajczyk images database. The error rates obtained are significantly lower than those of the original Sift. Asift Casamento de modelos Ciratefi Forapro Imagem digital Invariância a escala Invariância afim Mudança de iluminação Padrões repetitivos Pontos-chaves Sift Simulação de pontos de vista Transformação afim Affine invariant Asift, Ciratefi Forapro Illumination changes Keypoints Repetitive patterns Scale invariant Sift Template matching Viewpoints simulation
19	Natural scene classification, annotation and retrieval : developing different approaches for semantic scene modelling based on Bag of Visual Words Alqasrawi, Yousef T. N. January 2012 (has links) With the availability of inexpensive hardware and software, digital imaging has become an important medium of communication in our daily lives. A huge amount of digital images are being collected and become available through the internet and stored in various fields such as personal image collections, medical imaging, digital arts etc. Therefore, it is important to make sure that images are stored, searched and accessed in an efficient manner. The use of bag of visual words (BOW) model for modelling images based on local invariant features computed at interest point locations has become a standard choice for many computer vision tasks. Based on this promising model, this thesis investigates three main problems: natural scene classification, annotation and retrieval. Given an image, the task is to design a system that can determine to which class that image belongs to (classification), what semantic concepts it contain (annotation) and what images are most similar to (retrieval). This thesis contributes to scene classification by proposing a weighting approach, named keypoints density-based weighting method (KDW), to control the fusion of colour information and bag of visual words on spatial pyramid layout in a unified framework. Different configurations of BOW, integrated visual vocabularies and multiple image descriptors are investigated and analyzed. The proposed approaches are extensively evaluated over three well-known scene classification datasets with 6, 8 and 15 scene categories using 10-fold cross validation. The second contribution in this thesis, the scene annotation task, is to explore whether the integrated visual vocabularies generated for scene classification can be used to model the local semantic information of natural scenes. In this direction, image annotation is considered as a classification problem where images are partitioned into 10x10 fixed grid and each block, represented by BOW and different image descriptors, is classified into one of predefined semantic classes. An image is then represented by counting the percentage of every semantic concept detected in the image. Experimental results on 6 scene categories demonstrate the effectiveness of the proposed approach. Finally, this thesis further explores, with an extensive experimental work, the use of different configurations of the BOW for natural scene retrieval. 004
20	Lokalizace mobilního robota v prostředí / Localisation of Mobile Robot in the Environment Urban, Daniel January 2018 (has links) This diploma thesis deals with the problem of mobile robot localisation in the environment based on current 2D and 3D sensor data and previous records. Work is focused on detecting previously visited places by robot. The implemented system is suitable for loop detection, using the Gestalt 3D descriptors. The output of the system provides corresponding positions on which the robot was already located. The functionality of the system has been tested and evaluated on LiDAR data.

Search results