Spelling suggestions: "subject:"scale invariant feature transform (SIFT)"" "subject:"scale invariant feature ransform (SIFT)""
1 |
[en] COMPUTATIONAL INTELLIGENCE TECHNIQUES FOR VISUAL SELF-LOCALIZATION AND MAPPING OF MOBILE ROBOTS / [pt] LOCALIZAÇÃO E MAPEAMENTO DE ROBÔS MÓVEIS UTILIZANDO INTELIGÊNCIA E VISÃO COMPUTACIONALNILTON CESAR ANCHAYHUA ARESTEGUI 18 October 2017 (has links)
[pt] Esta dissertação introduz um estudo sobre os algoritmos de inteligência computacional para o controle autônomo dos robôs móveis, Nesta pesquisa, são desenvolvidos e implementados sistemas inteligentes de controle de um robô móvel construído no Laboratório de Robótica da PUC-Rio, baseado numa modificação do robô ER1. Os experimentos realizados consistem em duas etapas: a primeira etapa de simulação usando o software Player-Stage de simulação do robô em 2-D onde foram desenvolvidos os algoritmos de navegação usando as técnicas de inteligência computacional; e a segunda etapa a implementação dos
algoritmos no robô real. As técnicas implementadas para a navegação do robô móvel estão baseadas em algoritmos de inteligência computacional como são redes neurais, lógica difusa e support vector machine (SVM) e para dar suporte visual ao robô móvel foi implementado uma técnica de visão computacional
chamado Scale Invariant Future Transform (SIFT), estes algoritmos em conjunto fazem um sistema embebido para dotar de controle autônomo ao robô móvel. As simulações destes algoritmos conseguiram o objetivo, mas na implementação surgiram diferenças muito claras respeito à simulação pelo tempo que demora em processar o microprocessador. / [en] This theses introduces a study on the computational intelligence algorithms for autonomous control of mobile robots, In this research, intelligent systems are developed and implemented for a robot in the Robotics Laboratory of PUC-Rio, based on a modiÞcation of the robot ER1. The verification consist of two stages: the first stage includes simulation using Player-Stage software for simulation of the robot in 2-D with the developed of artiÞcial intelligence; an the second stage, including the implementation of the algorithms in the real robot. The techniques implemented for the navigation of the mobile robot are based on algorithms of computational intelligence as neural networks, fuzzy logic and support vector machine (SVM); and to give visual support to the mobile robot was implemented the visual algorithm called Scale Invariant Future Transform (SIFT), these algorithms in set makes an absorbed system to endow with independent control the mobile robot. The simulations of these algorithms had obtained the objective
but in the implementation clear differences had appeared respect to the simulation, it just for the time that delays in processing the microprocessor.
|
2 |
Real-time Hand Gesture Detection and Recognition for Human Computer InteractionDardas, Nasser Hasan Abdel-Qader 08 November 2012 (has links)
This thesis focuses on bare hand gesture recognition by proposing a new architecture to solve the problem of real-time vision-based hand detection, tracking, and gesture recognition for interaction with an application via hand gestures. The first stage of our system allows detecting and tracking a bare hand in a cluttered background using face subtraction, skin detection and contour comparison. The second stage allows recognizing hand gestures using bag-of-features and multi-class Support Vector Machine (SVM) algorithms. Finally, a grammar has been developed to generate gesture commands for application control.
Our hand gesture recognition system consists of two steps: offline training and online testing. In the training stage, after extracting the keypoints for every training image using the Scale Invariance Feature Transform (SIFT), a vector quantization technique will map keypoints from every training image into a unified dimensional histogram vector (bag-of-words) after K-means clustering. This histogram is treated as an input vector for a multi-class SVM to build the classifier. In the testing stage, for every frame captured from a webcam, the hand is detected using my algorithm. Then, the keypoints are extracted for every small image that contains the detected hand posture and fed into the cluster model to map them into a bag-of-words vector, which is fed into the multi-class SVM classifier to recognize the hand gesture.
Another hand gesture recognition system was proposed using Principle Components Analysis (PCA). The most eigenvectors and weights of training images are determined. In the testing stage, the hand posture is detected for every frame using my algorithm. Then, the small image that contains the detected hand is projected onto the most eigenvectors of training images to form its test weights. Finally, the minimum Euclidean distance is determined among the test weights and the training weights of each training image to recognize the hand gesture.
Two application of gesture-based interaction with a 3D gaming virtual environment were implemented. The exertion videogame makes use of a stationary bicycle as one of the main inputs for game playing. The user can control and direct left-right movement and shooting actions in the game by a set of hand gesture commands, while in the second game, the user can control and direct a helicopter over the city by a set of hand gesture commands.
|
3 |
Real-time Hand Gesture Detection and Recognition for Human Computer InteractionDardas, Nasser Hasan Abdel-Qader 08 November 2012 (has links)
This thesis focuses on bare hand gesture recognition by proposing a new architecture to solve the problem of real-time vision-based hand detection, tracking, and gesture recognition for interaction with an application via hand gestures. The first stage of our system allows detecting and tracking a bare hand in a cluttered background using face subtraction, skin detection and contour comparison. The second stage allows recognizing hand gestures using bag-of-features and multi-class Support Vector Machine (SVM) algorithms. Finally, a grammar has been developed to generate gesture commands for application control.
Our hand gesture recognition system consists of two steps: offline training and online testing. In the training stage, after extracting the keypoints for every training image using the Scale Invariance Feature Transform (SIFT), a vector quantization technique will map keypoints from every training image into a unified dimensional histogram vector (bag-of-words) after K-means clustering. This histogram is treated as an input vector for a multi-class SVM to build the classifier. In the testing stage, for every frame captured from a webcam, the hand is detected using my algorithm. Then, the keypoints are extracted for every small image that contains the detected hand posture and fed into the cluster model to map them into a bag-of-words vector, which is fed into the multi-class SVM classifier to recognize the hand gesture.
Another hand gesture recognition system was proposed using Principle Components Analysis (PCA). The most eigenvectors and weights of training images are determined. In the testing stage, the hand posture is detected for every frame using my algorithm. Then, the small image that contains the detected hand is projected onto the most eigenvectors of training images to form its test weights. Finally, the minimum Euclidean distance is determined among the test weights and the training weights of each training image to recognize the hand gesture.
Two application of gesture-based interaction with a 3D gaming virtual environment were implemented. The exertion videogame makes use of a stationary bicycle as one of the main inputs for game playing. The user can control and direct left-right movement and shooting actions in the game by a set of hand gesture commands, while in the second game, the user can control and direct a helicopter over the city by a set of hand gesture commands.
|
4 |
Real-time Hand Gesture Detection and Recognition for Human Computer InteractionDardas, Nasser Hasan Abdel-Qader January 2012 (has links)
This thesis focuses on bare hand gesture recognition by proposing a new architecture to solve the problem of real-time vision-based hand detection, tracking, and gesture recognition for interaction with an application via hand gestures. The first stage of our system allows detecting and tracking a bare hand in a cluttered background using face subtraction, skin detection and contour comparison. The second stage allows recognizing hand gestures using bag-of-features and multi-class Support Vector Machine (SVM) algorithms. Finally, a grammar has been developed to generate gesture commands for application control.
Our hand gesture recognition system consists of two steps: offline training and online testing. In the training stage, after extracting the keypoints for every training image using the Scale Invariance Feature Transform (SIFT), a vector quantization technique will map keypoints from every training image into a unified dimensional histogram vector (bag-of-words) after K-means clustering. This histogram is treated as an input vector for a multi-class SVM to build the classifier. In the testing stage, for every frame captured from a webcam, the hand is detected using my algorithm. Then, the keypoints are extracted for every small image that contains the detected hand posture and fed into the cluster model to map them into a bag-of-words vector, which is fed into the multi-class SVM classifier to recognize the hand gesture.
Another hand gesture recognition system was proposed using Principle Components Analysis (PCA). The most eigenvectors and weights of training images are determined. In the testing stage, the hand posture is detected for every frame using my algorithm. Then, the small image that contains the detected hand is projected onto the most eigenvectors of training images to form its test weights. Finally, the minimum Euclidean distance is determined among the test weights and the training weights of each training image to recognize the hand gesture.
Two application of gesture-based interaction with a 3D gaming virtual environment were implemented. The exertion videogame makes use of a stationary bicycle as one of the main inputs for game playing. The user can control and direct left-right movement and shooting actions in the game by a set of hand gesture commands, while in the second game, the user can control and direct a helicopter over the city by a set of hand gesture commands.
|
5 |
Ανάπτυξη τεχνικών αντιστοίχισης εικόνων με χρήση σημείων κλειδιώνΓράψα, Ιωάννα 17 September 2012 (has links)
Ένα σημαντικό πρόβλημα είναι η αντιστοίχιση εικόνων με σκοπό τη δημιουργία πανοράματος. Στην παρούσα εργασία έχουν χρησιμοποιηθεί αλγόριθμοι που βασίζονται στη χρήση σημείων κλειδιών.
Αρχικά στην εργασία βρίσκονται σημεία κλειδιά για κάθε εικόνα που μένουν ανεπηρέαστα από τις αναμενόμενες παραμορφώσεις με την βοήθεια του αλγορίθμου SIFT (Scale Invariant Feature Transform). Έχοντας τελειώσει αυτή τη διαδικασία για όλες τις εικόνες, προσπαθούμε να βρούμε το πρώτο ζευγάρι εικόνων που θα ενωθεί. Για να δούμε αν δύο εικόνες μπορούν να ενωθούν, ακολουθεί ταίριασμα των σημείων κλειδιών τους. Όταν ένα αρχικό σετ αντίστοιχων χαρακτηριστικών έχει υπολογιστεί, πρέπει να βρεθεί ένα σετ που θα παράγει υψηλής ακρίβειας αντιστοίχιση. Αυτό το πετυχαίνουμε με τον αλγόριθμο RANSAC, μέσω του οποίου βρίσκουμε το γεωμετρικό μετασχηματισμό ανάμεσα στις δύο εικόνες, ομογραφία στην περίπτωσή μας. Αν ο αριθμός των κοινών σημείων κλειδιών είναι επαρκής, δηλαδή ταιριάζουν οι εικόνες, ακολουθεί η ένωσή τους. Αν απλώς ενώσουμε τις εικόνες, τότε θα έχουμε σίγουρα κάποια προβλήματα, όπως το ότι οι ενώσεις των δύο εικόνων θα είναι πολύ εμφανείς. Γι’ αυτό, για την εξάλειψη αυτού του προβλήματος, χρησιμοποιούμε τη μέθοδο των Λαπλασιανών πυραμίδων. Επαναλαμβάνεται η παραπάνω διαδικασία μέχρι να δημιουργηθεί το τελικό πανόραμα παίρνοντας κάθε φορά σαν αρχική την τελευταία εικόνα που φτιάξαμε στην προηγούμενη φάση. / Stitching multiple images together to create high resolution panoramas is one of the most popular consumer applications of image registration and blending. At this work, feature-based registration algorithms have been used.
The first step is to extract distinctive invariant features from every image which are invariant to image scale and rotation, using SIFT (Scale Invariant Feature Transform) algorithm. After that, we try to find the first pair of images in order to stitch them. To check if two images can be stitched, we match their keypoints (the results from SIFT). Once an initial set of feature correspondences has been computed, we need to find the set that is will produce a high-accuracy alignment. The solution at this problem is RANdom Sample Consensus (RANSAC). Using this algorithm (RANSAC) we find the motion model between the two images (homography). If there is enough number of correspond points, we stitch these images. After that, seams are visible. As solution to this problem is used the method of Laplacian Pyramids. We repeat the above procedure using as initial image the ex panorama which has been created.
|
6 |
Descripteurs locaux pour l'imagerie radar et applications / Local features for SAR images and applicationsDellinger, Flora 01 July 2014 (has links)
Nous étudions ici l’intérêt des descripteurs locaux pour les images satellites optiques et radar. Ces descripteurs, par leurs invariances et leur représentation compacte, offrent un intérêt pour la comparaison d’images acquises dans des conditions différentes. Facilement applicables aux images optiques, ils offrent des performances limitées sur les images radar, en raison de leur fort bruit multiplicatif. Nous proposons ici un descripteur original pour la comparaison d’images radar. Cet algorithme, appelé SAR-SIFT, repose sur la même structure que l’algorithme SIFT (détection de points-clés et extraction de descripteurs) et offre des performances supérieures pour les images radar. Pour adapter ces étapes au bruit multiplicatif, nous avons développé un opérateur différentiel, le Gradient par Ratio, permettant de calculer une norme et une orientation du gradient robustes à ce type de bruit. Cet opérateur nous a permis de modifier les étapes de l’algorithme SIFT. Nous présentons aussi deux applications pour la télédétection basées sur les descripteurs. En premier, nous estimons une transformation globale entre deux images radar à l’aide de SAR-SIFT. L’estimation est réalisée à l’aide d’un algorithme RANSAC et en utilisant comme points homologues les points-clés mis en correspondance. Enfin nous avons mené une étude prospective sur l’utilisation des descripteurs pour la détection de changements en télédétection. La méthode proposée compare les densités de points-clés mis en correspondance aux densités de points-clés détectés pour mettre en évidence les zones de changement. / We study here the interest of local features for optical and SAR images. These features, because of their invariances and their dense representation, offer a real interest for the comparison of satellite images acquired under different conditions. While it is easy to apply them to optical images, they offer limited performances on SAR images, because of their multiplicative noise. We propose here an original feature for the comparison of SAR images. This algorithm, called SAR-SIFT, relies on the same structure as the SIFT algorithm (detection of keypoints and extraction of features) and offers better performances for SAR images. To adapt these steps to multiplicative noise, we have developed a differential operator, the Gradient by Ratio, allowing to compute a magnitude and an orientation of the gradient robust to this type of noise. This operator allows us to modify the steps of the SIFT algorithm. We present also two applications for remote sensing based on local features. First, we estimate a global transformation between two SAR images with help of SAR-SIFT. The estimation is realized with help of a RANSAC algorithm and by using the matched keypoints as tie points. Finally, we have led a prospective study on the use of local features for change detection in remote sensing. The proposed method consists in comparing the densities of matched keypoints to the densities of detected keypoints, in order to point out changed areas.
|
7 |
Use of Coherent Point Drift in computer vision applicationsSaravi, Sara January 2013 (has links)
This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used. The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform. The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images. The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration.
|
8 |
Signal Processing Algorithms For Digital Image ForensicsPrasad, S 02 1900 (has links)
Availability of digital cameras in various forms and user-friendly image editing softwares has enabled people to create and manipulate digital images easily. While image editing can be used for enhancing the quality of the images, it can also be used to tamper the images for malicious purposes. In this context, it is important to question the originality of digital images. Digital image forensics deals with the development of algorithms and systems to detect tampering in digital images. This thesis presents some simple algorithms which can be used to detect tampering in digital images. Out of the various kinds of image forgeries possible, the discussion is restricted to photo compositing (Photo montaging) and copy-paste forgeries.
While creating photomontage, it is very likely that one of the images needs to be resampled and hence there will be an inconsistency in some of its underlying characteristics. So, detection of resampling in an image will give a clue to decide whether the image is tampered or not. Two pixel domain techniques to detect resampling have been presented. The rest of them exploits the property of periodic zeros that occur in the second divergences due to interpolation during resembling. It requires a special condition on the resembling factor to be met. The second technique is based on the periodic zero-crossings that occur in the second divergences, which does not require any special condition on the resembling factor. It has been noted that this is an important property of revamping and hence the decay of this technique against mild counter attacks such as JPEG compression and additive noise has been studied. This property has been repeatedly used throughout this thesis.
It is a well known fact that interpolation is essentially low-pass filtering. In case of photomontage image which consists of resample and non resample portions, there will be an in consistency in the high-frequency content of the image. This can be demonstrated by a simple high-pass filtering of the image. This fact has also been exploited to detect photomontaging. One approach involves performing block wise DCT and reconstructing the image using only a few high-frequency coercions. Another elegant approach is to decompose the image using wavelets and reconstruct the image using only the diagonal detail coefficients. In both the cases mere visual inspection will reveal the forgery.
The second part of the thesis is related to tamper detection in colour filter array (CFA) interpolated images. Digital cameras employ Bayer filters to efficiently capture the RGB components of an image. The output of Bayer filter are sub-sampled versions of R, G and B components and they are completed by using demosaicing algorithms. It has been shown that demos icing of the color components is equivalent to resembling the image by a factor of two. Hence, CFA interpolated images contain periodic zero-crossings in its second differences. Experimental demonstration of the presence of periodic zero-crossings in images captured using four digital cameras of deferent brands has been done. When such an image is tampered, these periodic zero-crossings are destroyed and hence the tampering can be detected. The utility of zero-crossings in detecting various kinds of forgeries on CFA interpolated images has been discussed.
The next part of the thesis is a technique to detect copy-paste forgery in images. Generally, while an object or a portion if an image has to be erased from an image, the easiest way to do it is to copy a portion of background from the same image and paste it over the object. In such a case, there are two pixel wise identical regions in the same image, which when detected can serve as a clue of tampering. The use of Scale-Invariant-Feature-Transform (SIFT) in detecting this kind of forgery has been studied. Also certain modifications that can to be done to the image in order to get the SIFT working effectively has been proposed.
Throughout the thesis, the importance of human intervention in making the final decision about the authenticity of an image has been highlighted and it has been concluded that the techniques presented in the thesis can effectively help the decision making process.
|
Page generated in 0.1184 seconds