Spelling suggestions: "subject:"image descriptors"" "subject:"lmage descriptors""
1 |
Compact features for mobile visual search = Descritores compactos para busca visual em dispositivos móveis / Paul Joseph Hidalgo FloresHidalgo Flores, Paul Joseph, 1986- 27 August 2018 (has links)
Orientador: Eduardo Alves do Valle Junior / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de Computação / Made available in DSpace on 2018-08-27T09:45:59Z (GMT). No. of bitstreams: 1
HidalgoFlores_PaulJoseph_M.pdf: 28047685 bytes, checksum: 055aafb9f31fc0ba5c1cfac1cc7a79b3 (MD5)
Previous issue date: 2015 / Resumo: Aplicações de busca visual em aparelhos móveis (Mobile Visual Search ¿ MVS) tornaram-se possíveis devido ao alto poder computacional e a multiplicidade de sensores dos novos dispositivos móveis (smart-phones, tablets). Além disso, o estado da arte em recuperação de informação multimídia baseada no conteúdo (Content Based Image Retrieval - CBIR) alcançou uma maturidade que permite realizar estas tarefas de forma eficiente. Nesta dissertação, apresentamos um estudo das principais técnicas em CBIR visual das imagens. Uma vasta investigação da literatura foi realizada, que inclui desde os descritores baseados em gradiente mais comuns aos mais recentes descritores binários e compactos. Como resultado da análise comparativa entre as principais técnicas no contexto de MVS, apresentamos as alternativas mais apropriadas para serem utilizadas em tais aplicações / Abstract: Mobile Visual Search (MVS) applications became possible due to the computational power and multiple sensors on current mobile devices (smart-phones, tablets). In addition, the state-of-the-art in content based image retrieval (CBIR) has reached a maturity to perform these tasks efficiently. In this dissertation, we present a study of the major techniques in CBIR. An extensive study of literature, including the most common descriptors based on gradients and the recently proposed binary descriptors. As a result of comparative analysis between the main techniques in the context of MVS, we present the most appropriate alternatives to use in such applications / Mestrado / Engenharia de Computação / Mestre em Engenharia Elétrica
|
2 |
Um descritor de imagens baseado em particionamento extremo para busca em bases grandes e heterogêneasVidal, Márcio Luiz Assis 25 October 2013 (has links)
Submitted by Geyciane Santos (geyciane_thamires@hotmail.com) on 2015-06-22T14:59:26Z
No. of bitstreams: 1
Tese- Márcio Luiz Assis Vidal.pdf: 6102842 bytes, checksum: 12c4e5a330ea91e55788a8d2d6b46898 (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2015-06-24T15:29:04Z (GMT) No. of bitstreams: 1
Tese- Márcio Luiz Assis Vidal.pdf: 6102842 bytes, checksum: 12c4e5a330ea91e55788a8d2d6b46898 (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2015-06-24T16:06:25Z (GMT) No. of bitstreams: 1
Tese- Márcio Luiz Assis Vidal.pdf: 6102842 bytes, checksum: 12c4e5a330ea91e55788a8d2d6b46898 (MD5) / Made available in DSpace on 2015-06-24T16:06:25Z (GMT). No. of bitstreams: 1
Tese- Márcio Luiz Assis Vidal.pdf: 6102842 bytes, checksum: 12c4e5a330ea91e55788a8d2d6b46898 (MD5)
Previous issue date: 2013-10-25 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / In this thesis we propose a new image descriptor that address the problem of image
search in large and heterogeneous databases. This approach uses the idea of extreme
partitioning to obtain the visual properties of images that will be converted into a textual description. Once the textual description is appropriately generated, traditional text-based information retrieval techniques can be used. The key point of the proposed work is escalability, given that text-based search techniques can deal with databases with millions of documents. We have carried out experiments in order to con rm the viability of our proposal. The experimental results showed that our technique reaches higher precision levels compared to other content-based image retrieval techniques in a database with more than 100,000 images. / Neste trabalho é proposto um novo descritor de imagens que lida com o problema de busca de imagens em bases grandes e heterogêneas. Esta abordagem utiliza a idéia de um particionamento extremo para obter detalhes da imagem que são convertidos em uma descrição textual. Uma vez que a descrição textual é devidamente gerada, utiliza-se as técnicas de Recuperação de Informação (RI) tradicionais. O ponto chave do trabalho proposto é a representação textual das propriedades visuais das partições de uma imagem. Isto permite uma grande escalabilidade desta técnica, visto a existências de técnicas eficientes de busca baseada em texto para bases da ordem de milhões de documentos. Nossos experimentos comprovaram a viabilidade da técnica proposta, atingindo graus de precisão superiores às técnicas de busca de imagens tradicionais em uma base com mais de 100.000 imagens.
|
3 |
Localisation temps-réel d'un robot par vision monoculaire et fusion multicapteurs / Real-time robot location by monocular vision and multi-sensor fusionCharmette, Baptiste 14 December 2012 (has links)
Ce mémoire présente un système de localisation par vision pour un robot mobile circulant dans un milieu urbain. Pour cela, une première phase d’apprentissage où le robot est conduit manuellement est réalisée pour enregistrer une séquence vidéo. Les images ainsi acquises sont ensuite utilisées dans une phase hors ligne pour construire une carte 3D de l’environnement. Par la suite, le véhicule peut se déplacer dans la zone, de manière autonome ou non, et l’image reçue par la caméra permet de le positionner dans la carte. Contrairement aux travaux précédents, la trajectoire suivie peut être différente de la trajectoire d’apprentissage. L’algorithme développé permet en effet de conserver la localisation malgré des changements de point de vue importants par rapport aux images acquises initialement. Le principe consiste à modéliser les points de repère sous forme de facettes localement planes, surnommées patchs plan, dont l’orientation est connue. Lorsque le véhicule se déplace, une prédiction de la position courante est réalisée et la déformation des facettes induite par le changement de point de vue est reproduite. De cette façon la recherche des amers revient à comparer des images pratiquement identiques, facilitant ainsi leur appariement. Lorsque les positions sur l’image de plusieurs amers sont connues, la connaissance de leur position 3D permet de déduire la position du robot. La transformation de ces patchs plan est complexe et demande un temps de calcul important, incompatible avec une utilisation temps-réel. Pour améliorer les performances de l’algorithme, la localisation a été implémentée sur une architecture GPU offrant de nombreux outils permettant d’utiliser cet algorithme avec des performances utilisables en temps-réel. Afin de prédire la position du robot de manière aussi précise que possible, un modèle de mouvement du robot a été mis en place. Il utilise, en plus de la caméra, les informations provenant des capteurs odométriques. Cela permet d’améliorer la prédiction et les expérimentations montrent que cela fournit une plus grande robustesse en cas de pertes d’images lors du traitement. Pour finir ce mémoire détaille les différentes performances de ce système à travers plusieurs expérimentations en conditions réelles. La précision de la position a été mesurée en comparant la localisation avec une référence enregistrée par un GPS différentiel. / This dissertation presents a vision-based localization system for a mobile robot in an urban context. In this goal, the robot is first manually driven to record a learning image sequence. These images are then processed in an off-line way to build a 3D map of the area. Then vehicle can be —either automatically or manually— driven in the area and images seen by the camera are used to compute the position in the map. In contrast to previous works, the trajectory can be different from the learning sequence. The algorithm is indeed able to keep localization in spite of important viewpoint changes from the learning images. To do that, the features are modeled as locally planar features —named patches— whose orientation is known. While the vehicle is moving, its position is predicted and patches are warped to model the viewpoint change. In this way, matching the patches with points in the image is eased because their appearances are almost the same. After the matching, 3D positions of the patches associated with 2D points on the image are used to compute robot position. The warp of the patch is computationally expensive. To achieve real-time performance, the algorithm has been implemented on GPU architecture and many improvements have been done using tools provided by the GPU. In order to have a pose prediction as precise as possible, a motion model of the robot has been developed. This model uses, in addition to the vision-based localization, information acquired from odometric sensors. Experiments using this prediction model show that the system is more robust especially in case of image loss. Finally many experiments in real situations are described in the end of this dissertation. A differential GPS is used to evaluate the localization result of the algorithm.
|
4 |
Identificação de manipulações de cópia-colagem em imagens digitais / Copy-move forgery identification in digital imagesSilva, Ewerton Almeida, 1988- 07 December 2012 (has links)
Orientador: Anderson de Rezende Rocha / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-23T03:37:08Z (GMT). No. of bitstreams: 1
Silva_EwertonAlmeida_M.pdf: 20654769 bytes, checksum: cd66fa66dedc48d34c5feb0fa0311759 (MD5)
Previous issue date: 2012 / Resumo: Neste trabalho, nós investigamos duas abordagens para detecção de manipulações de Cópia-colagem (Copy-move Forgery) em imagens digitais. A primeira abordagem é baseada no algoritmo PatchMatch Generalizado [4], cuja proposta é encontrar correspondências de patches (blocos de pixels de tamanho definido) em uma ou mais imagens. A nossa abordagem consiste na aplicação do PatchMatch Generalizado em uma dada imagem com o propósito de encontrar, para cada patch desta, um conjunto de patches similares com base nas distâncias de seus histogramas. Em seguida, nós verificamos as correspondências de cada patch para decidir se eles são segmentos de uma região duplicada. A segunda abordagem, que consiste em nossa principal contribuição, é baseada em um processo de Votação e Análise Multiescala da imagem. Dada uma imagem suspeita, extraímos pontos de interesse robustos a operações de escala e rotação, encontramos correspondências entre eles e os agrupamos em regiões com base em certas restrições geométricas, tais como a distância física e a inclinação da reta que os liga. Após a aplicação das restrições geométricas, criamos uma pirâmide multiescala que representará o espaço de escalas da imagem. Nós examinamos, em cada imagem, os grupos criados usando um descritor robusto a rotações, redimensionamentos e compressões. Este processo diminui o domínio de busca de regiões duplicadas e gera um mapa de detecção para cada escala. A decisão final é dada a partir de uma votação entre todos os mapas, na qual um segmento é considerado duplicado se este assim o é na maioria das escalas. Nós validamos ambos os métodos em uma base de imagens que construímos. A base _e composta por 108 clonagens originais e com elevado grau de realismo. Comparamos os métodos propostos com outros do estado da arte nessa mesma base de imagens / Abstract: In this work, we investigate two approaches toward Copy-move Forgery detection in digital images. The first approach relies on the Generalized PatchMatch algorithm [4], which aims at finding patch correspondences in one or more images. Our approach consists in applying the Generalized PatchMatch algorithm in a certain image in order to obtain, for each of its patches, a set of similar patches based on their histogram distances. Next, we check the correspondences of each patch to decide whether or not they are portions of a duplicated region. Our second approach is based on a Voting and Multiscale Analysis process of an image. Given a suspicious image, we extract its interest points robust to scale and rotation transformations and we find possible correspondences among them. Next, we group the correspondent points into regions considering some geometric constraints, such as physical distance and inclination of the line between points of interest. After that, we construct a multiscale pyramid to represent the image scale-space. In each image, we examine the created groups using a descriptor robust to rotation, scaling and compression. This process decreases the search space of duplicated regions and yields a detection map. The final decision depends on a voting among all the detected maps, in which a pixel is considered as part of a manipulation if it is marked as so in the majority of the pyramid scales. We validate both methods using a dataset we have built comprising 108 original and realistic clonings. We compare the proposed methods to others from the state-of-the-art using such cloning dataset / Mestrado / Ciência da Computação / Mestre em Ciência da Computação
|
5 |
Image matching using rotating filters / Mise en correspondance d'images avec des filtres tournantsVenkatrayappa, Darshan 04 December 2015 (has links)
De nos jours les algorithmes de vision par ordinateur abondent dans les applications de vidéo-surveillance, de reconstruction 3D, de véhicules autonomes, d'imagerie médicale, etc… La détection et la mise en correspondance d'objets dans les images constitue une étape clé dans ces algorithmes.Les méthodes les plus communes pour la mise en correspondance d'objets ou d'images sont basées sur des descripteurs locaux, avec tout d'abord la détection de points d'intérêt, puis l'extraction de caractéristiques de voisinages des points d'intérêt, et enfin la construction des descripteurs d'image.Dans cette thèse, nous présentons des contributions au domaine de la mise en correspondance d'images par l'utilisation de demi filtres tournants. Nous suivons ici trois approches : la première présente un nouveau descripteur à faible débit et une stratégie de mise en correspondance intégrés à une plateforme vidéo. Deuxièmement, nous construisons un nouveau descripteur local en intégrant la réponse de demi filtres tournant dans un histogramme de gradient orienté (HOG) ; enfin nous proposons une nouvelle approche pour la construction d'un descripteur utilisant des statistiques du second ordre. Toutes ces trois approches apportent des résultats intéressants et prometteurs.Mots-clés : Demi filtres tournants, descripteur local d'image, mise en correspondance, histogramme de gradient orienté (HOG), Différence de gaussiennes. / Nowadays computer vision algorithms can be found abundantly in applications relatedto video surveillance, 3D reconstruction, autonomous vehicles, medical imaging etc. Image/object matching and detection forms an integral step in many of these algorithms.The most common methods for Image/object matching and detection are based on localimage descriptors, where interest points in the image are initially detected, followed byextracting the image features from the neighbourhood of the interest point and finally,constructing the image descriptor. In this thesis, contributions to the field of the imagefeature matching using rotating half filters are presented. Here we follow three approaches:first, by presenting a new low bit-rate descriptor and a cascade matching strategy whichare integrated on a video platform. Secondly, we construct a new local image patch descriptorby embedding the response of rotating half filters in the Histogram of Orientedgradient (HoG) framework and finally by proposing a new approach for descriptor constructionby using second order image statistics. All the three approaches provides aninteresting and promising results by outperforming the state of art descriptors.Key-words: Rotating half filters, local image descriptor, image matching, Histogram of Orientated Gradients (HoG), Difference of Gaussian (DoG).
|
6 |
Ré-identification de personnes à partir des séquences vidéo / Person re-identification from video sequenceIbn Khedher, Mohamed 01 July 2014 (has links)
Cette thèse s'inscrit dans le contexte de la vidéo surveillance et s'intéresse à la ré-identification de personnes dans un réseau de caméras à champs disjoints. La ré-identification consiste à déterminer si une personne quitte le champ d'une caméra et réapparait dans une autre. Elle est particulièrement difficile car l'apparence de la personne change de manière significative à cause de différents facteurs. Nous proposons d'exploiter la complémentarité de l'apparence de la personne et son style de mouvement pour la décrire d'une manière appropriée aux facteurs de complexité. C'est une nouvelle approche car la ré-identification a été traitée par des approches d'apparence. Les contributions majeures proposées concernent: la description de la personne et l'appariement des primitives. Nous étudions deux scénarios de ré-identification : simple et complexe. Dans le scénario simple, nous étudions la faisabilité de deux approches : approche biométrique basée sur la démarche et approche d'apparence fondée sur des points d'intérêt (PI) spatiaux et des primitives de couleur. Dans le scénario complexe, nous proposons de fusionner des primitives d'apparence et de mouvement. Nous décrivons le mouvement par des Pis spatio-temporels et l'apparence par des PIs spatiaux. Pour l'appariement, nous utilisons la représentation parcimonieuse comme méthode d'appariement local entre les PIs. Le schéma de fusion est fondé sur le calcul de la somme pondérée des votes des PIs et ensuite l'application de la règle de vote majoritaire. Nous proposons également une analyse d'erreurs permettant d'identifier les sources d'erreurs de notre système pour dégager les pistes d'amélioration les plus prometteuses / This thesis focuses on the problem of hu man re-identification through a network of cameras with non overlapping fields of view. Human re-identification is defined as the task of determining if a persan leaving the field of one camera reappears in another. It is particularly difficult because of persons' significant appearance change within different cameras vision fields due to various factors. In this work, we propose to exploit the complementarity of the person's appearance and style of movement that leads to a description that is more robust with respect to various complexity factors. This is a new approach for the re-identification problem that is usually treated by appearance methods only. The major contributions proposed in this work include: person's description and features matching. First we study the re-identification problem and classify it into two scenarios: simple and complex. In the simple scenario, we study the feasibility of two approaches: a biometric approach based on gait and an appearance approach based on spatial Interest Points (IPs) and color features. In the complex scenario, we propose to exploit a fusion strategy of two complementary features provided by appearance and motion descriptions. We describe motion using spatiotemporal IPs, and use the spatial IPs for describing the appearance. For feature matching, we use sparse representation as a local matching method between IPs. The fusion strategy is based on the weighted sum of matched IPs votes and then applying the rule of majority vote. Moreover, we have carried out an error analysis to identify the sources of errors in our proposed system to identify the most promising areas for improvement
|
7 |
Reconhecimento de texto e rastreamento de objetos 2D/3D / Text recognition and 2D/3D object trackingMinetto, Rodrigo, 1983- 20 August 2018 (has links)
Orientadores: Jorge Stolfi, Neucimar Jerônimo Leite / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-20T03:12:07Z (GMT). No. of bitstreams: 1
Minetto_Rodrigo_D.pdf: 35894128 bytes, checksum: 8a0e453fba7e6a9a02fb17a52fdbf878 (MD5)
Previous issue date: 2012 / Resumo: Nesta tese abordamos três problemas de visão computacional: (1) detecção e reconhecimento de objetos de texto planos em imagens de cenas reais; (2) rastreamento destes objetos de texto em vídeos digitais; e (3) o rastreamento de um objeto tridimensional rígido arbitrário com marcas conhecidas em um vídeo digital. Nós desenvolvemos, para cada um dos problemas, algoritmos inovadores, que são pelo menos tão precisos e robustos quanto outros algoritmos estado-da-arte. Especificamente, para reconhecimento de texto nós desenvolvemos (e validamos extensivamente) um novo descritor de imagem baseado em HOG especializado para escrita romana, que denominamos T-HOG, e mostramos sua contribuição como um filtro em um detector de texto (SNOOPERTEXT). Nós também melhoramos o algoritmo SNOOPERTEXT através do uso da técnica multiescala para tratar caracteres de tamanhos bastante variados e limitar a sensibilidade do algoritmo a vários artefatos. Para rastreamento de texto, nós descrevemos quatro estratégias básicas para combinar a detecção e o rastreamento de texto, e desenvolvemos também um rastreador específico baseado em filtro de partículas que explora o uso do reconhecedor T-HOG. Para o rastreamento de objetos rígidos, nós desenvolvemos um novo algoritmo preciso e robusto (AFFTRACK) que combina rastreamento de características por KLT com uma calibração de câmera melhorada. Nós testamos extensivamente nossos algoritmos com diversas bases de dados descritas na literatura. Nós também desenvolvemos algumas bases de dados (publicamente disponíveis) para a validação de algoritmos de detecção e rastreamento de texto e de rastreamento de objetos rígidos em vídeos / Abstract: In this thesis we address three computer vision problems: (1) the detection and recognition of flat text objects in images of real scenes; (2) the tracking of such text objects in a digital video; and (3) the tracking an arbitrary three-dimensional rigid object with known markings in a digital video. For each problem we developed innovative algorithms, which are at least as accurate and robust as other state-of-the-art algorithms. Specifically, for text classification we developed (and extensively evaluated) a new HOG-based descriptor specialized for Roman script, which we call T-HOG, and showed its value as a post-filter for an existing text detector (SNOOPERTEXT). We also improved the SNOOPERTEXT algorithm by using the multi-scale technique to handle widely different letter sizes while limiting the sensitivity of the algorithm to various artifacts. For text tracking, we describe four basic ways of combining a text detector and a text tracker, and we developed a specific tracker based on a particle-filter which exploits the T-HOG recognizer. For rigid object tracking we developed a new accurate and robust algorithm (AFFTRACK) that combines the KLT feature tracker with an improved camera calibration procedure. We extensively tested our algorithms on several benchmarks well-known in the literature. We also created benchmarks (publicly available) for the evaluation of text detection and tracking and rigid object tracking algorithms / Doutorado / Ciência da Computação / Doutor em Ciência da Computação
|
Page generated in 0.0804 seconds