• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 16
  • 4
  • 1
  • 1
  • Tagged with
  • 22
  • 22
  • 22
  • 9
  • 6
  • 6
  • 6
  • 5
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Contributions to 3D object recognition and 3D hand pose estimation using deep learning techniques

Gomez-Donoso, Francisco 18 September 2020 (has links)
In this thesis, a study of two blooming fields in the artificial intelligence topic is carried out. The first part of the present document is about 3D object recognition methods. Object recognition in general is about providing the ability to understand what objects appears in the input data of an intelligent system. Any robot, from industrial robots to social robots, could benefit of such capability to improve its performance and carry out high level tasks. In fact, this topic has been largely studied and some object recognition methods present in the state of the art outperform humans in terms of accuracy. Nonetheless, these methods are image-based, namely, they focus in recognizing visual features. This could be a problem in some contexts as there exist objects that look alike some other, different objects. For instance, a social robot that recognizes a face in a picture, or an intelligent car that recognizes a pedestrian in a billboard. A potential solution for this issue would be involving tridimensional data so that the systems would not focus on visual features but topological features. Thus, in this thesis, a study of 3D object recognition methods is carried out. The approaches proposed in this document, which take advantage of deep learning methods, take as an input point clouds and are able to provide the correct category. We evaluated the proposals with a range of public challenges, datasets and real life data with high success. The second part of the thesis is about hand pose estimation. This is also an interesting topic that focuses in providing the hand's kinematics. A range of systems, from human computer interaction and virtual reality to social robots could benefit of such capability. For instance to interface a computer and control it with seamless hand gestures or to interact with a social robot that is able to understand human non-verbal communication methods. Thus, in the present document, hand pose estimation approaches are proposed. It is worth noting that the proposals take as an input color images and are able to provide 2D and 3D hand pose in the image plane and euclidean coordinate frames. Specifically, the hand poses are encoded in a collection of points that represents the joints in a hand, so that they can be easily reconstructed in the full hand pose. The methods are evaluated on custom and public datasets, and integrated with a robotic hand teleoperation application with great success.
12

Object registration in semi-cluttered and partial-occluded scenes for augmented reality

Gao, Q.H., Wan, Tao Ruan, Tang, W., Chen, L. 26 November 2018 (has links)
Yes / This paper proposes a stable and accurate object registration pipeline for markerless augmented reality applications. We present two novel algorithms for object recognition and matching to improve the registration accuracy from model to scene transformation via point cloud fusion. Whilst the first algorithm effectively deals with simple scenes with few object occlusions, the second algorithm handles cluttered scenes with partial occlusions for robust real-time object recognition and matching. The computational framework includes a locally supported Gaussian weight function to enable repeatable detection of 3D descriptors. We apply a bilateral filtering and outlier removal to preserve edges of point cloud and remove some interference points in order to increase matching accuracy. Extensive experiments have been carried to compare the proposed algorithms with four most used methods. Results show improved performance of the algorithms in terms of computational speed, camera tracking and object matching errors in semi-cluttered and partial-occluded scenes. / Shanxi Natural Science and Technology Foundation of China, grant number 2016JZ026 and grant number 2016KW-043).
13

Deep-learning Approaches to Object Recognition from 3D Data

Chen, Zhiang 30 August 2017 (has links)
No description available.
14

Tecnologia para o reconhecimento do formato de objetos tri-dimensionais. / Three dimensional shape recognition technology.

Gonzaga, Adilson 05 July 1991 (has links)
Apresentamos neste trabalho o desenvolvimento de um método para o reconhecimento do Formato de Objetos Tri-dimensionais. Os sistemas tradicionais de Visão Computacional empregam imagens bi-dimensionais obtidos através de câmeras de TV, ricas em detalhes necessários a visão humana. Estes detalhes em grande parte das aplicações industriais de Robôs são supérfluos. Os algoritmos tradicionais de classificação consomem portanto muito tempo no processamento deste excesso de informação. Para este trabalho, desenvolvemos um sistema dedicado para reconhecimento que utiliza um feixe de Laser defletido sobre um objeto e a digitalização da Luminância em cada ponto de sua superfície. A intensidade luminosa refletida e proporcional a distância do ponto ao observador. É, portanto, possível determinar parâmetros que classifiquem cada objeto. A inclinação de cada face de um poliedro, o comportamento de suas fronteiras e também a existência de arestas internas, são as características adotadas. Estas características são então rotuladas, permitindo que o programa de classificação busque em um \"banco de conhecimento\" previamente estabelecido, a descrição dos objetos. Uma mesa giratória permite a rotação do modele fornecendo novas vistas ao observador, determinando sua classificação. Todo o sistema é controlado por um microcomputador cujo programa reconhece em tempo real o objeto em observação. Para o protótipo construído, utilizamos um Laser de HeNe sendo a recepção do raio refletido realizada por um fototransistor. Os objetos reconhecíveis pelo programa são poliedros regulares simples, compondo o seguinte conjunto: 1 prisma de base triangular, 1 cubo, 1 pirâmide de base triangular, 1 pirâmide de base retangular. O tratamento matemático empregado visa a comprovação da tecnologia proposta, podendo, na continuação de trabalhos futuros, ser efetivamente estendido a diversos outros objetos como, por exemplo, os de superfícies curvas. / We present in this work a new method for three dimensional Shape Recognition. Traditional Computer Vision systems use bi-dimensional TV camera images. In most of the industrial Robotic applications, the excess of detail obtained by the TV camera is needless. Traditional classification algorithms spend a lot of time to process the excess of information. For the present work we developed a dedicated recognition system, which deflects a Laser beam over an object and digitizes the Reflected beam point by point over the surface. The intensity of the reflected beam is proportional to the observer distance. Using this technique it was possible to establish features to classify various objects. These features are the slope of the polyhedral surfaces, the boundary type and the inner edges. For each object the features are labeled and the classification algorithm searches in a \"knowledge data base\" for the object description. The recognition system used a He-Ne Laser and the reflected signal was captured by a photo-transistor. The object to be recognized is placed over a rotating table which can be rotated, supplying a new view for the classification. A microcomputer controls the system operation and the object is recognized in real time. The recognized objects were simple regular polyhedral, just as: 1 triangular base prism, 1 cube, 1 triangular base pyramid, 1 rectangular base pyramid. To check that the proposed technology was correct, we used a dedicated mathematical approach, which can be extended to other surfaces, such as curves, in future works.
15

3d Object Recognition From Range Images

Izciler, Fatih 01 September 2012 (has links) (PDF)
Recognizing generic objects by single or multi view range images is a contemporary popular problem in 3D object recognition area with developing technology of scanning devices such as laser range scanners. This problem is vital to current and future vision systems performing shape based matching and classification of the objects in an arbitrary scene. Despite improvements on scanners, there are still imperfections on range scans such as holes or unconnected parts on images. This studyobjects at proposing and comparing algorithms that match a range image to complete 3D models in a target database.The study started with a baseline algorithm which usesstatistical representation of 3D shapesbased on 4D geometricfeatures, namely SURFLET-Pair relations.The feature describes the geometrical relationof a surface-point pair and reflects local and the global characteristics of the object. With the desire of generating solution to the problem,another algorithmthat interpretsSURFLET-Pairslike in the baseline algorithm, in which histograms of the features are used,isconsidered. Moreover, two other methods are proposed by applying 2D space filing curves on range images and applying 4D space filling curves on histograms of SURFLET-Pairs. Wavelet transforms are used for filtering purposes in these algorithms. These methods are tried to be compact, robust, independent on a global coordinate frame and descriptive enough to be distinguish queries&rsquo / categories.Baseline and proposed algorithms are implemented on a database in which range scans of real objects with imperfections are queries while generic 3D objects from various different categories are target dataset.
16

Representations and matching techniques for 3D free-form object and face recognition

Mian, Ajmal Saeed January 2007 (has links)
[Truncated abstract] The aim of visual recognition is to identify objects in a scene and estimate their pose. Object recognition from 2D images is sensitive to illumination, pose, clutter and occlusions. Object recognition from range data on the other hand does not suffer from these limitations. An important paradigm of recognition is model-based whereby 3D models of objects are constructed offline and saved in a database, using a suitable representation. During online recognition, a similar representation of a scene is matched with the database for recognizing objects present in the scene . . . The tensor representation is extended to automatic and pose invariant 3D face recognition. As the face is a non-rigid object, expressions can significantly change its 3D shape. Therefore, the last part of this thesis investigates representations and matching techniques for automatic 3D face recognition which are robust to facial expressions. A number of novelties are proposed in this area along with their extensive experimental validation using the largest available 3D face database. These novelties include a region-based matching algorithm for 3D face recognition, a 2D and 3D multimodal hybrid face recognition algorithm, fully automatic 3D nose ridge detection, fully automatic normalization of 3D and 2D faces, a low cost rejection classifier based on a novel Spherical Face Representation, and finally, automatic segmentation of the expression insensitive regions of a face.
17

Tecnologia para o reconhecimento do formato de objetos tri-dimensionais. / Three dimensional shape recognition technology.

Adilson Gonzaga 05 July 1991 (has links)
Apresentamos neste trabalho o desenvolvimento de um método para o reconhecimento do Formato de Objetos Tri-dimensionais. Os sistemas tradicionais de Visão Computacional empregam imagens bi-dimensionais obtidos através de câmeras de TV, ricas em detalhes necessários a visão humana. Estes detalhes em grande parte das aplicações industriais de Robôs são supérfluos. Os algoritmos tradicionais de classificação consomem portanto muito tempo no processamento deste excesso de informação. Para este trabalho, desenvolvemos um sistema dedicado para reconhecimento que utiliza um feixe de Laser defletido sobre um objeto e a digitalização da Luminância em cada ponto de sua superfície. A intensidade luminosa refletida e proporcional a distância do ponto ao observador. É, portanto, possível determinar parâmetros que classifiquem cada objeto. A inclinação de cada face de um poliedro, o comportamento de suas fronteiras e também a existência de arestas internas, são as características adotadas. Estas características são então rotuladas, permitindo que o programa de classificação busque em um \"banco de conhecimento\" previamente estabelecido, a descrição dos objetos. Uma mesa giratória permite a rotação do modele fornecendo novas vistas ao observador, determinando sua classificação. Todo o sistema é controlado por um microcomputador cujo programa reconhece em tempo real o objeto em observação. Para o protótipo construído, utilizamos um Laser de HeNe sendo a recepção do raio refletido realizada por um fototransistor. Os objetos reconhecíveis pelo programa são poliedros regulares simples, compondo o seguinte conjunto: 1 prisma de base triangular, 1 cubo, 1 pirâmide de base triangular, 1 pirâmide de base retangular. O tratamento matemático empregado visa a comprovação da tecnologia proposta, podendo, na continuação de trabalhos futuros, ser efetivamente estendido a diversos outros objetos como, por exemplo, os de superfícies curvas. / We present in this work a new method for three dimensional Shape Recognition. Traditional Computer Vision systems use bi-dimensional TV camera images. In most of the industrial Robotic applications, the excess of detail obtained by the TV camera is needless. Traditional classification algorithms spend a lot of time to process the excess of information. For the present work we developed a dedicated recognition system, which deflects a Laser beam over an object and digitizes the Reflected beam point by point over the surface. The intensity of the reflected beam is proportional to the observer distance. Using this technique it was possible to establish features to classify various objects. These features are the slope of the polyhedral surfaces, the boundary type and the inner edges. For each object the features are labeled and the classification algorithm searches in a \"knowledge data base\" for the object description. The recognition system used a He-Ne Laser and the reflected signal was captured by a photo-transistor. The object to be recognized is placed over a rotating table which can be rotated, supplying a new view for the classification. A microcomputer controls the system operation and the object is recognized in real time. The recognized objects were simple regular polyhedral, just as: 1 triangular base prism, 1 cube, 1 triangular base pyramid, 1 rectangular base pyramid. To check that the proposed technology was correct, we used a dedicated mathematical approach, which can be extended to other surfaces, such as curves, in future works.
18

Uma proposta de estruturação e integração de processamento de cores em sistemas artificiais de visão. / A proposal for structuration and integration of color processing in artifical vision systems.

Moreira, Jander 05 July 1999 (has links)
Esta tese descreve uma abordagem para a utilização da informação de cores no sistema de visão artificial com inspiração biológica denominada Cyvis-1. Considerando-se que grande parte da literatura sobre segmentação de imagens se refere a imagens em níveis de cinza, informações cromáticas na segmentação permanecem uma área que ainda deve ser mais bem explorada e para a qual se direcionou o interesse da presente pesquisa. Neste trabalho, o subsistema de cor do Cyvis-1 é definido, mantendo-se o vínculo com os princípios que inspiram o sistema de visão como um todo: hierarquia, modularidade, especialização do processamento, integração em vários níveis, representação efetiva da informação visual e integração com conhecimento de nível alto. O subsistema de cor se insere neste escopo, propondo uma técnica para segmentação de imagens coloridas baseada em mapas auto-organizáveis para a classificação dos pontos da imagem. A segmentação incorpora a determinação do número de classes sem supervisão, tornando o processo mais independente de intervenção humana. Por este processo de segmentação, são produzidos mapas das regiões encontradas e um mapa de bordas, derivado das regiões. Uma segunda proposta do trabalho é um estudo comparativo do desempenho de técnicas de segmentação por bordas. A comparação é feita em relação a um mapa de bordas de referência e o comportamento de várias técnicas é analisado segundo um conjunto de atributos locais baseados em contrastes de intensidade e cor. Derivada desta comparação, propõe-se também uma combinação para a geração de um mapa de bordas a partir da seleção das técnicas segundo seus desempenhos locais. Finalmente, integrando os aspectos anteriores, é proposta urna estruturação do módulo de cor, adicionalmente com a aquisição de imagens, a análise de formas e o reconhecimento de objetos poliédricos. Há, neste contexto, a integração ao módulo de estéreo, que proporciona o cálculo de dados tridimensionais, essenciais para o reconhecimento dos objetos. Para cada parte deste trabalho são propostas formas de avaliação para a validação dos resultados, demonstrando e caracterizando a eficiência e as limitações de cada uma. / This thesis describes an approach to color information processing in the biologically-inspired artificial vision system named Cyvis-1. Considering that most of the current literature in image segmentation deals with gray level images, color information remains an incipient area, which has motivated this research. This work defines the color subsystem within the Cyvis-1 underlying phylosophy, whose main principles include hierarchy, modularity, processing specialization, multilevel integration, effective representation of visual information, and high-level knowledge integration. The color subsystem is then introduced according to this framework, with a proposal of a segmentation technique based on self-organizing maps. The number of regions in the image is achieved through a unsupervised clustering approach, so no human interaction is needed. Such segmentation technique produces region oriented representation of the classes, which are used to derive an edge map. Another main topic in this work is a comparative study of the edge maps produced by several edge-oriented segmentation techniques. A reference edge map is used as standard segmentation, to which other edge maps are compared. Such analysis is carried out by means of local attributes (local gray level and \"color\" contrasts). As a consequence of the comparison, a combination edge map is also proposed, based on the conditional selection of techniques considering the local attributes. Finally, the integration of two above topics is proposed, which is characterized by the design of the color subsystem of Cyvis-1, altogether with the modules for image acquisition, shape analysis and polyhedral object recognition. In such a context, the integration with the stereo subsystem is accomplished, allowing the evaluation of the three-dimensional data needed for object recognition. Assessment and validation of the three proposals were carried out, providing the means for analyzing their efficiency and limitations.
19

Uma proposta de estruturação e integração de processamento de cores em sistemas artificiais de visão. / A proposal for structuration and integration of color processing in artifical vision systems.

Jander Moreira 05 July 1999 (has links)
Esta tese descreve uma abordagem para a utilização da informação de cores no sistema de visão artificial com inspiração biológica denominada Cyvis-1. Considerando-se que grande parte da literatura sobre segmentação de imagens se refere a imagens em níveis de cinza, informações cromáticas na segmentação permanecem uma área que ainda deve ser mais bem explorada e para a qual se direcionou o interesse da presente pesquisa. Neste trabalho, o subsistema de cor do Cyvis-1 é definido, mantendo-se o vínculo com os princípios que inspiram o sistema de visão como um todo: hierarquia, modularidade, especialização do processamento, integração em vários níveis, representação efetiva da informação visual e integração com conhecimento de nível alto. O subsistema de cor se insere neste escopo, propondo uma técnica para segmentação de imagens coloridas baseada em mapas auto-organizáveis para a classificação dos pontos da imagem. A segmentação incorpora a determinação do número de classes sem supervisão, tornando o processo mais independente de intervenção humana. Por este processo de segmentação, são produzidos mapas das regiões encontradas e um mapa de bordas, derivado das regiões. Uma segunda proposta do trabalho é um estudo comparativo do desempenho de técnicas de segmentação por bordas. A comparação é feita em relação a um mapa de bordas de referência e o comportamento de várias técnicas é analisado segundo um conjunto de atributos locais baseados em contrastes de intensidade e cor. Derivada desta comparação, propõe-se também uma combinação para a geração de um mapa de bordas a partir da seleção das técnicas segundo seus desempenhos locais. Finalmente, integrando os aspectos anteriores, é proposta urna estruturação do módulo de cor, adicionalmente com a aquisição de imagens, a análise de formas e o reconhecimento de objetos poliédricos. Há, neste contexto, a integração ao módulo de estéreo, que proporciona o cálculo de dados tridimensionais, essenciais para o reconhecimento dos objetos. Para cada parte deste trabalho são propostas formas de avaliação para a validação dos resultados, demonstrando e caracterizando a eficiência e as limitações de cada uma. / This thesis describes an approach to color information processing in the biologically-inspired artificial vision system named Cyvis-1. Considering that most of the current literature in image segmentation deals with gray level images, color information remains an incipient area, which has motivated this research. This work defines the color subsystem within the Cyvis-1 underlying phylosophy, whose main principles include hierarchy, modularity, processing specialization, multilevel integration, effective representation of visual information, and high-level knowledge integration. The color subsystem is then introduced according to this framework, with a proposal of a segmentation technique based on self-organizing maps. The number of regions in the image is achieved through a unsupervised clustering approach, so no human interaction is needed. Such segmentation technique produces region oriented representation of the classes, which are used to derive an edge map. Another main topic in this work is a comparative study of the edge maps produced by several edge-oriented segmentation techniques. A reference edge map is used as standard segmentation, to which other edge maps are compared. Such analysis is carried out by means of local attributes (local gray level and \"color\" contrasts). As a consequence of the comparison, a combination edge map is also proposed, based on the conditional selection of techniques considering the local attributes. Finally, the integration of two above topics is proposed, which is characterized by the design of the color subsystem of Cyvis-1, altogether with the modules for image acquisition, shape analysis and polyhedral object recognition. In such a context, the integration with the stereo subsystem is accomplished, allowing the evaluation of the three-dimensional data needed for object recognition. Assessment and validation of the three proposals were carried out, providing the means for analyzing their efficiency and limitations.
20

Inexact graph matching : application to 2D and 3D Pattern Recognition / Appariement inexact de graphes : application à la reconnaissance de formes 2D et 3D

Madi, Kamel 13 December 2016 (has links)
Les Graphes sont des structures mathématiques puissantes constituant un outil de modélisation universel utilisé dans différents domaines de l'informatique, notamment dans le domaine de la reconnaissance de formes. L'appariement de graphes est l'opération principale dans le processus de la reconnaissance de formes à base de graphes. Dans ce contexte, trouver des solutions d'appariement de graphes, garantissant l'optimalité en termes de précision et de temps de calcul est un problème de recherche difficile et d'actualité. Dans cette thèse, nous nous intéressons à la résolution de ce problème dans deux domaines : la reconnaissance de formes 2D et 3D. Premièrement, nous considérons le problème d'appariement de graphes géométriques et ses applications sur la reconnaissance de formes 2D. Dance cette première partie, la reconnaissance des Kites (structures archéologiques) est l'application principale considérée. Nous proposons un "framework" complet basé sur les graphes pour la reconnaissance des Kites dans des images satellites. Dans ce contexte, nous proposons deux contributions. La première est la proposition d'un processus automatique d'extraction et de transformation de Kites a partir d'images réelles en graphes et un processus de génération aléatoire de graphes de Kites synthétiques. En utilisant ces deux processus, nous avons généré un benchmark de graphes de Kites (réels et synthétiques) structuré en 3 niveaux de bruit. La deuxième contribution de cette première partie, est la proposition d'un nouvel algorithme d'appariement pour les graphes géométriques et par conséquent pour les Kites. L'approche proposée combine les invariants de graphes au calcul de l'édition de distance géométrique. Deuxièmement, nous considérons le problème de reconnaissance des formes 3D ou nous nous intéressons à la reconnaissance d'objets déformables représentés par des graphes c.à.d. des tessellations de triangles. Nous proposons une décomposition des tessellations de triangles en un ensemble de sous structures que nous appelons triangle-étoiles. En se basant sur cette décomposition, nous proposons un nouvel algorithme d'appariement de graphes pour mesurer la distance entre les tessellations de triangles. L'algorithme proposé assure un nombre minimum de structures disjointes, offre une meilleure mesure de similarité en couvrant un voisinage plus large et utilise un ensemble de descripteurs qui sont invariants ou au moins tolérants aux déformations les plus courantes. Finalement, nous proposons une approche plus générale de l'appariement de graphes. Cette approche est fondée sur une nouvelle formalisation basée sur le problème de mariage stable. L'approche proposée est optimale en terme de temps d'exécution, c.à.d. la complexité est quadratique O(n2), et flexible en terme d'applicabilité (2D et 3D). Cette approche se base sur une décomposition en sous structures suivie par un appariement de ces structures en utilisant l'algorithme de mariage stable. L'analyse de la complexité des algorithmes proposés et l'ensemble des expérimentations menées sur les bases de graphes des Kites (réelle et synthétique) et d'autres bases de données standards (2D et 3D) attestent l'efficacité, la haute performance et la précision des approches proposées et montrent qu'elles sont extensibles et générales / Graphs are powerful mathematical modeling tools used in various fields of computer science, in particular, in Pattern Recognition. Graph matching is the main operation in Pattern Recognition using graph-based approach. Finding solutions to the problem of graph matching that ensure optimality in terms of accuracy and time complexity is a difficult research challenge and a topical issue. In this thesis, we investigate the resolution of this problem in two fields: 2D and 3D Pattern Recognition. Firstly, we address the problem of geometric graphs matching and its applications on 2D Pattern Recognition. Kite (archaeological structures) recognition in satellite images is the main application considered in this first part. We present a complete graph based framework for Kite recognition on satellite images. We propose mainly two contributions. The first one is an automatic process transforming Kites from real images into graphs and a process of generating randomly synthetic Kite graphs. This allowing to construct a benchmark of Kite graphs (real and synthetic) structured in different level of deformations. The second contribution in this part, is the proposition of a new graph similarity measure adapted to geometric graphs and consequently for Kite graphs. The proposed approach combines graph invariants with a geometric graph edit distance computation. Secondly, we address the problem of deformable 3D objects recognition, represented by graphs, i.e., triangular tessellations. We propose a new decomposition of triangular tessellations into a set of substructures that we call triangle-stars. Based on this new decomposition, we propose a new algorithm of graph matching to measure the distance between triangular tessellations. The proposed algorithm offers a better measure by assuring a minimum number of triangle-stars covering a larger neighbourhood, and uses a set of descriptors which are invariant or at least oblivious under most common deformations. Finally, we propose a more general graph matching approach founded on a new formalization based on the stable marriage problem. The proposed approach is optimal in term of execution time, i.e. the time complexity is quadratic O(n2) and flexible in term of applicability (2D and 3D). The analyze of the time complexity of the proposed algorithms and the extensive experiments conducted on Kite graph data sets (real and synthetic) and standard data sets (2D and 3D) attest the effectiveness, the high performance and accuracy of the proposed approaches and show that the proposed approaches are extensible and quite general

Page generated in 0.0982 seconds