• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 172
  • 59
  • 25
  • 14
  • 11
  • 6
  • 4
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 364
  • 364
  • 108
  • 101
  • 64
  • 61
  • 46
  • 43
  • 38
  • 32
  • 30
  • 26
  • 26
  • 26
  • 26
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
301

Reconnaissance perceptuelle des objets d’Intérêt : application à l’interprétation des activités instrumentales de la vie quotidienne pour les études de démence / Perceptual object of interest recognition : application to the interpretation of instrumental activities of daily living for dementia studies

Buso, Vincent 30 November 2015 (has links)
Cette thèse est motivée par le diagnostic, l’évaluation, la maintenance et la promotion de l’indépendance des personnes souffrant de maladies démentielles pour leurs activités de la vie quotidienne. Dans ce contexte nous nous intéressons à la reconnaissance automatique des activités de la vie quotidienne.L’analyse des vidéos de type égocentriques (où la caméra est posée sur une personne) a récemment gagné beaucoup d’intérêt en faveur de cette tâche. En effet de récentes études démontrent l’importance cruciale de la reconnaissance des objets actifs (manipulés ou observés par le patient) pour la reconnaissance d’activités et les vidéos égocentriques présentent l’avantage d’avoir une forte différenciation entre les objets actifs et passifs (associés à l’arrière plan). Une des approches récentes envers la reconnaissance des éléments actifs dans une scène est l’incorporation de la saillance visuelle dans les algorithmes de reconnaissance d’objets. Modéliser le processus sélectif du système visuel humain représente un moyen efficace de focaliser l’analyse d’une scène vers les endroits considérés d’intérêts ou saillants,qui, dans les vidéos égocentriques, correspondent fortement aux emplacements des objets d’intérêt. L’objectif de cette thèse est de permettre au systèmes de reconnaissance d’objets de fournir une détection plus précise des objets d’intérêts grâce à la saillance visuelle afin d’améliorer les performances de reconnaissances d’activités de la vie de tous les jours. Cette thèse est menée dans le cadre du projet Européen Dem@care.Concernant le vaste domaine de la modélisation de la saillance visuelle, nous étudions et proposons une contribution à la fois dans le domaine "Bottom-up" (regard attiré par des stimuli) que dans le domaine "Top-down" (regard attiré par la sémantique) qui ont pour but d’améliorer la reconnaissance d’objets actifs dans les vidéos égocentriques. Notre première contribution pour les modèles Bottom-up prend racine du fait que les observateurs d’une vidéo sont normalement attirés par le centre de celle-ci. Ce phénomène biologique s’appelle le biais central. Dans les vidéos égocentriques cependant, cette hypothèse n’est plus valable.Nous proposons et étudions des modèles de saillance basés sur ce phénomène de biais non central.Les modèles proposés sont entrainés à partir de fixations d’oeil enregistrées et incorporées dans des modèles spatio-temporels. Lorsque comparés à l’état-de-l’art des modèles Bottom-up, ceux que nous présentons montrent des résultats prometteurs qui illustrent la nécessité d’un modèle géométrique biaisé non-centré dans ce type de vidéos. Pour notre contribution dans le domaine Top-down, nous présentons un modèle probabiliste d’attention visuelle pour la reconnaissance d’objets manipulés dans les vidéos égocentriques. Bien que les bras soient souvent source d’occlusion des objets et considérés comme un fardeau, ils deviennent un atout dans notre approche. En effet nous extrayons à la fois des caractéristiques globales et locales permettant d’estimer leur disposition géométrique. Nous intégrons cette information dans un modèle probabiliste, avec équations de mise a jour pour optimiser la vraisemblance du modèle en fonction de ses paramètres et enfin générons les cartes d’attention visuelle pour la reconnaissance d’objets manipulés. [...] / The rationale and motivation of this PhD thesis is in the diagnosis, assessment,maintenance and promotion of self-independence of people with dementia in their InstrumentalActivities of Daily Living (IADLs). In this context a strong focus is held towardsthe task of automatically recognizing IADLs. Egocentric video analysis (cameras worn by aperson) has recently gained much interest regarding this goal. Indeed recent studies havedemonstrated how crucial is the recognition of active objects (manipulated or observedby the person wearing the camera) for the activity recognition task and egocentric videospresent the advantage of holding a strong differentiation between active and passive objects(associated to background). One recent approach towards finding active elements in a sceneis the incorporation of visual saliency in the object recognition paradigms. Modeling theselective process of human perception of visual scenes represents an efficient way to drivethe scene analysis towards particular areas considered of interest or salient, which, in egocentricvideos, strongly corresponds to the locus of objects of interest. The objective of thisthesis is to design an object recognition system that relies on visual saliency-maps to providemore precise object representations, that are robust against background clutter and, therefore,improve the recognition of active object for the IADLs recognition task. This PhD thesisis conducted in the framework of the Dem@care European project.Regarding the vast field of visual saliency modeling, we investigate and propose a contributionin both Bottom-up (gaze driven by stimuli) and Top-down (gaze driven by semantics)areas that aim at enhancing the particular task of active object recognition in egocentricvideo content. Our first contribution on Bottom-up models originates from the fact thatobservers are attracted by a central stimulus (the center of an image). This biological phenomenonis known as central bias. In egocentric videos however this hypothesis does not alwayshold. We study saliency models with non-central bias geometrical cues. The proposedvisual saliency models are trained based on eye fixations of observers and incorporated intospatio-temporal saliency models. When compared to state of the art visual saliency models,the ones we present show promising results as they highlight the necessity of a non-centeredgeometric saliency cue. For our top-down model contribution we present a probabilisticvisual attention model for manipulated object recognition in egocentric video content. Althougharms often occlude objects and are usually considered as a burden for many visionsystems, they become an asset in our approach, as we extract both global and local featuresdescribing their geometric layout and pose, as well as the objects being manipulated. We integratethis information in a probabilistic generative model, provide update equations thatautomatically compute the model parameters optimizing the likelihood of the data, and designa method to generate maps of visual attention that are later used in an object-recognitionframework. This task-driven assessment reveals that the proposed method outperforms thestate-of-the-art in object recognition for egocentric video content. [...]
302

ESPERMINA REVERTE O DANO DE MEMÓRIA INDUZIDO POR LIPOPOLISSACARÍDEO EM CAMUNDONGOS / SPERMINE REVERSES LIPOPOLYSACCHARIDE-INDUCED MEMORY DEFICIT IN MICE

Frühauf, Pâmella Karina Santana 21 August 2014 (has links)
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Neuroinflammation is a neuropathological finding in a number of neurodegenerative diseases. Intraperitoneal injection of lipopolysaccharide (LPS) induces neuroinflammation and memory deficit. Spermine and spermidine are endogenous polyamines that physiologically modulate the N-methyl-D-aspartate (NMDA) receptor in mammals by binding to the polyamine-binding site at the NMDA receptor. Since polyamines improve memory in cognitive tasks, we tested whether the post-training administration of spermine reverses the deficits of memory induced by LPS in the object recognition task in mice. While spermine (1 mg/kg, i.p.) increased, ifenprodil (10 mg/kg, i.p.), a noncompetitive GluN2B-containing NMDA receptor antagonist, decreased the discrimination score on novel object recognition task. Spermine, at dose that did not alter memory (0.3 mg/kg, i.p.), reversed the cognitive impairment induced by LPS (250 μg/kg, i.p.). Ifenprodil (0.3 mg/kg, i.p.) reversed the protective effect of spermine against LPS-induced memory deficits in the novel object recognition task. However, spermine failed to reverse the LPS-induced increased of cortical and hippocampal cytokines levels. The results indicate that spermine protects from LPS-induced memory deficits in mice by mechanisms other than decreasing LPS-induced cytokine production. / A inflamação periférica desencadeia a produção central de citocinas inflamatórias, gerando um quadro de neuroinflamação. Essa condição altera as transmissões no receptor N-Metil-D-Aspartato (NMDA) o que prejudica a memória e a plasticidade sináptica. A injeção de Lipopolissacarídeo (LPS) induz a neuroinflamação e prejudica a memória. A espermina e a espermidina são poliaminas endógenas que modulam fisiologicamente o receptor NMDA em mamíferos. Uma vez que as poliaminas melhoram a memória em tarefas cognitivas, investigamos se a administração pós-treino de espermina reverte o prejuízo de memória induzido por LPS sistêmico na tarefa de reconhecimento de objetos em camundongos. Enquanto a espermina (1 mg/kg, ip) aumentou, o ifenprodil (10 mg/kg, ip), antagonista não competitivo do receptor NMDA contendo GluN2B, diminuiu a discriminação na tarefa de reconhecimento de objetos. A espermina, em doses que não alteram a memória (0,3 mg/kg, ip), reverteu o dano cognitivo induzido por LPS (250 μg/kg, ip). O ifenprodil (0,3 mg/kg, ip) impediu o efeito protetor da espermina contra o prejuízo de memória induzido por LPS na tarefa de reconhecimento de objetos. No entanto, a espermina não reverteu o aumento dos níveis de citocinas pró-inflamatórias induzido por LPS no hipocampo e córtex cerebral. Os resultados do presente estudo indicam que a espermina protege a piora da memória induzida por LPS em camundongos. O mecanismo desta proteção envolve o sítio de ligação das poliaminas no receptor NMDA, e não envolve mecanismos anti-inflamatórios.
303

Sistema de visión computacional estereoscópico aplicado a un robot cilíndrico accionado neumáticamente

Ramirez Montecinos, Daniela Elisa January 2017 (has links)
In the industrial area, robots are an important part of the technological resources available to perform manipulation tasks in manufacturing, assembly, the transportation of dangerous waste, and a variety of applications. Specialized systems of computer vision have entered the market to solve problems that other technologies have been unable to address. This document analyzes a stereo vision system that is used to provide the center of mass of an object in three dimensions. This kind of application is mounted using two or more cameras that are aligned along the same axis and give the possibility to measure the depth of a point in the space. The stereoscopic system described, measures the position of an object using a combination between the 2D recognition, which implies the calculus of the coordinates of the center of mass and using moments, and the disparity that is found comparing two images: one of the right and one of the left. This converts the system into a 3D reality viewfinder, emulating the human eyes, which are capable of distinguishing depth with good precision.The proposed stereo vision system is integrated into a 5 degree of freedom pneumatic robot, which can be programmed using the GRAFCET method by means of commercial software. The cameras are mounted in the lateral plane of the robot to ensure that all the pieces in the robot's work area can be observed.For the implementation, an algorithm is developed for recognition and position measurement using open sources in C++. This ensures that the system can remain as open as possible once it is integrated with the robot. The validation of the work is accomplished by taking samples of the objects to be manipulated and generating robot's trajectories to see if the object can be manipulated by its end effector or not. The results show that is possible to manipulate pieces in a visually crowded space with acceptable precision. However, the precision reached does not allow the robot to perform tasks that require higher accuracy as the one is needed in manufacturing assembly process of little pieces or in welding applications. / En el área industrial los robots forman parte importante del recurso tecnológico disponible para tareas de manipulación en manufactura, ensamble, manejo de residuos peligrosos y aplicaciones varias. Los sistemas de visión computacional se han ingresado al mercado como soluciones a problemas que otros tipos de sensores y métodos no han podido solucionar. El presente trabajo analiza un sistema de visión estereoscópico aplicado a un robot. Este arreglo permite la medición de coordenadas del centro de un objeto en las tres dimensiones, de modo que, le da al robot la posibilidad de trabajar en el espacio y no solo en un plano. El sistema estereoscópico consiste en el uso de dos o más cámaras alineadas en alguno de sus ejes, mediante las cuales, es posible calcular la profundidad a la que se encuentran los objetos. En el presente, se mide la posición de un objeto haciendo una combinación entre el reconocimiento 2D y la medición de las coordenadas y de su centro calculadas usando momentos. En el sistema estereoscópico, se añade la medición de la última coordenada mediante el cálculo de la disparidad encontrada entre las imágenes de las cámaras inalámbricas izquierda y derecha, que convierte al sistema en un visor 3D de la realidad, emulando los ojos humanos capaces de distinguir profundidades con cierta precisión. El sistema de visión computacional propuesto es integrado a un robot neumático de 5 grados de libertad el cual puede ser programado desde la metodología GRAFCET mediante software de uso comercial. Las cámaras del sistema de visión están montadas en el plano lateral del robot de modo tal, que es posible visualizar las piezas que quedan dentro de su volumen de trabajo. En la implementación, se desarrolla un algoritmo de reconocimiento y medición de posición, haciendo uso de software libre en lenguaje C++. De modo que, en la integración con el robot, el sistema pueda ser lo más abierto posible. La validación del trabajo se logra tomando muestras de los objetos a ser manipulados y generando trayectorias para el robot, a fin de visualizar si la pieza pudo ser captada por su garra neumática o no. Los resultados muestran que es posible lograr la manipulación de piezas en un ambiente visualmente cargado y con una precisión aceptable. Sin embargo, se observa que la precisión no permite que el sistema pueda ser usado en aplicaciones donde se requiere precisión al nivel de los procesos de ensamblado de piezas pequeñas o de soldadura.
304

Uma proposta de estruturação e integração de processamento de cores em sistemas artificiais de visão. / A proposal for structuration and integration of color processing in artifical vision systems.

Jander Moreira 05 July 1999 (has links)
Esta tese descreve uma abordagem para a utilização da informação de cores no sistema de visão artificial com inspiração biológica denominada Cyvis-1. Considerando-se que grande parte da literatura sobre segmentação de imagens se refere a imagens em níveis de cinza, informações cromáticas na segmentação permanecem uma área que ainda deve ser mais bem explorada e para a qual se direcionou o interesse da presente pesquisa. Neste trabalho, o subsistema de cor do Cyvis-1 é definido, mantendo-se o vínculo com os princípios que inspiram o sistema de visão como um todo: hierarquia, modularidade, especialização do processamento, integração em vários níveis, representação efetiva da informação visual e integração com conhecimento de nível alto. O subsistema de cor se insere neste escopo, propondo uma técnica para segmentação de imagens coloridas baseada em mapas auto-organizáveis para a classificação dos pontos da imagem. A segmentação incorpora a determinação do número de classes sem supervisão, tornando o processo mais independente de intervenção humana. Por este processo de segmentação, são produzidos mapas das regiões encontradas e um mapa de bordas, derivado das regiões. Uma segunda proposta do trabalho é um estudo comparativo do desempenho de técnicas de segmentação por bordas. A comparação é feita em relação a um mapa de bordas de referência e o comportamento de várias técnicas é analisado segundo um conjunto de atributos locais baseados em contrastes de intensidade e cor. Derivada desta comparação, propõe-se também uma combinação para a geração de um mapa de bordas a partir da seleção das técnicas segundo seus desempenhos locais. Finalmente, integrando os aspectos anteriores, é proposta urna estruturação do módulo de cor, adicionalmente com a aquisição de imagens, a análise de formas e o reconhecimento de objetos poliédricos. Há, neste contexto, a integração ao módulo de estéreo, que proporciona o cálculo de dados tridimensionais, essenciais para o reconhecimento dos objetos. Para cada parte deste trabalho são propostas formas de avaliação para a validação dos resultados, demonstrando e caracterizando a eficiência e as limitações de cada uma. / This thesis describes an approach to color information processing in the biologically-inspired artificial vision system named Cyvis-1. Considering that most of the current literature in image segmentation deals with gray level images, color information remains an incipient area, which has motivated this research. This work defines the color subsystem within the Cyvis-1 underlying phylosophy, whose main principles include hierarchy, modularity, processing specialization, multilevel integration, effective representation of visual information, and high-level knowledge integration. The color subsystem is then introduced according to this framework, with a proposal of a segmentation technique based on self-organizing maps. The number of regions in the image is achieved through a unsupervised clustering approach, so no human interaction is needed. Such segmentation technique produces region oriented representation of the classes, which are used to derive an edge map. Another main topic in this work is a comparative study of the edge maps produced by several edge-oriented segmentation techniques. A reference edge map is used as standard segmentation, to which other edge maps are compared. Such analysis is carried out by means of local attributes (local gray level and \"color\" contrasts). As a consequence of the comparison, a combination edge map is also proposed, based on the conditional selection of techniques considering the local attributes. Finally, the integration of two above topics is proposed, which is characterized by the design of the color subsystem of Cyvis-1, altogether with the modules for image acquisition, shape analysis and polyhedral object recognition. In such a context, the integration with the stereo subsystem is accomplished, allowing the evaluation of the three-dimensional data needed for object recognition. Assessment and validation of the three proposals were carried out, providing the means for analyzing their efficiency and limitations.
305

Efeito da condição claro/escuro e da intensidade luminosa na aprendizagem e memória de trabalho de camundongos Swiss

Ramos, Shayenne Elizianne 28 February 2011 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2016-07-18T18:06:11Z No. of bitstreams: 1 shayenneelizianneramos.pdf: 1074982 bytes, checksum: abf1f27a03503c8f798e487f18d495b1 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-07-22T15:11:40Z (GMT) No. of bitstreams: 1 shayenneelizianneramos.pdf: 1074982 bytes, checksum: abf1f27a03503c8f798e487f18d495b1 (MD5) / Made available in DSpace on 2016-07-22T15:11:40Z (GMT). No. of bitstreams: 1 shayenneelizianneramos.pdf: 1074982 bytes, checksum: abf1f27a03503c8f798e487f18d495b1 (MD5) Previous issue date: 2011-02-28 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Em um organismo, o sistema circadiano modula muitos processos fisiológicos e comportamentais, como aprendizagem e memória. Adicionalmente, a melatonina também tem sido mostrada como moduladora desses processos cognitivos. No entanto, diferentes programas de iluminação são utilizados em animais de produção e a melatonina tem sido usada como suplemento alimentar e no tratamento de alterações da ritmicidade circadiana. Logo, há necessidade de serem realizados mais estudos sobre os efeitos gerados por manipulações no ritmo circadiano e pela melatonina. Como ambos estão diretamente relacionados à iluminação, o presente trabalho teve como objetivo investigar o efeito de diferentes condições de claro/escuro no biotério e intensidade luminosa durante testes comportamentais na aprendizagem e memória de trabalho de camundongos Swiss. No biotério, camundongos Swiss foram mantidos em ciclo claro-escuro de 12:12 h (C), claro constante (L) ou escuro constante (E) e testados no labirinto Lashley III e Teste de Reconhecimento de Objetos sob iluminação de 500 ou 0 lux, resultando seis tratamentos (C500, C0, L500, L0, E500 e E0). Não houve diferença significativa entre o ciclo claro-escuro de 12:12 h, claro constante e escuro constante no biotério e nem entre os testes realizados a 500 e 0 lux. Somente os animais mantidos em escuro constante e testados a 0 lux, tratamento E0, tiveram a aprendizagem e memória de trabalho prejudicados, demonstrados pela aprendizagem mais lenta no labirinto Lashley III e o não reconhecimento do objeto no Teste de Reconhecimento de Objetos. / In an organism, the circadian system modulates many physiological and behavioral processes such as learning and memory. Additionally, melatonin has also been shown to modulate these cognitive processes. However, different lighting programs are used in production animal and the melatonin has been used as a food supplement and in the treatment of circadian rhythmic changes. Therefore, there is need for more studies be conducted on the effects caused by manipulations in the circadian rhythm and by melatonin. Because both are directly related to lighting, this study aimed to investigate the effect of different light/dark conditions in stock room and light intensity during behavioral tests in learning and working memory of Swiss mice. In a stock room, Swiss mice were kept on a 12:12 light/dark cycle (C), constant light (L) or constant darkness (E). They were tested in the Lashley III maze and Object-recognition task under 500 or 0 lx illuminations, resulting in six treatments (C500, C0, L500, L0, E0 and E500). There wasn´t significant difference between 12:12 light/dark cycle, constant light and constant darkness conditions in stock room neither between the tests conducted at 500 and 0 lx. Only animals kept in constant darkness and tested at 0 lx, E0 treatment, had impaired learning and working memory, demonstrated by slower learning in the Lashley III maze and the non-recognition of the object in Object-recognition task.
306

Real-time Object Recognition on a GPU

Pettersson, Johan January 2007 (has links)
Shape-Based matching (SBM) is a known method for 2D object recognition that is rather robust against illumination variations, noise, clutter and partial occlusion. The objects to be recognized can be translated, rotated and scaled. The translation of an object is determined by evaluating a similarity measure for all possible positions (similar to cross correlation). The similarity measure is based on dot products between normalized gradient directions in edges. Rotation and scale is determined by evaluating all possible combinations, spanning a huge search space. A resolution pyramid is used to form a heuristic for the search that then gains real-time performance. For SBM, a model consisting of normalized edge gradient directions, are constructed for all possible combinations of rotation and scale. We have avoided this by using (bilinear) interpolation in the search gradient map, which greatly reduces the amount of storage required. SBM is highly parallelizable by nature and with our suggested improvements it becomes much suited for running on a GPU. This have been implemented and tested, and the results clearly outperform those of our reference CPU implementation (with magnitudes of hundreds). It is also very scalable and easily benefits from future devices without effort. An extensive evaluation material and tools for evaluating object recognition algorithms have been developed and the implementation is evaluated and compared to two commercial 2D object recognition solutions. The results show that the method is very powerful when dealing with the distortions listed above and competes well with its opponents.
307

Computational Models of Perceptual Space : From Simple Features to Complex Shapes

Pramod, R T January 2014 (has links) (PDF)
Dissimilarity plays a very important role in object recognition. But, finding perceptual dissimilarity between objects is non-trivial as it is not equivalent to the pixel dissimilarity between the objects (For example, two white noise images appear very similar even when they have different intensity values at every corresponding pixel). However, visual search allows us to reliably measure perceptual dissimilarity between a pair of objects. When the target object is dissimilar to the distracter, visual search becomes easy and it will be difficult otherwise. Even though we can measure perceptual dissimilarity between objects, we still do not understand either the underlying mechanisms or the visual features involved in the computation of dissimilarities. For this thesis, I have explored perceptual dissimilarity in two studies – by looking at known simple features and understanding how they combine, and using computational models to understand or discover complex features. In the first study, we looked at how dissimilarity between two simple objects with known features can be predicted using dissimilarities between individual features. Specifically, we investigated how search for targets differing in multiple features (intensity, length, orientation) from the distracters is related to searches for targets differing in each of the individual features. We found that multiple feature dissimilarities could be predicted as a linear combination of individual feature dissimilarities. Also, we demonstrated for the first time that Aspect ratio of the object emerges as a novel feature in visual search. This work has been published in the Journal of Vision (Pramod & Arun, 2014). Having established in the first study that simple features combine linearly, we devised a second study to investigate dissimilarities in complex shapes. Since it is known that shape is one of the salient and complex features in object representation, we chose silhouettes of animals and abstract objects to explore the nature of dissimilarity computations. We conducted visual search using pairs of these silhouettes on humans to get an estimate of perceptual dissimilarity. We then used various computational models of shape representation (like Fourier Descriptors, Curvature Scale Space, HMAX model etc) to see how well they can predict the observed dissimilarities. We found that many of these computational models were able to predict the perceptual dissimilarities of a large number of object pairs. However, we also observed many cases where computational models failed to predict perceptual dissimilarities. The manuscript related to this study is under preparation.
308

Simultaneous real-time object recognition and pose estimation for artificial systems operating in dynamic environments

Van Wyk, Frans Pieter January 2013 (has links)
Recent advances in technology have increased awareness of the necessity for automated systems in people’s everyday lives. Artificial systems are more frequently being introduced into environments previously thought to be too perilous for humans to operate in. Some robots can be used to extract potentially hazardous materials from sites inaccessible to humans, while others are being developed to aid humans with laborious tasks. A crucial aspect of all artificial systems is the manner in which they interact with their immediate surroundings. Developing such a deceivingly simply aspect has proven to be significantly challenging, as it not only entails the methods through which the system perceives its environment, but also its ability to perform critical tasks. These undertakings often involve the coordination of numerous subsystems, each performing its own complex duty. To complicate matters further, it is nowadays becoming increasingly important for these artificial systems to be able to perform their tasks in real-time. The task of object recognition is typically described as the process of retrieving the object in a database that is most similar to an unknown, or query, object. Pose estimation, on the other hand, involves estimating the position and orientation of an object in three-dimensional space, as seen from an observer’s viewpoint. These two tasks are regarded as vital to many computer vision techniques and and regularly serve as input to more complex perception algorithms. An approach is presented which regards the object recognition and pose estimation procedures as mutually dependent. The core idea is that dissimilar objects might appear similar when observed from certain viewpoints. A feature-based conceptualisation, which makes use of a database, is implemented and used to perform simultaneous object recognition and pose estimation. The design incorporates data compression techniques, originally suggested by the image-processing community, to facilitate fast processing of large databases. System performance is quantified primarily on object recognition, pose estimation and execution time characteristics. These aspects are investigated under ideal conditions by exploiting three-dimensional models of relevant objects. The performance of the system is also analysed for practical scenarios by acquiring input data from a structured light implementation, which resembles that obtained from many commercial range scanners. Practical experiments indicate that the system was capable of performing simultaneous object recognition and pose estimation in approximately 230 ms once a novel object has been sensed. An average object recognition accuracy of approximately 73% was achieved. The pose estimation results were reasonable but prompted further research. The results are comparable to what has been achieved using other suggested approaches such as Viewpoint Feature Histograms and Spin Images. / Dissertation (MEng)--University of Pretoria, 2013. / gm2014 / Electrical, Electronic and Computer Engineering / unrestricted
309

Learning objects model and context for recognition and localisation / Apprentissage de modèles et contextes d'objets pour la reconnaissance et la localisation

Manfredi, Guido 18 September 2015 (has links)
Cette thèse traite des problèmes de modélisation, reconnaissance, localisation et utilisation du contexte pour la manipulation d'objets par un robot. Le processus de modélisation se divise en quatre composantes : le système réel, les données capteurs, les propriétés à reproduire et le modèle. En spécifiant chacune des ces composantes, il est possible de définir un processus de modélisation adapté au problème présent, la manipulation d'objets par un robot. Cette analyse mène à l'adoption des descripteurs de texture locaux pour la modélisation. La modélisation basée sur des descripteurs de texture locaux a été abordé dans de nombreux travaux traitant de structure par le mouvement (SfM) ou de cartographie et localisation simultanée (SLAM). Les méthodes existantes incluent Bundler, Roboearth et 123DCatch. Pourtant, aucune de ces méthodes n'a recueilli le consensus. En effet, l'implémentation d'une approche similaire montre que ces outils sont difficiles d'utilisation même pour des utilisateurs experts et qu'ils produisent des modèles d'une haute complexité. Cette complexité est utile pour fournir un modèle robuste aux variations de point de vue. Il existe deux façons pour un modèle d'être robuste : avec le paradigme des vues multiple ou celui des descripteurs forts. Dans le paradigme des vues multiples, le modèle est construit à partir d'un grand nombre de points de vue de l'objet. Le paradigme des descripteurs forts compte sur des descripteurs résistants aux changements de points de vue. Les expériences réalisées montrent que des descripteurs forts permettent d'utiliser un faible nombre de vues, ce qui résulte en un modèle simple. Ces modèles simples n'incluent pas tout les point de vus existants mais les angles morts peuvent être compensés par le fait que le robot est mobile et peut adopter plusieurs points de vue. En se basant sur des modèles simples, il est possible de définir des méthodes de modélisation basées sur des images seules, qui peuvent être récupérées depuis Internet. A titre d'illustration, à partir d'un nom de produit, il est possible de récupérer des manières totalement automatiques des images depuis des magasins en ligne et de modéliser puis localiser les objets désirés. Même avec une modélisation plus simple, dans des cas réel ou de nombreux objets doivent être pris en compte, il se pose des problèmes de stockage et traitement d'une telle masse de données. Cela se décompose en un problème de complexité, il faut traiter de nombreux modèles rapidement, et un problème d'ambiguïté, des modèles peuvent se ressembler. L'impact de ces deux problèmes peut être réduit en utilisant l'information contextuelle. Le contexte est toute information non issue des l'objet lui même et qui aide a la reconnaissance. Ici deux types de contexte sont abordés : le lieu et les objets environnants. Certains objets se trouvent dans certains endroits particuliers. En connaissant ces liens lieu/objet, il est possible de réduire la liste des objets candidats pouvant apparaître dans un lieu donné. Par ailleurs l'apprentissage du lien lieu/objet peut être fait automatiquement par un robot en modélisant puis explorant un environnement. L'information appris peut alors être fusionnée avec l'information visuelle courante pour améliorer la reconnaissance. Dans les cas des objets environnants, un objet peut souvent apparaître au cotés d'autres objets, par exemple une souris et un clavier. En connaissant la fréquence d'apparition d'un objet avec d'autres objets, il est possible de réduire la liste des candidats lors de la reconnaissance. L'utilisation d'un Réseau de Markov Logique est particulièrement adaptée à la fusion de ce type de données. Cette thèse montre la synergie de la robotique et du contexte pour la modélisation, reconnaissance et localisation d'objets. / This Thesis addresses the modeling, recognition, localization and use of context for objects manipulation by a robot. We start by presenting the modeling process and its components: the real system, the sensors' data, the properties to reproduce and the model. We show how, by specifying each of them, one can define a modeling process adapted to the problem at hand, namely object manipulation by a robot. This analysis leads us to the adoption of local textured descriptors for object modeling. Modeling with local textured descriptors is not a new concept, it is the subject of many Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM) works. Existing methods include bundler, roboearth modeler and 123DCatch. Still, no method has gained widespread adoption. By implementing a similar approach, we show that they are hard to use even for expert users and produce highly complex models. Such complex techniques are necessary to guaranty the robustness of the model to view point change. There are two ways to handle the problem: the multiple views paradigm and the robust features paradigm. The multiple views paradigm advocate in favor of using a large number of views of the object. The robust feature paradigm relies on robust features able to resist large view point changes. We present a set of experiments to provide an insight into the right balance between both. By varying the number of views and using different features we show that small and fast models can provide robustness to view point changes up to bounded blind spots which can be handled by robotic means. We propose four different methods to build simple models from images only, with as little a priori information as possible. The first one applies to planar or piecewise planar objects and relies on homographies for localization. The second approach is applicable to objects with simple geometry, such as cylinders or spheres, but requires many measures on the object. The third method requires the use of a calibrated 3D sensor but no additional information. The fourth technique doesn't need a priori information at all. We apply this last method to autonomous grocery objects modeling. From images automatically retrieved from a grocery store website, we build a model which allows recognition and localization for tracking. Even using light models, real situations ask for numerous object models to be stored and processed. This poses the problems of complexity, processing multiple models quickly, and ambiguity, distinguishing similar objects. We propose to solve both problems by using contextual information. Contextual information is any information helping the recognition which is not directly provided by sensors. We focus on two contextual cues: the place and the surrounding objects. Some objects are mainly found in some particular places. By knowing the current place, one can restrict the number of possible identities for a given object. We propose a method to autonomously explore a previously labeled environment and establish a correspondence between objects and places. Then this information can be used in a cascade combining simple visual descriptors and context. This experiment shows that, for some objects, recognition can be achieved with as few as two simple features and the location as context. The objects surrounding a given object can also be used as context. Objects like a keyboard, a mouse and a monitor are often close together. We use qualitative spatial descriptors to describe the position of objects with respect to their neighbors. Using a Markov Logic Network, we learn patterns in objects disposition. This information can then be used to recognize an object when surrounding objects are already identified. This Thesis stresses the good match between robotics, context and objects recognition.
310

Detekce objektu ve videosekvencích / Object Detection in Video Sequences

Šebela, Miroslav January 2010 (has links)
The thesis consists of three parts. Theoretical description of digital image processing, optical character recognition and design of system for car licence plate recognition (LPR) in image or video sequence. Theoretical part describes image representation, smoothing, methods used for blob segmentation and proposed are two methods for optical character recognition (OCR). Concern of practical part is to find solution and design procedure for LPR system included OCR. The design contain image pre-processing, blob segmentation, object detection based on its properties and OCR. Proposed solution use grayscale trasformation, histogram processing, thresholding, connected component,region recognition based on its patern and properties. Implemented is also optical recognition method of licence plate where acquired values are compared with database used to manage entry of vehicles into object.

Page generated in 0.1099 seconds