Global ETD Search

131	Computational neuroscience of natural scene processing in the ventral visual pathway Tromans, James Matthew January 2012 (has links) Neural responses in the primate ventral visual system become more complex in the later stages of the pathway. For example, not only do neurons in IT cortex respond to complete objects, they also learn to respond invariantly with respect to the viewing angle of an object and also with respect to the location of an object. These types of neural responses have helped guide past research with VisNet, a computational model of the primate ventral visual pathway that self-organises during learning. In particular, previous research has focussed on presenting to the model one object at a time during training, and has placed emphasis on the transform invariant response properties of the output neurons of the model that consequently develop. This doctoral thesis extends previous VisNet research and investigates the performance of the model with a range of more challenging and ecologically valid training paradigms. For example, when multiple objects are presented to the network during training, or when objects partially occlude one another during training. The different mechanisms that help output neurons to develop object selective, transform invariant responses during learning are proposed and explored. Such mechanisms include the statistical decoupling of objects through multiple object pairings, and the separation of object representations by independent motion. Consideration is also given to the heterogeneous response properties of neurons that develop during learning. For example, although IT neurons demonstrate a number of differing invariances, they also convey spatial information and view specific information about the objects presented on the retina. A updated, scaled-up version of the VisNet model, with a significantly larger retina, is introduced in order to explore these heterogeneous neural response properties. 152.14
132	Zero-shot Learning for Visual Recognition Problems Naha, Shujon January 2016 (has links) In this thesis we discuss different aspects of zero-shot learning and propose solutions for three challenging visual recognition problems: 1) unknown object recognition from images 2) novel action recognition from videos and 3) unseen object segmentation. In all of these three problems, we have two different sets of classes, the “known classes”, which are used in the training phase and the “unknown classes” for which there is no training instance. Our proposed approach exploits the available semantic relationships between known and unknown object classes and use them to transfer the appearance models from known object classes to unknown object classes to recognize unknown objects. We also propose an approach to recognize novel actions from videos by learning a joint model that links videos and text. Finally, we present a ranking based approach for zero-shot object segmentation. We represent each unknown object class as a semantic ranking of all the known classes and use this semantic relationship to extend the segmentation model of known classes to segment unknown class objects. / October 2016 Zero-shot Learning Computer Vision Object Recognition Action Recognition Object Segmentation
133	The anatomical and functional correlates of category-specificity Thomas, R. M. January 2004 (has links) The dramatic effects of brain damage can provide some of the most interesting insights into the nature of normal cognitive performance. In recent years a number of neuropsychological studies have reported a particular form of cognitive impairment where patients have problems recognising objects from one category but remain able to recognise those from others. The most frequent ‘category-specific’ pattern is an impairment identifying living things, compared to nonliving things. The reverse pattern of dissociation, i.e., an impairment recognising and naming nonliving things relative to living things, has been reported albeit much less frequently. The objective of the work carried out in this thesis was to investigate the organising principles and anatomical correlates of stored knowledge for categories of living and nonliving things. Three complementary cognitive neuropsychological research techniques were employed to assess how, and where, this knowledge is represented in the brain: (i) studies of normal (neurologically intact) subjects, (ii) case-studies of neurologically impaired patients with selective deficits in object recognition, and (iii) studies of the anatomical correlates of stored knowledge for living and nonliving things on the brain using magnetoencephalography (MEG). The main empirical findings showed that semantic knowledge about living and nonliving things is principally encoded in terms of sensory and functional features, respectively. In two case-study chapters evidence was found supporting the view that category-specific impairments can arise from damage to a pre-semantic system, rather than the assumption often made that the system involved must be semantic. In the MEG study, rather than finding evidence for the involvement of specific brain areas for different object categories, it appeared that, when subjects named and categorised living and nonliving things, a non-differentiated neural system was involved. 617.481044
134	Depth-adaptive methodologies for 3D image caregorization Kounalakis, Tsampikos January 2015 (has links) Image classification is an active topic of computer vision research. This topic deals with the learning of patterns in order to allow efficient classification of visual information. However, most research efforts have focused on 2D image classification. In recent years, advances of 3D imaging enabled the development of applications and provided new research directions. In this thesis, we present methodologies and techniques for image classification using 3D image data. We conducted our research focusing on the attributes and limitations of depth information regarding possible uses. This research led us to the development of depth feature extraction methodologies that contribute to the representation of images thus enhancing the recognition efficiency. We proposed a new classification algorithm that adapts to the need of image representations by implementing a scale-based decision that exploits discriminant parts of representations. Learning from the design of image representation methods, we introduced our own which describes each image by its depicting content providing more discriminative image representation. We also propose a dictionary learning method that exploits the relation of training features by assessing the similarity of features originating from similar context regions. Finally, we present our research on deep learning algorithms combined with data and techniques used in 3D imaging. Our novel methods provide state-of-the-art results, thus contributing to the research of 3D image classification. 006.3
135	Color based classification of circular markers for the identification of experimental units Narjala, Lakshmi January 1900 (has links) Master of Science / Department of Computing and Information Sciences / Daniel Andresen / The purpose of this project is to analyze the growth of plants under certain lighting conditions. In order to ensure ideal lighting for all plants under demanding conditions like lack of optimal light due to shadowing, side wall reflections, overlapping of plants, etc., pots are rotated manually in an irregular fashion. To keep track of the position of these plants from time to time, a marking system is used for each tray of 16 plants. These markers are unique for each tray High definition surveillance cameras placed above these plants capture the plant images periodically. These images undergo image processing. Image processing should be able to identify and recognize the plants from the identification markers that were placed within each tray and thereby draw the statistics about the growth of the plants. Hence the computing part of this project is all about extracting the identity of a plant through image processing. Image processing involves object and color recognition. Fiji, an image processing tool, is used for object recognition and the Python image module called “Image” is used for color recognition. Object recognition accurately locates the position of these circular objects and measures their size and shape. Color recognition identifies the pixel values of these circular objects. Finally the code corresponding to three-element groups of these circular units is fetched and stored. This code gives the identity of the tray and, therefore, each plant. The timestamp that is stored with each plant image along with the code fetched through image processing is used to track the location of a plant in the plant chamber through time. Image processing Object recognition Color recognition Circular markers as experimental units Computer Science (0984)
136	Androidapplikation för digitalisering av formulär : Minimering av inlärningstid, kostnad och felsannolikhet Fahlén, Erik January 2018 (has links) This study was performed by creating an android application that uses custom object recognition to scan and digitalize a series of checkbox form for example to correct multiple-choice questions or collect forms in a spreadsheet. The purpose with this study was to see which dataset and hardware with the machine learning library TensorFlow was cheapest, price worthy, enough reliable and fastest. A dataset of filled example forms with annotated checkboxes was created and used in the learning process. The model that was used for the object recognition was Single Show MultiBox Detector, MobileNet version, because it can detect multiple objects in the same image as well as it doesn’t have as high hardware requirements making it fitted for phones. The learning process was done in Google Clouds Machine Learning Engine with different image resolutions and cloud configurations. After the learning process on the cloud the finished TensorFlow model was converted to the TensorFlow Lite model that gets used in phones. The TensorFlow Lite model was used in the compilation of the android application so that the object recognition could work. The android application worked and could recognize the inputs in the checkbox form. Different image resolutions and cloud configurations during the learning process gave different results when it comes to which one was fastest and cheapest. In the end the conclusion was that Googles hardware setup STANDARD_1 was 20% faster than BASIC that was 91% cheaper and more price worthy with this dataset. / Denna studie genomfördes genom att skapa en fungerande androidapplikation som använder sig av en anpassad objektigenkänning för att skanna och digitalisera en serie av kryssruteformulär exempelvis för att rätta flervalsfrågor eller sammanställa enkäter i ett kalkylark. Syftet med undersökningen var att se vilka datauppsättningar och hårdvara med maskininlärningsbiblioteket TensorFlow som var billigast, mest prisvärd, tillräcklig tillförlitlig och snabbast. En datauppsättning av ifyllda exempelformulär med klassificerade kryssrutor skapades och användes i inlärningsprocessen. Modellen som användes för objektigenkänningen blev Single Shot MultiBox Detector, version MobileNet, för att denna kan känna igen flera objekt i samma bild samt att den inte har lika höga hårdvarukrav vilket gör den anpassad för mobiltelefoner. Inlärningsprocessen utfördes i Google Clouds Machine Learning Engine med olika bildupplösningar och molnkonfiguration. Efter inlärningsprocessen på molnet konverterades den färdiga TensorFlow- modellen till en TensorFlow Lite-modell som används i mobiltelefoner. TensorFlow Lite-modellen användes i kompileringen av androidapplikationen för att objektigenkänningen skulle fungera. Androidapplikationen fungerade och kunde känna igen alla inmatningar i kryssruteformuläret. Olika bildupplösningar och molnkonfigurationer under inlärningsprocessen gav olika resultat när det gäller vilken som var snabbast eller billigast. I slutändan drogs slutsatsen att Googles hårdvaruuppsättning STANDARD_1 var 20% snabbare än BASIC som var 91% billigare och mest prisvärd med denna datauppsättning. Machine learning TensorFlow object recognition computer engineering Maskininlärning TensorFlow objektigenkänning datateknik Software Engineering Programvaruteknik
137	Feature extraction from 3D point clouds / Extração de atributos robustos a partir de nuvens de pontos 3D Przewodowski Filho, Carlos André Braile 13 March 2018 (has links) Computer vision is a research field in which images are the main object of study. One of its category of problems is shape description. Object classification is one important example of applications using shape descriptors. Usually, these processes were performed on 2D images. With the large-scale development of new technologies and the affordable price of equipment that generates 3D images, computer vision has adapted to this new scenario, expanding the classic 2D methods to 3D. However, it is important to highlight that 2D methods are mostly dependent on the variation of illumination and color, while 3D sensors provide depth, structure/3D shape and topological information beyond color. Thus, different methods of shape descriptors and robust attributes extraction were studied, from which new attribute extraction methods have been proposed and described based on 3D data. The results obtained from well known public datasets have demonstrated their efficiency and that they compete with other state-of-the-art methods in this area: the RPHSD (a method proposed in this dissertation), achieved 85:4% of accuracy on the University of Washington RGB-D dataset, being the second best accuracy on this dataset; the COMSD (another proposed method) has achieved 82:3% of accuracy, standing at the seventh position in the rank; and the CNSD (another proposed method) at the ninth position. Also, the RPHSD and COMSD methods have relatively small processing complexity, so they achieve high accuracy with low computing time. / Visão computacional é uma área de pesquisa em que as imagens são o principal objeto de estudo. Um dos problemas abordados é o da descrição de formatos (em inglês, shapes). Classificação de objetos é um importante exemplo de aplicação que usa descritores de shapes. Classicamente, esses processos eram realizados em imagens 2D. Com o desenvolvimento em larga escala de novas tecnologias e o barateamento dos equipamentos que geram imagens 3D, a visão computacional se adaptou para este novo cenário, expandindo os métodos 2D clássicos para 3D. Entretanto, estes métodos são, majoritariamente, dependentes da variação de iluminação e de cor, enquanto os sensores 3D fornecem informações de profundidade, shape 3D e topologia, além da cor. Assim, foram estudados diferentes métodos de classificação de objetos e extração de atributos robustos, onde a partir destes são propostos e descritos novos métodos de extração de atributos a partir de dados 3D. Os resultados obtidos utilizando bases de dados 3D públicas conhecidas demonstraram a eficiência dos métodos propóstos e que os mesmos competem com outros métodos no estado-da-arte: o RPHSD (um dos métodos propostos) atingiu 85:4% de acurácia, sendo a segunda maior acurácia neste banco de dados; o COMSD (outro método proposto) atingiu 82:3% de acurácia, se posicionando na sétima posição do ranking; e o CNSD (outro método proposto) em nono lugar. Além disso, os métodos RPHSD têm uma complexidade de processamento relativamente baixa. Assim, eles atingem uma alta acurácia com um pequeno tempo de processamento. 3D object recognition 3D point clouds Descritores de formato Nuvens de pontos 3D Reconhecimento de objetos 3D Shape descriptors
138	Avaliação de um método baseado em máquinas de suporte vetorial de múltiplos núcleos e retificação de imagens para classificação de objetos em imagens onidirecionais. / Assessment of a method based on multiple kernel support vector machines and images unwrapping for the classification of objects in omnidirectional images. Amaral, Fábio Rodrigo 18 October 2010 (has links) Apesar da popularidade das câmeras onidirecionais aplicadas à robótica móvel e da importância do reconhecimento de objetos no universo mais amplo da robótica e da visão computacional, é difícil encontrar trabalhos que relacionem ambos na literatura especializada. Este trabalho visa avaliar um método para classificação de objetos em imagens onidirecionais, analisando sua eficácia e eficiência para ser aplicado em tarefas de auto-localização e mapeamento de ambientes feitas por robôs moveis. Tal método é construído a partir de um classificador de objetos, implementado através de máquinas de suporte vetorial, estendidas para a utilização de Aprendizagem de Múltiplos Núcleos. Também na construção deste método, uma etapa de retificação é aplicada às imagens onidirecionais, de modo a aproximá-las das imagens convencionais, às quais o classificador utilizado já demonstrou bons resultados. A abordagem de Múltiplos Núcleos se faz necessária para possibilitar a aplicação de três tipos distintos de detectores de características em imagens, ponderando, para cada classe, a importância de cada uma das características em sua descrição. Resultados experimentais atestam a viabilidade de tal proposta. / Despite the popularity of omnidirectional cameras used in mobile robotics, and the importance of object recognition in the broader universe of robotics and computer vision, it is difficult to find works that relate both in the literature. This work aims at performing the evaluation of a method for object classification in omnidirectional images, evaluating its effectiveness and efficience considering its application to tasks of self-localization and environment mapping made by mobile robots. The method is based on a multiple kernel learning extended support vector machine object classifier. Furthermore, an unwrapping step is applied to omnidirectional images, to make them similar to perspective images, to which the classifier used has already shown good results. The Multiple Kernels approach is necessary to allow the use of three distinct types of feature detectors in omnidirectional images by considering, for each class, the importance of each feature in the description. Experimental results demonstrate the feasibility of such a proposal. Computer vision Mobile robotics Object recognition Reconhecimento de padrões Robótica móvel Visão computacional
139	Zero-shot visual recognition via latent embedding learning Wang, Qian January 2018 (has links) Traditional supervised visual recognition methods require a great number of annotated examples for each concerned class. The collection and annotation of visual data (e.g., images and videos) could be laborious, tedious and time-consuming when the number of classes involved is very large. In addition, there are such situations where the test instances are from novel classes for which training examples are unavailable in the training stage. These issues can be addressed by zero-shot learning (ZSL), an emerging machine learning technique enabling the recognition of novel classes. The key issue in zero-shot visual recognition is the semantic gap between visual and semantic representations. We address this issue in this thesis from three different perspectives: visual representations, semantic representations and the learning models. We first propose a novel bidirectional latent embedding framework for zero-shot visual recognition. By learning a latent space from visual representations and labelling information of the training examples, instances of different classes can be mapped into the latent space with the preserving of both visual and semantic relatedness, hence the semantic gap can be bridged. We conduct experiments on both object and human action recognition benchmarks to validate the effectiveness of the proposed ZSL framework. Then we extend the ZSL to the multi-label scenarios for multi-label zero-shot human action recognition based on weakly annotated video data. We employ a long short term memory (LSTM) neural network to explore the multiple actions underlying the video data. A joint latent space is learned by two component models (i.e. the visual model and the semantic model) to bridge the semantic gap. The two component embedding models are trained alternately to optimize the ranking based objectives. Extensive experiments are carried out on two multi-label human action datasets to evaluate the proposed framework. Finally, we propose alternative semantic representations for human actions towards narrowing the semantic gap from the perspective of semantic representation. A simple yet effective solution based on the exploration of web data has been investigated to enhance the semantic representations for human actions. The novel semantic representations are proved to benefit the zero-shot human action recognition significantly compared to the traditional attributes and word vectors. In summary, we propose novel frameworks for zero-shot visual recognition towards narrowing and bridging the semantic gap, and achieve state-of-the-art performance in different settings on multiple benchmarks. 004
140	Participação da via NTS-PGI-LC-hipocampo (núcleo do trato solitário- núcleo paragigantocelular-Locus coeruleus-hipocampo) na consolidação da memória de reconhecimento de objetos Carpes, Pâmela Billig Mello January 2010 (has links) Existem crescentes evidências sobre a contribuição da liberação de noradrenalina (NA) central na consolidação das memórias. Teoricamente, o Núcleo do Trato Solitário (NTS) recebe informações e diversos estímulos periféricos, que são então projetados ao Núcleo Paragigantocelular (PGi). Este, por sua vez, utiliza neurotransmissores, predominantemente excitatórios, para influenciar a ativação do Locus Coeruleus (LC). Então, o LC envia projeções noradrenérgicas ao hipocampo e à amígdala, influenciando os processos mnemônicos. Aqui nós demonstramos que a inibição pelo muscimol do NTS, PGi ou LC até 3 horas após o treino na tarefa de reconhecimento de objetos (RO) impede a consolidação da memória medida 24 h após o treino. Adicionalmente, a infusão de timolol, um antagonista de receptores β-adrenérgicos, na região CA1 do hipocampo também impede a consolidação deste tipo de memória. A infusão de NA na região CA1 do hipocampo não altera a retenção da memória, mas, reverte o prejuízo causado pela inibição do NTS, PGi ou LC. A infusão de NMDA no LC após a inibição do NTS ou PGi também reverte essa amnésia. Concomitantemente, verificamos que a inibição NTS, PGi ou LC bloqueia o aumento da expressão do fator neurotrófico derivado do cérebro (BDNF, do inglês brain-derived neurotrophic factor) que ocorre 120 min após o treino na tarefa de reconhecimento de objetos na região CA1 do hipocampo. Também a infusão de NA na região CA1 do hipocampo após a inibição do NTS, PGi ou LC ou de NMDA no LC após a inibição do NTS ou PGi promovem novamente o aumento do BDNF120 min após o treino no RO. Com isso conclui-se que a ativação da via NTS-PGi-LC-Hipocampo é necessária para que ocorra consolidação da memória de RO, na qual desempenha um papel o BDNF hipocampal. / There is evidence of the contribution of brain noradrenaline release (NA) to memory consolidation. The Nucleus of the Solitary Tract (NTS) receives information originated by peripheral stimuli and projects to the Paragigantocellularis Nucleus (PGi), which influences the Locus Coeruleus (LC) through excitatory neurotransmitters. The LC sends noradrenergic projections to the hippocampus and amygdala, influencing the memory processes. Here we show that inhibition by muscimol of NTS, PGi or LC up to 3 h after object recognition training impairs the consolidation of the memory measured 24 h later. Additionally, the infusion of timolol in the CA1 region of hippocampus also inhibits consolidation of this type of memory. The infusion of NA into the CA1 region of hippocampus does not alter memory consolidation of this task, but reverts the deleterious effect of NTS, PGi or LC inhibition. The infusion of NMDA in LC after inhibition of NTS or PGi also reverts the amnesia. Concomitantly, the inhibition of NTS, PGi or LC blocks the increase of brain-derived neurotrophic factor (BDNF) expression in CA1 that occurs 120 min after training in the object recognition task. Further, the infusion of NA in CA1 after inhibition of NTS, PGi or LC; or of NMDA in LC after inhibition of NTS or PGi promotes the BDNF increase seen 120 min after object recognition training. Thus, it is concluded that the activation of NTSPGi- LC-Hippocampus pathway is necessary for consolidation of the object recognition memory, and hippocampal BDNF is involved in this process. Memória Noradrenalina Hipocampo Neurologia Declarative memory Retention BDNF Noradrenaline Object recognition task

Search results