• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 172
  • 59
  • 25
  • 14
  • 11
  • 6
  • 4
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 364
  • 364
  • 108
  • 101
  • 64
  • 61
  • 46
  • 43
  • 38
  • 32
  • 30
  • 26
  • 26
  • 26
  • 26
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
241

The Role of Contextual Associations in the Selection of Objects

Sulman, Noah Patrick 01 January 2011 (has links)
This paper describes a sequence of experiments addressing basic questions about the control of visual attention and the relationship between attention and object recognition. This work reviews compelling findings addressing attentional control on the basis of high-level perceptual properties. In five experiments observers were presented with a rapid sequence of object photographs and instructed to either detect or selectively encode a verbally cued object category. When these object categories (e.g. "baseball") were preceded by contextual images associated with a given object category (e.g. "baseball diamond"), observers were less likely to accurately report information about the target item. This effect obtained with both detection and discrimination measures. This evidence of attentional capture is particularly strong because associated contexts typically enhance object detection or discrimination, whereas here they harmed performance. These findings demonstrate that observers use relatively abstract and elaborated representations when selecting visual objects on the basis of category. Further, even when observers attempt to ignore depictions of associated contexts these images engage perceptual processing. That is, while participants were able to determine the target of their search categorically, they had relatively little control over the specific types of representations and information employed when performing an object search task. After reviewing these five experiments, conclusions regarding the use of object-context association knowledge in vision are addressed.
242

Stochastic methods in computational stereo

Coffman, Thayne Richard 16 June 2011 (has links)
Computational stereo estimates 3D structure by analyzing visual changes between two or more passive images of a scene that are captured from different viewpoints. It is a key enabler for ubiquitous autonomous systems, large-scale surveying, virtual reality, and improved techniques for compression, tracking, and object recognition. The fact that computational stereo is an under-constrained inverse problem causes many challenges. Its computational and memory requirements are high. Typical heuristics and assumptions, used to constrain solutions or reduce computation, prevent treatment of key realities such as reflection, translucency, ambient lighting changes, or moving objects in the scene. As a result, a general solution is lacking. Stochastic models are common in computational stereo, but stochastic algorithms are severely under-represented. In this dissertation I present two stochastic algorithms and demonstrate their advantages over deterministic approaches. I first present the Quality-Efficient Stochastic Sampling (QUESS) approach. QUESS reduces the number of match quality function evaluations needed to estimate dense stereo correspondences. This facilitates the use of complex quality metrics or metrics that take unique values at non-integer disparities. QUESS is shown to outperform two competing approaches, and to have more attractive memory and scaling properties than approaches based on exhaustive sampling. I then present a second novel approach based on the Hough transform and extend it with distributed ray tracing (DRT). DRT is a stochastic anti-aliasing technique common to computer rendering but which has not been used in computational stereo. I demonstrate that the DRT-enhanced approach outperforms the unenhanced approach, a competing variation that uses re-accumulation in the Hough domain, and another baseline approach. DRT’s advantages are particularly strong for reduced image resolution and/or reduced accumulator matrix resolution. In support of this second approach, I develop two novel variations of the Hough transform that use DRT, and demonstrate that they outperform competing variations on a traditional line segment detection problem. I generalize these two examples to draw broader conclusions, suggest future work, and call for a deeper exploration by the community. Both practical and academic gaps in the state of the art can be reduced by a renewed exploration of stochastic computational stereo techniques. / text
243

Visual Perception of Objects and their Parts in Artificial Systems

Schoeler, Markus 12 October 2015 (has links)
No description available.
244

Delineating the Neural Circuitry Underlying Crossmodal Object Recognition in Rats

Reid, James 15 September 2011 (has links)
Previous research has indicated that the perirhinal cortex (PRh) and posterior parietal cortex (PPC) functionally interact to mediate crossmodal object representations in rats; however, it remains to be seen whether other cortical regions contribute to this cognitive function. The prefrontal cortex (PFC) has been widely implicated in crossmodal tasks and might underlie either a unified multimodal or amodal representation or comparison mechanism that allows for integration of object information across sensory modalities. The hippocampus (HPC) is also a strong candidate, with extensive polymodal inputs, and has been implicated in some aspects of object recognition. A series of lesion based experiments assessed the roles of HPC, PFC and PFC sub regions [medial prefrontal (mPFC) and orbitofrontal cortex (OFC)], revealing functional dissociations between these brain regions using two versions of crossmodal object recognition: 1. spontaneous crossmodal matching (CMM), which requires rats to compare between a stored tactile object representation and visually-presented objects to discriminate the novel and familiar stimuli; and 2. crossmodal object association (CMA), in which simultaneous pre-exposure to the tactile and visual elements of an object enhances CMM performance across long retention delays. Notably, while inclusive PFC lesions impaired both CMM and CMA tasks, selective OFC lesions disrupted only CMM, whereas selective mPFC damage did not impair performance on either task. Furthermore, there was no impact of HPC lesions on either CMM or CMA tasks. Thus, the PFC and the OFC play a selective role in crossmodal object recognition but the exact contributions and interactions of the regions will require further research to elucidate. / PDF Document / Natural Sciences and Engineering Research Council of Canada (NSERC)
245

Automatic Urban Modelling using Mobile Urban LIDAR Data

Ioannou, Yani Andrew 01 March 2010 (has links)
Recent advances in Light Detection and Ranging (LIDAR) technology and integration have resulted in vehicle-borne platforms for urban LIDAR scanning, such as Terrapoint Inc.'s TITAN system. Such technology has lead to an explosion in ground LIDAR data. The large size of such mobile urban LIDAR data sets, and the ease at which they may now be collected, has shifted the bottleneck of creating abstract urban models for Geographical Information Systems (GIS) from data collection to data processing. While turning such data into useful models has traditionally relied on human analysis, this is no longer practical. This thesis outlines a methodology for automatically recovering the necessary information to create abstract urban models from mobile urban LIDAR data using computer vision methods. As an integral part of the methodology, a novel scale-based interest operator is introduced (Di erence of Normals) that is e cient enough to process large datasets, while accurately isolating objects of interest in the scene according to real-world parameters. Finally a novel localized object recognition algorithm is introduced (Local Potential Well Space Embedding), derived from a proven global method for object recognition (Potential Well Space Embedding). The object recognition phase of our methodology is discussed with these two algorithms as a focus. / Thesis (Master, Computing) -- Queen's University, 2010-03-01 12:26:34.698
246

Optimization convexe pour cosegmentation

Joulin, Armand 17 December 2012 (has links) (PDF)
Les hommes et la plupart des animaux ont une capacité naturelle à voir le monde et à le comprendre sans effort. La simplicité apparente avec laquelle un humain perçoit ce qui l'entoure suggère que le processus impliqué ne nécessite pas, dans une certaine mesure, un haut degré de réflexion. Cette observation suggère que notre perception visuelle du monde peut être simulée sur un ordinateur. La vision par ordinateur est le domaine de la recherche consacré au problème de la création d'une forme de perception visuelle pour des ordinateurs. Les premiers travaux dans ce domaine remontent aux années cinquante, mais la puissance de calcul des ordinateurs de cette époque ne permettait pas de traiter et d'analyser les données visuelles nécessaires à l'elaboration d'une perception visuelle virtuelle. Ce n'est que récemment que la puissance de calcul et la capacité de stockage ont permis à ce domaine de vrai- ment émerger. Depuis maintenant deux décennies, la vision par ordinateur a permis de répondre à problèmes pratiques ou industrielles comme par exemple, la détection des visages, de personnes au comportement suspect dans une foule ou de défauts de fabrication dans des chaînes de production. En revanche, en ce qui concerne l'émergence d'une perception visuelle virtuelle non spécifique à une tâche donnée, peu de progrès ont été réalisés et la communauté est toujours confrontée à des problèmes fondamentaux. Un de ces problèmes est de segmenter une image ou une video en régions porteuses de sens, ou en d'autres termes, en objets ou actions. La segmentation de scène est non seulement naturelle pour les humains, mais aussi essentielle pour comprendre pleinement son environnement. Malheureusement elle est aussi extrêmement difficile à reproduire sur un ordinateur. Une des raisons est qu'il n'existe pas de définition claire de ce qu'est une région "significative". En effet, en fonction de la scène ou de la situation, une région peut avoir des interprétations différentes. Par exemple, étant donnée une scène se passant dans la rue, on peut considérer que distinguer un piéton est important dans cette situation, par contre ses vêtements ne le semblent pas nécessairement. Si maintenant nous considérons une scène ayant lieu pendant un défilé de mode, un vêtement devient un élément important, donc une région significative. Dans cette thèse, nous nous concentrons sur ce problème de segmentation et nous l'abordons sous un angle particulier afin d'éviter cette difficulté fondamentale. Nous allons considérer la segmentation comme un problème d'apprentissage faible- ment supervisé, c'est-à-dire qu'au lieu de segmenter des images selon une certaine définition prédéfinie de régions "significatives", nous développons des méthodes per- mettant de segmenter simultanément un ensemble d'images en régions qui apparais- sent régulièrement. En d'autres termes, nous définissons une région "significative" d'un point de vue statistique: Ce sont les régions qui apparaissent régulièrement dans l'ensemble des images données. Pour cela nous concevons des modèles ayant une portée qui va au-delà de l'application à la vision. Notre approche prend ses racines dans l'apprentissage statistique, dont l'objectif est de concevoir des méthodes efficaces pour extraire et/ou apprendre des motifs récurrents dans des jeux de données. Ce domaine a récemment connu une forte popularité en raison de l'augmentation du nombre, de la taille des bases de données disponibles et la nécessité de traiter les données automatiquement. Dans cette thèse, nous nous concentrons sur des méthodes conçues pour découvrir l'information "cachée" dans une base de données à partir d'annotations incomplètes ou inexistantes. Enfin, nos travaux prennent aussi racines dans le domaine de l'optimisation numérique afin d'élaborer des algorithmes efficaces et adaptés spécialement à nos prob- lèmes. En particulier, nous utilisons et adaptons des outils récemment développés afin de relaxer des problèmes combinatoires complexes en des problèmes convexes pour lesquels il est garanti de trouver la solution optimale à l'aide de procedures developpees en optimisation convexe. Nous illustrons la qualité de nos formulations et algorithmes aussi sur des problèmes tirés de domaines autres que la vision par ordinateur. En particulier, nous montrons que nos travaux peuvent être utilisés dans la classification de texte et en biologie cellulaire.
247

Alignement élastique d'images pour la reconnaissance d'objet

Duchenne, Olivier 29 November 2012 (has links) (PDF)
The objective of this thesis is to explore the use of graph matching in object recognition systems. In the continuity of the previously described articles, rather than using descriptors invariant to misalignment, this work directly tries to find explicit correspondences between prototypes and test images, in order to build a robust similarity measure and infer the class of the test images. In chapter 2, we will present a method that given interest points in two images tries to find correspondences between them. It extends previous graph matching approaches [Leordeanu and Hebert, 2005a] to handle interactions between more than two feature correspondences. This allows us to build a more discriminative and/or more invariant matching method. The main contributions of this chapter are: The introduction of an high-order objective function for hyper-graph matching (Section 2.3.1). The application of the tensor power iteration method to the high-order matching task, combined with a relaxation based on constraints on the row norms of assignment matrices, which is tighter than previous methods (Section 2.3.1). An l1-norm instead of the classical l2-norm relaxation, that provides solutions that are more interpretable but still allows an efficient power iteration algorithm (Section 2.3.5). The design of appropriate similarity measures that can be chosen either to improve the invariance of matching, or to improve the expressivity of the model (Section 2.3.6). The proposed approach has been implemented, and it is compared to stateof-the-art algorithms on both synthetic and real data. As shown by our experiments (Section 2.5), our implementation is, overall, as fast as these methods in spite of the higher complexity of the model, with better accuracy on standard databases. In chapter 3, we build a graph-matching method for object categorization. The main contributions of this chapter are: Generalizing [Caputo and Jie, 2009; Wallraven et al., 2003], we propose in Section 3.3 to use the optimum value of the graph-matching problem associated with two images as a (non positive definite) kernel, suitable for SVM classification. We propose in Section 3.4 a novel extension of Ishikawa's method [Ishikawa, 2003] for optimizing MRFs which is orders of magnitude faster than competing algorithms (e.g., [Kim and Grauman, 2010; Kolmogorov and Zabih, 2004; Leordeanu and Hebert, 2005a]) for the grids with a few hundred nodes considered in this article). In turn, this allows us to combine our kernel with SVMs in image classification tasks. We demonstrate in Section 3.5 through experiments with standard benchmarks (Caltech 101, Caltech 256, and Scenes datasets) that our method matches and in some cases exceeds the state of the art for methods using a single type of features. In chapter 4, we introduce our work about object detection that perform fast image alignment. The main contributions of this chapter are: We propose a novel image similarity measure that allows for arbitrary deformations of the image pattern within some given disparity range and can be evaluated very efficiently [Lemire, 2006], with a cost equal to a small constant times that of correlation in a sliding-window mode. Our similarity measure relies on a hierarchical notion of parts based on simple rectangular image primitives and HOG cells [Dalal and Triggs, 2005a], and does not require manual part specification [Felzenszwalb and Huttenlocher, 2005b; Bourdev and Malik, 2009; Felzenszwalb et al., 2010] or automated discovery [Lazebnik et al., 2005; Kushal et al., 2007].
248

Learning Hierarchical Feature Extractors For Image Recognition

Boureau, Y-Lan 01 September 2012 (has links) (PDF)
Telling cow from sheep is effortless for most animals, but requires much engineering for computers. In this thesis, we seek to tease out basic principles that underlie many recent advances in image recognition. First, we recast many methods into a common unsu- pervised feature extraction framework based on an alternation of coding steps, which encode the input by comparing it with a collection of reference patterns, and pooling steps, which compute an aggregation statistic summarizing the codes within some re- gion of interest of the image. Within that framework, we conduct extensive comparative evaluations of many coding or pooling operators proposed in the literature. Our results demonstrate a robust superiority of sparse coding (which decomposes an input as a linear combination of a few visual words) and max pooling (which summarizes a set of inputs by their maximum value). We also propose macrofeatures, which import into the popu- lar spatial pyramid framework the joint encoding of nearby features commonly practiced in neural networks, and obtain significantly improved image recognition performance. Next, we analyze the statistical properties of max pooling that underlie its better perfor- mance, through a simple theoretical model of feature activation. We then present results of experiments that confirm many predictions of the model. Beyond the pooling oper- ator itself, an important parameter is the set of pools over which the summary statistic is computed. We propose locality in feature configuration space as a natural criterion for devising better pools. Finally, we propose ways to make coding faster and more powerful through fast convolutional feedforward architectures, and examine how to incorporate supervision into feature extraction schemes. Overall, our experiments offer insights into what makes current systems work so well, and state-of-the-art results on several image recognition benchmarks.
249

Automated Construction Progress Tracking using 3D Sensing Technologies

Turkan, Yelda 05 April 2012 (has links)
Accurate and frequent construction progress tracking provides critical input data for project systems such as cost and schedule control as well as billing. Unfortunately, conventional progress tracking is labor intensive, sometimes subject to negotiation, and often driven by arcane rules. Attempts to improve progress tracking have recently focused mainly on automation, using technologies such as 3D imaging, Global Positioning System (GPS), Ultra Wide Band (UWB) indoor locating, hand-held computers, voice recognition, wireless networks, and other technologies in various combinations. Three dimensional (3D) imaging technologies, such as 3D laser scanners (LADARs) and photogrammetry have shown great potential for saving time and cost for recording project 3D status and thus to support some categories of progress tracking. Although laser scanners in particular and 3D imaging in general are being investigated and used in multiple applications in the construction industry, their full potential has not yet been achieved. The reason may be that commercial software packages are still too complicated and time consuming for processing scanned data. Methods have however been developed for the automated, efficient and effective recognition of project 3D BIM objects in site laser scans. This thesis presents a novel system that combines 3D object recognition technology with schedule information into a combined 4D object based construction progress tracking system. The performance of the system is investigated on a comprehensive field database acquired during the construction of a steel reinforced concrete structure, Engineering V Building at the University of Waterloo. It demonstrates a degree of accuracy that meets or exceeds typical manual performance. However, the earned value tracking is the most commonly used method in the industry. That is why the object based automated progress tracking system is further explored, and combined with earned value theory into an earned value based automated progress tracking system. Nevertheless, both of these systems are focused on permanent structure objects only, not secondary or temporary. In the last part of the thesis, several approaches are proposed for concrete construction secondary and temporary object tracking. It is concluded that accurate tracking of structural building project progress is possible by combining a-priori 4D project models with 3D object recognition using the algorithms developed and presented in this thesis.
250

Representations and matching techniques for 3D free-form object and face recognition

Mian, Ajmal Saeed January 2007 (has links)
[Truncated abstract] The aim of visual recognition is to identify objects in a scene and estimate their pose. Object recognition from 2D images is sensitive to illumination, pose, clutter and occlusions. Object recognition from range data on the other hand does not suffer from these limitations. An important paradigm of recognition is model-based whereby 3D models of objects are constructed offline and saved in a database, using a suitable representation. During online recognition, a similar representation of a scene is matched with the database for recognizing objects present in the scene . . . The tensor representation is extended to automatic and pose invariant 3D face recognition. As the face is a non-rigid object, expressions can significantly change its 3D shape. Therefore, the last part of this thesis investigates representations and matching techniques for automatic 3D face recognition which are robust to facial expressions. A number of novelties are proposed in this area along with their extensive experimental validation using the largest available 3D face database. These novelties include a region-based matching algorithm for 3D face recognition, a 2D and 3D multimodal hybrid face recognition algorithm, fully automatic 3D nose ridge detection, fully automatic normalization of 3D and 2D faces, a low cost rejection classifier based on a novel Spherical Face Representation, and finally, automatic segmentation of the expression insensitive regions of a face.

Page generated in 0.0648 seconds