What can your computer recognize chemical and facial pattern recognition through the use of Eigen Analysis Method /Giordano, Anthony J. January 2007 (has links) (PDF)
Senior Honors thesis--Regis University, Denver, Colo., 2007. / Title from PDF title page (viewed on June 26, 2007). Includes bibliographical references.
08 August 2013
A new background subtraction algorithm is proposed based on using a subspace model. The key components of the algorithm include a novel method for initializing the subspace and a robust update framework for continuously learning and improving the model. Unlike traditional subspace techniques the proposed approach does not require supervised or lengthy training data upfront, but instead is bootstrapped using a single background frame and exploiting spatial information in place of temporal data to generate pixel statistics for the model. The update framework allows for intelligently updating the model and re-initialization if required as determined by the algorithm. Experimental results indicate that the proposed subspace algorithm out performed traditional subspace approaches and was comparable to and sometimes better than leading standard pixel-based techniques on several standard background subtraction data sets. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2013-08-07 15:42:26.205
The past decade has seen a growing interest in computer stereo vision: the recovery of the depth map of a scene from two-dimensional images. The main problem of computer stereo is in establishing correspondence between features or regions in two or more images. This is referred to as the correspondence problem. One way to reduce the difficulty of the above problem is to constrain the camera modeling. Conventional stereo systems use two or more cameras, which are positioned in space at a uniform distance from the scene. These systems use epipolar geometry for their camera modeling, in order to curb the search space to be one-dimensional - along epipolar lines. Following Jain's approach, this thesis exploits a non-conventional camera modeling: the cameras are positioned in space one behind the other, such that their optical axes are collinear (hence the name coaxial stereo), and their distance apart is known. This approach complies with a simple case of epipolar geometry which further reduces the magnitude of the correspondence problem. The displacement of the projection of a stationary point occurs along a radial line, and depends only on its spatial depth and the distance between the cameras. Thus, to simplify (significantly) the recovery of depth from disparity, complex logarithmic mapping is applied to the original images. The logarithmic part of the transformation introduces great distortion to the image's resolution. Therefore, to minimize this distortion, it is applied to the features used in the matching process. The search for matching features is conducted along radial lines. Following Mokhtarian and Mackworth's approach, a scale-space image is constructed for each radial line by smoothing its intensity profile with a Gaussian filter, and finding zero-crossings in the second derivative at varying scale levels. Scale-space images of corresponding radial lines are then matched, based on a modified uniform cost algorithm. The matching algorithm is written with generality in mind. As a consequence, it can be easily adopted to other stereoscopic systems. Some new results on the structure of scale-space images of one dimensional functions are presented. / Science, Faculty of / Computer Science, Department of / Graduate
A new form of parallelism, distributed bit-parallelism, is introduced. A distributed bit-parallel organization distributes each bit of a data item to a different processor. Bit-parallelism allows computation that is sub-linear with word size for such operations as integer addition, arithmetic shifts, and data moves. The implications of bit-parallelism for system architecture are analyzed. An implementation of a bit-parallel architecture based on a mesh with bypass network is presented. The performance of bit-parallel algorithms on this architecture is analyzed and found to be several times faster than bit-serial algorithms. The application of the architecture to low level vision algorithms is discussed. / Applied Science, Faculty of / Electrical and Computer Engineering, Department of / Graduate
Van Niekerk, Graeme Neill
20 November 2014
M.Sc. (Computer Science) / In this dissertation, a study of the field of computer vision as well as various fields relating to computer vision is made. An investigation of organic vision is made involving the study of the organic focusing device and visual cortex in humans. This is also done from a psychological point-of-view. Various network models emulating the neuronic networks as well as component networks of the human visual cortex are investigated. Recent work done in the area of neural networks and computer vision is also mentioned. The mathematical theory and techniques used in the area of image formation and image processing, is studied. The study of the field of artificial intelligence and its relation towards the computer vision problem, is made as well as a discussion of numerous application systems that have been developed. Existing industrial applications of computer vision are studied as well as the mentioning of systems that have been developed for this purpose. The use of parallel architectures and multiresolution systems for computer vision application, are investigated. Finally, a discussion of the formal language theory and automata is given in terms of its relevance to computer vision. The discussion centers around the the recognition of two and three-dimensional structures by various automata in the two dimensions. From this study, a formal model for the recognition of three-dimensional digital structures, is proposed and informally defined. It will be the aim of further study to fully develop and implement this model.
31 May 2022
Human pose estimation (HPE) is an ever-growing research field, with an increasing number of publications in the computer vision and deep learning fields and it covers a multitude of practical scenarios, from sports to entertainment and from surveillance to medical applications. Despite the impressive results that can be obtained with HPE, there are still many problems that need to be tackled when dealing with real-world applications. Most of the issues are linked to a poor or completely wrong detection of the pose that emerges from the inability of the network to model the viewpoint. This thesis shows how designing viewpoint-equivariant neural networks can lead to substantial improvements in the field of human pose estimation, both in terms of state-of-the-art results and better real-world applications. By jointly learning how to build hierarchical human body poses together with the observer viewpoint, a network can learn to generalise its predictions when dealing with previously unseen viewpoints. As a result, the amount of training data needed can be drastically reduced, simultaneously leading to faster and more efficient training and more robust and interpretable real-world applications.
09 January 2009
The combination of low-cost imaging chips and high-performance, multicore, embedded processors heralds a new era in portable vision systems. Early vision algorithms have the potential for highly data-parallel, integer execution. However, an implementation must operate within the constraints of embedded systems including low clock rate, low-power operation and with limited memory. This dissertation explores new approaches to adapt novel pixel-based vision algorithms for tomorrow's multicore embedded processors. It presents : - An adaptive, multimodal background modeling technique called Multimodal Mean that achieves high accuracy and frame rate performance with limited memory and a slow-clock, energy-efficient, integer processing core. - A new workload partitioning technique to optimize the execution of early vision algorithms on multi-core systems. - A novel data transfer technique called cat-tail dma that provides globally-ordered, non-blocking data transfers on a multicore system. By using efficient data representations, Multimodal Mean provides comparable accuracy to the widely used Mixture of Gaussians (MoG) multimodal method. However, it achieves a 6.2x improvement in performance while using 18% less storage than MoG while executing on a representative embedded platform. When this algorithm is adapted to a multicore execution environment, the new workload partitioning technique demonstrates an improvement in execution times of 25% with only a 125 ms system reaction time. It also reduced the overall number of data transfers by 50%. Finally, the cat-tail buffering technique reduces the data-transfer latency between execution cores and main memory by 32.8% over the baseline technique when executing Multimodal Mean. This technique concurrently performs data transfers with code execution on individual cores, while maintaining global ordering through low-overhead scheduling to prevent collisions.
Thesis (Ph. D.)--Virginia Polytechnic Institute and State University, 1991. / Vita. Abstract. Includes bibliographical references (leaves 184-191). Also available via the Internet.
Taylor W. Hubbard (5930666)
17 January 2019
<p>The thesis covered the creation and testing of a low cost and modular sorting system of pegs used in products by Lafayette Instruments. The system is designed to check peg dimensions through use of computer vision while sorting out nonconforming parts and counting ones that are conforming. Conforming parts are separated into bins of predetermined quantities so that they do not need manual counting. The developed system will save engineers and technicians at Lafayette instruments many man hours from manually sorting and counting the roughly 160,000 pegs a year. The system will be able to sort and count at a speed comparable to a human operator while achieving an overall average accuracy of 95% or higher.</p>
On the design and implementation of decision-theoretic, interactive, and vision-driven mobile robotsElinas, Pantelis 05 1900 (has links)
We present a framework for the design and implementation of visually-guided, interactive, mobile robots. Essential to the framework's robust performance is our behavior-based robot control architecture enhanced with a state of the art decision-theoretic planner that takes into account the temporal characteristics of robot actions and allows us to achieve principled coordination of complex subtasks implemented as robot behaviors/skills. We study two different models of the decision theoretic layer: Multiply Sectioned Markov Decision Processes (MSMDPs) under the assumption that the world state is fully observable by the agent, and Partially Observable Markov Decision Processes (POMDPs) that remove the latter assumption and allow us to model the uncertainty in sensor measurements. The MSMDP model utilizes a divide-and-conquer approach for solving problems with millions of states using concurrent actions. For solving large POMDPs, we present heuristics that improve the computational efficiency of the point-based value iteration algorithm while tackling the problem of multi-step actions using Dynamic Bayesian Networks. In addition, we describe a state-of-the-art simultaneous localization and mapping algorithm for robots equipped with stereo vision. We first present the Monte-Carlo algorithm sigmaMCL for robot localization in 3D using natural landmarks identified by their appearance in images. Secondly, we extend sigmaMCL and develop the sigmaSLAM algorithm for solving the simultaneous localization and mapping problem for visually-guided, mobile robots. We demonstrate our real-time algorithm mapping large, indoor environments in the presence of large changes in illumination, image blurring and dynamic objects. Finally, we demonstrate empirically the applicability of our framework for developing interactive, mobile robots capable of completing complex tasks with the aid of a human companion. We present an award winning robot waiter for serving hors d'oeuvres at receptions and a robot for delivering verbal messages among inhabitants of an office-like environment.
Page generated in 0.083 seconds