Global ETD Search

691	Virtual image sensors to track human activity in a smart house Tun, Min Han January 2007 (has links) With the advancement of computer technology, demand for more accurate and intelligent monitoring systems has also risen. The use of computer vision and video analysis range from industrial inspection to surveillance. Object detection and segmentation are the first and fundamental task in the analysis of dynamic scenes. Traditionally, this detection and segmentation are typically done through temporal differencing or statistical modelling methods. One of the most widely used background modeling and segmentation algorithms is the Mixture of Gaussians method developed by Stauffer and Grimson (1999). During the past decade many such algorithms have been developed ranging from parametric to non-parametric algorithms. Many of them utilise pixel intensities to model the background, but some use texture properties such as Local Binary Patterns. These algorithms function quite well under normal environmental conditions and each has its own set of advantages and short comings. However, there are two drawbacks in common. The first is that of the stationary object problem; when moving objects become stationary, they get merged into the background. The second problem is that of light changes; when rapid illumination changes occur in the environment, these background modelling algorithms produce large areas of false positives. / These algorithms are capable of adapting to the change, however, the quality of the segmentation is very poor during the adaptation phase. In this thesis, a framework to suppress these false positives is introduced. Image properties such as edges and textures are utilised to reduce the amount of false positives during adaptation phase. The framework is built on the idea of sequential pattern recognition. In any background modelling algorithm, the importance of multiple image features as well as different spatial scales cannot be overlooked. Failure to focus attention on these two factors will result in difficulty to detect and reduce false alarms caused by rapid light change and other conditions. The use of edge features in false alarm suppression is also explored. Edges are somewhat more resistant to environmental changes in video scenes. The assumption here is that regardless of environmental changes, such as that of illumination change, the edges of the objects should remain the same. The edge based approach is tested on several videos containing rapid light changes and shows promising results. Texture is then used to analyse video images and remove false alarm regions. Texture gradient approach and Laws Texture Energy Measures are used to find and remove false positives. It is found that Laws Texture Energy Measure performs better than the gradient approach. The results of using edges, texture and different combination of the two in false positive suppression are also presented in this work. This false positive suppression framework is applied to a smart house senario that uses cameras to model ”virtual sensors” to detect interactions of occupants with devices. Results show the accuracy of virtual sensors compared with the ground truth is improved.
692	Creation and spatial partitioning of mip-mappable geometry images Domanski, Luke, University of Western Sydney, College of Health and Science, School of Computing and Mathematics January 2007 (has links) A Geometry Image (GIM) describes a regular polygonal surface mesh using a standard 2D image format without the need for explicit connectivity information. Like many regular or semi-regular surface representations, GIMs lend themselves well to a number processing tasks performed in computer graphics. It has been suggested that GIMs could provide improvements within real-time rendering pipelines through straightforward localised surface processing and simple mip-map based level-of-detail. The simplicity of such algorithms in the case of GIMs makes them highly amenable to user transparent implementation in graphics hardware or programming libraries, shifting implementation responsibility from the application programmer and reducing the processing load on the CPU. However, these topics have received limited attention in the literature. This thesis examines a number of issues regarding mip-mapping and localised processing of GIMs. In particular, it focuses on the creation of mip-mappable multi-chart GIMs and how to spatially partition and cull GIMs such that mip-mapping and localised processing can be performed effectively. These are important processing tasks that occur before rendering takes place, but influence how mip-mapping and localised processing can be implemented and utilised during rendering. Solutions to these tasks that consider such influences are, therefore, likely to facilitate simple and effective algorithms for mip-mapping and localised processing of GIMs that are amenable to hardware implementation. The topics discussed in this thesis will form a basis for future work on low level geometric mip-mapping and localised processing of GIMs in real-time graphics pipelines. With respect to creating mip-mappable GIMs, the thesis presents a method for automatic generation of polycube parameter domains and surface mappings that can be used to create multi-chart GIMs with square or rectangular charts. As will be discussed, these GIMs provide particular advantages for mip-mapping compared to multi-chart GIMs with irregular shaped charts. The method casts the polycube generation problem as a coarse topology preserving voxelisation of the surface that simultaneously aligns the surface with the voxel set boundary. This process produces both the polycube and an initial surface-to-polycube mapping. Theorems for piecewise construction of well-composed voxel sets are also presented that facilitate a piecewise implementation of the polycube generation algorithm and support the topological guarantees it provides. This method improves on previous methods for polycube generation which require significant user interaction. For spatial partitioning of GIMs the thesis introduces the concept of locality masks, bit masks partitioning parameter space. The method stores a 2D bit mask at each spatial node which identifies the set of GIM samples inside the node’s spatial range. Like GIMs the locality masks support mip-mapping through simple image down-sampling. By unifying the masks of all nodes that pass a spatial query processing can be performed globally on the unculled set of primitives rather than on a node-by-node basis, promoting a more optimised order in which to performed localised processing. Locality masks are also well suited to compression and provide a bandwidth efficient method of transferring a list of indexed rendering primitives. The locality masks method is compared to other methods of partitioning and culling GIMs in various ways and their suitability for rendering and other task is analysed. / Doctor of Philosophy (PhD) image processing geometry computer vision digital techniques data processing
693	Machine Vision as the Primary Sensory Input for Mobile, Autonomous Robots Lovell, Nathan, N/A January 2006 (has links) Image analysis, and its application to sensory input (computer vision) is a fairly mature field, so it is surprising that its techniques are not extensively used in robotic applications. The reason for this is that, traditionally, robots have been used in controlled environments where sophisticated computer vision was not necessary, for example in car manufacturing. As the field of robotics has moved toward providing general purpose robots that must function in the real world, it has become necessary that the robots be provided with robust sensors capable of understanding the complex world around them. However, when researchers apply techniques previously studied in image analysis literature to the field of robotics, several difficult problems emerge. In this thesis we examine four reasons why it is difficult to apply work in image analysis directly to real-time, general purpose computer vision applications. These are: improvement in the computational complexity of image analysis algorithms, robustness to dynamic and unpredictable visual conditions, independence from domain specific knowledge in object recognition and the development of debugging facilities. This thesis examines each of these areas making several innovative contributions in each area. We argue that, although each area is distinct, improvement must be made in all four areas before vision will be utilised as the primary sensory input for mobile, autonomous robotic applications. In the first area, the computational complexity of image analysis algorithms, we note the dependence of a large number of high-level processing routines on a small number of low-level algorithms. Therefore, improvement to a small set of highly utilised algorithms will yield benefits in a large number of applications. In this thesis we examine the common tasks of image segmentation, edge and straight line detection and vectorisation. In the second area, robustness to dynamic and unpredictable conditions, we examine how vision systems can be made more tolerant to changes of illumination in the visual scene. We examine the classical image segmentation task and present a method for illumination independence that builds on our work from the first area. The third area is the reliance on domain-specific knowledge in object recognition. Many current systems depend on a large amount of hard-coded domainspecific knowledge to understand the world around them. This makes the system hard to modify, even for slight changes in the environment, and very difficult to apply in a different context entirely. We present an XML-based language, the XML Object Definition (XOD) language, as a solution to this problem. The language is largely descriptive instead of imperative so, instead of describing how to locate objects within each image, the developer simply describes the properties of the objects. The final area is the development of support tools. Vision system programming is extremely difficult because large amounts of data are handled at a very fast rate. If the system is running on an embedded device (such as a robot) then locating defects in the code is a time consuming and frustrating task. Many development-support applications are available for specific applications. We present a general purpose development-support tool for embedded, real-time vision systems. The primary case study for this research is that of Robotic soccer, in the international RoboCup Four-Legged league. We utilise all of the research of this thesis to provide the first illumination-independent object recognition system for RoboCup. Furthermore we illustrate the flexibility of our system by applying it to several other tasks and to marked changes in the visual environment for RoboCup itself. Mobile Autonomous Robots sensory input computer vision robotic applications robots
694	Reconstructing 3D geometry from multiple images via inverse rendering. Bastian, John William January 2008 (has links) An image is a two-dimensional representation of the three-dimensional world. Recovering the information which is lost in the process of image formation is one of the fundamental problems in Computer Vision. One approach to this problem involves generating and evaluating a succession of surface hypotheses, with the best hypothesis selected as the final estimate. The fitness of each hypothesis can be evaluated by comparing the reference images against synthetic images of the hypothesised surface rendered with the reference cameras. An infinite number of surfaces can recreate any set of reference images, so many approaches to the reconstruction problem recover the largest from this set of surfaces. In contrast, the approach we present here accommodates prior structural information about the scene, thereby reducing ambiguity and finding a reconstruction which reflects the requirements of the user. The user describes structural information by defining a set of primitives and relating them by parameterised transformations. The reconstruction problem then becomes one of estimating the parameter values that transform the primitives such that the hypothesised surface best recreates the reference images. Two appearance-based likelihoods which measure the hypothesised surface against the reference images are described. The first likelihood compares each reference image against an image synthesised from the same viewpoint by rendering a projection of a second image onto the surface. The second likelihood finds the ‘optimal’ surface texture given the hypothesised scene configuration. Not only does this process maximise photo-consistency with respect to all reference images, but it prohibits incorrect reconstructions by allowing the use of prior information about occlusion. The second likelihood is able to reconstruct scenes in cases where the first is biased. / http://proxy.library.adelaide.edu.au/login?url= http://library.adelaide.edu.au/cgi-bin/Pwebrecon.cgi?BBID=1330993 / Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2008
695	An extended Mumford-Shah model and improved region merging algorithm for image segmentation Tao, Trevor January 2005 (has links) In this thesis we extend the Mumford-Shah model and propose a new region merging algorithm for image segmentation. The segmentation problem is to determine an optimal partition of an image into constituent regions such that individual regions are homogenous within and adjacent regions have contrasting properties. By optimimal, we mean one that minimizes a particular energy functional. In region merging, the image is initially divided into a very fine grid, with each pixel being a separate region. Regions are then recursively merged until it is no longer possible to decrease the energy functional. In 1994, Koepfler, Lopez and Morel developed a region merging algorithm for segmentating an image. They consider the piecewise constant Mumford-Shah model, where the energy functional consists of two terms, accuracy versus complexity, with the trade - off controlled by a scale parameter. They show that one can efficiently generate a hierarchy of segmentations from coarse to fine. This algorithm is complemented by a sound theoretical analysis of the piecewise constant model, due to Morel and Solimini. The primary motivation for extending the Mumford-Shah model stems from the fact that this model is only suitable for " cartoon " images, where each region is uncomtaminated by any form of noise. Other shortcomings also need to be addressed. In the algorithm of Koepfler et al., it is difficult to determine the order in which the regions are merged and a " schedule " is required in order to determine the number and fine - ness of segmentations in the hierarchy. Both of these difficulties mitigate the theoretical analysis of Koepfler ' s algorithm. There is no definite method for selecting the " optimal " value of the scale parameter itself. Furthermore, the mathematical analysis is not well understood for more complex models. None of these issues are convincingly answered in the literature. This thesis aims to provide some answers to the above shortcomings by introducing new techniques for region merging algorithms and a better understanding of the theoretical analysis of both the mathematics and the algorithm ' s performance. A review of general segmentation techniques is provided early in this thesis. Also discussed is the development of an " extended " model to account for white noise contamination of images, and an improvement of Koepfler ' s original algorithm which eliminates the need for a schedule. The work of Morel and Solimini is generalized to the extended model. Also considered is an application to textured images and the issue of selecting the value of the scale parameter. / Thesis (Ph.D.)--School of Mathematical Sciences, 2005. Computer vision Mathematics
696	Analyse adaptative du mouvement dans des séquences monoculaires non calibrées Lingrand, Diane 22 July 1999 (has links) (PDF) Part of the robotic vision framework, this thesis focuses on uncalibrated monocular video sequence analysis, taking singular physical cases into account (camera models, internal camera parameter evolution, object displacements, scene structure) that leads to specific equations. Singularities may allow us to retrieve more movement or structure elements than general equations. Moreover, numerical precision is improved as the number of parameters decreases. Thus, the detection and the proper management of geometric and kinematic properties of those singular cases are fundamental. The complete study of all singular cases in a pair of images, or a video sequence, is computationally intractable and requires an appropriately adapted algorithm. The implementation of the theoretical study is based on the Argès robotic system. The system is able to determine the specific movement from a pair of images and computes the related parameters. Computer Vision
697	A Single-Camera Gaze Tracker using Controlled Infrared Illumination Wallenberg, Marcus January 2009 (has links) <p>Gaze tracking is the estimation of the point in space a person is “looking at”. This is widely used in both diagnostic and interactive applications, such as visual attention studies and human-computer interaction. The most common commercial solution used to track gaze today uses a combination of infrared illumination and one or more cameras. These commercial solutions are reliable and accurate, but often expensive. The aim of this thesis is to construct a simple single-camera gaze tracker from off-the-shelf components. The method used for gaze tracking is based on infrared illumination and a schematic model of the human eye. Based on images of reflections of specific light sources in the surfaces of the eye the user’s gaze point will be estimated. Evaluation is also performed on both the software and hardware components separately, and on the system as a whole. Accuracy is measured in spatial and angular deviation and the result is an average accuracy of approximately one degree on synthetic data and 0.24 to 1.5 degrees on real images at a range of 600 mm.</p> gaze tracking eye tracking computer vision Image analysis Bildanalys
698	Object Recognition with Cluster Matching Lennartsson, Mattias January 2009 (has links) <p>Within this thesis an algorithm for object recognition called Cluster Matching has been developed, implemented and evaluated. The image information is sampled at arbitrary sample points, instead of interest points, and local image features are extracted. These sample points are used as a compact representation of the image data and can quickly be searched for prior known objects. The algorithm is evaluated on a test set of images and the result is surprisingly reliable and time efficient.</p> computer vision object recognition cluster matching Image analysis Bildanalys
699	Saliency Maps using Channel Representations / Saliency-kartor utifrån kanalrepresentationer Tuttle, Alexander January 2010 (has links) <p>In this thesis an algorithm for producing saliency maps as well as an algorithm for detecting salient regions based on the saliency map was developed. The saliency values are computed as center-surround differences and a local descriptor called the region p-channel is used to represent center and surround respectively. An integral image representation called the integral p-channel is used to speed up extraction of the local descriptor for any given image region. The center-surround difference is calculated as either histogram or p-channel dissimilarities.</p><p>Ground truth was collected using human subjects and the algorithm’s ability to detect salient regions was evaluated against this ground truth. The algorithm was also compared to another saliency algorithm.</p><p>Two different center-surround interpretations are tested, as well as several p-channel and histogram dissimilarity measures. The results show that for all tested settings the best performing dissimilarity measure is the so called diffusion distance. The performance comparison showed that the algorithm developed in this thesis outperforms the algorithm against which it was compared, both with respect to region detection and saliency ranking of regions. It can be concluded that the algorithm shows promising results and further investigation of the algorithm is recommended. A list of suggested approaches for further research is provided.</p> computer vision saliency maps p-channels Image analysis Bildanalys
700	Visual-inertial tracking using Optical Flow measurements Larsson, Olof January 2010 (has links) <p> </p><p>Visual-inertial tracking is a well known technique to track a combination of a camera and an inertial measurement unit (IMU). An issue with the straight-forward approach is the need of known 3D points. To by-pass this, 2D information can be used without recovering depth to estimate the position and orientation (pose) of the camera. This Master's thesis investigates the feasibility of using Optical Flow (OF) measurements and indicates the benifits using this approach.</p><p>The 2D information is added using OF measurements. OF describes the visual flow of interest points in the image plane. Without the necessity to estimate depth of these points, the computational complexity is reduced. With the increased 2D information, the 3D information required for the pose estimate decreases.</p><p>The usage of 2D points for the pose estimation has been verified with experimental data gathered by a real camera/IMU-system. Several data sequences containing different trajectories are used to estimate the pose. It is shown that OF measurements can be used to improve visual-inertial tracking with reduced need of 3D-point registrations.</p> Optical Flow Computer vision Sensor fusion Signal processing Signalbehandling

Search results