Global ETD Search

161	Confidence measures for disparity estimates from energy neuron populations / Tsang, Kong Chau. January 2007 (has links) Thesis (Ph.D.)--Hong Kong University of Science and Technology, 2007. / Includes bibliographical references. Also available in electronic version. Binocular vision. Computer vision.
162	Efficient recursive factorization methods for determining structure from motion Li, Yanhua. January 2000 (has links) (PDF) Bibliography: leaves 100-110. This thesis addresses the structure from motion problem in computer vision. Computer vision Image processing
163	Multi-cue visual tracking: feature learning and fusion Lan, Xiangyuan 10 August 2016 (has links) As an important and active research topic in computer vision community, visual tracking is a key component in many applications ranging from video surveillance and robotics to human computer. In this thesis, we propose new appearance models based on multiple visual cues and address several research issues in feature learning and fusion for visual tracking. Feature extraction and feature fusion are two key modules to construct the appearance model for the tracked target with multiple visual cues. Feature extraction aims to extract informative features for visual representation of the tracked target, and many kinds of hand-crafted feature descriptors which capture different types of visual information have been developed. However, since large appearance variations, e.g. occlusion, illumination may occur during tracking, the target samples may be contaminated/corrupted. As such, the extracted raw features may not be able to capture the intrinsic properties of the target appearance. Besides, without explicitly imposing the discriminability, the extracted features may potentially suffer background distraction problem. To extract uncontaminated discriminative features from multiple visual cues, this thesis proposes a novel robust joint discriminative feature learning framework which is capable of 1) simultaneously and optimally removing corrupted features and learning reliable classifiers, and 2) exploiting the consistent and feature-specific discriminative information of multiple feature. In this way, the features and classifiers learned from potentially corrupted tracking samples can be better utilized for target representation and foreground/background discrimination. As shown by the Data Processing Inequality, information fusion in feature level contains more information than that in classifier level. In addition, not all visual cues/features are reliable, and thereby combining all the features may not achieve a better tracking performance. As such, it is more reasonable to dynamically select and fuse multiple visual cues for visual tracking. Based on aforementioned considerations, this thesis proposes a novel joint sparse representation model in which feature selection, fusion, and representation are performed optimally in a unified framework. By taking advantages of sparse representation, unreliable features are detected and removed while reliable features are fused on feature level for target representation. In order to capture the non-linear similarity of features, the model is further extended to perform feature fusion in kernel space. Experimental results demonstrate the effectiveness of the proposed model. Since different visual cues extracted from the same object should share some commonalities in their representations and each feature should also have some diversities to reflect its complementarity in appearance modeling, another important problem in feature fusion is how to learn the commonality and diversity in the fused representations of multiple visual cues to enhance the tracking accuracy. Different from existing multi-cue sparse trackers which only consider the commonalities among the sparsity patterns of multiple visual cues, this thesis proposes a novel multiple sparse representation model for multi-cue visual tracking which jointly exploits the underlying commonalities and diversities of different visual cues by decomposing multiple sparsity patterns. Moreover, this thesis introduces a novel online multiple metric learning to efficiently and adaptively incorporate the appearance proximity constraint, which ensures that the learned commonalities of multiple visual cues are more representative. Experimental results on tracking benchmark videos and other challenging videos show that the proposed tracker achieves better performance than the existing sparsity-based trackers and other state-of-the-art trackers. Computer vision;Machine learning
164	Active robot vision and its use in object recognition Hoad, Paul January 1994 (has links) Object recognition has been one of the main areas of research into computer vision in the last 20-30 years. Until recently most of this research has been performed on scenes taken using static monocular, binocular or even trinocular cameras. It is believed, however, that by adding the ability to move the look point and concentrate on a region of interest a more robust and efficient method of vision can be achieved. Recent studies into the ability to provide human-like vision systems for a more active approach to vision have lead to the development of a number of robot controlled vision systems. In this thesis the development of one such system at the University of Surrey, the stereo robot head "Getafix" is described. The design, construction and development of the head and its control system have been undertaken as part of this project with the aim of improving current vision tasks, in particular, that of object recognition. In this thesis the design of the control systems, kinematics and control software of the stereo robot head will be discussed. A number of simple commissioning experiments are also shown, using the concepts of the robot control developed herein. Camera lens control and calibration is also described. A review of classical primitive based object recognition systems is given and the development of a novel generic cylindrical object recognition strategy is shown. The use of this knowledge source is demonstrated with other vision processes of colour and stereo. The work on the cylinder recognition strategy and the stereo robot head are finally combined within an active vision framework. A purposive active vision strategy is used to detect cylindrical structures, that would otherwise be undetectable by the cylindrical object detection algorithm alone. 629.892 Computer vision
165	Primitive extraction via gathering evidence of global parameterised models Aguado Guadarrama, Alberto Sergio January 1996 (has links) The extraction of geometric primitives from images is a fundamental task in computer vision. The objective of shape extraction is to find the position and recognise descriptive features of objects (such as size and rotation) for scene analysis and interpretation. The Hough transform is an established technique for extracting geometric shapes based on the duality definition of the points on a curve and their parameters. This technique has been developed for extracting simple geometric shapes such as lines, circles and ellipses as well as arbitrary shapes represented in a non-analytically tabular form. The main drawback of the Hough transform technique is the computational requirement which has an exponential growth of memory space and processing time as the number of parameters used to represent a primitive increases. For this reason most of the research on the Hough transform has focused on reducing the computational burden for extracting simple geometric shapes. This thesis presents two novel techniques based on the Hough transform approach, one for ellipse extraction and the other for arbitrary shape extraction. The ellipse extraction technique confronts the primary problems of the Hough transform, namely the storage and computational load, by considering the angular changes in the position vector function of the points in an ellipse. These changes are expressed in terms of sets of points and gradient direction to obtain simplified mappings which split the five-dimensional parameter space required for ellipse extraction into two two-dimensional and one one-dimensional spaces. The new technique for arbitrary shape extraction uses an analytic representation of arbitrary shapes. This representation extends the applicability of the Hough transform from lines and quadratic forms, such as circles and ellipses, to arbitrary shapes avoiding the discretisation problems inherent in current (tabular) approaches. The analytic representation of shapes is based on the Fourier expansion of a curve and the extraction process is formulated by including this representation in a general novel definition of the Hough transform. In the development of this technique some strategies of parameter reduction are implemented and evaluated. 621.3994 Computer vision
166	Novel techniques for image texture classification Chen, Yan Qiu January 1995 (has links) Texture plays an increasingly important role in computer vision. It has found wide application in remote sensing, medical diagnosis, quality control, food inspection and so forth. This thesis investigates the problem of classifying texture in digital images, following the convention of splitting the problem into feature extraction and classification. Texture feature descriptions considered in this thesis include Liu's features, features from the Fourier transform using geometrical regions, the Statistical Gray-Level Dependency Matrix, and the Statistical Feature Matrix. Classification techniques that are considered in this thesis include the K-Nearest Neighbour Rule and the Error Back-Propagation method. Novel techniques developed during the author's Ph.D study include (1) a Generating Shrinking Algorithm that builds a three-layer feed-forward network to classify arbitrary patterns with guaranteed convergence and known generalisation behaviour, (2) a set of Statistical Geometrical Features for texture analysis based on the statistics of the geometrical properties of connected regions in a sequence of binary images obtained from a texture image, (3) a neural implementation of the K-Nearest Neighbour Rule that can complete a classification task within 2K clock cycles. Experimental evaluation using the entire Brodatz texture database shows that (1) the Statistical Geometrical Features give the best performance for all the considered classifiers, (2) the Generating Shrinking Algorithm offers better performance over the Error Back-Propagation method and the K-Nearest Neighbour Rule's performance is comparable to that of the Generating Shrinking Algorithm, (3) the combination of the Statistical Geometrical Features with the Generating-Shrinking Algorithm constitutes one of the best texture classification systems considered. 621.3994 Computer vision
167	Attentive visual tracking and trajectory estimation for dynamic scene segmentation Roberts, Jonathan Michael January 1994 (has links) Intelligent Co-Pilot Systems (ICPS) offer the next challenge to vehicle-highway automation. The key to ICPSs is the detection of moving objects (other vehicles) from the moving observer using a visual sensor. The aim of the work presented in this thesis was to design and implement a feature detection and tracking strategy that is capable of tracking image features independently, in parallel, and in real-time and to cluster/segment features utilising the inherent temporal information contained within feature trajectories. Most images contain areas that are of little or no interest to vision tasks. An attentive, data-driven, approach to feature detection and tracking is proposed which aims to increase the efficiency of feature detection and tracking by focusing attention onto relevant regions of the image likely to contain scene structure. This attentive algorithm lends itself naturally to parallelisation and results from a parallel implementation are presented. A scene may be segmented into independently moving objects based on the assumption that features belonging to the same object will move in an identical way in three dimensions (this assumes objects are rigid). A model for scene segmentation is proposed that uses information contained within feature trajectories to cluster, or group, features into independently moving objects. This information includes: image-plane position, time-to-collision of a feature with the image-plane, and the type of motion observed. The Multiple Model Adaptive Estimator (MMAE) algorithm is extended to cope with constituent filters with different states (MMAE2) in an attempt to accurately estimate the time-to-collision of a feature and provide a reliable idea of the type of motion observed (in the form of a model belief measure). Finally, poor state initialisation is identified as a likely prime cause for poor Extended Kalman Filter (EKF) performance (and hence poor MMAE2 performance) when using high order models. The idea of the neurofuzzy initialised EKF (NF-EKF) is introduced which attempts to reduce the time for an EKF to converge by improving the accuracy of the EKF's initial state estimates. 621.3994 Computer vision
168	Automated detection of photogrammetric pipe features Szabo, Jason Leslie 16 March 2018 (has links) This dissertation presents original computer vision algorithms to automate the identification of piping and photogrammetric piping features in individual digital images of industrial installations. Automatic identification of the pixel regions associated with piping is the core original element of this work and is accomplished through a re-representation of image information (light intensity versus position) in a light intensity versus gradient orientation data space. This work is based on the physics of scene illumination/reflectance and evaluates pixel regions in a hierarchy of data abstractions to identify pipe regions without needing specific information about pipe edges, illumination, or reflectance characteristics. The synthesis of correlated information used in this image segmentation algorithm provides a robust technique to identify potential pipe pixel regions in real images. An additional unique element of this work is a pipe edge identification methodology, which uses the information from this light intensity versus gradient orientation data space to localize the pipe edge search space (in both a pixel position and gradient orientation sense). This localization provides a very specific, perspective independent, self-adaptive pipe edge filter. Pipe edges identified in this manner are then incorporated into a robust region joining algorithm to address the issue of region fragmentation (caused by occluding components and shadows). Automated photogrammetric feature identification is also demonstrated through algorithmically recognizing the intersection of orthogonal pipe sections (with piping code acceptable diameter ratios) as potential T-junctions or 90-degree elbows. As pipe intersections, these image points are located with sub pixel resolution even though they cannot be identified by simple inspection. The computer vision algorithms of this dissertation are robust physics based methods, applicable to the identification of piping and photogrammetric pipe features in real world images of industrial installations, being: perspective independent, albedo independent, and unaffected by inter-reflections. Automating these operator driven input tasks will improve the accuracy, ease-of-use, and cost effectiveness of implementing existing photogrammetric programs to the significant industrial problem of generating as-built piping drawings. / Graduate Computer vision Photogrammetry
169	Control issues in high level vision Remagnino, Paolo January 1993 (has links) Vision entails complex processes to sense, interpret and reason about the external world. The performance of such processes in a dynamic environment needs to be regulated by flexible and reliable control mechanisms. This thesis is concerned with aspects of control in high level vision. The study of control problems in vision defines a research area which only recently has received adequate attention. Classification criteria such as scope of application, knowledge representation, control structure and communication have been chosen to establish means of comparisons between the existing vision systems. Control problems have recently become of great topical interest as a result of the basic ideas of the active vision paradigm. The proponents of active vision suggest that robust solutions to vision problems arise when sensing and analysis are controlled (i.e. purposively adjusted) to exploit both data and available knowledge (temporal context). The work reported in this thesis follows the basic tenets of active vision. It is directed at the study of control of sensor gaze, scene interpretation and visual strategy monitoring. Control of the visual sensor is an important aspect of active vision. A vision system must be able to establish its orientation with respect to the partially known environment and have control strategies for selecting targets to be viewed. In this thesis algorithms are implemented for establishing vision system pose relative to prestored environment landmarks and for directing gaze to points defined by objects in an established scene model. Particular emphasis has been placed on accounting for and propagating estimation errors arising from both measured image data and inaccuracy of stored scene knowledge. In order to minimise the effect of such errors a hierarchical scene model has been adopted with contextually related objects grouped together. Object positions are described relative to local determined landmarks and this keeps the size of errors within tolerable bounds. The scene interpretation module takes image descriptions in terms of low level features and produces a symbolic description of the scene in terms of known objects classes and their attributes. The construction of the scene model is an incremental process which is achieved by means of several knowledge sources independently controlled by separate modules. The scene interpreter has been carefully structured and operates in a loop of perception that is controlled by high level commands delivered from the system supervisor module. The individual scene interpreter modules operate as locally controlled modules and are instructed as to what visual task to perform, where to look in the scene and what subset of data to use. The module processing takes into account the existing partial scene interpretation. These mechanisms embody the concepts of spatial focus of attention and exploitation of temporal context. Robust scene interpretation is achieved via temporal integration of the interpretation. The element of control concerned with visual strategy monitoring is at the system supervisor level. The supervisor takes a user given task and decides the best strategy to follow in order to satisfy it. This may involve interrogation of existing knowledge or the initiation of new data collection and analysis. In the case of new analysis the supervisor has to express the task in terms of a set of achievable visual tasks and then these are encoded into a control word which is passed to the scene interpreter. The vocabulary of the scene supervisor includes tasks such as general scene exploration, the finding of a specific object, the monitoring of a specified object, the description of attributes of single objects or relationships between two or more objects. The supervisor has to schedule sub-tasks in such a way as to achieve a good solution to the given problem. A considerable number of experiments, which make use of real and synthetic data, demonstrate the advantages of the proposed approach by means of the current implementation (written in C and in the rule based system Clips). 621.3994 Computer vision
170	Object recognition by region matching using relaxation with relational constraints Ahmadyfard, Alireza January 2003 (has links) Our objective in this thesis is to develop a method for establishing an object recognition system based on the matching of image regions. A region is segmented from image based on colour homogeneity of pixels. The method can be applied to a number of computer vision applications such as object recognition (in general) and image retrieval. The motivation for using regions as image primitives is that they can be represented invariantly to a group of geometric transformations and regions are stable under scaling. We model each object of interest in our database using a single frontal image. The recognition task is to determine the presence of object(s) of interest in scene images. We propose a novel method for afflne invariant representation of image regions in the form of Attributed Relational Graph (ARG). To make image regions comparable for matching, we project each region to an affine invariant space and describe it using a set of unary measurements. The distinctiveness of these features is enhanced by describing the relation between the region and its neighbours. We limit ourselves to the low order relations, binary relations, to minimise the combinatorial complexity of both feature extraction and model matching, and to maximise the probability of the features being observed. We propose two sets of binary measurements: geometric relations between pair of regions, and colour profile on the line connecting the centroids of regions. We demonstrate that the former measurements are very discriminative when the shape of segmented regions is informative. However, they are susceptible to distortion of regions boundaries as a result of severe geometric transformations. In contrast, the colour profile binary measurements are very robust. Using this representation we construct a graph to represent the regions in the scene image and refer to it as the scene graph. Similarly a graph containing the regions of all object models is constructed and referred to as the model graph. We consider the object recognition as the problem of matching the scene graph and model graphs. We adopt the probabilistic relaxation labelling technique for our problem. The method is modified to cope better with image segmentation errors. The implemented algorithm is evaluated under affine transformation, occlusion, illumination change and cluttered scene. Good performance for recognition even under severe scaling and in cluttered scenes is reported. Key words: Region Matching, Object Recognition, Relaxation Labelling, Affine Invariant. 006 Computer vision

Search results