Spelling suggestions: "subject:"computer disision"" "subject:"computer decisision""
331 |
Alternative approaches to optophonic mappingsCapp, Michael January 2000 (has links)
This thesis presents a number of modifications to a blind aid, known as the video optophone, which enables a blind user to more readily interpret their local environment for enhanced mobility and navigation. Versions of this form of blind aid are generally both difficult to use and interpret, and are therefore inadequate for safe mobility. The reason for this severe problem lies in the complexity and excessive bandwidth of the optophonic output after the conversion from scene-to-sound. The work herein describes a number of modifications that can be applied to the current optophonic process to make more efficient use of the limited bandwidth provided by the auditory system when converting scene images to sound. Various image processing and stereo techniques have been employed to artificially emulate the human visual system through the use of depth maps that successfully fade out the quantity of relatively unimportant image features, whilst emphasising the more significant regions such as nearby obstacles. A series of experiments were designed to test these various modifications to the optophonic mapping by studying important factors of mobility and subject response whilst going about everyday life. The devised system, labelled DeLIA for the Detection, Location, Identification, and Avoidance (or Action) of obstacles, provided a means for gathering statistical data on users’ interpretation of the optophonic output. An analysis of this data demonstrated a significant improvement when using the stereo cartooning technique, developed as part of this work, over the more conventional plain image as an input to an optophonic mapping from scene-to-sound. Lastly, conclusions were drawn from the results, which indicated that the use of a stereo depth map as an input to a video optophone would improve its usefulness as an aid to general mobility. For the purposes of detecting and determining text or similar detail, either a plain unmodified image or some form of edge (depth) image were found to produce the best results
|
332 |
Computer recognition of occluded curved line drawingsAdler, Mark Ronald January 1978 (has links)
A computer program has been designed to interpret scenes from PEANUTS cartoons, viewing each scene as a two-dimensional representation of an event in the three-dimensional world. Characters are identified by name, their orientation and body position is described, and their relationship to other objects in the scene is indicated. This research is seen as an investigation of the problems in recognising flexible non-geometric objects which are subject to self-occlusion as well as occlusion by other objects. A hierarchy of models containing both shape and relational information has been developed to deal with the flexible cartoon bodies. Although the region is the basic unit used in the analysis, the hierarchy makes use of intermediate models to group individual regions into larger more meaningful functional units. These structures may be shared at a higher level in the hierarchy. Knowledge of model similarities may be applied to select alternative models and conserve some results of an incorrect model application. The various groupings account for differences among the characters or modifications in appearance due to changes in attitude. Context information plays a key role in the selection of models to deal with ambiguous shapes. By emphasising relationships between regions, the need for a precise description of shape is reduced. Occlusion interferes with the model-based analysis by obscuring the essential features required by the models. Both the perceived shape of the regions and the inter-relationships between them are altered. An heuristic based on the analysis of line junctions is used to confirm occlusion as the cause of the failure of a model-to-region match. This heuristic, an extension of the T-joint techniques of polyhedral domains, deals with "curved" junctions and can be applied to cases of multi-layered occlusion. The heuristic was found to be most effective in dealing with occlusion between separate objects; standard instances of self-occlusion were more effectively handled at the model level. This thesis describes the development of the program, structuring the discussion around three main problem areas: models, occlusion, and the control aspects of the system. Relevant portions of the programs analyses are used to illustrate each problem area.
|
333 |
3D underwater monocular machine vision from 2D images in an attenuating mediumRandell, Charles James 25 May 2017 (has links)
This dissertation presents a novel underwater machine vision technique which uses
the optical properties of water to extract range information from colour images. By
exploiting the fact that the attenuation of light in water is a function of frequency, an
intensity-range transformation is developed and implemented to provide monocular
vision systems with a three-dimensional scene reconstruction capability. The technique
can also be used with images that have no salient, contrasting features and there are no
restrictions on surface shapes.
From a generalized reflectance map based on the optical properties of water, the
closed form intensity-range transformation is derived to convert intensity images from
various spectral bands into a range map wherein the value of each "pixel" is the range to
the imaged surface. The technique is computationally efficient enough to be performed
in real time and does not require specialized illumination or similar restrictive conditions.
A calibration procedure is developed which enables the transformation to be practically
implemented. An alternate approach to estimating range from multispectral data based on
expanding the medium's transfer function and using these terms as elements in sensitivity
vectors is also presented and analyzed.
Mathematical analysis of the intensity-range transformation and associated
developments is provided in terms of its performance in noise and sensitivity to various
system parameters. Its performance as a function of light scattering is studied with the
aid of computer simulation. Results from transforming actual underwater images are also
presented. The results of this analysis and the demonstrated performance of the
intensity-range transformation endorse it as a practical enhancement to underwater
machine vision systems. / Graduate
|
334 |
Embodied Visual Object Recognition / Förkroppsligad objektigenkänningWallenberg, Marcus January 2017 (has links)
Object recognition is a skill we as humans often take for granted. Due to our formidable object learning, recognition and generalisation skills, it is sometimes hard to see the multitude of obstacles that need to be overcome in order to replicate this skill in an artificial system. Object recognition is also one of the classical areas of computer vision, and many ways of approaching the problem have been proposed. Recently, visually capable robots and autonomous vehicles have increased the focus on embodied recognition systems and active visual search. These applications demand that systems can learn and adapt to their surroundings, and arrive at decisions in a reasonable amount of time, while maintaining high object recognition performance. This is especially challenging due to the high dimensionality of image data. In cases where end-to-end learning from pixels to output is needed, mechanisms designed to make inputs tractable are often necessary for less computationally capable embodied systems.Active visual search also means that mechanisms for attention and gaze control are integral to the object recognition procedure. Therefore, the way in which attention mechanisms should be introduced into feature extraction and estimation algorithms must be carefully considered when constructing a recognition system.This thesis describes work done on the components necessary for creating an embodied recognition system, specifically in the areas of decision uncertainty estimation, object segmentation from multiple cues, adaptation of stereo vision to a specific platform and setting, problem-specific feature selection, efficient estimator training and attentional modulation in convolutional neural networks. Contributions include the evaluation of methods and measures for predicting the potential uncertainty reduction that can be obtained from additional views of an object, allowing for adaptive target observations. Also, in order to separate a specific object from other parts of a scene, it is often necessary to combine multiple cues such as colour and depth in order to obtain satisfactory results. Therefore, a method for combining these using channel coding has been evaluated. In order to make use of three-dimensional spatial structure in recognition, a novel stereo vision algorithm extension along with a framework for automatic stereo tuning have also been investigated. Feature selection and efficient discriminant sampling for decision tree-based estimators have also been implemented. Finally, attentional multi-layer modulation of convolutional neural networks for recognition in cluttered scenes has been evaluated. Several of these components have been tested and evaluated on a purpose-built embodied recognition platform known as Eddie the Embodied. / Embodied Visual Object Recognition / FaceTrack
|
335 |
A cellular automaton-based system for the identification of topological features of carotid artery plaquesDelaney, Matthew January 2014 (has links)
The formation of a plaque in one or both of the internal carotid arteries poses a serious threat to the lives of those in whom it occurs. This thesis describes a technique designed to detect level of occlusion and provide topological information about such plaques. In order to negate the cost of specialised hardware, only the sound produced by blood-flow around the occlusion is used; this raises problems that prevent the application of existing medical imaging techniques, however, these can be overcome by the application of a nonlinear technique that takes full advantage of the discrete nature of digital computers. Results indicate that both level of occlusion and presence or absence of various topological features can be determined in this way. Beginning with a review of existing work in medical-imaging and in more general but related techniques, the EPI process of Friden (2004) is identified as the strongest approach to a situation where it is desirable to work with both signal and noise yet avoid the computational cost and other pitfalls of established techniques. The remained of the thesis discusses attempts to automate the EPI process which, in the form given by Frieden (2004), requires a degree of human mathematical creative problem-solving. Initially, a numerical-methods inspired approach based on genetic algorithms was attempted but found to be both computationally costly and insufficiently true to the nature of the EPI equations. A second approach, based on the idea of creating a formal system allowing entropy, direction and logic to be manipulated together proved to lack certain key properties and require an amount of work beyond the scope of the project described in this thesis in order to be extended into a form that was usable for the EPI process. The approach upon which the imaging system described is ultimately built is based on an abstracted form of constraint-logic programming resulting in a cellular-automaton based model which is shown to produce distinct images for different sizes and topologies of plaque in a reliable and human-interpretable way.
|
336 |
Linking digitized video input with optical character recognition20 November 2014 (has links)
M.Com. (Informatics) / This dissertation examines the field of computer vision, with special attention given to the recognition of alpha numeric characters on video images using OCR software. The study may be broadly divided into four sections. The first section offers an introduction to standard OCR (Optical Character Recognition) methods that have evolved over the years and have been incorporated into some commercial software packages currently. The second section covers the problem of reading characters in a dynamic environment and also the problems experienced with the compatibility of current OCR software products. The third section of the dissertation looks at solutions for the problem mentioned in section two and creates a framework for a generic model in which any application should fit. The generic model is then described in detail. The framework should provide a foundation for interested parties to build, modify or improve the model. The final section gives examples of how the model should present a solution. Experimental results are looked at and the model is critically evaluated.
|
337 |
A revised framework for human scene recognitionLinsley, Drew January 2016 (has links)
Thesis advisor: Sean P. MacEvoy / For humans, healthy and productive living depends on navigating through the world and behaving appropriately along the way. But in order to do this, humans must first recognize their visual surroundings. The technical difficulty of this task is hard to comprehend: the number of possible scenes that can fall on the retina approaches infinity, and yet humans often effortlessly and rapidly recognize their surroundings. Understanding how humans accomplish this task has long been a goal of psychology and neuroscience, and more recently, has proven useful in inspiring and constraining the development of new algorithms for artificial intelligence (AI). In this thesis I begin by reviewing the current state of scene recognition research, drawing upon evidence from each of these areas, and discussing an unchallenged assumption in the literature: that scene recognition emerges from independently processing information about scenes’ local visual features (i.e. the kinds of objects they contain) and global visual features (i.e., spatial parameters. ). Over the course of several projects, I challenge this assumption with a new framework for scene recognition that indicates a crucial role for information sharing between these resources. Development and validation of this framework will expand our understanding of scene recognition in humans and provide new avenues for research by expanding these concepts to other domains spanning psychology, neuroscience, and AI. / Thesis (PhD) — Boston College, 2016. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Psychology.
|
338 |
THREE DIMENSIONAL SEGMENTATION AND DETECTION OF FLUORESCENCE MICROSCOPY IMAGESDavid J. Ho (5929748) 10 June 2019 (has links)
Fluorescence microscopy is an essential tool for imaging subcellular structures in tissue. Two-photon microscopy enables imaging deeper into tissue using near-infrared light. The use of image analysis and computer vision tools to detect and extract information from the images is still challenging due to the degraded microscopy volumes by blurring and noise during the image acquisition and the complexity of subcellular structures presented in the volumes. In this thesis we describe methods for segmentation and detection of fluorescence microscopy images in 3D. We segment tubule boundaries by distinguishing them from other structures using three dimensional steerable filters. These filters can capture strong directional tendencies of the voxels on a tubule boundary. We also describe multiple three dimensional convolutional neural networks (CNNs) to segment nuclei. Training the CNNs usually require a large set of labeled images which is extremely difficult to obtain in biomedical images. We describe methods to generate synthetic microscopy volumes and to train our 3D CNNs using these synthetic volumes without using any real ground truth volumes. The locations and sizes of the nuclei are detected using of our CNNs, known as the Sphere Estimation Network. Our methods are evaluated using real ground truth volumes and are shown to outperform other techniques.
|
339 |
Projector-based interactive visual processing. / CUHK electronic theses & dissertations collection / Digital dissertation consortiumJanuary 2011 (has links)
Motivated by these problems, we explore the potential of projectors in interactive information visualization and processing in this thesis. In particular, we make three contributions. First, we propose a computer vision solution for direct 3D object exhibition and manipulation without the user wearing spectacles. In our approach, a new 3D display interface is designed by projecting images on a hand-held foam sphere which can be moved freely by the user. By tracking the motion of the sphere and projecting motion-dependent images onto the sphere, a virtual 3D perception can be created. Using this interface, the user will experience as if he is holding the real object in hands and be able to control the viewing angle freely. / Second, we extend the projection on traditional rigid screen to projection on flexible surfaces. A new flexible display method is proposed, which can project information on a hand-held flexible surface (e.g. an ordinary white paper with a checker pattern at the back) that can be twisted freely. While the user twists the projection surface, the system recovers the deformation of the surface and projects well-tailored information onto the surface corresponding to the deformation. As a result, the viewer will see the information as if it was printed on the paper. Two applications, the flexible image projection and curvilinear data slicing are created to demonstrate the usefulness of the method. After the studies on fixed-position projection, we conduct an investigation on mobile projectors, which is becoming especially necessary with the rapid popularity of mobile projectors. We propose a hand-held movable projection method that can freely project keystone-free content onto a general flat surface without any markings or boundaries on the displaying screen. Compared with traditional static projection systems that keep the projector and screen in fixed positions, our projection scheme can give the user greater freedom of display control while producing undistorted images at the same time. / The recent trend of human-computer interaction technologies has revealed the potential of the projector as an powerful interaction tool. More than a pure display tool, a projector has great strength that can change largely the way a traditional user interface works. Although some possibilities have been investigated in previous work, certain applications and approaches deserve further studies. For example, 1) Projection showing 3D information: viewing 3D models is usually achieved by projecting polarized light of different phases for left and right eyes, and the user is required to wear specially designed spectacles. The cost of building such a system is high. 2) Projection on flexible surface: most existing systems display information on flat rigid projection screens, extending it to non-planar flexible surfaces is an interesting and useful research direction; 3) Direct user-info interaction: existing systems using mouse and screen have limited freedom of control and low level of user experience. Direct manipulation of the display object by the hands of a user is more natural; 4) Mobile projector display: portable or embedded projectors are becoming more and more popular, but some fundamental problems, e.g. the keystone correction, are not fully studied. / To verify the correctness of our methods, we built prototype systems using off-the-shelf devices and conducted extensive experiments, including both simulation and real experiments. The results show that the proposed methods are effective and good performance has been achieved. In particular, the real-time speed and low-cost requirement make it quite appealing in many application areas, such as education, digital games, medical applications etc. Capitalizing on the shrinking size, increasing portability, and decreasing cost of projectors, it is predictable that projector-based interactive processing will become more and more popular in the near future. We believe the research work in this thesis will provide a good foundation for further research and development on computer vision and projector-based applications. / Li, Zhaorong. / Adviser: Kin-Hong Wong. / Source: Dissertation Abstracts International, Volume: 73-06, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (leaves 133-142). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [201-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese.
|
340 |
Computer recognition of partially-occluded objects.January 1986 (has links)
by Chan Ming-hong. / Bibliography: leaves 67-68 / Thesis (M.Ph.)--Chinese University of Hong Kong, 1986
|
Page generated in 0.0861 seconds