1 |
Some aspects of visual discomfortO'Hare, Louise January 2013 (has links)
Visual discomfort is the adverse sensations, such as headaches and eyestrain, encountered on viewing certain stimuli. These sensations can arise under certain viewing conditions, such as stereoscopic viewing and prolonged reading of text patterns. Also, discomfort can occur as a result of viewing stimuli with certain spatial properties, including stripes and filtered noise patterns of particular spatial frequency. This thesis is an exploration of the stimulus properties causing discomfort, within the framework of two theoretical explanations. Both of the explanations relate to the stimuli being difficult for the visual system to process. The first is concerned with discomfort being the result of inefficient neural processing. Neural activity requires energy to process information, and stimuli that demand a lot of energy to be processed might be uncomfortable. The second explanation revolves around uncomfortable stimuli not being effective in driving the accommodative (focussing) response. Accommodation relies on the stimulus as a cue to drive the response effectively - an uninformative cue might result in discomfort from an uncertain accommodative response. The following research investigates both these possibilities using a combination of psychophysical experimentation, questionnaire-based surveys on non-clinical populations, and computational modelling. The implications of the work for clinical populations are also discussed.
|
2 |
Retina-V1 model of detectability across the visual fieldBradley, Chris Kent 22 September 2014 (has links)
A practical model is proposed for predicting the detectability of targets at arbitrary locations in the visual field, in arbitrary gray-scale backgrounds, and under photopic viewing conditions. The major factors incorporated into the model include: (i) the optical point spread function of the eye, (ii) local luminance gain control (Weber's law), (iii) the sampling array of retinal ganglion cells, (iv) orientation and spatial-frequency dependent contrast masking, (iv) broadband contrast masking, (vi) and efficient response pooling. The model is tested against previously reported threshold measurements on uniform backgrounds (the ModelFest data set and data from Foley et al. 2007), and against new measurements reported here for several ModelFest targets presented on uniform, 1/f noise, and natural backgrounds, at retinal eccentricities ranging from 0 to 10 deg. Although the model has few free parameters, it is able to account quite well for all the threshold measurements. / text
|
3 |
Higher-level representations of natural imagesMiflah, Hussain Ismail Ahamed January 2018 (has links)
The traditional view of vision is that neurons in early cortical areas process information about simple features (e.g. orientation and spatial frequency) in small, spatially localised regions of visual space (the neuron's receptive field). This piecemeal information is then fed-forward into later stages of the visual system where it gets combined to form coherent and meaningful global (higher-level) representations. The overall aim of this thesis is to examine and quantify this higher level processing; how we encode global features in natural images and to understand the extent to which our perception of these global representations is determined by the local features within images. Using the tilt after-effect as a tool, the first chapter examined the processing of a low level, local feature and found that the orientation of a sinusoidal grating could be encoded in both a retinally and spatially non-specific manner. Chapter 2 then examined these tilt aftereffects to the global orientation of the image (i.e., uprightness). We found that image uprightness was also encoded in a retinally / spatially non-specific manner, but that this global property could be processed largely independently of its local orientation content. Chapter 3 investigated if our increased sensitivity to cardinal (vertical and horizontal) structures compared to inter-cardinal (45° and 135° clockwise of vertical) structures, influenced classification of unambiguous natural images. Participants required relatively less contrast to classify images when they retained near-cardinal as compared to near-inter-cardinal structures. Finally, in chapter 4, we examined category classification when images were ambiguous. Observers were biased to classify ambiguous images, created by combining structures from two distinct image categories, as carpentered (e.g., a house). This could not be explained by differences in sensitivity to local structures and is most likely the result of our long-term exposure to city views. Overall, these results show that higher-level representations are not fully dependent on the lower level features within an image. Furthermore, our knowledge about the environment influences the extent to which we use local features to rapidly identify an image.
|
4 |
Surface Reflectance Estimation and Natural Illumination StatisticsDror, Ron O., Adelson, Edward H., Willsky, Alan S. 01 September 2001 (has links)
Humans recognize optical reflectance properties of surfaces such as metal, plastic, or paper from a single image without knowledge of illumination. We develop a machine vision system to perform similar recognition tasks automatically. Reflectance estimation under unknown, arbitrary illumination proves highly underconstrained due to the variety of potential illumination distributions and surface reflectance properties. We have found that the spatial structure of real-world illumination possesses some of the statistical regularities observed in the natural image statistics literature. A human or computer vision system may be able to exploit this prior information to determine the most likely surface reflectance given an observed image. We develop an algorithm for reflectance classification under unknown real-world illumination, which learns relationships between surface reflectance and certain features (statistics) computed from a single observed image. We also develop an automatic feature selection method.
|
5 |
Global Depth Perception from Familiar Scene StructureTorralba, Antonio, Oliva, Aude 01 December 2001 (has links)
In the absence of cues for absolute depth measurements as binocular disparity, motion, or defocus, the absolute distance between the observer and a scene cannot be measured. The interpretation of shading, edges and junctions may provide a 3D model of the scene but it will not inform about the actual "size" of the space. One possible source of information for absolute depth estimation is the image size of known objects. However, this is computationally complex due to the difficulty of the object recognition process. Here we propose a source of information for absolute depth estimation that does not rely on specific objects: we introduce a procedure for absolute depth estimation based on the recognition of the whole scene. The shape of the space of the scene and the structures present in the scene are strongly related to the scale of observation. We demonstrate that, by recognizing the properties of the structures present in the image, we can infer the scale of the scene, and therefore its absolute mean depth. We illustrate the interest in computing the mean depth of the scene with application to scene recognition and object detection.
|
6 |
Statistical mechanical models for image processing16 October 2001 (has links) (PDF)
No description available.
|
7 |
Using computational models to study texture representations in the human visual system.Balas, Benjamin 07 February 2005 (has links)
Traditionally, human texture perception has been studied using artificial textures made of random-dot patterns or abstract structured elements. At the same time, computer algorithms for the synthesis of natural textures have improved dramatically. The current study seeks to unify these two fields of research through a psychophysical assessment of a particular computational model, thus providing a sense of what image statistics are most vital for representing a range of natural textures. We employ Portilla and SimoncelliÂs 2000 model of texture synthesis for this task (a parametric model of analysis and synthesis designed to mimic computations carried out by the human visual system). We find an intriguing interaction between texture type (periodic v. structured) and image statistics (autocorrelation function and filter magnitude correlations), suggesting different processing strategies may be employed for these two texture families under pre-attentive viewing.
|
8 |
Aspects of Fourier imagingHsiao, Wen-Hsin January 2008 (has links)
A number of topics related to Fourier imaging are investigated. Relationships between the magnitude of errors in the amplitude and phase of the Fourier transform of images and the mean square error in reconstructed images are derived. The differing effects of amplitude and phase errors are evaluated, and "equivalent" amplitude and phase errors are derived. A model of the probability density function of the Fourier amplitudes of images is derived. The fundamental basis of phase dominance is studied and quantitated. Inconsistencies in published counter-examples of phase dominance are highlighted. The key characteristics of natural images that lead to their observed power spectral behaviour with spatial frequency are determined.
|
9 |
Improving character recognition by thresholding natural images / Förbättra optisk teckeninläsning genom att segmentera naturliga bilderGranlund, Oskar, Böhrnsen, Kai January 2017 (has links)
The current state of the art optical character recognition (OCR) algorithms are capable of extracting text from images in predefined conditions. OCR is extremely reliable for interpreting machine-written text with minimal distortions, but images taken in a natural scene are still challenging. In recent years the topic of improving recognition rates in natural images has gained interest because more powerful handheld devices are used. The main problem faced dealing with recognition in natural images are distortions like illuminations, font textures, and complex backgrounds. Different preprocessing approaches to separate text from its background have been researched lately. In our study, we assess the improvement reached by two of these preprocessing methods called k-means and Otsu by comparing their results from an OCR algorithm. The study showed that the preprocessing made some improvement on special occasions, but overall gained worse accuracy compared to the unaltered images. / Dagens optisk teckeninläsnings (OCR) algoritmer är kapabla av att extrahera text från bilder inom fördefinierade förhållanden. De moderna metoderna har uppnått en hög träffsäkerhet för maskinskriven text med minimala förvrängningar, men bilder tagna i en naturlig scen är fortfarande svåra att hantera. De senaste åren har ett stort intresse för att förbättra tecken igenkännings algoritmerna uppstått, eftersom fler kraftfulla och handhållna enheter används. Det huvudsakliga problemet när det kommer till igenkänning i naturliga bilder är olika förvrängningar som infallande ljus, textens textur och komplicerade bakgrunder. Olika metoder för förbehandling och därmed separation av texten och dess bakgrund har studerats under den senaste tiden. I våran studie bedömer vi förbättringen som uppnås vid förbehandlingen med två metoder som kallas för k-means och Otsu genom att jämföra svaren från en OCR algoritm. Studien visar att Otsu och k-means kan förbättra träffsäkerheten i vissa förhållanden men generellt sett ger det ett sämre resultat än de oförändrade bilderna.
|
10 |
Character Recognition in Natural Images Utilising TensorFlow / Teckenigenkänning i naturliga bilder med TensorFlowViklund, Alexander, Nimstad, Emma January 2017 (has links)
Convolutional Neural Networks (CNNs) are commonly used for character recognition. They achieve the lowest error rates for popular datasets such as SVHN and MNIST. Usage of CNN is lacking in research about character classification in natural images regarding the whole English alphabet. This thesis conducts an experiment where TensorFlow is used to construct a CNN that is trained and tested on the Chars74K dataset, with 15 images per class for training and 15 images per class for testing. This is done with the aim of achieving a higher accuracy than the non-CNN approach by de Campos et al. [1], that achieved 55.26%. The thesis explores data augmentation techniques for expanding the small training set and evaluates the result of applying rotation, stretching, translation and noise-adding. The result of this is that all of these methods apart from adding noise gives a positive effect on the accuracy of the network. Furthermore, the experiment shows that with a three layered convolutional neural network it is possible to create a character classifier that is as good as de Campos et al.'s. It is believed that even better results can be achieved if more experiments would be conducted on the parameters of the network and the augmentation. / Det är vanligt att använda konvolutionära artificiella neuronnät (CNN) för bildigenkänning, då de ger de minsta felmarginalerna på kända datamängder som SVHN och MNIST. Dock saknas det forskning om användning av CNN för klassificering av bokstäver i naturliga bilder när det gäller hela det engelska alfabetet. Detta arbete beskriver ett experiment där TensorFlow används för att bygga ett CNN som tränas och testas med bilder från Chars74K. 15 bilder per klass används för träning och 15 per klass för testning. Målet med detta är att uppnå högre noggrannhet än 55.26%, vilket är vad de campos et al. [1] uppnådde med en metod utan artificiella neuronnät. I rapporten utforskas olika tekniker för att artificiellt utvidga den lilla datamängden, och resultatet av att applicera rotation, utdragning, translation och bruspåslag utvärderas. Resultatet av det är att alla dessa metoder utom bruspåslag ger en positiv effekt på nätverkets noggrannhet. Vidare visar experimentet att med ett CNN med tre lager går det att skapa en bokstavsklassificerare som är lika bra som de Campos et al.s klassificering. Om fler experiment skulle genomföras på nätverkets och utvidgningens parametrar är det troligt att ännu bättre resultat kan uppnås.
|
Page generated in 0.0523 seconds