Global ETD Search

1	Design of A Saccadic Active Vision System Wong, Winnie Sze-Wing January 2006 (has links) Human vision is remarkable. By limiting the main concentration of high-acuity photoreceptors to the eye's central fovea region, we efficiently view the world by redirecting the fovea between points of interest using eye movements called <em>saccades</em>. <br /><br /> Part I describes a saccadic vision system prototype design. The dual-resolution saccadic camera detects objects of interest in a scene by processing low-resolution image information; it then revisits salient regions in high-resolution. The end product is a dual-resolution image in which background information is displayed in low-resolution, and salient areas are captured in high-acuity. This lends to a resource-efficient active vision system. <br /><br />Part II describes CMOS image sensor designs for active vision. Specifically, this discussion focuses on methods to determine regions of interest and achieve high dynamic range on the sensor. Electrical & Computer Engineering saccade active vision selective attention saliency map dynamic range
2	Estimation of visual focus for control of a FOA-based image coder / Estimering av visuellt fokus för kontroll av en FOA-baserad bildkodare Carlén, Stefan January 2003 (has links) <p>A major feature of the human eye is the compressed sensitiveness of the retina. An image coder, which makes use of this, can heavily encode the parts of the image which is not close to the focus of our eyes. Existing image coding schemes require that the gaze direction of the viewer is measured. However, a great advantage would be if an estimator predicts the focus of attention (FOA) regions in the image. </p><p>This report presents such an implementation, which is based on a model that mimics many of the biological features of the human visual system (HVS). For example, it uses a center-surround mechanism, which is a replica of the receptive fields of the neurons in the HVS. </p><p>An extra feature of the implementation is the extension to handle video sequences, and the expansion of the FOA:s. The test results of the system show good results from a large variety of images.</p> Technology image coding focus of attention saliency map FOA expansion TEKNIKVETENSKAP TECHNOLOGY TEKNIKVETENSKAP
3	Estimation of visual focus for control of a FOA-based image coder / Estimering av visuellt fokus för kontroll av en FOA-baserad bildkodare Carlén, Stefan January 2003 (has links) A major feature of the human eye is the compressed sensitiveness of the retina. An image coder, which makes use of this, can heavily encode the parts of the image which is not close to the focus of our eyes. Existing image coding schemes require that the gaze direction of the viewer is measured. However, a great advantage would be if an estimator predicts the focus of attention (FOA) regions in the image. This report presents such an implementation, which is based on a model that mimics many of the biological features of the human visual system (HVS). For example, it uses a center-surround mechanism, which is a replica of the receptive fields of the neurons in the HVS. An extra feature of the implementation is the extension to handle video sequences, and the expansion of the FOA:s. The test results of the system show good results from a large variety of images. Technology image coding focus of attention saliency map FOA expansion TEKNIKVETENSKAP TECHNOLOGY TEKNIKVETENSKAP
4	Design of A Saccadic Active Vision System Wong, Winnie Sze-Wing January 2006 (has links) Human vision is remarkable. By limiting the main concentration of high-acuity photoreceptors to the eye's central fovea region, we efficiently view the world by redirecting the fovea between points of interest using eye movements called <em>saccades</em>. <br /><br /> Part I describes a saccadic vision system prototype design. The dual-resolution saccadic camera detects objects of interest in a scene by processing low-resolution image information; it then revisits salient regions in high-resolution. The end product is a dual-resolution image in which background information is displayed in low-resolution, and salient areas are captured in high-acuity. This lends to a resource-efficient active vision system. <br /><br />Part II describes CMOS image sensor designs for active vision. Specifically, this discussion focuses on methods to determine regions of interest and achieve high dynamic range on the sensor. Electrical & Computer Engineering saccade active vision selective attention saliency map dynamic range
5	Improving the performance of airport luggage inspection by providing cognitive and perceptual supports to screeners Liu, Xi January 2008 (has links) Recently concern about aviation security has focused on the work of airport security screeners who detect threat items in passengers' luggage. An effective method of training and screening is required for improving screeners' detection abilities and performance to cope with the unreliable human performance of screening. The overall aim of this thesis is to understand and define the potential visual and cognitive factors in the task of inspecting airport passengers' X-ray luggage images, examine usability of perceptual feedback in this demanding task and develop a new method of salient regions which assist screeners to detect targets. The result of this work would obtain knowledge and skills of X-ray luggage images examination, provide insight into the design of training system and develop a method to significantly enhance screeners' detection ability. A questionnaire was developed for screeners to extract the expertise of the screening task and investigate the effect of image features on visual attention. A series of experiments were designed to understand the screening task and explore how knowledge and skills are developed with practice. Results indicated that training under time stressed conditions is recommended for ensuring adequate high detection ability in real life situation as screeners have to balance accuracy and speed in time pressure. The advantages of screeners are better detection ability and search skills which were gained by experience of the search task. Hit rate of naive people was improved with the perceptual exposure of images of threat items. However, scanning did not become efficient. It has demonstrated that detection performance and search skills are improved by the practice of frequency exposure targets in the search task and such ability partly transfer to novel targets. Learning in visual search of threat items is stimuli specific such that familiarity with stimulus and task is the source of performance enhancement. Threat items should be updated constantly and massive amount of X-ray threat objects should be employed for airport security screeners training so as to enlarge object knowledge and enhance recognition ability. Perceptual feedback of circling areas with dwell duration longer than 1000ms does not Significantly improve observers' detection ability in the airport screening task. Features of bags and threat items influence initial attention and attention allocation in the search process. Salient regions, based on the pure stimulus properties, not only contain most of targets in X-ray images but also improve observers' detection performance of high hit rate by forcing observers to scrutinize these areas carefully. 331.2
6	Visual Attention in Active Vision Systems : Attending, Classifying and Manipulating Objects Rasolzadeh, Babak January 2011 (has links) This thesis has presented a computational model for the combination of bottom-up and top-down attentional mechanisms. Furthermore, the use for this model has been demonstrated in a variety of applications of machine and robotic vision. We have observed that an attentional mechanism is imperative in any active vision system, machine as well as biological, since it not only reduces the amount of information that needs to be further processed (for say recognition, action), but also by only processing the attended image regions, such tasks become more robust to large amounts of clutter and noise in the visual field. Using various feature channels such as color, orientation, texture, depth and symmetry, as input, the presented model is able with a pre-trained artificial neural network to modulate a saliency map for a particular top-down goal, e.g. visual search for a target object. More specifically it dynamically combines the unmodulated bottom-up saliency with the modulated top-down saliency, by means of a biologically and psychophysically motivated temporal differential equation. This way the system is for instance able to detect important bottom-up cues, even while in visual search mode (top-down) for a particular object. All the computational steps for yielding the final attentional map, that ranks regions in images according to their importance for the system, are shown to be biologically plausible. It has also been demonstrated that the presented attentional model facilitates tasks other than visual search. For instance, using the covert attentional peaks that the model returns, we can improve scene understanding and segmentation through clustering or scattering of the 2D/3D components of the scene, depending on the configuration of these attentional peaks and their relations to other attributes of the scene. More specifically this is performed by means of entropy optimization of the scence under varying cluster-configurations, i.e. different groupings of the various components of the scene. Qualitative experiments demonstrated the use of this attentional model on a robotic humanoid platform and in a real-time manner control the overt attention of the robot by specifying the saccadic movements of the robot head. These experiments also exposed another highly important aspect of the model; its temporal variability, as opposed to many other attentional (saliency) models that exclusively deal with static images. Here the dynamic aspects of the attentional mechanism proved to allow for a temporally varying trade-off between top-down and bottom-up influences depending on changes in the environment of the robot. The thesis has also lay forward systematic and quantitative large scale experiments on the actual benefits and uses of this kind of attentional model. To this end a simulated 2D environment was implemented, where the system could not “see” the entire environment and needed to perform overt shifts of attention (a simulated saccade) in order to perfom a visual search task for a pre-defined sought object. This allowed for a simple and rapid substitution of the core attentional-model of the system with comparative computational models designed by other researchers. Nine such contending models were tested and compared with the presented model, in a quantitative manner. Given certain asumptions these experiments showed that the attentional model presented in this work outperforms the other models in simple visualsearch tasks. / QC 20111228 visual attention saliency map compter vision robotics active vision machine learning
7	Visual saliency and eye movement:modeling and applications Rezazadegan Tavakoli, H. (Hamed) 04 November 2014 (has links) Abstract Humans are capable of narrowing their focus on the highlights of visual information in a fraction of time in order to handle enormous mass of data. Akin to human, computers should deal with a tremendous amount of visual information. To replicate such a focusing mechanism, computer vision relies on techniques that filter out redundant information. Consequently, saliency has recently been a popular subject of discussion in the computer vision community, though it is an old subject matter in the disciplines of cognitive sciences rather than computer science. The reputation of saliency techniques – particularly in the computer vision domain – is greatly due to their inexpensive and fast computation which facilitates their use in many computer vision applications, e.g., image/video compression, object recognition, tracking, etc. This study investigates visual saliency modeling, which is the transformation of an image into a salience map such that the identified conspicuousness agrees with the statistics of human eye movements. It explores the extent of image and video processing to develop saliency techniques suitable for computer vision, e.g., it adopts sparse sampling scheme and kernel density estimation to introduce a saliency measure for images. Also, it studies the role of eye movement in salience modeling. To this end, it introduces a particle filter based framework of saccade generation incorporated into a salience model. Moreover, eye movements and salience are exploited in several applications. The contributions of this study lie on the proposal of a number of salience models for image and video stimuli, a framework to incorporate a model of eye movement generation in salience modeling, and the investigation of the application of salience models and eye movements in tracking, background subtraction, scene recognition, and valence recognition. / Tiivistelmä Ihmiset kykenevät kohdistamaan katseensa hetkessä näkymän keskeisiin asioihin, mikä vaatii näköjärjestelmältä valtavan suurten tietomäärien käsittelyä. Kuten ihmisen myös tietokoneen pitäisi pystyä käsittelemään vastaavasti suurta määrää visuaalista informaatiota. Tällaisen mekanismin toteuttaminen tietokonenäöllä edellyttää menetelmiä, joilla redundanttista tietoa voidaan suodattaa. Tämän vuoksi salienssista eli silmiinpistävyydestä on muodostunut viime aikoina suosittu tutkimusaihe tietotekniikassa ja erityisesti tietokonenäön tutkimusyhteisössä, vaikka sitä sinänsä on jo pitkään tutkittu kognitiivisissa tieteissä. Salienssimenetelmien tunnettavuus erityisesti tietokonenäössä johtuu pääasiassa niiden laskennallisesta tehokkuudesta, mikä taas mahdollistaa menetelmien käytön monissa tietokonenäön sovelluksissa kuten kuvan ja videon pakkaamisessa, objektin tunnistuksessa, seurannassa, etc. Tässä väitöskirjassa tutkitaan visuaalisen salienssin mallintamista, millä tarkoitetaan muunnosta kuvasta salienssikartaksi siten, että laskennallinen silmiinpistävyys vastaa ihmisen silmänliikkeistä muodostettavaa statistiikkaa. Työssä tarkastellaan keinoja, miten kuvan- ja videonkäsittelyä voidaan käyttää kehittämään salienssimenetelmiä tietokonenäön tarpeisiin. Työssä esitellään esimerkiksi harvaa näytteistystä ja ydinestimointia hyödyntävä kuvien salienssimitta. Työssä tutkitaan myös silmänliikkeiden merkitystä salienssin mallintamisen kannalta. Tätä varten esitellään partikkelisuodatusta hyödyntävä lähestymistapa sakkadien generointiin, joka voidaan liittää salienssimalliin. Lisäksi silmänliikkeitä ja salienssia hyödynnetään useissa sovelluksissa. Suoritetun tutkimuksen tieteellisiin kontribuutioihin sisältyvät useat esitetyt salienssimallit kuvasta ja videosta saatavalle herätteelle, lähestymistapa silmänliikkeiden laskennalliseen mallintamiseen ja generointiin osana salienssimallia sekä salienssimallien ja silmänliikkeiden sovellettavuuden tutkiminen visuaalisessa seurannassa, taustanvähennyksessä, näkymäanalyysissa ja valenssin tunnistuksessa. computer vision pattern recognition saliency map vision system visual attention hahmontunnistus näköjärjestelmä salienssikartta tietokonenäkö visuaalinen tarkkaavaisuus
8	Content-aware Video Compression Subramanian, Vivek January 2019 (has links) In a video there are certain regions in the image that viewers focus on more than others, which are called the salient regions or RegionsOf-Interest (ROI). This thesis aims to improve the perceived quality of videos by improving the quality of these ROis while degrading the quality of the other non-ROI regions of a frame to keep the same bitrate as would have been the case otherwise. This improvement is achieved by using saliency maps generated using an eye tracker or a deep neural network and providing this information to a modified video encoder. In this thesis the open source x264 encoder was chosen to make use of this information. The effects of ROI encoding are studied for high quality 720p videos by encoding them at low bitrates. The results indicate that ROI encoding can improve subjective video quality when carefully applied. / I en video £inns <let vissa delar av bilden som tittarna fokuserar mer pa an andra, och dessa kallas Region of Interest". Malet med den har uppsatsen ar att hoja den av tittaren upplevda videokvaliteten genom att minska kompressionsgraden ( och darmed hoja kvaliteten) i de iogonfallande delarna av bilden, samtid som man hojer kompressionsgraden i ovriga delar sa att bitraten blir den samma som innan andringen. Den har forbattringen gors genom att anvanda Saliency Mapsssom visar de iogonfallande delarna for varje bildruta. Dessa Saliency Maps"har antingen detekterats med hjalp av en Eye Tracker eller sa har de raknats fram av ett Neuralt Natverk. Informationen anvands sedan i en modifierad version av den oppna codecen x264 enligt en egendesignad algoritm. Effekten av forandringen har studerats genom att koda hogkvalitativa kallfiler vid lag bitrate. Resultaten indikerar att denna metod kan forbattra den upplevda kvaliteten av en video om den appliceras med ratt styrka. region-of-interest saliency map bitrate H.264 video compression quantization offset Engineering and Technology Teknik och teknologier
9	Towards Explainable AI Using Attribution Methods and Image Segmentation Rocks, Garrett J 01 January 2023 (has links) (PDF) With artificial intelligence (AI) becoming ubiquitous in a broad range of application domains, the opacity of deep learning models remains an obstacle to adaptation within safety-critical systems. Explainable AI (XAI) aims to build trust in AI systems by revealing important inner mechanisms of what has been treated as a black box by human users. This thesis specifically aims to improve the transparency and trustworthiness of deep learning algorithms by combining attribution methods with image segmentation methods. This thesis has the potential to improve the trust and acceptance of AI systems, leading to more responsible and ethical AI applications. An exploratory algorithm called ESAX is introduced and shows how performance greater than other top attribution methods on PIC testing can be achieved in some cases. These results lay a foundation for future work in segmentation attribution. XAI AI Saliency Map XRAI Segmentation Human AI Trust Artificial Intelligence and Robotics Electrical and Electronics
10	Road Scene Content Analysis for Driver Assistance and Autonomous Driving Altun, Melih 24 August 2015 (has links) No description available. Electrical Engineering Computer Science Road scene content analysis saliency map entropy driven context-feature fusion

Search results