1 |
Active stereo for AGV navigationLi, Fuxing January 1996 (has links)
No description available.
|
2 |
Target template guidance of eye movements during real-world searchMalcolm, George Law January 2010 (has links)
Humans must regularly locate task-relevant objects when interacting with the world around them. Previous research has identified different types of information that the visual system can use to help locate objects in real-world scenes, including low-level image features and scene context. However, previous research using object arrays suggest that there may be another type of information that can guide real-world search: target knowledge. When a participant knows what a target looks like they generate and store a visual representation, or template, of it. This template then facilitates the search process. A complete understanding of real-world search needs to identify how a target template guides search through scenes. Three experiments in Chapter 2 confirmed that a target template facilitates realworld search. By using an eye-tracker target knowledge was found to facilitate both scanning and verification behaviours during search, but not the search initiation process. Within the scanning epoch a target template facilitated gaze directing and shortened fixation durations. These results suggest that target knowledge affects both the activation map, which selects which regions of the scene to fixate, and the evaluation process that compares a fixated object to the internal representation of the target. With the exact behaviours that a target template facilitates now identified, Chapter 3 investigated the role that target colour played in template-guided search. Colour is one of the more interesting target features as it has been shown to be preferred by the visual system over other features when guiding search through object arrays. Two real-world search experiments in Chapter 3 found that colour information had its strongest effect on the gaze directing process, suggesting that the visual system relies heavily on colour information when searching for target-similar regions in the scene percept. Although colour was found to facilitate the evaluation process too, both when rejecting a fixated object as a distracter and accepting it as the target, this behaviour was found to be influenced comparatively less. This suggests that the two main search behaviours – gaze directing and region evaluation – rely on different sets of template features. The gaze directing process relies heavily on colour information, but knowledge of other target features will further facilitate the evaluation process. Chapter 4 investigated how target knowledge combined with other types of information to guide search. This is particularly relevant in real-world search where several sources of guidance information are simultaneously available. A single experiment investigated how target knowledge and scene context combined to facilitate search. Both information types were found to facilitate scanning and verification behaviours. During the scanning epoch both facilitated the eye guidance and object evaluation processes. When both information sources were available to the visual system simultaneously, each search behaviour was facilitated additively. This suggests that the visual system processes target template and scene context information independently. Collectively, the results indicate not only the manner in which a target template facilitates real-world search but also updates our understanding of real-world search and the visual system. These results can help increase the accuracy of future realworld search models by specifying the manner in which our visual system utilises target template information, which target features are predominantly relied upon and how target knowledge combines with other types of guidance information.
|
3 |
Real-Time Motion and Stereo Cues for Active Visual ObserversBjörkman, Mårten January 2002 (has links)
No description available.
|
4 |
Surveillance of Time-varying Geometry Objects using a Multi-camera Active-vision SystemMackay, Matthew Donald 10 January 2012 (has links)
The recognition of time-varying geometry (TVG) objects (in particular, humans) and their actions is a complex task due to common real-world sensing challenges, such as obstacles and environmental variations, as well as due to issues specific to TVG objects, such as self-occlusion. Herein, it is proposed that a multi-camera active-vision system, which dynamically selects camera poses in real-time, be used to improve TVG action sensing performance by selecting camera views on-line for near-optimal sensing-task performance. Active vision for TVG objects requires an on-line sensor-planning strategy that incorporates information about the object itself, including its current action, and information about the state of the environment, including obstacles, into the pose-selection process. Thus, the focus of this research is the development of a novel methodology for real-time sensing-system reconfiguration (active vision), designed specifically for the recognition of a single TVG object and its actions in a cluttered, dynamic environment, which may contain multiple other dynamic (maneuvering) obstacles.
The proposed methodology was developed as a complete, customizable sensing-system framework which can be readily modified to suit a variety of specific TVG action-sensing tasks – a 10-stage pipeline real-time architecture. Sensor Agents capture and correct camera images, removing noise and lens distortion, and segment the images into regions of interest. A Synchronization Agent aligns multiple images from different cameras to a single ‘world-time.’ Point Tracking and De-Projection Agents detect, identify, and track points of interest in the resultant 2-D images, and form constraints in normalized camera coordinates using the tracked pixel coordinates. A 3-D Solver Agent combines all constraints to estimate world-coordinate positions for all visible features of the object-of-interest (OoI) 3-D articulated model. A Form-Recovery Agent uses an iterative process to combine model constraints, detected feature points, and other contextual information to produce an estimate of the OoI’s current form. This estimate is used by an Action-Recognition Agent to determine which action the OoI is performing, if any, from a library of known actions, using a feature-vector descriptor for identification. A Prediction Agent provides estimates of future OoI and obstacle poses, given past detected locations, and estimates of future OoI forms given the current action and past forms. Using all of the data accumulated in the pipeline, a Central Planning Agent implements a formal, mathematical optimization developed from the general sensing problem. The agent seeks to optimize a visibility metric, which is positively related to sensing-task performance, to select desirable, feasible, and achievable camera poses for the next sensing instant. Finally, a Referee Agent examines the complete set of chosen poses for consistency, enforces global rules not captured through the optimization, and maintains system functionality if a suitable solution cannot be determined.
In order to validate the proposed methodology, rigorous experiments are also presented herein. They confirm the basic assumptions of active vision for TVG objects, and characterize the gains in sensing-task performance. Simulated experiments provide a method for rapid evaluation of new sensing tasks. These experiments demonstrate a tangible increase in single-action recognition performance over the use of a static-camera sensing system. Furthermore, they illustrate the need for feedback in the pose-selection process, allowing the system to incorporate knowledge of the OoI’s form and action. Later real-world, multi-action and multi-level action experiments demonstrate the same tangible increase when sensing real-world objects that perform multiple actions which may occur simultaneously, or at differing levels of detail.
A final set of real-world experiments characterizes the real-time performance of the proposed methodology in relation to several important system design parameters, such as the number of obstacles in the environment, and the size of the action library. Overall, it is concluded that the proposed system tangibly increases TVG action-sensing performance, and can be generalized to a wide range of applications, including human-action sensing. Future research is proposed to develop similar methods to address deformable objects and multiple objects of interest.
|
5 |
Surveillance of Time-varying Geometry Objects using a Multi-camera Active-vision SystemMackay, Matthew Donald 10 January 2012 (has links)
The recognition of time-varying geometry (TVG) objects (in particular, humans) and their actions is a complex task due to common real-world sensing challenges, such as obstacles and environmental variations, as well as due to issues specific to TVG objects, such as self-occlusion. Herein, it is proposed that a multi-camera active-vision system, which dynamically selects camera poses in real-time, be used to improve TVG action sensing performance by selecting camera views on-line for near-optimal sensing-task performance. Active vision for TVG objects requires an on-line sensor-planning strategy that incorporates information about the object itself, including its current action, and information about the state of the environment, including obstacles, into the pose-selection process. Thus, the focus of this research is the development of a novel methodology for real-time sensing-system reconfiguration (active vision), designed specifically for the recognition of a single TVG object and its actions in a cluttered, dynamic environment, which may contain multiple other dynamic (maneuvering) obstacles.
The proposed methodology was developed as a complete, customizable sensing-system framework which can be readily modified to suit a variety of specific TVG action-sensing tasks – a 10-stage pipeline real-time architecture. Sensor Agents capture and correct camera images, removing noise and lens distortion, and segment the images into regions of interest. A Synchronization Agent aligns multiple images from different cameras to a single ‘world-time.’ Point Tracking and De-Projection Agents detect, identify, and track points of interest in the resultant 2-D images, and form constraints in normalized camera coordinates using the tracked pixel coordinates. A 3-D Solver Agent combines all constraints to estimate world-coordinate positions for all visible features of the object-of-interest (OoI) 3-D articulated model. A Form-Recovery Agent uses an iterative process to combine model constraints, detected feature points, and other contextual information to produce an estimate of the OoI’s current form. This estimate is used by an Action-Recognition Agent to determine which action the OoI is performing, if any, from a library of known actions, using a feature-vector descriptor for identification. A Prediction Agent provides estimates of future OoI and obstacle poses, given past detected locations, and estimates of future OoI forms given the current action and past forms. Using all of the data accumulated in the pipeline, a Central Planning Agent implements a formal, mathematical optimization developed from the general sensing problem. The agent seeks to optimize a visibility metric, which is positively related to sensing-task performance, to select desirable, feasible, and achievable camera poses for the next sensing instant. Finally, a Referee Agent examines the complete set of chosen poses for consistency, enforces global rules not captured through the optimization, and maintains system functionality if a suitable solution cannot be determined.
In order to validate the proposed methodology, rigorous experiments are also presented herein. They confirm the basic assumptions of active vision for TVG objects, and characterize the gains in sensing-task performance. Simulated experiments provide a method for rapid evaluation of new sensing tasks. These experiments demonstrate a tangible increase in single-action recognition performance over the use of a static-camera sensing system. Furthermore, they illustrate the need for feedback in the pose-selection process, allowing the system to incorporate knowledge of the OoI’s form and action. Later real-world, multi-action and multi-level action experiments demonstrate the same tangible increase when sensing real-world objects that perform multiple actions which may occur simultaneously, or at differing levels of detail.
A final set of real-world experiments characterizes the real-time performance of the proposed methodology in relation to several important system design parameters, such as the number of obstacles in the environment, and the size of the action library. Overall, it is concluded that the proposed system tangibly increases TVG action-sensing performance, and can be generalized to a wide range of applications, including human-action sensing. Future research is proposed to develop similar methods to address deformable objects and multiple objects of interest.
|
6 |
Real-Time Motion and Stereo Cues for Active Visual ObserversBjörkman, Mårten January 2002 (has links)
No description available.
|
7 |
From vision to drawn metaphor : an artistic investigation into the relationship between eye-tracking and drawingBaker, Catherine January 2012 (has links)
At its most essential drawing consists of the making of marks on a surface, however such an interpretation does not necessarily reflect the diverse practice of artists whose work seeks to challenge the conventions of drawing and establish new boundaries. This abstract documents a practice involving a new consideration for drawing which focuses on the active process of drawing as a physical and perceptual encounter. It proposes that eye movements and their associated cognitive processing can be considered as a drawing generating process. It does not seek to undermine the conventional three-way process of drawing involving eye, hand and brain but presents ideas which push against the established boundaries for drawing practice and has investigated new ways of making and new ways of considering the practice of drawing as a phenomenological contemplation. The proposition for drawing presented in this document, has been developed through a practice-led enquiry over the last eight years and involves using scientific methodologies found within the area of Active Vision. By examining artworks produced within the early part of the period of time defined within this thesis, emergent ideas relating to the act of making in-situ drawings and the recollection of such experiences brought about a series of questions regarding the process of generating a drawing. As the practice developed, using data obtained from different eye-tracking experiments, the author has explored the possibilities for drawing through using scientific methods of tracking the act of looking to investigate the relationship between the observer and the observer entity. Using the relationship between the drawn mark and visual responses to it as the basis for a practice-led period of research, this thesis presents the notion that by using technologies designed for other disciplines artists can explore the potential for drawing beyond the conventions cited above. Through the use of eye-tracking data the artist and author seeks to firmly establish the use of this scientific methodology within an artistic framework. It is a framework that responds to new ways of thinking about spatiality and the relations between sight and thought, taking into account the value of experience within the production of art; how the physical act itself becomes the manifestation of a process of drawing, understanding and knowledge of the world around us.
|
8 |
Design of A Saccadic Active Vision SystemWong, Winnie Sze-Wing January 2006 (has links)
Human vision is remarkable. By limiting the main concentration of high-acuity photoreceptors to the eye's central fovea region, we efficiently view the world by redirecting the fovea between points of interest using eye movements called <em>saccades</em>. <br /><br /> Part I describes a saccadic vision system prototype design. The dual-resolution saccadic camera detects objects of interest in a scene by processing low-resolution image information; it then revisits salient regions in high-resolution. The end product is a dual-resolution image in which background information is displayed in low-resolution, and salient areas are captured in high-acuity. This lends to a resource-efficient active vision system. <br /><br />Part II describes CMOS image sensor designs for active vision. Specifically, this discussion focuses on methods to determine regions of interest and achieve high dynamic range on the sensor.
|
9 |
Active visual scene explorationSommerlade, Eric Chris Wolfgang January 2011 (has links)
This thesis addresses information theoretic methods for control of one or several active cameras in the context of visual surveillance. This approach has two advantages. Firstly, any system dealing with real inputs must take into account noise in the measurements and the underlying system model. Secondly, the control of cameras in surveillance often has different, potentially conflicting objectives. Information theoretic metrics not only yield a way to assess the uncertainty in the current state estimate, they also provide means to choose the observation parameters that optimally reduce this uncertainty. The latter property allows comparison of sensing actions with respect to different objectives. This allows specification of a preference for objectives, where the generated control will fulfil these desired objectives accordingly. The thesis provides arguments for the utility of information theoretic approaches to control visual surveillance systems, by addressing the following objectives in particular: Firstly, how to choose a zoom setting of a single camera to optimally track a single target with a Kalman filter. Here emphasis is put on an arbitration between loss of track due to noise in the observation process, and information gain due to higher accuracy after successful observation. The resulting method adds a running average of the Kalman filter’s innovation to the observation noise, which not only ameliorates tracking performance in the case of unexpected target motions, but also provides a higher maximum zoom setting. The second major contribution of this thesis is a term that addresses exploration of the supervised area in an information theoretic manner. The reasoning behind this term is to model the appearance of new targets in the supervised environment, and use this as prior uncertainty about the occupancy of areas currently not under observation. Furthermore, this term uses the performance of an object detection method to gauge the information that observations of a single location can yield. Additionally, this thesis shows experimentally that a preference for control objectives can be set using a single scalar value. This linearly combines the objective functions of the two conflicting objectives of detection and exploration, and results in the desired control behaviour. The third contribution is an objective function that addresses classification methods. The thesis shows in detail how the information can be derived that can be gained from the classification of a single target, under consideration of its gaze direction. Quantitative and qualitative validation show the increase in performance when compared to standard methods.
|
10 |
Design of A Saccadic Active Vision SystemWong, Winnie Sze-Wing January 2006 (has links)
Human vision is remarkable. By limiting the main concentration of high-acuity photoreceptors to the eye's central fovea region, we efficiently view the world by redirecting the fovea between points of interest using eye movements called <em>saccades</em>. <br /><br /> Part I describes a saccadic vision system prototype design. The dual-resolution saccadic camera detects objects of interest in a scene by processing low-resolution image information; it then revisits salient regions in high-resolution. The end product is a dual-resolution image in which background information is displayed in low-resolution, and salient areas are captured in high-acuity. This lends to a resource-efficient active vision system. <br /><br />Part II describes CMOS image sensor designs for active vision. Specifically, this discussion focuses on methods to determine regions of interest and achieve high dynamic range on the sensor.
|
Page generated in 0.0652 seconds