411 |
Visual servoing path-planning for generalized cameras and objectsShen, Tiantian., 沈添天. January 2013 (has links)
Visual servoing (VS) is an automatic control technique which uses vision feedback to control the robot motion. Eye-in-hand VS systems, with the vision sensor mounted directly on the robot end-effector have received significant attention, in particular for the task of steering the vision sensor (usually a camera) from the present position to the desired one identified by image features shown in advance. The servo uses the difference between the present and the desired views (shown a priori) of some objects to develop real-time driving signals. This approach is also known as “teach-by-showing” method. To accomplish such a task, many constraints and limits are required such as camera field of view (FOV), robot joint limits, collision and occlusion avoidance, and etc. Path-planning technologies, as one branch of high-level control strategies, are explored in this thesis to impose these constraints for VS tasks with respect to different types of cameras and objects.
First, a VS path-planning strategy is proposed for a class of cameras that include conventional perspective cameras, fisheye cameras, and catadioptric systems. These cameras are described by adopting a unified mathematical model and the strategy consists of designing image trajectories that allow the camera to reach the desired position while satisfying the camera FOV limit and the end-effector collision avoidance. To this end, the proposed strategy introduces the projection of the available image features onto a virtual plane and the computation of a feasible camera trajectory through polynomial programming. The computed image trajectory is hence tracked by an image-based visual servoing (IBVS) controller. Experimental results with a fisheye camera mounted on a 6-degree-of-freedom (6-DoF) robot arm illustrate the proposed strategy.
Second, this thesis proposes a path-planning strategy for visual servoing with image moments, in the case of which the observed features are not restrained to points. Image moments of some solid objects such as circle, sphere, and etc. are more intuitive features than the dominant feature points in VS applications. The problem consists of planning a trajectory in order to ensure the convergence of the robot end-effector to the desired position while satisfying workspace (Cartesian space) constraints of the robot end-effector and visibility constraints of these solid objects, in particular including collision and occlusion avoidance. A solution based on polynomial parametrization is proposed and validated by some simulation and experiment results.
Third, constrained optimization is combined with robot teach-by-demonstration to address simultaneously visibility constraint, joint limits and whole-arm collisions for robust vision-based control of a robot manipulator. User demonstration data generates safe regions for robot motion with respect to joint limits and potential whole-arm collisions. Constrained optimization uses these safe regions to generate new feasible trajectories under visibility constraint that achieve the desired view of the target (e.g., a pre-grasping location) in new, undemonstrated locations. To fulfill these requirements, camera trajectories that traverse a set of selected control points are modeled and optimized using either quintic Hermite splines or polynomials with C2 continuity. Experiments with a 7-DoF articulated arm validate the proposed method. / published_or_final_version / Electrical and Electronic Engineering / Doctoral / Doctor of Philosophy
|
412 |
3D trajectory recovery in spatial and time domains from multiple imagesZhang, Xiongbo, 張雄波 January 2013 (has links)
Recovering 3D structure from multiple 2D images is a fundamental problem in computer vision. Most of existing methods focus on the reconstruction of static points in 3D space; however, the reconstruction of trajectories which are resulted from moving points should also have our full attention due to its high efficiency in structure modeling and description. Depending on whether points are moving in spatial domain or in time domain, trajectory recovery turns out to be a curve reconstruction problem or a non-rigid structure recovery problem respectively. This thesis addresses several issues that were not considered in existing approaches in both of the two problems.
For the curve reconstruction problem, we propose a dedicated method for planar curve reconstruction and an optimization method for general curve reconstruction. In the planar curve reconstruction method, measured projected curves that are typically represented by sequences of points are fitted using B-splines before reconstruction, enabling the occlusion problem to be handled naturally. Also, an optimization algorithm is developed to match the fitted curves across images while enforcing the planarity constraint, and the algorithm is guaranteed to converge. In the general curve reconstruction method, Non-Uniform Rational B-Spline (NURBS) is employed for curve representation in 3D space, which improves the flexibility in curve description while maintaining the smoothness of a curve at the same time. Starting with measured point sequences of projected curves, a complete set of algorithms are developed and evaluated, including curve initialization and optimization of the initialized curve by minimizing the 2D reprojection error that is defined to be the 2D Euclidean distance from measured points to reprojected curves. Experiments show that the proposed methods are robust and efficient, and are excellent in producing high-quality reconstruction results.
For the non-rigid structure recovery problem, we proposed two methods for the recovery of non-rigid structures together with a strategy that automates the process of non-rigid structure recovery. Compared with existing methods using synthetic datasets, both of the two proposed methods perform significantly better than existing methods when there are noise contaminations in measurements, and are capable to recover the ground truth solution when the measurements are noise free whereas no existing method is capable of achieving this so far. In the first method, namely factorization-based method, the available constraints in non-rigid structure from motion are analyzed and the ambiguity of the solution space of the proposed method is clarified, leading to a straightforward approach that requires only solution to several linear equations in least-squares sense instead of having to solve non-linear optimization problems in existing methods. In the second method, namely bundle adjustment method, a modified trajectory basis model that is demonstrated to be more flexible for non-rigid structure description is proposed. The method seeks for optimal non-rigid structure and camera matrices by alternately solving a set of linear equations in least square sense. Experiments on real non-rigid motions show that the method improves the quality of reconstruction significantly. / published_or_final_version / Electrical and Electronic Engineering / Doctoral / Doctor of Philosophy
|
413 |
Foveated object recognition by corner searchArnow, Thomas Louis, 1946- 29 August 2008 (has links)
Here we describe a gray scale object recognition system based on foveated corner finding, the computation of sequential fixation points, and elements of Lowe’s SIFT transform. The system achieves rotational, transformational, and limited scale invariant object recognition that produces recognition decisions using data extracted from sequential fixation points. It is broken into two logical steps. The first is to develop principles of foveated visual search and automated fixation selection to accomplish corner search. The result is a new algorithm for finding corners which is also a corner-based algorithm for aiming computed foveated visual fixations. In the algorithm, long saccades move the fovea to previously unexplored areas of the image, while short saccades improve the accuracy of putative corner locations. The system is tested on two natural scenes. As an interesting comparison study we compare fixations generated by the algorithm with those of subjects viewing the same images, whose eye movements are being recorded by an eyetracker. The comparison of fixation patterns is made using an information-theoretic measure. Results show that the algorithm is a good locator of corners, but does not correlate particularly well with human visual fixations. The second step is to use the corners located, which meet certain goodness criteria, as keypoints in a modified version of the SIFT algorithm. Two scales are implemented. This implementation creates a database of SIFT features of known objects. To recognize an unknown object, a corner is located and a feature vector created. The feature vector is compared with those in the database of known objects. The process is continued for each corner in the unknown object until enough information has been accumulated to reach a decision. The system was tested on 78 gray scale objects, hand tools and airplanes, and shown to perform well. / text
|
414 |
Towards robust identification of slow moving animals in deep-sea imagery by integrating shape and appearance cuesMehrnejad, Marzieh 13 August 2015 (has links)
Underwater video data are a rich source of information for marine biologists. However,
the large amount of recorded video creates a ’big data’ problem, which emphasizes
the need for automated detection techniques.
This work focuses on the detection of quasi-stationary crabs of various sizes in
deep-sea images. Specific issues related to image quality such as low contrast and
non-uniform lighting are addressed by the pre-processing step. The segmentation
step is based on color, size and shape considerations. Segmentation identifies regions
that potentially correspond to crabs. These regions are normalized to be invariant to
scale and translation. Feature vectors are formed by the normalized regions, and they
are further classified via supervised and non-supervised machine learning techniques.
The proposed approach is evaluated experimentally using a video dataset available
from Ocean Networks Canada. The thesis provides an in-depth discussion about the
performance of the proposed algorithms. / Graduate / 0544 / 0800 / 0547 / mars_mehr@hotmail.com
|
415 |
Reading between the lines : object localization using implicit cues from image tagsHwang, Sung Ju 10 November 2010 (has links)
Current uses of tagged images typically exploit only
the most explicit information: the link between the nouns
named and the objects present somewhere in the image. We
propose to leverage “unspoken” cues that rest within an
ordered list of image tags so as to improve object localization.
We define three novel implicit features from an image’s
tags—the relative prominence of each object as signified
by its order of mention, the scale constraints implied
by unnamed objects, and the loose spatial links hinted by
the proximity of names on the list. By learning a conditional
density over the localization parameters (position
and scale) given these cues, we show how to improve both
accuracy and efficiency when detecting the tagged objects.
We validate our approach with 25 object categories from
the PASCAL VOC and LabelMe datasets, and demonstrate
its effectiveness relative to both traditional sliding windows
as well as a visual context baseline. / text
|
416 |
A factorization-based approach to projective reconstruction from line correspondences in multiple imagesNg, Tuen-pui., 吳端珮. January 2004 (has links)
published_or_final_version / abstract / toc / Electrical and Electronic Engineering / Master / Master of Philosophy
|
417 |
The Application of human body tracking for the development of a visualinterfaceWong, Shu-fai., 黃樹輝. January 2004 (has links)
published_or_final_version / abstract / toc / Computer Science and Information Systems / Master / Master of Philosophy
|
418 |
Image complexity measurement for predicting target detectabilityPeters, Richard Alan, 1956- January 1988 (has links)
Designers of automatic target recognition algorithms (ATRs) need to compare the performance of different ATRs on a wide variety of imagery. The task would be greatly facilitated by an image complexity metric that correlates with the performance of a large number of ATRs. The ideal metric is independent of any specific ATR and does not require advance knowledge of true targets in the image. No currently used metric meets both these criteria. Complete independence of ATRs and prior target information is neither possible nor desirable since the metric must correlate with ATR performance. An image complexity metric that derives from the common characteristics of a large set of ATRs and the attributes of typical targets may be sufficiently general for ATR comparison. Many real-time, tactical ATRs operate on forward looking infrared (FLIR) imagery and identify, as potential targets, image regions of a specific size that are highly discernible by virtue of their contrast and edge strength. For such ATRs, an image complexity metric could be based on measurements of the mutual discernibility of image regions on various scales. This paper: (1) reviews ATR algorithms in the public domain literature and investigates the common characteristics of both the algorithms and the imagery on which they operate; (2) shows that complexity measurement requires a complete segmentation of the image based on these commonalities; (3) presents a new method of scale-specific image segmentation that uses the mask-driven close-open transform, a novel implementation of a morphological operator; (4) reviews edge detection for discernibility measurement; (5) surveys image complexity metrics in the current literature and discusses their limitations; (6) proposes a new local feature discernibility metric based on relative contrast and edge strength; (7) derives a new global image complexity metric based on the probability distribution of local metrics; (8) compares the metric to the output of a specific ATR; and (9) makes suggestions for further work.
|
419 |
Probabilistic frameworks for single view reconstruction using shape priorsChen, Yu January 2012 (has links)
No description available.
|
420 |
Texture ambiguity and occlusion in live 3D reconstructionMcIlroy, Paul Malcolm January 2013 (has links)
No description available.
|
Page generated in 0.0606 seconds