Global ETD Search

21	Continuous regression : a functional regression approach to facial landmark tracking Sánchez Lozano, Enrique January 2017 (has links) Facial Landmark Tracking (Face Tracking) is a key step for many Face Analysis systems, such as Face Recognition, Facial Expression Recognition, or Age and Gender Recognition, among others. The goal of Facial Landmark Tracking is to locate a sparse set of points defining a facial shape in a video sequence. These typically include the mouth, the eyes, the contour, or the nose tip. The state of the art method for Face Tracking builds on Cascaded Regression, in which a set of linear regressors are used in a cascaded fashion, each receiving as input the output of the previous one, subsequently reducing the error with respect to the target locations. Despite its impressive results, Cascaded Regression suffers from several drawbacks, which are basically caused by the theoretical and practical implications of using Linear Regression. Under the context of Face Alignment, Linear Regression is used to predict shape displacements from image features through a linear mapping. This linear mapping is learnt through the typical least-squares problem, in which a set of random perturbations is given. This means that, each time a new regressor is to be trained, Cascaded Regression needs to generate perturbations and apply the sampling again. Moreover, existing solutions are not capable of incorporating incremental learning in real time. It is well-known that person-specific models perform better than generic ones, and thus the possibility of personalising generic models whilst tracking is ongoing is a desired property, yet to be addressed. This thesis proposes Continuous Regression, a Functional Regression solution to the least-squares problem, resulting in the first real-time incremental face tracker. Briefly speaking, Continuous Regression approximates the samples by an estimation based on a first-order Taylor expansion yielding a closed-form solution for the infinite set of shape displacements. This way, it is possible to model the space of shape displacements as a continuum, without the need of using complex bases. Further, this thesis introduces a novel measure that allows Continuous Regression to be extended to spaces of correlated variables. This novel solution is incorporated into the Cascaded Regression framework, and its computational benefits for training under different configurations are shown. Then, it presents an approach for incremental learning within Cascaded Regression, and shows its complexity allows for real-time implementation. To the best of my knowledge, this is the first incremental face tracker that is shown to operate in real-time. The tracker is tested in an extensive benchmark, attaining state of the art results, thanks to the incremental learning capabilities. 006.4
22	Automatic image annotation applied to habitat classification Torres Torres, Mercedes January 2015 (has links) Habitat classification, the process of mapping a site with its habitats, is a crucial activity for monitoring environmental biodiversity. Phase 1 classification, a 10-class four-tier hierarchical scheme, is the most widely used scheme in the UK. Currently, no automatic approaches have been developed and its classification is carried out exclusively by ecologists. This manual approach using surveyors is laborious, expensive and subjective. To this date, no automatic approach has been developed. This thesis presents the first automatic system for Phase 1 classification. Our main contribution is an Automatic Image Annotation (AIA) framework for the automatic classification of Phase 1 habitats. This framework combines five elements to annotate unseen photographs: ground-taken geo-referenced photography, low-level visual features, medium-level semantic information, random projections forests and location-based weighted predictions. Our second contribution are two fully-annotated ground-taken photograph datasets, the first publicly available databases specifically designed for the development of multimedia analysis techniques for ecological applications. Habitat 1K has over 1,000 photographs and 4,000 annotated habitats and Habitat 3K has over 3,000 images and 11,000 annotated habitats. This is the first time ground-taken photographs have been used with such ecological purposes. Our third contribution is a novel Random Forest-based classifier: Random Projection Forests (RPF). RPFs use Random Projections as a dimensionality reduction mechanism in their split nodes. This new design makes their training and testing phase more efficient than those of the traditional implementation of Random Forests. Our fourth contribution arises from the limitations that low-level features have when classifying similarly visual classes. Low-level features have been proven to be inadequate for discriminating high-level semantic concepts, such as habitat classes. Currently, only humans posses such high-level knowledge. In order to obtain this knowledge, we create a new type of feature, called medium-level features, which use a Human-In-The-Loop approach to extract crucial semantic information. Our final contribution is a location-based voting system for RPFs. We benefit from the geographical properties of habitats to weight the predictions from the RPFs according to the geographical distance between unseen test photographs and photographs in the training set. Results will show that ground-taken photographs are a promising source of information that can be successfully applied to Phase 1 classification. Experiments will demonstrate that our AIA approach outperforms traditional Random Forests in terms of recall and precision. Moreover, both our modifications, the inclusion of medium-level knowledge and a location-based voting system, greatly improve the recall and precision of even the most complex habitats. This makes our complete image-annotation system, to the best of our knowledge, the most accurate automatic alternative to manual habitat classification for the complete categorization of Phase 1 habitats. 333.95
23	Motion correction in high-field MRI Sulikowska, Aleksandra January 2016 (has links) The work described in this thesis was conducted at the University of Nottingham in the Sir Peter Mansfield Imaging Centre, between September 2011 and 2014. Subject motion in high- resolution magnetic resonance imaging (MRI) is a major source of image artefacts. It is a very complex problem, due to variety of physical motion types, imaging techniques, or k-space trajectories. Many techniques have been proposed over the years to correct images for motion, all looking for the best practical solution in clinical scanning, which would give cost- effective, robust and high accuracy correction, without decreasing patient comfort or prolonging the scan time. Moreover, if the susceptibility induced field changes due to head rotation are large enough, they will compromise motion correction methods. In this work a method for prospective correction of head motion for MR brain imaging at 7 T was proposed. It would employ innovative NMR tracking devices not presented in literature before. The device presented in this thesis is characterized by a high accuracy of position measurements (0.06 ± 0.04 mm), is considered very practical, and stands the chance to be used in routine imaging in the future. This study also investigated the significance of the field changes induced by the susceptibility in human brain due to small head rotations (±10 deg). The size and location of these field changes were characterized, and then the effects of the changes on the image were simulated. The results have shown that the field shift may be as large as \|-18.3\| Hz/deg. For standard Gradient Echo sequence at 7 T and a typical head movement, the simulated image distortions were on average equal to 0.5%, and not larger than 15% of the brightest voxel. This is not likely to compromise motion correction, but may be significant in some imaging sequences. 538
24	A GPU parallel approach improving the density of patch based multi-view stereo reconstruction Haines, Benjamin A. January 2016 (has links) Multi-view stereo is the process of recreating three-dimensional data from a set of two or more images of a scene. The ability to acquire 3D data from 2D images is a core concept in computer vision with wide-ranging applications throughout areas such as 3D printing, robotics, recognition, navigation and a vast number of other fields. While 3D reconstruction has been increasingly well studied over the past decades, it is only with the recent evolution of CPU and GPU technologies that practical implementations, able to accurately, robustly and efficiently capture 3D data of photographed objects have begun to emerge. Whilst current research has been shown to perform well under specific circumstances and for a subset of objects, there are still many practical and implementary issues that remain an open problem for these techniques. Most notably, the ability to robustly reconstruct objects from sparse image sets or objects with low texture. Alongside a review of algorithms within the multi-view field, the work proposed in this thesis outlines a massively parallel patch based multi-view stereo pipeline for static scene recovery. By utilising advances in GPU technology, a particle swarm algorithm implemented on the GPU forms the basis for improving the density of patch-based methods. The novelty of such an approach removes the reliance on feature matching and gradient descent to better account for the optimisation of patches within textureless regions, for which current methods struggle. An enhancement to the photo-consistency matching metric, which is used to evaluate the optimisation of each patch, is then defined. Specifically targeting the shortcomings of the photo-consistency metric when used inside a particle swarm optimisation, increasing its effectiveness over textureless areas. Finally, a multi-resolution reconstruction system based on a wavelet framework is presented to further improve upon the robustness of reconstruction over low textured regions. 006.3
25	An investigation of automatic processing techniques for time-lapse microscope images Li, Yuexiang January 2016 (has links) The analysis of time-lapse microscope images is a recent popular research topic. Processing techniques have been employed in such studies to extract important information about cells—e.g., cell number or alterations of cellular features—for various tasks. However, few studies provide acceptable results in practical applications because they cannot simultaneously solve the core challenges that are shared by most cell datasets: the image contrast is extremely low; the distribution of grey scale is non-uniform; images are noisy; the number of cells is large, etc. These factors also make manual processing an extremely laborious task. To improve the efficiency of related biological analyses and disease diagnoses. This thesis establishes a framework in these directions: a new segmentation method for cell images is designed as the foundation of an automatic approach for the measurement of cellular features. The newly proposed segmentation method achieves substantial improvements in the detection of cell filopodia. An automatic measuring mechanism for cell features is established in the designed framework. The measuring component enables the system to provide quantitative information about various cell features that are useful in biological research. A novel cell-tracking framework is constructed to monitor the alterations of cells with an accuracy of cell tracking above 90%. To address the issue of processing speed, two fast-processing techniques have been developed to complete edge detection and visual tracking. For edge detection, the new detector is a hybrid approach that is based on the Canny operator and fuzzy entropy theory. The method calculates the fuzzy entropy of gradients from an image to decide the threshold for the Canny operator. For visual tracking, a newly defined feature is employed in the fast-tracking mechanism to recognize different cell events with tracking accuracy: i.e., 97.66%, and processing speed, i.e., 0.578s/frame. 621.381

Page generated in 0.072 seconds