Global ETD Search

111	Hybrid scene characterisation applied to natural images Sakellariou, Georgios January 2012 (has links) In this thesis, a combination of skeletonisation and graph matching techniques, coupled with a blend of supervised and unsupervised learning methodology is applied to the task of characterising and classifying natural shapes. A novel navigation-based skeletonisation algorithm is used to gather low level structural and morphological information about the shape. Subsequently, the data are converted into a series of attributed graphs, which characterise the image. Graphs of the same type can then be compared using an approximate graph matcher, which identifies a degree of similarity between them. Each degree of similarity corresponds to a data point in a conceptual space (as defined by Gärdenfors). The proposed method is applied to two distinct problems; the classification of leaf types, and the characterisation of river networks. The classification and characterisation systems are tested on a database of images of leaves and a collection of satellite images respectively. The novel navigation-based skeletonisation algorithm features several advantages; first, it allows the collection of topological and morphological information on the fly. This eliminates the need for any post-processing on the extracted skeletons. In addition, the adaptation of the algorithm to suit different applications is facilitated by the fact that any sort of morphological information can be included without alterations to the function of the algorithm. The conversion of the skeletons to attributed graphs is simplified by the existence of structural and morphological flags in the skeletal points. Last, concepts are created in the resulting conceptual space by means of a best-guess approach as well as a mechanism for accommodating external user input. 006.6
112	Calculating the curvature shape characteristics of the human body from 3D scanner data Douros, I. January 2004 (has links) In the recent years, there have been significant advances in the development and manufacturing of 3D scanners capable of capturing detailed (external) images of whole human bodies. Such hardware offers the opportunity to collect information that could be used to describe, interpret and analyse the shape of the human body for a variety of applications where shape information plays a vital role (e.g. apparel sizing and customisation; medical research in fields such as nutrition, obesity/anorexia and perceptive psychology; ergonomics for vehicle and furniture design). However, the representations delivered by such hardware typically consist of unstructured or partially structured point clouds, whereas it would be desirable to have models that allow shape-related information to be more immediately accessible. This thesis describes a method of extracting the differential geometry properties of the body surface from unorganized point cloud datasets. In effect, this is a way of constructing curvature maps that allows the detection on the surface of features that are deformable (such as ridges) rather than reformable under certain transformations. Such features could subsequently be used to interpret the topology of a human body and to enable classification according to its shape, rather than its size (as is currently the standard practice for many of the applications concemed). The background, motivation and significance of this research are presented in chapter one. Chapter two is a literature review describing the previous and current attempts to model 3D objects in general and human bodies in particular, as well as the mathematical and technical issues associated with the modelling. Chapter three presents an overview of: the methodology employed throughout the research; the assumptions regarding the data to be processed; and the strategy for evaluating the results for each stage of the methodology. Chapter four describes an algorithm (and some variations) for approximating the local surface geometry around a given point of the input data set by means of a least-squares minimization. The output of such an algorithm is a surface patch described in an analytic (implicit) form. This is necessary for the next step described below. The case is made for using implicit surfaces rather than more popular 3D surface representations such as parametric forms or height functions. Chapter five describes the processing needed for calculating curvature-related characteristics for each point of the input surface. This utilises the implicit surface patches generated by the algorithm described in the previous chapter, and enables the construction of a "curvature map" of the original surface, which incorporates rich information such as the principal curvatures, shape indices and curvature directions. Chapter six describes a family of algorithms for calculating features such as ridges and umbilic points on the surface from the curvature map, in a manner that bypasses the problem of separating a vector field (i.e. the principal curvature directions) across the entire surface of an object. An alternative approach, using the focal surface information, is also considered briefly in comparison. The concluding chapter summarises the results from all steps of the processing and evaluates them in relation to the requirements set in chapter one. Directions for further research are also proposed. 006.6
113	3D hand pose regression with variants of decision forests Tang, Danhang January 2015 (has links) 3D hand pose regression is a fundamental component in many modern human computer interaction applications such as sign language recognition, virtual object manipulation, game control, etc. This thesis focuses on the scope of 3D pose regression with a single hand from depth data. The problem has many challenges including high degrees of freedom, severe viewpoint changes, self-occlusion and sensor noise. The main contributions of this work are to propose a series of decision forest-based methods in a progressive manner, which improves upon the previous and achieves state-of-the-art performance is achieved in the end. The thesis first introduces a novel algorithm called semi-supervised transductive regression forest, which combines transductive learning and semi-supervised learning to bridge the gap between synthetically generated, noise-free training data and real noisy data. Moreover, it incorporates a coarse-to-fine training quality function to handle viewpoint changes in a more efficient manner. As a patch-based method, STR forest has high complexity during inference. To handle that, this thesis proposes latent regression forest, a method that models the pose estimation problem as a coarse-to-fine search. This inherently combines the efficiency of a holistic method and the flexibility of a patch-based method, and thus results in 62.5 FPS without CPU/GPU optimisation. Targeting the drawbacks of LRF, a new algorithm called hierarchical sampling forests is proposed to model this problem as a progressive search, guided by kinematic structure. Hence the intermediate results (partial poses) can be verified by a new efficient energy function. Consequently it can produce more accurate full poses. All these methods are thoroughly described, compared and published. In the conclusion part we discuss and analyse their differences, limitations and usage scenarios, and then propose a few ideas for future work. 006.6
114	View-dependent representation of shape and appearance from multiple view video Volino, Marco January 2016 (has links) Over the past decade, markerless performance capture, through multiple synchronised cameras, has emerged as an alternative to traditional motion capture techniques, allowing the simultaneous acquisition of shape, motion and appearance. This technology is capable of capturing the subtle details of human motion, e.g. clothing, skin and hair dynamics, which cannot be achieved through current marker based capture techniques. Markerless performance capture has the potential to revolutionise digital content creation in many creative industries, but must overcome several hurdles before it can be seen as a practical mainstream technology. One limitation of the technology is the enormous size of the generated data. This thesis addresses issues surrounding compact appearance representation of virtual characters generated through markerless performance capture, optimisation of the underlying 3D geometry and delivery of interactive content over the internet. Current approaches to multiple camera texture representation effectively reduce the storage requirements by discarding huge amounts of view dependent and dynamic appearance information. This information is important for reproducing the realism of the captured multiple view video. The first contribution of this thesis introduces a novel multiple layer texture representation (MLTR) for multiple view video. The MLTR preserves dynamic, view dependent appearance information by resampling the captured frames into a hierarchical set of texture maps ordered by surface visibility. The MLTR also enables computationally efficient view-dependent rendering by pre-computing visibility testing and reduces projective texturing to a simple texture lookup. The representation is quantitatively evaluated and shown to reduce the storage cost by > 90% without a significant effect on visual quality. The second contribution outlines the ideal properties for the optimal representation of 4D video and takes steps in achieving this goal. Using the MLTR, spatial and temporal consistency is enforced using a Markov random field framework, allowing video compression algorithms to make further storage reductions through increased spatial and temporal redundancies. An optical flow-based multiple camera alignment method is also introduced to reduce visual artefacts, such as blurring and ghosting, that are caused by approximate geometry and camera calibration errors. This results in clearer and sharper textures with a lower storage footprint. In order to facilitate high quality free-viewpoint rendering, two shape optimisation methods are proposed. The first combines the strengths of the visual hull, multiple view stereo and temporally consistent geometry to match visually important features using a non-rigid iterative closest point method. The second is based on a bundle adjustment formulation which jointly refines shape and calibration. While, these methods achieve the objective of enhancing the geometry and/or camera calibration parameters, further research is required to improve the resulting shape. Finally, it is shown how the methods developed in this thesis could be used to deliver interactive 4D video to consumers via a WebGL enabled internet browser, e.g. Firefox or Chrome. Existing methods for parametric motion graphs are adapted and combined with an efficient WebGL renderer to allow interactive 4D character delivery over the Internet. This demonstrates for the first time that 4D video has the potential to provide interactive content via the internet which opens this technology up to the widest possible audience. 006.6
115	Image watermarking based on Shearlet transform Ahmaderaghi, Baharak January 2016 (has links) This thesis is dedicated to investigating image watermarking techniques based on the recently proposed transform called 'Shearlet' as the watermark embedding domain. The aim is to obtain new levels of imperceptibility and robustness which lead to higher data hiding capacity. With this idea in mind, new image watermarking algorithms in the Discrete Shearlet Transform domain are developed. First, combined with state of the art spread spectrum embedding methodology, a new watermarking algorithm using DST is designed in order to obtain better performance. The system was tested using five common types of image attacks. The results indicated that a combination of DST and spread spectrum embedding was more rQbust and more imperceptible compared with two well-known watermarking systems based on OCT and DWT domain, using the same embedding strategy. Second, a new perceptual image watermarking scheme using discrete Shearlet transform was developed by adapting a spatial visual model to the structure of the DST decompositions. The system performance was compared, under the same condition using the same embedding and extracting strategy with two watermarking systems based on DWT and DTCWT domain. Experimental results show the proposed method's efficiency by having higher imperceptibility and capacity and, at the same time, being more robust against some of the attacks. Finally, a Shearlet transform-based watermarking framework is proposed for blind watermarking, improving the other transformed-based methods. In order to develop this blind watermarking system, statistical modelling was applied to describe the behaviour of discrete Shearlet transform coefficients based on different sub-bands and resolutions. In order to investigate system performance, the obtained results are compared with watermarking systems based on DWT using the same embedding strategy. The overall results indicate the improvements in performance in order to assess the claims made during this research about the usage of the Discrete Shearlet Transform as a new embedding domain. 006.6
116	Dictionary learning for scalable sparse image representation Begovic, Bojana January 2016 (has links) Modern era of signal processing has developed many technical tools for recording and processing large and growing amount of data together with algorithms specialised for data analysis. This gives rise to new challenges in terms of data processing and modelling data representation. Fields ranging from experimental sciences, astronomy, computer vision,neuroscience mobile networks etc., are all in constant search for scalable and efficient data processing tools which would enable more effective analysis of continuous video streams containing millions of pixels. Therefore, the question of digital signal representation is still of high importance, despite the fact that it has been the topic of a significant amount of work in the past. Moreover, developing new data processing methods also affects the quality of everyday life, where devices such as CCD sensors from digital cameras or cell phones are intensively used for entertainment purposes. Specifically, one of the novel processing tools is signal sparse coding which represents signals as linear combinations of a few representational basis vectors i.e., atoms given an overcomplete dictionary. Applications that employ sparse representation are many such as denoising, compression, and regularisation in inverse problems, feature extraction, and more. In this thesis we introduce and study a particular signal representation denoted as the scalable sparse coding. It is based on a novel design for the dictionary learning algorithm, which has proven to be effective for scalable sparse representation of many modalities such as high motion video sequences, natural and solar images. The proposed algorithm is built upon the foundation of the K-SVD framework originally designed to learn non-scalable dictionaries for natural images. The scalable dictionary learning design is mainly motivated by the main perception characteristics of the Human Visual System (HVS) mechanism. Specifically, its core structure relies on the exploitation of the spatial high-frequency image components and contrast variations in order to achieve visual scene objects identification at all scalable levels. The implementation of HVS properties is carried out by introducing a semi-random Morphological Component Analysis (MCA) based initialisation of the scalable dictionary and the regularisation of its atom’s update mechanism. Subsequently, this enables scalable sparse image reconstruction. In general, dictionary learning for sparse representations leads to state-of-the-art image restoration results for several different problems in the field of image processing. Experiments in this thesis show that these are equally achievable by accommodating all dictionary elements to tailor the scalable data representation and reconstruction, hence modelling data that admit sparse representation in a novel manner. Furthermore, achieved results demonstrateand validate the practicality of the proposed scheme making it a promising candidate for many practical applications involving both time scalable display, denoising and scalable compressive sensing (CS). Performed simulations include scalable sparse recovery for representation of static and dynamic data changing over time such as video sequences and natural images. Lastly, we contribute novel approaches for scalable denoising and contrast enhancement (CE), applied on solar images corrupted with pixel-dependent Poisson and zero-mean additive white Gaussian noise. Given that solar data contain noise introduced by charge-coupled devices within the on-board acquisition system these artefacts, prior to image analysis, have to be removed. Thus, novel image denoising and contrast enhancement methods are necessary for solar preprocessing. 006.6
117	Sparse analysis model based dictionary learning and signal reconstruction Dong, Jing January 2016 (has links) Sparse representation has been studied extensively in the past decade in a variety of applications, such as denoising, source separation and classification. Earlier effort has been focused on the well-known synthesis model, where a signal is decomposed as a linear combination of a few atoms of a dictionary. However, the analysis model, a counterpart of the synthesis model, has not received much attention until recent years. The analysis model takes a different viewpoint to sparse representation, and it assumes that the product of an analysis dictionary and a signal is sparse. Compared with the synthesis model, this model tends to be more expressive to represent signals, as a much richer union of subspaces can be described. This thesis focuses on the analysis model and aims to address the two main challenges: analysis dictionary learning (ADL) and signal reconstruction. In the ADL problem, the dictionary is learned from a set of training samples so that the signals can be represented sparsely based on the analysis model, thus offering the potential to fit the signals better than pre-defined dictionaries. Among the existing ADL algorithms, such as the well-known Analysis K-SVD, the dictionary atoms are updated sequentially. The first part of this thesis presents two novel analysis dictionary learning algorithms to update the atoms simultaneously. Specifically, the Analysis Simultaneous Codeword Optimization (Analysis SimCO) algorithm is proposed, by adapting the SimCO algorithm which is proposed originally for the synthesis model. In Analysis SimCO, the dictionary is updated using optimization on manifolds, under the $\ell_2$-norm constraints on the dictionary atoms. This framework allows multiple dictionary atoms to be updated simultaneously in each iteration. However, similar to the existing ADL algorithms, the dictionary learned by Analysis SimCO may contain similar atoms. To address this issue, Incoherent Analysis SimCO is proposed by employing a coherence constraint and introducing a decorrelation step to enforce this constraint. The competitive performance of the proposed algorithms is demonstrated in the experiments for recovering synthetic dictionaries and removing additional noise in images, as compared with existing ADL methods. The second part of this thesis studies how to reconstruct signals with learned dictionaries under the analysis model. This is demonstrated by a challenging application problem: multiplicative noise removal (MNR) of images. In the existing sparsity motivated methods, the MNR problem is addressed using pre-defined dictionaries, or learned dictionaries based on the synthesis model. However, the potential of analysis dictionary learning for the MNR problem has not been investigated. In this thesis, analysis dictionary learning is applied to MNR, leading to two new algorithms. In the first algorithm, a dictionary learned based on the analysis model is employed to form a regularization term, which can preserve image details while removing multiplicative noise. In the second algorithm, in order to further improve the recovery quality of smooth areas in images, a smoothness regularizer is introduced to the reconstruction formulation. This regularizer can be seen as an enhanced Total Variation (TV) term with an additional parameter controlling the level of smoothness. To address the optimization problem of this model, the Alternating Direction Method of Multipliers (ADMM) is adapted and a relaxation technique is developed to allow variables to be updated flexibly. Experimental results show the superior performance of the proposed algorithms as compared with three sparsity or TV based algorithms for a range of noise levels. 006.6
118	Appearance-based tracking of non-rigid objects Leung, Po January 2008 (has links) No description available. 006.6
119	A study of fundamental graphics software Botting, R. J. January 1970 (has links) No description available. 006.6
120	Spatially structured cognitive models of semantic information : the implications for computerised databases Collins, Julie January 2006 (has links) Existing theories of semantic cognition propose models of cognitive processing occurring in a conceptual space, where 'meaning' is derived from the spatial relationships between concepts' mapped locations within the space. Information visualisation is a growing area of research within the field of information retrieval, and methods for presenting database contents visually in the form of spatial data management systems (SDMSs) are being developed. This thesis combined these two areas of research to investigate the benefits associated with employing spatial-semantic mapping (documents represented as objects in two- and three-dimensional virtual environments are proximally mapped dependent on the semantic similarity of their content) as a tool for improving retrieval performance and navigational efficiency when browsing for information within such systems. Positive effects associated with the quality of document mapping were observed; improved retrieval performance and browsing behaviour were witnessed when mapping was optimal. It was also shown using a third dimension for virtual environment (VE) presentation provides sufficient additional information regarding the semantic structure of the environment that performance is increased in comparison to using two-dimensions for mapping. A model that describes the relationship between retrieval performance and browsing behaviour was proposed on the basis of findings. Individual differences were not found to have any observable influence on retrieval performance or browsing behaviour when mapping quality was good. The findings from this work have implications for both cognitive modelling of semantic information, and for designing and testing information visualisation systems. These implications are discussed in the conclusions of this work. 006.6 Psychology

Search results