Global ETD Search

151	Feature selection from higher order correlations Zhang, Zhihong January 2012 (has links) This thesis addresses the problems in feature selection, particularly focusing on selecting features from higher order correlations. To this end, we present two supervised feature selection approaches named \emph{Graph based Information-theoretic Feature Selection} and \emph{Hypergraph based Information-theoretic Feature Selection} respectively, which are capable of considering third or even higher order dependencies between the relevant features and capturing the optimal size of relevant feature subset. Furthermore, we develop two unsupervised feature selection methods which can evaluate features jointly rather than individually. In this case, larger feature combinations are considered. The reason for this is that although an individual feature may have limited relevance to a particular class, when taken in combination with other features it may be strongly relevant to the class. In Chapter $2$, we thoroughly review the relevant literature of the classifier independent (filter-based) feature selection methods. One dominant direction of research in this area is exemplified by the so-called information theoretic feature selection criteria, which is measuring the mutual dependence of two variables. Another influential direction is the graph-based feature selection methods, which are to select the features that best preserve the data similarity or a manifold structure derived from the entire feature set. We notice that most existing feature selection methods evaluate features individually or just simply consider pairwise feature interaction, and hence cannot handle redundant features. Another shortcoming of existing feature selection methods is that most of them select features in a greedy way and do not provide a direct measure to judge whether to add additional features or not. To deal with this problem, they require a user to supply the number of selected features in advance. However, in real applications, it is hard to estimate the number of useful features before the feature selection process. This thesis addresses these weaknesses, and fills a gap in the literature of selecting features from higher order correlations. In Chapter $3$ we propose a graph based information-theoretic approach to feature selection. There are three novel ingredients. First, by incorporating mutual information (MI) for pairwise feature similarity measure, we establish a novel feature graph framework which is used for characterizing the informativeness between the pair of features. Secondly, we locate the relevant feature subset (RFS) from the feature graph by maximizing features' average pairwise relevance. The RFS is expected to have little redundancy and very strong discriminating power. This strategy reduces the optimal search space from the original feature set to the relatively smaller relevant feature subset, and thus enable an efficient computation. Finally, based on RFS, we evaluate the importance of unselected features by using a new information theoretic criterion referred to as the multidimensional interaction information (MII). The advantage of MII is that it can go beyond pairwise interaction and consider third or higher order feature interactions. As a result, we can evaluate features jointly, and thus avoid the redundancies arising in individual feature combinations. Experimental results demonstrate the effectiveness of our feature selection method on a number of standard data-sets. In Chapter $4$, we find that in some situations the graph representation for relational patterns can lead to substantial loss of information. This is because in real-world problems objects and their features tend to exhibit multiple relationships rather than simple pairwise ones. This motive us to establish a feature hypergraph (rather than feature graph) to characterize the multiple relationships among features. We draw on recent work on hyper-graph clustering to select the most informative feature subset (mIFS) from a set of objects using high-order (rather than pairwise) similarities. There are two novel ingredients. First, we use MII to measure the significance of different feature combinations with respect to the class labels. Secondly, we use hypergraph clustering to select the most informative feature subset (mIFS), which has both low redundancy and strong discriminating power. The advantage of MII is that it incorporates third or higher order feature interactions. Hypergraph clustering, which extracts the most informative features. The size of the most informative feature subset (mIFS) is determined automatically. Experimental results demonstrate the effectiveness of our feature selection method on a number of standard data-sets. In addition to the supervised feature selection methods, we present two novel unsupervised feature selection methods in Chapter $5$ and Chapter $6$. Specifically, we propose a new two-step spectral regression technique for unsupervised feature selection in Chapter $5$. In the first step, we use kernel entropy component analysis (kECA) to transform the data into a lower-dimensional space so as to improve class separation. Second, we use $\ell_{1}$-norm regularization to select the features that best align with the data embedding resulting from kECA. The advantage of kECA is that dimensionality reducing data transformation maximally preserves entropy estimates for the input data whilst also best preserving the cluster structure of the data. Using $\ell_{1}$-norm regularization, we cast feature discriminant analysis into a regression framework which accommodates the correlations among features. As a result, we can evaluate joint feature combinations, rather than being confined to consider them individually. Experimental results demonstrate the effectiveness of our feature selection method on a number of standard face data-sets. In Chapter $6$, by incorporating MII for higher order similarities measure, we establish a novel hypergraph framework which is used for characterizing the multiple relationships within a set of samples (e.g. face samples under varying illumination conditions). Thus, the structural information latent in the data can be more effectively modeled. We then explore a strategy to select the discriminating feature subset on the basis of the hypergraph representation. The strategy is based on an unsupervised method which derive the hypergraph embedding view of feature selection. We develop the strategy based on a number of standard image datasets, and the results demonstrate the effectiveness of our feature selection method. We summarize the contributions of this thesis in Chapter $7$, and analyze the developed methods. Finally, we give some suggestions to the future work in feature selection. 004.01
152	Spectral geometry for structural pattern recognition El Ghawalby, Heyayda January 2011 (has links) Graphs are used pervasively in computer science as representations of data with a network or relational structure, where the graph structure provides a flexible representation such that there is no fixed dimensionality for objects. However, the analysis of data in this form has proved an elusive problem; for instance, it suffers from the robustness to structural noise. One way to circumvent this problem is to embed the nodes of a graph in a vector space and to study the properties of the point distribution that results from the embedding. This is a problem that arises in a number of areas including manifold learning theory and graph-drawing. In this thesis, our first contribution is to investigate the heat kernel embedding as a route to computing geometric characterisations of graphs. The reason for turning to the heat kernel is that it encapsulates information concerning the distribution of path lengths and hence node affinities on the graph. The heat kernel of the graph is found by exponentiating the Laplacian eigensystem over time. The matrix of embedding co-ordinates for the nodes of the graph is obtained by performing a Young-Householder decomposition on the heat kernel. Once the embedding of its nodes is to hand we proceed to characterise a graph in a geometric manner. With the embeddings to hand, we establish a graph characterization based on differential geometry by computing sets of curvatures associated with the graph nodes, edges and triangular faces. The second contribution comes from the need to solve the problem that arise in the processing of a noisy data over a graph. The Principal difficulty of this task, is how to preserve the geometrical structures existing in the initial data. Bringing together several, distinct concepts that have received some independent recent attention in machine learning; we propose a framework to regularize real-valued or vector-valued functions on weighted graphs of arbitrary topology. The first of these is deduced from the concepts of the spectral graph theory that have been applied to a wide range of clustering and classification tasks over the last decades taking in consideration the properties of the graph $p$-Laplacian as a nonlinear extension of the usual graph Laplacian. The second one is the geometric point of view comes from the heat kernel embedding of the graph into a manifold. In these techniques we use the geometry of the manifold by assuming that it has the geometric structure of a Riemannian manifold. The third important conceptual framework comes from the manifold regularization which extends the classical framework of regularization in the sense of reproducing Hilbert Spaces to exploit the geometry of the embedded set of points. The proposed framework, based on the $p$-Laplacian operators considering minimizing a weighted sum of two energy terms: a regularization one and an additional approximation term which helps to avoid the shrinkage effects obtained during the regularization process. The data are structured by functions depending on data features, the curvature attributes associated with the geometric embedding of the graph. The third contribution is inspired by the concepts and techniques of the graph calculus of partial differential functions. We propose a new approach for embedding graphs on pseudo-Riemannian manifolds based on the wave kernel which is the solution of the wave equation on the edges of a graph. The eigensystem of the wave-kernel is determined by the eigenvalues and the eigenfunctions of the normalized adjacency matrix and can be used to solve the edge-based wave equation. By factorising the Gram-matrix for the wave-kernel, we determine the embedding co-ordinates for nodes under the wave-kernel. The techniques proposed through this thesis are investigated as a means of gauging the similarity of graphs. We experiment on sets of graphs representing the proximity of image features in different views of different objects in three different datasets namely, the York model house, the COIL-20 and the TOY databases. 004.01
153	A study of lawyers' information behaviour leading to the development of two methods for evaluating electronic resources Makri, Stephann January 2009 (has links) In this thesis we examine the information behaviour displayed by a broad cross-section of academic and practicing lawyers and feed our findings into the development of the Information Behaviour (IB) methods - two novel methods for evaluating the functionality and usability of electronic resources. We captured lawyers' information behaviour by conducting naturalistic observations, where we asked participants to think aloud whilst using existing resources to 'find information required for their work.' Lawyers' information behaviours closely matched those observed in other disciplines by Ellis and others, serving to validate Ellis's existing model in the legal domain. Our findings also extend Ellis's model to include behaviours pertinent to legal information-seeking, broaden the scope of the model to cover information use (in addition to information-seeking) behaviours and enhance the potential analytical detail of the model through the identification of a range of behavioural 'subtypes' and levels at which behaviours can operate. The identified behaviours were used as the basis for developing two methods for evaluating electronic resources - the IB functionality method (which mainly involves examining whether and how information behaviours are currently, or might in future be, supported by an electronic resource) and the IB usability method (which involves setting users behaviour-focused tasks, asking them to think aloud whilst performing the tasks, and identifying usability issues from the think- aloud data). Finally the IB methods were themselves evaluated by stakeholders working for LexisNexis Butterworths - a large electronic legal resource development firm. Stakeholders were recorded using the methods and focus group and questionnaire data was collected, with the aim of ascertaining how usable, useful and learnable they considered the methods to be and how likely they would be to use them in future. Overall, findings were positive regarding both methods and useful suggestions for improving the methods were made. 004.01
154	Single and joint iterative decoding for higher order modulation schemes Vital, Juan Carlos Serrato January 2008 (has links) The research project described in this thesis concentrates on the study, and application of specific channel coding techniques, in particular, low-density parity-check (LDPC) codes, iterative decoding on Tanner graphs, and their application on joint iterative receivers based on the turbo principle, previously proposed. The construction of random LDPC codes that fulfil certain desirable characteristics, such as large girth, specific p and -y values, and acceptable BER and FER performance for short code lengths, traditionally requires a high degree of processing power (i. e. CPU cycles) to run stochastic routines that firstly search within all the possible combinations for those ones that match the desired characteristics of the LDPC matrix, and secondly determines the bit-error rate (BER) and frame-error rate (FER) performance. The construction of well structured LDPC codes by means of algebraic methods has provided LDPC codes that achieve excellent performance, with desirable structure on their LDPC matrices. However, from the universe of LDPC matrices, those ones created through well structured procedures are a small group. Multiple procedures to modify their characteristics such as length and rate have assisted to increase the pool of LDPC codes based on well structured procedures. This thesis study the problem of constructing random LDPC codes with particular length, girth, and column weight as design parameters, with reduced processing power, while providing, at the same time, a desirable structure to allow efficient use of the memory and of the parallel processing capacity to reduce delay through efficient encoding and decoding. Based on previous studies that analysed the same problem, an algorithm is introduced to construct the Girth-Partition and Shift (GPS) LDPC codes, which are half-rate quasi-cyclic (QC) LDPC codes. Several GPS constructions are analysed over the AWGN channel and the flat-fading channel. The effect on the BER and FER performance from variations on their design parameters, is included in this study. This work also includes the BER and FER performance of the concatenation in parallel of different LDPC codes, some of which are based on well structured procedures, such as Euclidean Geometries (EG) and Projective Geomtries (PG), and Margulis constructions based on the Cayley graph, while the rest are based on random procedures, such as Graphical Models (GM) and GPS-LDPC codes. The aim of the analysis of this scheme, combined with the referred LDPC code constructions, include the improvement of the BER and FER performance for short code lengths and the reduction of the encoding complexity. The BER and FER performance achieved by the parallel concatenation of the previously mentioned LDPC codes, is further analysed in a joint demapping, parallel channel decoding and source decoding system. The impact of each component on the overall system performance is also examined. 004.01
155	Phase transition behaviour in constraint satisfaction problems Grant, Stuart Alexander January 1997 (has links) Many problems in artificial intelligence and computer science can be formulated as constraint satisfaction problems (CSPs). A CSP consists of a set of variables among which a set of constraints are imposed, with a solution corresponding to an assignment for every variable such that no constraints are violated. Most forms of CSP are NP-complete. Recent research has shown that the CSP exhibits a phase transition as a control parameter is varied. This transition lies between a region where most problems are easy and soluble, and a region where most problems are easy but insoluble. In the intervening phase transition region, the average problem difficulty is greatest. Phase transition behaviour can be exploited to create test beds of hard and easy problems for CSP algorithms. In this thesis, we study the phase transition of the binary CSP and examine various aspects of complete search algorithms for it. The phenomenon of exceptionally hard problems (`ehps') is examined in detail: these are rare searches on easy problems which become exceptionally expensive for a particular complete algorithm following a poor early search move. An explanation for the occurrence of ehps is proposed, and the relative susceptibility of certain algorithms to the phenomenon is explored. We then show that the phase transition paradigm can be applied to two tasks of polynomial cost complexity: attempting to establish arc and path consistency in a CSP. Phase transition behaviour analogous to that found when searching for a solution is demonstrated for these tasks, and the effectiveness and cost of establishing arc and path consistency is examined. The theme of establishing consistency in CSPs is extended by studying an algorithm which maintains arc consistency during search. Its performance is compared with that of an algorithm which maintains a lower level of consistency, and it is shown that the higher level of consistency reduces average search cost and ehp behaviour on many types of CSP. Finally, the subject of dynamically selecting the variable to instantiate at each stage in the search process is considered. We compare a number of heuristics which attempt to select the variable most likely to lead to failure, and show that the supposed principle behind these appears to be fundamentally flawed. 004.01
156	Negation in logic and deductive databases Wang, Xuegang January 2000 (has links) This thesis studies negation in logic and deductive databases. Among other things, two kinds of negation are discussed in detail: strong negation and nonmonotonic negation. In the logic part, we have constructed a first-order logic CF 0 of strong negation with bounded quantifiers. The logic is based on constructive logics, in particular, Thomason's logic CF. However, unlike constructive logic, quantifiers in our system as in Thomason's are static rather than dynamic. For the logic CF 0 , the usual Kripke formal semantics is defined but based on situations instead of conventional possible worlds. A sound and complete axiomatic system of CF 0 is established based on the axiomatic systems of constructive logics with strong negation and Thomason's completeness proof techniques. CF 0 is proposed as the underlying logic for situation theory. Thus the connection between CF 0 and infon logic is briefly discussed. In the database part, based on the study of some main existing semantic theories for logic programs with nonmonotonic negation, we have defined a novel semantics of logic programs called quasi-stable semantics. An important observation is that a nonmonotonic negation such as is required for logic programs should be computed by a nonmonotonic, revision process. Only a process that allows one to withdraw by revising provisionally held negative information can hope to be adequate to model a non-monotonic negation. In light of this, we propose a model of negation that owes much to the stable semantics but allows, through a mechanism of consistency- recovery, for just this withdrawal of previously assumed negative information. It has been proved that our new semantics maintains the desired features of both the well-founded semantics and the stable model semantics while overcoming their shortcomings. In addition, the quasi-stable semantics has been generalised to logic programs with both strong negation and nonmonotonic negation, giving rise to the quasi-answer set semantics. 004.01
157	Parallel domain decomposition preconditioning for the adaptive finite element solution of elliptic problems in three dimensions Nadeem, Sarfraz Ahmad January 2001 (has links) A novel weakly overlapping two level additive Schwarz domain decomposition preconditioning algorithm is presented which is appropriate for the parallel finite element solution of elliptic partial differential equations in three dimensions. This algorithm allows each processor to be assigned one or more subdomains and, as with most parallel domain decomposition solvers, each processor is able to solve its own subproblem(s) concurrently. The novel feature of the technique proposed here is that it requires just a single layer of overlap in the elements which make up each subdomain at each level of refinement, and it is shown that this amount of overlap is sufficient to yield an optimal preconditioner. The number of elements in this overlap region between subdomains is O(h-2 ) as the mesh size h -> 0. This is an improvement over the O(h-3) overlapping elements required to obtain optimality for a conventional two level additive Schwarz algorithm. The quality and effectiveness of this new algorithm is examined using both global uniform and local non-uniform refinement with two representative partitions of the domain . This preconditioning algorithm is then generalized such that the resulting preconditioner is not only suitable for symmetric problems but also for nonsymmetric and convection-dominated elliptic problems. This generalization, in the absence of theoretical or mathematical background, is based on empirical observations. Moreover, it turns out to be more effective and robust than the original symmetric preconditioning algorithm when applied to symmetric positive definite problems. This generalized algorithm is tested on symmetric, nonsymmetric and convection-dominated partial differential equations, where the number of iterations observed suggests that the preconditioner may in fact be optimal, i.e. the condition number of the preconditioned systems is bounded as the mesh is refined or the number of subdomains is increased. Due to non-physical oscillations in the solution of convection-dominated problems when discretized by the Galerkin finite element method, unless the size of elements is sufficiently small, we have extended our implementation of the generalized preconditioning algorithm to be applicable to systems arising from a more stable finite element discretization technique based upon streamline diffusion. Numerical experiments for a wide variety of problems are included to demonstrate the optimal or near-optimal behaviour and quality of this generalized algorithm. Parallel performance of the generalized preconditioning algorithm is also evaluated and analyzed. All the timings quoted are for a SG Origin 2000 computer and all software implementations described in this thesis have been coded and tested using ANSI C and the MPI communication library. 004.01
158	Issues in validation and executability of formal specifications in the Z notation West, Margaret Mary January 2002 (has links) The work considers issues in the execution of the Z notation in a logic programming language. A subset of Z which is capable of being animated is identified, together with the necessary theoretical foundations for the relationship of Z to its executable form. The thesis also addresses the transition from research results to potentially useful tools. The thesis has 4 major parts: Tools Survey: A survey of tools which support the animation of Z is presented and the advantages (and disadvantages) to be gained from an animating system which uses a logic programming language are discussed. Requirements, particularly correctness, are described and discussed and weaknesses in the current tools are identified. Correctness - Program Synthesis: If a program can be deduced directly from the specification, then it is partially correct with respect to the specification. This method of obtaining a program from a specification is one form of logic programming synthesis. We examine such formal links between a specification (in Z) and an executable form and also some translation techniques for synthesising a logic program from a Z specification. The techniques are illustrated by examples which reveal important shortcomings. Translation Rules to Godel: New techniques for the animation of Z utilising the Godel logic programming language are presented which circumvent these shortcomings. The techniques are realised via translation rules known as structure simulation . Two substantial case studies are examined as proof of concept. These indicate both the coverage of the Z notation by structure simulation and the practicality of the rules. Correctness - Abstract Approximation: Published criteria for correctness of an animation are compared and contrasted with the method of Abstract Interpretation (AI). In AI a concrete semantics is related to an approximate one that explicitly exhibits an underlying structure present in the richer concrete structure. In our case, the concrete semantics is Z associated with ZF set theory . The approximate semantics of the execution are the outputs of Z. The criteria are applied to a logic programming language (the original w as applied to a functional language). Formal arguments are presented which show that the structure simulation rules obey the criteria for correctness. Finally, areas of work which had been omitted by the original authors are presented explicitly. 004.01
159	Correctness, precision and efficiency in the sharing analysis of real logic languages Zaffanella, Enea January 2001 (has links) For programming languages based on logic, a knowledge of variable sharing is important; for instance, for their automatic parallelization and for many optimisations of the unification procedure, such as occurs-check reduction. Because of its usefulness, a considerable amount of research has been done on the design and development of techniques for the static analysis of variable sharing. Despite this fact, some of the most important issues related to the specification and implementation of a practical sharing analysis tool, such as the correctness, the precision and the efficiency of the analysis, have lacked satisfactory solutions. The thesis reports on our work in rectifying this situation and, thereby, enhancing the state-of-the-art on sharing analysis. Our contributions include: a correctness proof for the set-sharing domain of Jacobs and Langen that does not depend on the presence of the occurs-check in the concrete unification procedure; the definition of the simpler abstraction of set-sharing that is guaranteed to achieve the same precision on both variable independence and groundness; the specification, on this new domain of an abstract unification operator having polynomial complexity, as opposed to the exponential complexity of the abstract unification operator defined on set-sharing; the generalization of all the above results to a combined abstract domain including set-sharing, freeness and linearity information; an extensive experimental evaluation, including both the validation of the above results as well as the implementation and comparison of many other recent proposals for more precise sharing analysis domains; and the specification of a new representation for set-sharing which, by allowing for the definition of a family of widening operators, solves all the scalability problems of the analysis, while having a negligible impact on its precision. 004.01
160	Interaction techniques with novel multimodal feedback for addressing gesture-sensing systems Freeman, Euan January 2016 (has links) Users need to be able to address in-air gesture systems, which means finding where to perform gestures and how to direct them towards the intended system. This is necessary for input to be sensed correctly and without unintentionally affecting other systems. This thesis investigates novel interaction techniques which allow users to address gesture systems properly, helping them find where and how to gesture. It also investigates audio, tactile and interactive light displays for multimodal gesture feedback; these can be used by gesture systems with limited output capabilities (like mobile phones and small household controls), allowing the interaction techniques to be used by a variety of device types. It investigates tactile and interactive light displays in greater detail, as these are not as well understood as audio displays. Experiments 1 and 2 explored tactile feedback for gesture systems, comparing an ultrasound haptic display to wearable tactile displays at different body locations and investigating feedback designs. These experiments found that tactile feedback improves the user experience of gesturing by reassuring users that their movements are being sensed. Experiment 3 investigated interactive light displays for gesture systems, finding this novel display type effective for giving feedback and presenting information. It also found that interactive light feedback is enhanced by audio and tactile feedback. These feedback modalities were then used alongside audio feedback in two interaction techniques for addressing gesture systems: sensor strength feedback and rhythmic gestures. Sensor strength feedback is multimodal feedback that tells users how well they can be sensed, encouraging them to find where to gesture through active exploration. Experiment 4 found that they can do this with 51mm accuracy, with combinations of audio and interactive light feedback leading to the best performance. Rhythmic gestures are continuously repeated gesture movements which can be used to direct input. Experiment 5 investigated the usability of this technique, finding that users can match rhythmic gestures well and with ease. Finally, these interaction techniques were combined, resulting in a new single interaction for addressing gesture systems. Using this interaction, users could direct their input with rhythmic gestures while using the sensor strength feedback to find a good location for addressing the system. Experiment 6 studied the effectiveness and usability of this technique, as well as the design space for combining the two types of feedback. It found that this interaction was successful, with users matching 99.9% of rhythmic gestures, with 80mm accuracy from target points. The findings show that gesture systems could successfully use this interaction technique to allow users to address them. Novel design recommendations for using rhythmic gestures and sensor strength feedback were created, informed by the experiment findings. 004.01

Search results