Global ETD Search

431	Data analysis in proteomics novel computational strategies for modeling and interpreting complex mass spectrometry data Sniatynski, Matthew John 11 1900 (has links) Contemporary proteomics studies require computational approaches to deal with both the complexity of the data generated, and with the volume of data produced. The amalgamation of mass spectrometry -- the analytical tool of choice in proteomics -- with the computational and statistical sciences is still recent, and several avenues of exploratory data analysis and statistical methodology remain relatively unexplored. The current study focuses on three broad analytical domains, and develops novel exploratory approaches and practical tools in each. Data transform approaches are the first explored. These methods re-frame data, allowing for the visualization and exploitation of features and trends that are not immediately evident. An exploratory approach making use of the correlation transform is developed, and is used to identify mass-shift signals in mass spectra. This approach is used to identify and map post-translational modifications on individual peptides, and to identify SILAC modification-containing spectra in a full-scale proteomic analysis. Secondly, matrix decomposition and projection approaches are explored; these use an eigen-decomposition to extract general trends from groups of related spectra. A data visualization approach is demonstrated using these techniques, capable of visualizing trends in large numbers of complex spectra, and a data compression and feature extraction technique is developed suitable for use in spectral modeling. Finally, a general machine learning approach is developed based on conditional random fields (CRFs). These models are capable of dealing with arbitrary sequence modeling tasks, similar to hidden Markov models (HMMs), but are far more robust to interdependent observational features, and do not require limiting independence assumptions to remain tractable. The theory behind this approach is developed, and a simple machine learning fragmentation model is developed to test the hypothesis that reproducible sequence-specific intensity ratios are present within the distribution of fragment ions originating from a common peptide bond breakage. After training, the model shows very good performance associating peptide sequences and fragment ion intensity information, lending strong support to the hypothesis. Proteomics Bioinformatics Machine learning Mass spectrometry
432	Learning and discovery in incremental knowledge acquisition Suryanto, Hendra, Computer Science & Engineering, Faculty of Engineering, UNSW January 2005 (has links) Knowledge Based Systems (KBS) have been actively investigated since the early period of AI. There are four common methods of building expert systems: modeling approaches, programming approaches, case-based approaches and machine-learning approaches. One particular technique is Ripple Down Rules (RDR) which may be classified as an incremental case-based approach. Knowledge needs to be acquired from experts in the context of individual cases viewed by them. In the RDR framework, the expert adds a new rule based on the context of an individual case. This task is simple and only affects the expert???s workflow minimally. The rule added fixes an incorrect interpretation made by the KBS but with minimal impact on the KBS's previous correct performance. This provides incremental improvement. Despite these strengths of RDR, there are some limitations including rule redundancy, lack of intermediate features and lack of models. This thesis addresses these RDR limitations by applying automatic learning algorithms to reorganize the knowledge base, to learn intermediate features and possibly to discover domain models. The redundancy problem occurs because rules created in particular contexts which should have more general application. We address this limitation by reorganizing the knowledge base and removing redundant rules. Removal of redundant rules should also reduce the number of future knowledge acquisition sessions. Intermediate features improve modularity, because the expert can deal with features in groups rather than individually. In addition to the manual creation of intermediate features for RDR, we propose the automated discovery of intermediate features to speed up the knowledge acquisition process by generalizing existing rules. Finally, the Ripple Down Rules approach facilitates rapid knowledge acquisition as it can be initialized with a minimal ontology. Despite minimal modeling, we propose that a more developed knowledge model can be extracted from an existing RDR KBS. This may be useful in using RDR KBS for other applications. The most useful of these three developments was the automated discovery of intermediate features. This made a significant difference to the number of knowledge acquisition sessions required. Machine learning Knowledge acquisition (Expert systems)
433	Behavioural cloning robust goal directed control Isaac, Andrew Paul, Computer Science & Engineering, Faculty of Engineering, UNSW January 2009 (has links) Behavioural cloning is a simple and effective technique for automatically and non-intrusively producing comprehensible and implementable models of human control skill. Behavioural cloning applies machine learning techniques to behavioural trace data, in a transparent manner, and has been very successful in a wide range of domains. The limitations of early behavioural cloning work are: that the clones lack goal-structure, are not robust to variation, are sensitive to the nature of the training data and often produce complicated models of the control skill. Recent behavioural cloning work has sought to address these limitations by adopting goal-structured task decompositions and combining control engineering representations with more sophisticated machine learning algorithms. These approaches have had some success but by compromising either transparency or robustness. This thesis addresses these limitations by investigating: new behavioural cloning representations, control structures, data processing techniques, machine learning algorithms, and performance estimation and testing techniques. First a novel hierarchical decomposition of control is developed, where goal settings and the control skill to achieve them are learnt. This decomposition allows feedback control mechanisms to be combined with modular goal-achievement. Data processing limitations are addressed by developing data-driven, correlative and sampling techniques, that also inform the development of the learning algorithm. The behavioural cloning process is developed by performing experiments on simulated aircraft piloting tasks, and then the generality of the process is tested by performing experiments on simulated gantry-crane control tasks. The performance of the behavioural cloning process was compared to existing techniques, and demonstrated a marked improvement over these methods. The system is capable of handling novel goal-settings and task structure, under high noise conditions. The ability to produce successful controllers was greatly improved by using the developed control representation, data processing and learning techniques. The models produced are compact but tend to abstract the originating control behaviour. In conclusion, the control representation and cloning process address current limitations of behavioural cloning, and produce reliable, reusable and readable clones. Learning Control Behavioural Cloning Machine Learning
434	Graphical Models: Modeling, Optimization, and Hilbert Space Embedding Zhang, Xinhua, xinhua.zhang.cs@gmail.com January 2010 (has links) Over the past two decades graphical models have been widely used as powerful tools for compactly representing distributions. On the other hand, kernel methods have been used extensively to come up with rich representations. This thesis aims to combine graphical models with kernels to produce compact models with rich representational abilities. Graphical models are a powerful underlying formalism in machine learning. Their graph theoretic properties provide both an intuitive modular interface to model the interacting factors, and a data structure facilitating efficient learning and inference. The probabilistic nature ensures the global consistency of the whole framework, and allows convenient interface of models to data. Kernel methods, on the other hand, provide an effective means of representing rich classes of features for general objects, and at the same time allow efficient search for the optimal model. Recently, kernels have been used to characterize distributions by embedding them into high dimensional feature space. Interestingly, graphical models again decompose this characterization and lead to novel and direct ways of comparing distributions based on samples. Among the many uses of graphical models and kernels, this thesis is devoted to the following four areas: Conditional random fields for multi-agent reinforcement learning Conditional random fields (CRFs) are graphical models for modelling the probability of labels given the observations. They have traditionally been trained with using a set of observation and label pairs. Underlying all CRFs is the assumption that, conditioned on the training data, the label sequences of different training examples are independent and identically distributed (iid ). We extended the use of CRFs to a class of temporal learning algorithms, namely policy gradient reinforcement learning (RL). Now the labels are no longer iid. They are actions that update the environment and affect the next observation. From an RL point of view, CRFs provide a natural way to model joint actions in a decentralized Markov decision process. They define how agents can communicate with each other to choose the optimal joint action. We tested our framework on a synthetic network alignment problem, a distributed sensor network, and a road traffic control system. Using tree sampling by Hamze & de Freitas (2004) for inference, the RL methods employing CRFs clearly outperform those which do not model the proper joint policy. Bayesian online multi-label classification Gaussian density filtering (GDF) provides fast and effective inference for graphical models (Maybeck, 1982). Based on this natural online learner, we propose a Bayesian online multi-label classification (BOMC) framework which learns a probabilistic model of the linear classifier. The training labels are incorporated to update the posterior of the classifiers via a graphical model similar to TrueSkill (Herbrich et al., 2007), and inference is based on GDF with expectation propagation. Using samples from the posterior, we label the test data by maximizing the expected F-score. Our experiments on Reuters1-v2 dataset show that BOMC delivers significantly higher macro-averaged F-score than the state-of-the-art online maximum margin learners such as LaSVM (Bordes et al., 2005) and passive aggressive online learning (Crammer et al., 2006). The online nature of BOMC also allows us to effciently use a large amount of training data. Hilbert space embedment of distributions Graphical models are also an essential tool in kernel measures of independence for non-iid data. Traditional information theory often requires density estimation, which makes it unideal for statistical estimation. Motivated by the fact that distributions often appear in machine learning via expectations, we can characterize the distance between distributions in terms of distances between means, especially means in reproducing kernel Hilbert spaces which are called kernel embedment. Under this framework, the undirected graphical models further allow us to factorize the kernel embedment onto cliques, which yields efficient measures of independence for non-iid data (Zhang et al., 2009). We show the effectiveness of this framework for ICA and sequence segmentation, and a number of further applications and research questions are identified. Optimization in maximum margin models for structured data Maximum margin estimation for structured data, e.g. (Taskar et al., 2004), is an important task in machine learning where graphical models also play a key role. They are special cases of regularized risk minimization, for which bundle methods (BMRM, Teo et al., 2007) and the closely related SVMStruct (Tsochantaridis et al., 2005) are state-of-the-art general purpose solvers. Smola et al. (2007b) proved that BMRM requires O(1/έ) iterations to converge to an έ accurate solution, and we further show that this rate hits the lower bound. By utilizing the structure of the objective function, we devised an algorithm for the structured loss which converges to an έ accurate solution in O(1/√έ) iterations. This algorithm originates from Nesterov's optimal first order methods (Nesterov, 2003, 2005b). Machine Learning Graphical Models Kernel Methods Optimization
435	Behavioural cloning robust goal directed control Isaac, Andrew Paul, Computer Science & Engineering, Faculty of Engineering, UNSW January 2009 (has links) Behavioural cloning is a simple and effective technique for automatically and non-intrusively producing comprehensible and implementable models of human control skill. Behavioural cloning applies machine learning techniques to behavioural trace data, in a transparent manner, and has been very successful in a wide range of domains. The limitations of early behavioural cloning work are: that the clones lack goal-structure, are not robust to variation, are sensitive to the nature of the training data and often produce complicated models of the control skill. Recent behavioural cloning work has sought to address these limitations by adopting goal-structured task decompositions and combining control engineering representations with more sophisticated machine learning algorithms. These approaches have had some success but by compromising either transparency or robustness. This thesis addresses these limitations by investigating: new behavioural cloning representations, control structures, data processing techniques, machine learning algorithms, and performance estimation and testing techniques. First a novel hierarchical decomposition of control is developed, where goal settings and the control skill to achieve them are learnt. This decomposition allows feedback control mechanisms to be combined with modular goal-achievement. Data processing limitations are addressed by developing data-driven, correlative and sampling techniques, that also inform the development of the learning algorithm. The behavioural cloning process is developed by performing experiments on simulated aircraft piloting tasks, and then the generality of the process is tested by performing experiments on simulated gantry-crane control tasks. The performance of the behavioural cloning process was compared to existing techniques, and demonstrated a marked improvement over these methods. The system is capable of handling novel goal-settings and task structure, under high noise conditions. The ability to produce successful controllers was greatly improved by using the developed control representation, data processing and learning techniques. The models produced are compact but tend to abstract the originating control behaviour. In conclusion, the control representation and cloning process address current limitations of behavioural cloning, and produce reliable, reusable and readable clones. Learning Control Behavioural Cloning Machine Learning
436	Building better software the applicability of a professional tool for automating quality assessment and fault detection / Di Stefano, Justin S. January 1900 (has links) Thesis (M.S.)--West Virginia University, 2008. / Title from document title page. Document formatted into pages; contains vii, 83 p. : ill. (some col.). Vita. Includes abstract. Includes bibliographical references (p. 81-83).
437	Exact learning of first-order expressions from queries / Arias Robles, Marta. January 1900 (has links) Thesis (Ph.D.)--Tufts University, 2004. / Adviser: Roni Khardon. Submitted to the Dept. of Computer Science. Includes bibliographical references (leaves 157-161). Access restricted to members of the Tufts University community. Also available via the World Wide Web;
438	Architecting system of systems: artificial life analysis of financial market behavior Ergin, Nil Hande, January 2007 (has links) (PDF) Thesis (Ph. D.)--University of Missouri--Rolla, 2007. / Vita. The entire thesis text is included in file. Title from title screen of thesis/dissertation PDF file (viewed November 27, 2007) Includes bibliographical references (p. 124-137).
439	Learning bilingual semantic frames / Wu, Zhaojun. January 2008 (has links) Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2008. / Includes bibliographical references (leaves 70-75). Also available in electronic version.
440	Learning comprehensible theories from structured data / Ng, Kee Siong. January 2005 (has links) Thesis (Ph.D.)--Australian National University, 2005.

Search results