Global ETD Search

1061	Machine Learning for Variant Detection and Population Analysis in Heterogenerous Cancer Sample Jiao, Wei 28 November 2013 (has links) Cancer is a complex and deadly disease that is caused by genetic lesions in somatic cells. Further research in computational methodology for detecting and characterizing somatic mutations is necessary in order to understand the comprehensive systems level model of the roles of those lesions in cancer development. In the first project, I trained a list of supervised machine learning classifiers that classify false positive versus true positive somatic single nucleotide variants (SNVs). I was able to show an improvement of somatic SNV detection on the data set over the reported classifier. In the second project, we developed PhyloSub model that uses a nonparametric Bayesian prior over a set of trees to cluster SNVs, and infer the subclonal phylogenetic structure of tumors with uncertainty from SNV sequencing data. Experiments showed that PhyloSub model could infer the subclonal phylogenetic structure from both single and multiple tumor samples. Single nucleotide variant Machine learning Cancer heterogeneity 0715
1062	Utilizing Positron Emission Tomography in Lung Cancer Treatment Li, Heyse 04 December 2013 (has links) We explore both robust biologically guided intensity-modulated radiation therapy (BG-IMRT) and pattern recognition to identify responders to cancer treatment for lung cancer. Heterogeneous dose prescriptions that are derived from biological images are subject to uncertainty, due to potential noise in the image. We develop a robust optimization model to design BG-IMRT plans that are de-sensitized to uncertainty. Computational results show improvements in tumor control probability and deviation from prescription dose compared to a non-robust model, while maintaining tissue dose below toxicity levels. We applied machine learning algorithms to 4D gated positron emission tomography/computed tomography (PET/CT) scans. We identified classifiers which could outperform a naive classifier. Our work shows the potential of using machine learning algorithms to predict patient response. This could hopefully lead to more adaptive treatment plans, where the clinician would adapt the treatment based on the prediction provided at certain time intervals in the treatment. optimization robust radiation therapy IMRT lung cancer machine learning 0546
1063	Utilizing Positron Emission Tomography in Lung Cancer Treatment Li, Heyse 04 December 2013 (has links) We explore both robust biologically guided intensity-modulated radiation therapy (BG-IMRT) and pattern recognition to identify responders to cancer treatment for lung cancer. Heterogeneous dose prescriptions that are derived from biological images are subject to uncertainty, due to potential noise in the image. We develop a robust optimization model to design BG-IMRT plans that are de-sensitized to uncertainty. Computational results show improvements in tumor control probability and deviation from prescription dose compared to a non-robust model, while maintaining tissue dose below toxicity levels. We applied machine learning algorithms to 4D gated positron emission tomography/computed tomography (PET/CT) scans. We identified classifiers which could outperform a naive classifier. Our work shows the potential of using machine learning algorithms to predict patient response. This could hopefully lead to more adaptive treatment plans, where the clinician would adapt the treatment based on the prediction provided at certain time intervals in the treatment. optimization robust radiation therapy IMRT lung cancer machine learning 0546
1064	Empirical learning methods for the induction of knowledge from optimization models Kirschner, Kenneth J. 08 1900 (has links) No description available. Machine learning Combinatorial optimization Chemical engineering Heart valves Diseases
1065	Pairwise rational kernels applied to metabolic network predictions Roche Lima, Abiel 06 April 2015 (has links) Metabolic networks are represented by the set of metabolic pathways. Metabolic pathways are a series of chemical reactions, in which the product from one reaction serves as the input to another reaction. Many pathways remain incompletely characterized, and in some of them not all enzyme components have been identified. One of the major challenges of computational biology is to obtain better models of metabolic pathways. Existing models are dependent on the annotation of the genes. This propagates error accumulation when the pathways are predicted by incorrectly annotated genes. Pairwise kernel frameworks have been used in supervised learning approaches, e.g., Pairwise Support Vector Machines (SVMs), to predict relationships among two pairs of entities. Rational kernels are based on transducers to manipulate sequence data, computing similarity measures between sequences or automata. Rational kernels take advantage of the smaller and faster representation and algorithms of weighted finite-state transducers. They have been effectively used in problems that handle large amount of sequence information such as protein essentiality, natural language processing and machine translations. We propose a new framework, Pairwise Rational Kernels (PRKs), to manipulate pairs of sequence data, as pairwise combinations of rational kernels. We develop experiments using SVM with PRKs applied to metabolic pathway predictions in order to validate our methods. As a result, we obtain faster execution times with PRKs than other kernels, while maintaining accurate predictions. Because raw sequence data can be used, the predictor model avoids the errors introduced by incorrect gene annotations. We also obtain a new type of Pairwise Rational Kernels based on automaton and transducer operations. In this case, we define new operations over two pairs of automata to obtain new rational kernels. We also develop experiments to validate these new PRKs to predict metabolic networks. As a result, we obtain the best execution times when we compare them with other kernels and the previous PRKs. Machine Learning kernel methods Bioinformatics Metabolic network predictions
1066	A Homogeneous Hierarchical Scripted Vector Classification Network with Optimisation by Genetic Algorithm Wright, Hamish Michael January 2007 (has links) A simulated learning hierarchical architecture for vector classification is presented. The hierarchy used homogeneous scripted classifiers, maintaining similarity tables, and selforganising maps for the input. The scripted classifiers produced output, and guided learning with permutable script instruction tables. A large space of parametrised script instructions was created, from which many different combinations could be implemented. The parameter space for the script instruction tables was tuned using a genetic algorithm with the goal of optimizing the networks ability to predict class labels for bit pattern inputs. The classification system, known as Dura, was presented with various visual classification problems, such as: detecting overlapping lines, locating objects, or counting polygons. The network was trained with a random subset from the input space, and was then tested over a uniformly sampled subset. The results showed that Dura could successfully classify these and other problems. The optimal scripts and parameters were analysed, allowing inferences about which scripted operations were important, and what roles they played in the learning classification system. Further investigations were undertaken to determine Dura's performance in the presence of noise, as well as the robustness of the solutions when faced with highly stochastic training sequences. It was also shown that robustness and noise tolerance in solutions could be improved through certain adjustments to the algorithm. These adjustments led to different solutions which could be compared to determine what changes were responsible for the increased robustness or noise immunity. The behaviour of the genetic algorithm tuning the network was also analysed, leading to the development of a super solutions cache, as well as improvements in: convergence, fitness function, and simulation duration. The entire network was simulated using a program written in C++ using FLTK libraries for the graphical user interface. machine learning genetic algorithm statistical learning adaptive software
1067	Machine learning approach to reconstructing signalling pathways and interaction networks in biology Dondelinger, Frank January 2013 (has links) In this doctoral thesis, I present my research into applying machine learning techniques for reconstructing species interaction networks in ecology, reconstructing molecular signalling pathways and gene regulatory networks in systems biology, and inferring parameters in ordinary differential equation (ODE) models of signalling pathways. Together, the methods I have developed for these applications demonstrate the usefulness of machine learning for reconstructing networks and inferring network parameters from data. The thesis consists of three parts. The first part is a detailed comparison of applying static Bayesian networks, relevance vector machines, and linear regression with L1 regularisation (LASSO) to the problem of reconstructing species interaction networks from species absence/presence data in ecology (Faisal et al., 2010). I describe how I generated data from a stochastic population model to test the different methods and how the simulation study led us to introduce spatial autocorrelation as an important covariate. I also show how we used the results of the simulation study to apply the methods to presence/absence data of bird species from the European Bird Atlas. The second part of the thesis describes a time-varying, non-homogeneous dynamic Bayesian network model for reconstructing signalling pathways and gene regulatory networks, based on L`ebre et al. (2010). I show how my work has extended this model to incorporate different types of hierarchical Bayesian information sharing priors and different coupling strategies among nodes in the network. The introduction of these priors reduces the inference uncertainty by putting a penalty on the number of structure changes among network segments separated by inferred changepoints (Dondelinger et al., 2010; Husmeier et al., 2010; Dondelinger et al., 2012b). Using both synthetic and real data, I demonstrate that using information sharing priors leads to a better reconstruction accuracy of the underlying gene regulatory networks, and I compare the different priors and coupling strategies. I show the results of applying the model to gene expression datasets from Drosophila melanogaster and Arabidopsis thaliana, as well as to a synthetic biology gene expression dataset from Saccharomyces cerevisiae. In each case, the underlying network is time-varying; for Drosophila melanogaster, as a consequence of measuring gene expression during different developmental stages; for Arabidopsis thaliana, as a consequence of measuring gene expression for circadian clock genes under different conditions; and for the synthetic biology dataset, as a consequence of changing the growth environment. I show that in addition to inferring sensible network structures, the model also successfully predicts the locations of changepoints. The third and final part of this thesis is concerned with parameter inference in ODE models of biological systems. This problem is of interest to systems biology researchers, as kinetic reaction parameters can often not be measured, or can only be estimated imprecisely from experimental data. Due to the cost of numerically solving the ODE system after each parameter adaptation, this is a computationally challenging problem. Gradient matching techniques circumvent this problem by directly fitting the derivatives of the ODE to the slope of an interpolant. I present an inference procedure for a model using nonparametric Bayesian statistics with Gaussian processes, based on Calderhead et al. (2008). I show that the new inference procedure improves on the original formulation in Calderhead et al. (2008) and I present the result of applying it to ODE models of predator-prey interactions, a circadian clock gene, a signal transduction pathway, and the JAK/STAT pathway. 006.3
1068	Dynamic Beamforming Optimization for Anti - Jamming and Hardware Fault Recovery Becker, Jonathan 16 May 2014 (has links) In recent years there has been a rapid increase in the number of wireless devices for both commercial and defense applications. Such unprecedented demand has increased device cost and complexity and also added a strain on the spectrum utilization of wireless communication systems. This thesis addresses these issues, from an antenna system perspective, by developing new techniques to dynamically optimize adaptive beamforming arrays for improved anti-jamming and reliability. Available frequency spectrum is a scarce resource, and therefor e increased interference will occur as the wireless spectrum saturates. To mitig ate unintentional interference, or intentional interference from a jamming source, antenna arrays are used to focus electromagnetic energy on a signal of interest while simultaneously minimizing radio frequency energy in directions of interfering signals. The reliability of such arrays, especially in commercial satellite and defense applications, can be addressed by hardware redundancy, but at the expense of increased volume, mass as well as component and design cost. This thesis proposes the development of new models and optimization algorithms to dynamically adapt beamforming arrays to mitigate interference and increase hardware reliability. The contributions of this research are as follows. First, analytical models are developed and experimental results show that small antenna arrays can thwart interference using dynamically applied stochastic algorithms. This type of insitu optimization, with an algorithm dynamically optimizing a beamformer to thwart interference sources with unknown positions, inside of an anechoic chamber has not been done before to our knowledge. Second, it is shown that these algorithms can recover from hardware failures and localized faults in the array. Experiments were performed with a proof-of-concept four-antenna array. This is the first hardware demonstration showing an antenna array with live hardware fault recovery that is adapted by stochastic algorithms in an anechoic chamber. We also compare multiple stochastic algorithms in performing both anti-jamming and hardware fault recovery. Third, we show that stochastic algorithms can be used to continuously track and mitigate interfering signals that continuously move in an additive white Gaussian noise wireless channel. stochastic algorithms antenna arrays fault recovery reliability antenna machine learning
1069	Semi-Supervised and Latent-Variable Models of Natural Language Semantics Das, Dipanjan 01 September 2012 (has links) This thesis focuses on robust analysis of natural language semantics. A primary bottleneck for semantic processing of text lies in the scarcity of high-quality and large amounts of annotated data that provide complete information about the semantic structure of natural language expressions. In this dissertation, we study statistical models tailored to solve problems in computational semantics, with a focus on modeling structure that is not visible in annotated text data. We first investigate supervised methods for modeling two kinds of semantic phenomena in language. First, we focus on the problem of paraphrase identification, which attempts to recognize whether two sentences convey the same meaning. Second, we concentrate on shallow semantic parsing, adopting the theory of frame semantics (Fillmore, 1982). Frame semantics offers deep linguistic analysis that exploits the use of lexical semantic properties and relationships among semantic frames and roles. Unfortunately, the datasets used to train our paraphrase and frame-semantic parsing models are too small to lead to robust performance. Therefore, a common trait in our methods is the hypothesis of hidden structure in the data. To this end, we employ conditional log-linear models over structures, that are firstly capable of incorporating a wide variety of features gathered from the data as well as various lexica, and secondly use latent variables to model missing information in annotated data. Our approaches towards solving these two problems achieve state-of-the-art accuracy on standard corpora. For the frame-semantic parsing problem, we present fast inference techniques for jointly modeling the semantic roles of a given predicate. We experiment with linear program formulations, and use a commercial solver as well as an exact dual decomposition technique that breaks the role labeling problem into several overlapping components. Continuing with the theme of hypothesizing hidden structure in data for modeling natural language semantics, we present methods to leverage large volumes of unlabeled data to improve upon the shallow semantic parsing task. We work within the framework of graph-based semi-supervised learning, a powerful method that associates similar natural language types, and helps propagate supervised annotations to unlabeled data. We use this framework to improve frame-semantic parsing performance on unknown predicates that are absent in annotated data. We also present a family of novel objective functions for graph-based learning that result in sparse probability measures over graph vertices, a desirable property for natural language types. Not only are these objectives easier to numerically optimize, but also they result in smoothed distributions over predicates that are smaller in size. The experiments presented in this dissertation empirically demonstrates that missing information in text corpora contain considerable semantic information that can be incorporated into structured models for semantics, to significant benefit over the current state of the art. The methods in this thesis were originally presented by Das and Smith (2009, 2011, 2012), and Das et al. (2010, 2012). The thesis gives a more thorough exposition, relating and comparing the methods, and also presents several extensions of the aforementioned papers. computer science artificial intellignce natural language processing machine learning
1070	Attribute Learning using Joint Human and Machine Computation Law, Edith L.M. 01 August 2012 (has links) This thesis is centered around the problem of attribute learning -- using the joint effort of humans and machines to describe objects, e.g., determining that a piece of music is "soothing," that the bird in an image "has a red beak", or that Ernest Hemingway is an "Nobel Prize winning author." In this thesis, we present new methods for solving the attribute-learning problem using the joint effort of the crowd and machines via human computation games. When creating a human computation system, typically two design objectives need to be simultaneously satisfied. The first objective is human-centric -- the task prescribed by the system must be intuitive, appealing and easy to accomplish for human workers. The second objective is task-centric -- the system must actually perform the task at hand. These two goals are often at odds with each other, especially in the casual game setting. This thesis shows that human computation games can accomplish both the human-centric and task-centric objectives, if we first design for humans, then devise machine learning algorithms to work around the limitations of human workers and complement their abilities in order to jointly accomplish the task of learning attributes. We demonstrate the effectiveness of our approach in three concrete problem settings: music tagging, bird image classification and noun phrase categorization. Contributions of this thesis include a framework for attribute learning, two new game mechanisms, experiments showing the effectiveness of the hybrid human and machine computation approach for learning attributes in vocabulary-rich settings and under the constraints of knowledge limitations, as well as deployed games played by tens of thousands of people, generating large datasets for machine learning. Attribute Learning Human Computation Games with a Purpose Machine Learning

Search results