Global ETD Search

1	A nonparametric Bayesian perspective for machine learning in partially-observed settings Akova, Ferit 31 July 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Robustness and generalizability of supervised learning algorithms depend on the quality of the labeled data set in representing the real-life problem. In many real-world domains, however, we may not have full knowledge of the underlying data-generating mechanism, which may even have an evolving nature introducing new classes continually. This constitutes a partially-observed setting, where it would be impractical to obtain a labeled data set exhaustively defined by a fixed set of classes. Traditional supervised learning algorithms, assuming an exhaustive training library, would misclassify a future sample of an unobserved class with probability one, leading to an ill-defined classification problem. Our goal is to address situations where such assumption is violated by a non-exhaustive training library, which is a very realistic yet an overlooked issue in supervised learning. In this dissertation we pursue a new direction for supervised learning by defining self-adjusting models to relax the fixed model assumption imposed on classes and their distributions. We let the model adapt itself to the prospective data by dynamically adding new classes/components as data demand, which in turn gradually make the model more representative of the entire population. In this framework, we first employ suitably chosen nonparametric priors to model class distributions for observed as well as unobserved classes and then, utilize new inference methods to classify samples from observed classes and discover/model novel classes for those from unobserved classes. This thesis presents the initiating steps of an ongoing effort to address one of the most overlooked bottlenecks in supervised learning and indicates the potential for taking new perspectives in some of the most heavily studied areas of machine learning: novelty detection, online class discovery and semi-supervised learning. Statistical decision Nonparametric statistics -- Research Mathematical statistics Stochastic processes Boosting (Algorithms) Statistics -- Data processing Machine learning Computational linguistics Data mining Computational intelligence
2	System biology modeling : the insights for computational drug discovery Huang, Hui January 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Traditional treatment strategy development for diseases involves the identification of target proteins related to disease states, and the interference of these proteins with drug molecules. Computational drug discovery and virtual screening from thousands of chemical compounds have accelerated this process. The thesis presents a comprehensive framework of computational drug discovery using system biology approaches. The thesis mainly consists of two parts: disease biomarker identification and disease treatment discoveries. The first part of the thesis focuses on the research in biomarker identification for human diseases in the post-genomic era with an emphasis in system biology approaches such as using the protein interaction networks. There are two major types of biomarkers: Diagnostic Biomarker is expected to detect a given type of disease in an individual with both high sensitivity and specificity; Predictive Biomarker serves to predict drug response before treatment is started. Both are essential before we even start seeking any treatment for the patients. In this part, we first studied how the coverage of the disease genes, the protein interaction quality, and gene ranking strategies can affect the identification of disease genes. Second, we addressed the challenge of constructing a central database to collect the system level data such as protein interaction, pathway, etc. Finally, we built case studies for biomarker identification for using dabetes as a case study. The second part of the thesis mainly addresses how to find treatments after disease identification. It specifically focuses on computational drug repositioning due to its low lost, few translational issues and other benefits. First, we described how to implement literature mining approaches to build the disease-protein-drug connectivity map and demonstrated its superior performances compared to other existing applications. Second, we presented a valuable drug-protein directionality database which filled the research gap of lacking alternatives for the experimental CMAP in computational drug discovery field. We also extended the correlation based ranking algorithms by including the underlying topology among proteins. Finally, we demonstrated how to study drug repositioning beyond genomic level and from one dimension to two dimensions with clinical side effect as prediction features. System Biology Drug Repositioning Machine Learning Side Effect Proteins -- Analysis -- Mathematics Machine learning -- Research -- Analysis Pharmacogenomics Drug development -- Research -- Analysis Drugs -- Side effects Systems biology -- Research Artificial intelligence
3	User Modeling and Optimization for Environmental Planning System Design Singh, Vidya Bhushan January 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Environmental planning is very cumbersome work for environmentalists, government agencies like USDA and NRCS, and farmers. There are a number of conflicts and issues involved in such a decision making process. This research is based on the work to provide a common platform for environmental planning called WRESTORE (Watershed Restoration using Spatio-Temporal Optimization of Resources). We have designed a system that can be used to provide the best management practices for environmental planning. A distributed system was designed to combine high performance computing power of clusters/supercomputers in running various environmental model simulations. The system is designed to be a multi-user system just like a multi-user operating system. A number of stakeholders can log-on and run environmental model simulations simultaneously, seamlessly collaborate, and make collective judgments by visualizing their landscapes. In the research, we identified challenges in running such a system and proposed various solutions. One challenge was the lack of fast optimization algorithm. In our research, several algorithms are utilized such as Genetic Algorithm (GA) and Learning Automaton (LA). However, the criticism is that LA has a slow rate of convergence and that both LA and GA have the problem of getting stuck in local optima. We tried to solve the multi-objective problems using LA in batch mode to make the learning faster and accurate. The problems where the evaluation of the fitness functions for optimization is a bottleneck, like running environmental model simulation, evaluation of a number of such models in parallel can give considerable speed-up. In the multi-objective LA, different weight pair solutions were evaluated independently. We created their parallel versions to make them practically faster in computation. Additionally, we extended the parallelism concept with the batch mode learning. Another challenge we faced was in User Modeling. There are a number of User Modeling techniques available. Selection of the best user modeling technique is a hard problem. In this research, we modeled user's preferences and search criteria using an ANN (Artificial Neural Network). Training an ANN with limited data is not always feasible. There are many situations where a simple modeling technique works better if the learning data set is small. We formulated ways to fine tune the ANN in case of limited data and also introduced the concept of Deep Learning in User Modeling for environmental planning system. User Modeling Environmental Planning Genetic Algorithm Neural Network Watershed Interactive Optimization Watershed restoration -- Research Mathematical optimization -- Research High performance computing -- Research Information modeling User interfaces (Computer systems) Spatial analysis (Statistics) Sustainable development Machine learning -- Research System analysis

1

Page generated in 0.1067 seconds