• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 38
  • 20
  • 7
  • 6
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 95
  • 95
  • 31
  • 28
  • 23
  • 21
  • 21
  • 18
  • 18
  • 15
  • 14
  • 13
  • 13
  • 12
  • 11
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Bayesian mixture models for frequent itemset mining

He, Ruofei January 2012 (has links)
In binary-transaction data-mining, traditional frequent itemset mining often produces results which are not straightforward to interpret. To overcome this problem, probability models are often used to produce more compact and conclusive results, albeit with some loss of accuracy. Bayesian statistics have been widely used in the development of probability models in machine learning in recent years and these methods have many advantages, including their abilities to avoid overfitting. In this thesis, we develop two Bayesian mixture models with the Dirichlet distribution prior and the Dirichlet process (DP) prior to improve the previous non-Bayesian mixture model developed for transaction dataset mining. First, we develop a finite Bayesian mixture model by introducing conjugate priors to the model. Then, we extend this model to an infinite Bayesian mixture using a Dirichlet process prior. The Dirichlet process mixture model is a nonparametric Bayesian model which allows for the automatic determination of an appropriate number of mixture components. We implement the inference of both mixture models using two methods: a collapsed Gibbs sampling scheme and a variational approximation algorithm. Experiments in several benchmark problems have shown that both mixture models achieve better performance than a non-Bayesian mixture model. The variational algorithm is the faster of the two approaches while the Gibbs sampling method achieves a more accurate result. The Dirichlet process mixture model can automatically grow to a proper complexity for a better approximation. However, these approaches also show that mixture models underestimate the probabilities of frequent itemsets. Consequently, these models have a higher sensitivity but a lower specificity.
32

Reliability Assessment of a Continuous-state Fuel Cell Stack System with Multiple Degrading Components

Wu, Xinying 23 September 2019 (has links)
No description available.
33

XPRIME: A Method Incorporating Expert Prior Information into Motif Exploration

Poulsen, Rachel Lynn 16 April 2009 (has links) (PDF)
One of the primary goals of active research in molecular biology is to better understand the process of transcription regulation. An important objective in understanding transcription is identifying transcription factors that directly regulate target genes. Identifying these transcription factors is a key step toward eliminating genetic diseases or disease susceptibilities that are encoded inside deoxyribonucleic acid (DNA). There is much uncertainty and variation associated with transcription factor binding sites, requiring these sites to be represented stochastically. Although typically each transcription factor prefers to bind to a specific DNA word, it can bind to different variations of that DNA word. In order to model these uncertainties, we use a Bayesian approach that allows the binding probabilities associated with the motif to vary. This project presents a new method for motif searching that uses expert prior information to scan DNA sequences for multiple known motif binding sites as well as new motifs. The method uses a mixture model to model the motifs of interest where each motif is represented by a Multinomial distribution, and Dirichlet prior distributions are placed on each motif of interest. Expert prior information is given to search for known motifs and diffuse priors are used to search for new motifs. The posterior distribution of each motif is then sampled using Markov Chain Monte Carlo (MCMC) techniques and Gibbs sampling.
34

A Bayesian Approach to Missile Reliability

Redd, Taylor Hardison 01 June 2011 (has links) (PDF)
Each year, billions of dollars are spent on missiles and munitions by the United States government. It is therefore vital to have a dependable method to estimate the reliability of these missiles. It is important to take into account the age of the missile, the reliability of different components of the missile, and the impact of different launch phases on missile reliability. Additionally, it is of importance to estimate the missile performance under a variety of test conditions, or modalities. Bayesian logistic regression is utilized to accurately make these estimates. This project presents both previously proposed methods and ways to combine these methods to accurately estimate the reliability of the Cruise Missile.
35

An Evaluation of the Indian Buffet Process as Part of a Recommendation System / En utvärdering av Indian Buffet Process som en del av ett rekommendationssystem

Alinder, Helena, Nilsson, Josefin January 2018 (has links)
This report investigates if it is possible to use the Indian Buffet Process (IBP), a stochastic process that defines a probability distribution, as part of a recommendation system. The report focuses on recommendation systems where one type of object, for instance movies, is recommended to another type of object, for instance users.         A concept of performing link prediction with IBP is presented, along with a method for performing inference. Three papers that are related to the subject are presented and their results are analyzed together with additional experiments on an implementation of the IBP.        The report arrives at the conclusion that it is possible to use IBP in a recommendation system when recommending one object to another. In order to use IBP priors in a recommendation system which include real-life datasets, the paper suggests the use of a coupled version of the IBP model and if possible perform inference with a parallel Gibbs sampling. / Denna rapport undersöker om det är möjligt att använda Indian Buffet Process (IBP), en stokatisk process som definierar en sannolikhetsfördelning, som en del av ett rekommendationssystem. Rapporten fokuserar på rekommendationssystem där en sorts objekt, exempelvis filmer, rekommenderas till en annan sorts objekt, exempelvis användare.         Ett sätt att förutse länkar, link prediction, mellan olika objekt med hjälp av IBP presenteras tillsammans med en metod för att dra statistiska slutsatser, inference. Tre rapporter som är relaterade till ämnet presenteras och deras resultat analyseras tillsammans med ytterligare experiment på en implementation av IBP.        Rapporten drar slutsatsen att det är möjligt att använda IBP i ett rekommendationssystem då systemet rekommenderar ett objekt till ett annat objekt. Rapporten föreslår en kopplad version av IBP för att kunna använda IBP i ett rekommendationssystem som arbetar på riktigt data samt att inference ska utföras med en parallell Gibbs sampling.
36

Bayesian Model Checking Strategies for Dichotomous Item Response Theory Models

Toribio, Sherwin G. 16 June 2006 (has links)
No description available.
37

Bayesian Degradation Analysis Considering Competing Risks and Residual-Life Prediction for Two-Phase Degradation

Ning, Shuluo 11 September 2012 (has links)
No description available.
38

Bayesian Modeling for Isoform Identification and Phenotype-specific Transcript Assembly

Shi, Xu 24 October 2017 (has links)
The rapid development of biotechnology has enabled researchers to collect high-throughput data for studying various biological processes at the genomic level, transcriptomic level, and proteomic level. Due to the large noise in the data and the high complexity of diseases (such as cancer), it is a challenging task for researchers to extract biologically meaningful information that can help reveal the underlying molecular mechanisms. The challenges call for more efforts in developing efficient and effective computational methods to analyze the data at different levels so as to understand the biological systems in different aspects. In this dissertation research, we have developed novel Bayesian approaches to infer alternative splicing mechanisms in biological systems using RNA sequencing data. Specifically, we focus on two research topics in this dissertation: isoform identification and phenotype-specific transcript assembly. For isoform identification, we develop a computational approach, SparseIso, to jointly model the existence and abundance of isoforms in a Bayesian framework. A spike-and-slab prior is incorporated into the model to enforce the sparsity of expressed isoforms. A Gibbs sampler is developed to sample the existence and abundance of isoforms iteratively. For transcript assembly, we develop a Bayesian approach, IntAPT, to assemble phenotype-specific transcripts from multiple RNA sequencing profiles. A two-layer Bayesian framework is used to model the existence of phenotype-specific transcripts and the transcript abundance in individual samples. Based on the hierarchical Bayesian model, a Gibbs sampling algorithm is developed to estimate the joint posterior distribution for phenotype-specific transcript assembly. The performances of our proposed methods are evaluated with simulation data, compared with existing methods and benchmarked with real cell line data. We then apply our methods on breast cancer data to identify biologically meaningful splicing mechanisms associated with breast cancer. For the further work, we will extend our methods for de novo transcript assembly to identify novel isoforms in biological systems; we will incorporate isoform-specific networks into our methods to better understand splicing mechanisms in biological systems. / Ph. D.
39

Bayesian Integration and Modeling for Next-generation Sequencing Data Analysis

Chen, Xi 01 July 2016 (has links)
Computational biology currently faces challenges in a big data world with thousands of data samples across multiple disease types including cancer. The challenging problem is how to extract biologically meaningful information from large-scale genomic data. Next-generation Sequencing (NGS) can now produce high quality data at DNA and RNA levels. However, in cells there exist a lot of non-specific (background) signals that affect the detection accuracy of true (foreground) signals. In this dissertation work, under Bayesian framework, we aim to develop and apply approaches to learn the distribution of genomic signals in each type of NGS data for reliable identification of specific foreground signals. We propose a novel Bayesian approach (ChIP-BIT) to reliably detect transcription factor (TF) binding sites (TFBSs) within promoter or enhancer regions by jointly analyzing the sample and input ChIP-seq data for one specific TF. Specifically, a Gaussian mixture model is used to capture both binding and background signals in the sample data; and background signals are modeled by a local Gaussian distribution that is accurately estimated from the input data. An Expectation-Maximization algorithm is used to learn the model parameters according to the distributions on binding signal intensity and binding locations. Extensive simulation studies and experimental validation both demonstrate that ChIP-BIT has a significantly improved performance on TFBS detection over conventional methods, particularly on weak binding signal detection. To infer cis-regulatory modules (CRMs) of multiple TFs, we propose to develop a Bayesian integration approach, namely BICORN, to integrate ChIP-seq and RNA-seq data of the same tissue. Each TFBS identified from ChIP-seq data can be either a functional binding event mediating target gene transcription or a non-functional binding. The functional bindings of a set of TFs usually work together as a CRM to regulate the transcription processes of a group of genes. We develop a Gibbs sampling approach to learn the distribution of CRMs (a joint distribution of multiple TFs) based on their functional bindings and target gene expression. The robustness of BICORN has been validated on simulated regulatory network and gene expression data with respect to different noise settings. BICORN is further applied to breast cancer MCF-7 ChIP-seq and RNA-seq data to identify CRMs functional in promoter or enhancer regions. In tumor cells, the normal regulatory mechanism may be interrupted by genome mutations, especially those somatic mutations that uniquely occur in tumor cells. Focused on a specific type of genome mutation, structural variation (SV), we develop a novel pattern-based probabilistic approach, namely PSSV, to identify somatic SVs from whole genome sequencing (WGS) data. PSSV features a mixture model with hidden states representing different mutation patterns; PSSV can thus differentiate heterozygous and homozygous SVs in each sample, enabling the identification of those somatic SVs with a heterozygous status in the normal sample and a homozygous status in the tumor sample. Simulation studies demonstrate that PSSV outperforms existing tools. PSSV has been successfully applied to breast cancer patient WGS data for identifying somatic SVs of key factors associated with breast cancer development. In this dissertation research, we demonstrate the advantage of the proposed distributional learning-based approaches over conventional methods for NGS data analysis. Distributional learning is a very powerful approach to gain biological insights from high quality NGS data. Successful applications of the proposed Bayesian methods to breast cancer NGS data shed light on underlying molecular mechanisms of breast cancer, enabling biologists or clinicians to identify major cancer drivers and develop new therapeutics for cancer treatment. / Ph. D.
40

Estimação do tamanho populacional a partir de um modelo de captura-recaptura com heterogeneidade

Pezzott, George Lucas Moraes 14 March 2014 (has links)
Made available in DSpace on 2016-06-02T20:06:10Z (GMT). No. of bitstreams: 1 6083.pdf: 1151427 bytes, checksum: 24c39bb02ef8c214a3e10c3cc5bae9ef (MD5) Previous issue date: 2014-03-14 / Financiadora de Estudos e Projetos / In this work, we consider the estimation of the number of errors in a software from a closed population. The process of estimating the population size is based on the capture-recapture method which consists of examining the software, in parallel, by a number of reviewers. The probabilistic model adopted accommodates situations in which reviewers are independent and homogeneous (equally efficient), and each error is an element that is part of a disjoint partition in relation to its detection probability. We propose an iterative process to obtain maximum likelihood estimates in which the EM algorithm is used to the nuisance parameters estimation. The estimates of population parameters were also obtained under the Bayesian approach, in which Monte Carlo on Markov Chains (MCMC) simulations through Gibbs sampling algorithm with insertion of latent variables were used on the conditional posterior distributions. The two approaches were applied to simulated data and in two real data sets from the literature. / Neste trabalho, consideramos a estimação do número de erros em um software provenientes de uma população fechada. O processo de estimação do tamanho populacional é baseado no método de captura-recaptura, que consiste em examinar o software, em paralelo, por certo número de revisores. O modelo probabilístico adotado acomoda situações em que os revisores são independentes e homogêneos (igualmente eficientes) e que cada erro é um elemento que faz parte de uma partição disjunta quanto à sua probabilidade de detecção. Propomos um processo iterativo para obtenção das estimativas de máxima verossimilhança em que utilizamos o algoritmo EM na estimação dos parâmetros perturbadores. As estimativas dos parâmetros populacionais também foram obtidas sob o enfoque Bayesiano, onde utilizamos simulações de Monte Carlo em Cadeias de Markov (MCMC) através do algoritmo Gibbs sampling com a inserção de variáveis latentes nas distribuições condicionais a posteriori. As duas abordagens foram aplicadas em dados simulados e em dois conjuntos de dados reais da literatura.

Page generated in 0.0755 seconds