• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 104
  • 25
  • 23
  • 16
  • 3
  • 2
  • Tagged with
  • 224
  • 224
  • 36
  • 34
  • 28
  • 27
  • 26
  • 24
  • 23
  • 22
  • 22
  • 21
  • 19
  • 19
  • 18
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Random finite sets for multitarget tracking with applications

Wood, Trevor M. January 2011 (has links)
Multitarget tracking is the process of jointly determining the number of targets present and their states from noisy sets of measurements. The difficulty of the multitarget tracking problem is that the number of targets present can change as targets appear and disappear while the sets of measurements may contain false alarms and measurements of true targets may be missed. The theory of random finite sets was proposed as a systematic, Bayesian approach to solving the multitarget tracking problem. The conceptual solution is given by Bayes filtering for the probability distribution of the set of target states, conditioned on the sets of measurements received, known as the multitarget Bayes filter. A first-moment approximation to this filter, the probability hypothesis density (PHD) filter, provides a more computationally practical, but theoretically sound, solution. The central thesis of this work is that the random finite set framework is theoretically sound, compatible with the Bayesian methodology and amenable to immediate implementation in a wide range of contexts. In advancing this thesis, new links between the PHD filter and existing Bayesian approaches for manoeuvre handling and incorporation of target amplitude information are presented. A new multitarget metric which permits incorporation of target confidence information is derived and new algorithms are developed which facilitate sequential Monte Carlo implementations of the PHD filter. Several applications of the PHD filter are presented, with a focus on applications for tracking in sonar data. Good results are presented for implementations on real active and passive sonar data. The PHD filter is also deployed in order to extract bacterial trajectories from microscopic visual data in order to aid ongoing work in understanding bacterial chemotaxis. A performance comparison between the PHD filter and conventional multitarget tracking methods using simulated data is also presented, showing favourable results for the PHD filter.
52

Bayesian pathway analysis in epigenetics

Wright, Alan January 2013 (has links)
A typical gene expression data set consists of measurements of a large number of gene expressions, on a relatively small number of subjects, classified according to two or more outcomes, for example cancer or non-cancer. The identification of associations between gene expressions and outcome is a huge multiple testing problem. Early approaches to this problem involved the application of thousands of univariate tests with corrections for multiplicity. Over the past decade, numerous studies have demonstrated that analyzing gene expression data structured into predefined gene sets can produce benefits in terms of statistical power and robustness when compared to alternative approaches. This thesis presents the results of research on gene set analysis. In particular, it examines the properties of some existing methods for the analysis of gene sets. It introduces novel Bayesian methods for gene set analysis. A distinguishing feature of these methods is that the model is specified conditionally on the expression data, whereas other methods of gene set analysis and IGA generally make inferences conditionally on the outcome. Computer simulation is used to compare three common established methods for gene set analysis. In this simulation study a new procedure for the simulation of gene expression data is introduced. The simulation studies are used to identify situations in which the established methods perform poorly. The Bayesian approaches developed in this thesis apply reversible jump Markov chain Monte Carlo (RJMCMC) techniques to model gene expression effects on phenotype. The reversible jump step in the modelling procedure allows for posterior probabilities for activeness of gene set to be produced. These mixture models reverse the generally accepted conditionality and model outcome given gene expression, which is a more intuitive assumption when modelling the pathway to phenotype. It is demonstrated that the two models proposed may be superior to the established methods studied. There is considerable scope for further development of this line of research, which is appealing in terms of the use of mixture model priors that reflect the belief that a relatively small number of genes, restricted to a small number of gene sets, are associated with the outcome.
53

New Advancements of Scalable Statistical Methods for Learning Latent Structures in Big Data

Zhao, Shiwen January 2016 (has links)
<p>Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.</p><p>Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.</p> / Dissertation
54

Bayesian Emulation for Sequential Modeling, Inference and Decision Analysis

Irie, Kaoru January 2016 (has links)
<p>The advances in three related areas of state-space modeling, sequential Bayesian learning, and decision analysis are addressed, with the statistical challenges of scalability and associated dynamic sparsity. The key theme that ties the three areas is Bayesian model emulation: solving challenging analysis/computational problems using creative model emulators. This idea defines theoretical and applied advances in non-linear, non-Gaussian state-space modeling, dynamic sparsity, decision analysis and statistical computation, across linked contexts of multivariate time series and dynamic networks studies. Examples and applications in financial time series and portfolio analysis, macroeconomics and internet studies from computational advertising demonstrate the utility of the core methodological innovations.</p><p>Chapter 1 summarizes the three areas/problems and the key idea of emulating in those areas. Chapter 2 discusses the sequential analysis of latent threshold models with use of emulating models that allows for analytical filtering to enhance the efficiency of posterior sampling. Chapter 3 examines the emulator model in decision analysis, or the synthetic model, that is equivalent to the loss function in the original minimization problem, and shows its performance in the context of sequential portfolio optimization. Chapter 4 describes the method for modeling the steaming data of counts observed on a large network that relies on emulating the whole, dependent network model by independent, conjugate sub-models customized to each set of flow. Chapter 5 reviews those advances and makes the concluding remarks.</p> / Dissertation
55

An Introduction to the Theory and Applications of Bayesian Networks

Jaitha, Anant 01 January 2017 (has links)
Bayesian networks are a means to study data. A Bayesian network gives structure to data by creating a graphical system to model the data. It then develops probability distributions over these variables. It explores variables in the problem space and examines the probability distributions related to those variables. It conducts statistical inference over those probability distributions to draw meaning from them. They are good means to explore a large set of data efficiently to make inferences. There are a number of real world applications that already exist and are being actively researched. This paper discusses the theory and applications of Bayesian networks.
56

Polarised neutron diffraction measurements of PrBa2Cu3O6+x and the Bayesian statistical analysis of such data

Markvardsen, Anders Johannes January 2000 (has links)
The physics of the series Pr<sub>y</sub>Y<sub>1-y</sub>Ba<sub>2</sub>Cu<sub>3</sub>O<sub>6&plus;x</sub>, and ability of Pr to suppress superconductivity, has been a subject of frequent discussions in the literature for more than a decade. This thesis describes a polarised neutron diffraction (PND) experiment performed on PrBa<sub>2</sub>Cu<sub>3</sub>O<sub>6.24</sub> designed to find out something about the electron structure. This experiment pushed the limits of what can be done using the PND technique. The problem is one of a limited number of measured Fourier components that need to be inverted to form a real space image. To accomplish this inversion the maximum entropy technique has been employed. In some cases, the maximum entropy technique has the ability to increase the resolution of ‘inverted’ data immensely, but this ability is found to depend critically on the choice of constants used in the method. To investigate this a Bayesian robustness analysis of the maximum entropy method is carried out, resulting in an improvement of the maximum entropy technique for analysing PND data. Some results for nickel in the literature have been re-analysed and a comparison is made with different maximum entropy algorithms. Equipped with an improved data analysis technique and carefully measured PND data for PrBa<sub>2</sub>Cu<sub>3</sub>O<sub>6.24</sub> a number of new interesting features are observed, putting constraints on existing theoretical models of Pr<sub>y</sub>Y<sub>1-y</sub>Ba<sub>2</sub>Cu<sub>3</sub>O<sub>6&plus;x</sub> and leaving room for more questions to be answered.
57

Análise Bayesiana de dois problemas em Astrofísica Relativística: neutrinos do colapso gravitacional e massas das estrelas de nêutrons / Bayesian analysis of two problems in Relativistic Astrophysics: neutrinos from gravitational collapse and mass distribution of neutron stars.

Lima, Rodolfo Valentim da Costa 19 April 2012 (has links)
O evento estraordinário de SN1987A vem sendo investigado há mais de vinte e cinco anos. O fascínio que cerca tal evento astronômico está relacionado com a observação em tempo real da explosão à luz da Física de neutrinos. Detectores espalhados pelo mundo observaram um surto neutrinos que dias mais tarde foi confirmado como sendo a SN1987A. Kamiokande, IMB e Baksan apresentaram os eventos detectados que permitiu o estudo de modelos para a explosão e resfriamento da hipotética estrela de nêutrons remanescente. Até hoje não há um consenso a origem do progenitor e a natureza do objeto compacto remanescente. O trabalho se divide em duas partes: estudo dos neutrinos de SN1987A através de Análise Estatística Bayesiana através de um modelo proposto com duas temperaturas que evidenciam dois surtos de neutrinos. A motivação está na hipótese do segundo surto como resultado da formação de matéria estranha no objeto compacto. A metodologia empregada foi a desenvolvida por um trabalho interessante de Loredo (2002) que permite modelar e testar hipóteses sobre os modelos via Bayesian Information Criterion (BIC). A segunda parte do trabalho, a mesma metodologia estatística é usada no estudo da distribuição de massas das estrelas de nêutrons usando a base de dados disponível (http://stellarcollapse.org). A base de dados foi analisada utilizando somente o valor do objeto e seu desvio padrão. Construindo uma função de verossimilhança e utilizando distribuições ``a priori\'\' com hipótese de bimodalidade da distribuição das massas contra uma distribuição unimodal sobre todas as massas dos objetos. O teste BIC indica forte tendência favorável à existência da bimodalidade com valores centrados em 1.37M para objetos de baixa massa e 1.73M para objetos de alta massa e a confirmação da fraca evidência de um terceiro pico esperado em 1.25M. / The extraordinary event of supernova has been investigated twenty five years ago. The fascination surrounds such astronomical event is on the real time observation the explosion at light to neutrino Physics. Detectors spread for the world had observed one burst neutrinos that days later it was confirmed as being of SN1987A. Kamiokande, IMB and Baksan had presented the detected events that allowed to the study of models for the explosion and cooling of hypothetical neutron star remain. Until today it does not have a consensus the origin of the progenitor and the nature of the remaining compact object. The work is divided in two parts: study of the neutrinos of SN1987A through Analysis Bayesiana Statistics through a model considered with two temperatures that two evidence bursts of neutrinos. The motivation is in the hypothesis of as burst as resulted of the formation of strange matter in the compact object. The employed methodology was developed for an interesting work of Loredo & Lamb (2002) that it allows shape and to test hypotheses on the models saw Bayesian Information Criterion (BIC). The second part of the work, the same methodology statistics is used in the study of the distribution of masses of the neutron stars using the available database http://stellarcollapse.org/. The database was analyzed only using the value of the object and its shunting line standard. Constructing to a a priori function likelihood and using distributions with hypothesis of bimodal distribution of the masses against a unimodal distribution on all the masses of objects. Test BIC indicates fort favorable trend the existence of the bimodality with values centered in 1.37M for objects of low mass and 1.73M for objects of high mass and week evidence of one third peak around 1.25M.
58

Statistical Methods for Characterizing Genomic Heterogeneity in Mixed Samples

Zhang, Fan 12 December 2016 (has links)
"Recently, sequencing technologies have generated massive and heterogeneous data sets. However, interpretation of these data sets is a major barrier to understand genomic heterogeneity in complex diseases. In this dissertation, we develop a Bayesian statistical method for single nucleotide level analysis and a global optimization method for gene expression level analysis to characterize genomic heterogeneity in mixed samples. The detection of rare single nucleotide variants (SNVs) is important for understanding genetic heterogeneity using next-generation sequencing (NGS) data. Various computational algorithms have been proposed to detect variants at the single nucleotide level in mixed samples. Yet, the noise inherent in the biological processes involved in NGS technology necessitates the development of statistically accurate methods to identify true rare variants. At the single nucleotide level, we propose a Bayesian probabilistic model and a variational expectation maximization (EM) algorithm to estimate non-reference allele frequency (NRAF) and identify SNVs in heterogeneous cell populations. We demonstrate that our variational EM algorithm has comparable sensitivity and specificity compared with a Markov Chain Monte Carlo (MCMC) sampling inference algorithm, and is more computationally efficient on tests of relatively low coverage (27x and 298x) data. Furthermore, we show that our model with a variational EM inference algorithm has higher specificity than many state-of-the-art algorithms. In an analysis of a directed evolution longitudinal yeast data set, we are able to identify a time-series trend in non-reference allele frequency and detect novel variants that have not yet been reported. Our model also detects the emergence of a beneficial variant earlier than was previously shown, and a pair of concomitant variants. Characterization of heterogeneity in gene expression data is a critical challenge for personalized treatment and drug resistance due to intra-tumor heterogeneity. Mixed membership factorization has become popular for analyzing data sets that have within-sample heterogeneity. In recent years, several algorithms have been developed for mixed membership matrix factorization, but they only guarantee estimates from a local optimum. At the gene expression level, we derive a global optimization (GOP) algorithm that provides a guaranteed epsilon-global optimum for a sparse mixed membership matrix factorization problem for molecular subtype classification. We test the algorithm on simulated data and find the algorithm always bounds the global optimum across random initializations and explores multiple modes efficiently. The GOP algorithm is well-suited for parallel computations in the key optimization steps. "
59

Identification and photometric redshifts for type-I quasars with medium- and narrow-band filter surveys / Identificação e redshifts fotométricos para quasares do tipo-I com sistemas de filtros de bandas médias e estreitas

Silva, Carolina Queiroz de Abreu 16 November 2015 (has links)
Quasars are valuable sources for several cosmological applications. In particular, they can be used to trace some of the heaviest halos and their high intrinsic luminosities allow them to be detected at high redshifts. This implies that quasars (or active galactic nuclei, in a more general sense) have a huge potential to map the large-scale structure. However, this potential has not yet been fully realized, because instruments which rely on broad-band imaging to pre-select spectroscopic targets usually miss most quasars and, consequently, are not able to properly separate broad-line emitting quasars from other point-like sources (such as stars and low resolution galaxies). This work is an initial attempt to investigate the realistic gains on the identification and separation of quasars and stars when medium- and narrow-band filters in the optical are employed. The main novelty of our approach is the use of Bayesian priors both for the angular distribution of stars of different types on the sky and for the distribution of quasars as a function of redshift. Since the evidence from these priors convolve the angular dependence of stars with the redshift dependence of quasars, this allows us to control for the near degeneracy between these objects. However, our results are inconclusive to quantify the efficiency of star-quasar separation by using this approach and, hence, some critical refinements and improvements are still necessary. / Quasares são objetos valiosos para diversas aplicações cosmológicas. Em particular, eles podem ser usados para localizar alguns dos halos mais massivos e suas luminosidades intrinsecamente elevadas permitem que eles sejam detectados a altos redshifts. Isso implica que quasares (ou núcleos ativos de galáxias, de um modo geral) possuem um grande potencial para mapear a estrutura em larga escala. Entretanto, esse potencial ainda não foi completamente atingido, porque instrumentos que se baseiam no imageamento por bandas largas para pré-selecionar alvos espectroscópicos perdem a maioria dos quasares e, consequentemente, não são capazes de separar adequadamente quasares com linhas de emissão largas de outras fontes pontuais (como estrelas e galáxias de baixa resolução). Esse trabalho é uma tentativa inicial de investigar os ganhos reais na identificação e separação de quasares e estrelas quando são usados filtros de bandas médias e estreitas. A principal novidade desse método é o uso de priors Bayesianos tanto para a distribuição angular de estrelas de diferentes tipos no céu quanto para a distribuição de quasares como função do redshift. Como a evidência desses priors é uma convolução entre a dependência angular das estrelas e a dependência em redshift dos quasares, isso permite que a degenerescência entre esses objetos seja levada em consideração. Entretanto, nossos resultados ainda são inconclusivos para quantificar a eficiência da separação entre estrelas e quasares utilizando esse método e, portanto, alguns refinamentos críticos são necessários.
60

Construção de redes usando estatística clássica e Bayesiana - uma comparação / Building complex networks through classical and Bayesian statistics - a comparison

Lina Dornelas Thomas 13 March 2012 (has links)
Nesta pesquisa, estudamos e comparamos duas maneiras de se construir redes. O principal objetivo do nosso estudo é encontrar uma forma efetiva de se construir redes, especialmente quando temos menos observações do que variáveis. A construção das redes é realizada através da estimação do coeficiente de correlação parcial com base na estatística clássica (inverse method) e na Bayesiana (priori conjugada Normal - Wishart invertida). No presente trabalho, para resolver o problema de se ter menos observações do que variáveis, propomos uma nova metodologia, a qual chamamos correlação parcial local, que consiste em selecionar, para cada par de variáveis, as demais variáveis que apresentam maior coeficiente de correlação com o par. Aplicamos essas metodologias em dados simulados e as comparamos traçando curvas ROC. O resultado mais atrativo foi que, mesmo com custo computacional alto, usar inferência Bayesiana é melhor quando temos menos observações do que variáveis. Em outros casos, ambas abordagens apresentam resultados satisfatórios. / This research is about studying and comparing two different ways of building complex networks. The main goal of our study is to find an effective way to build networks, particularly when we have fewer observations than variables. We construct networks estimating the partial correlation coefficient on Classic Statistics (Inverse Method) and on Bayesian Statistics (Normal - Invese Wishart conjugate prior). In this current work, in order to solve the problem of having less observations than variables, we propose a new methodology called local partial correlation, which consists of selecting, for each pair of variables, the other variables most correlated to the pair. We applied these methods on simulated data and compared them through ROC curves. The most atractive result is that, even though it has high computational costs, to use Bayesian inference is better when we have less observations than variables. In other cases, both approaches present satisfactory results.

Page generated in 0.0567 seconds