Spelling suggestions: "subject:"nonparametric."" "subject:"nonparametrically.""
21 |
Nonparametric Bayesian Models for Joint Analysis of Imagery and TextLi, Lingbo January 2014 (has links)
<p>It has been increasingly important to develop statistical models to manage large-scale high-dimensional image data. This thesis presents novel hierarchical nonparametric Bayesian models for joint analysis of imagery and text. This thesis consists two main parts.</p><p>The first part is based on single image processing. We first present a spatially dependent model for simultaneous image segmentation and interpretation. Given a corrupted image, by imposing spatial inter-relationships within imagery, the model not only improves reconstruction performance but also yields smooth segmentation. Then we develop online variational Bayesian algorithm for dictionary learning to process large-scale datasets, based on online stochastic optimization with a natu- ral gradient step. We show that dictionary is learned simultaneously with image reconstruction on large natural images containing tens of millions of pixels.</p><p>The second part applies dictionary learning for joint analysis of multiple image and text to infer relationship among images. We show that feature extraction and image organization with annotation (when available) can be integrated by unifying dictionary learning and hierarchical topic modeling. We present image organization in both "flat" and hierarchical constructions. Compared with traditional algorithms feature extraction is separated from model learning, our algorithms not only better fits the datasets, but also provides richer and more interpretable structures of image</p> / Dissertation
|
22 |
A Bayesian Nonparametric Approach for Causal Inference with Missing CovariatesZang, Huaiyu 09 June 2020 (has links)
No description available.
|
23 |
Bayesian Microphone Array Processing / ベイズ法によるマイクロフォンアレイ処理Otsuka, Takuma 24 March 2014 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第18412号 / 情博第527号 / 新制||情||93(附属図書館) / 31270 / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 奥乃 博, 教授 河原 達也, 准教授 CUTURI CAMETO Marco, 講師 吉井 和佳 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
24 |
Online Clustering with Bayesian NonparametricsScherreik, Matthew D. January 2020 (has links)
No description available.
|
25 |
A New Nonparametric Procedure for the k-sample ProblemWilcock, Samuel Phillip 18 September 2001 (has links)
The k-sample data setting is one of the most common data settings used today. The null hypothesis that is most generally of interest for these methods is that the k-samples have the same location. Currently there are several procedures available for the individual who has data of this type. The most often used method is commonly called the ANOVA F-test. This test assumes that all of the underlying distributions are normal, with equal variances. Thus the only allowable difference in the distributions is a possible shift, under the alternative hypothesis. Under the null hypothesis, it is assumed that all k distributions are identical, not just equally located.
Current nonparametric methods for the k-sample setting require a variety of restrictions on the distribution of the data. The most commonly used method is that due to Kruskal and Wallis (1952). The method, commonly called the Kruskal-Wallis test, does not assume that the data come from normal populations, though they must still be continuous, but maintains the requirement that the populations must be identical under the null, and may differ only by a possible shift under the alternative.
In this work a new procedure is developed which is exactly distribution free when the distributions are equivalent and continuous under the null hypothesis, and simulations are used to study the properties of the test when the distributions are continuous and have the same medians under the null. The power of the statistic under alternatives is also studied. The test bears a resemblance to the two sample sign type tests, which will be pointed out as the development is shown. / Ph. D.
|
26 |
A Bayesian nonparametric approach for the two-sample problem / Uma abordagem bayesiana não paramétrica para o problema de duas amostrasConsole, Rafael de Carvalho Ceregatti de 19 November 2018 (has links)
In this work, we discuss the so-called two-sample problem Pearson and Neyman (1930) assuming a nonparametric Bayesian approach. Considering X1; : : : ; Xn and Y1; : : : ; Ym two independent i.i.d samples generated from P1 and P2, respectively, the two-sample problem consists in deciding if P1 and P2 are equal. Assuming a nonparametric prior, we propose an evidence index for the null hypothesis H0 : P1 = P2 based on the posterior distribution of the distance d (P1; P2) between P1 and P2. This evidence index has easy computation, intuitive interpretation and can also be justified in the Bayesian decision-theoretic context. Further, in a Monte Carlo simulation study, our method presented good performance when compared with the well known Kolmogorov- Smirnov test, the Wilcoxon test as well as a recent testing procedure based on Polya tree process proposed by Holmes (HOLMES et al., 2015). Finally, we applied our method to a data set about scale measurements of three different groups of patients submitted to a questionnaire for Alzheimer\'s disease diagnostic. / Neste trabalho, discutimos o problema conhecido como problema de duas amostras Pearson and Neyman (1930) utilizando uma abordagem bayesiana não-paramétrica. Considere X1; : : : ; Xn and Y1; : : : ;Ym duas amostras independentes, geradas por P1 e P2, respectivamente, o problema de duas amostras consiste em decidir se P1 e P2 são iguais. Assumindo uma priori não-paramétrica, propomos um índice de evidência para a hipótese nula H0 : P1 = P2 baseado na distribuição a posteriori da distância d (P1; P2) entre P1 e P2. O índice de evidência é de fácil implementação, tem uma interpretação intuitiva e também pode ser justificada no contexto da teoria da decisão bayesiana. Além disso, em um estudo de simulação de Monte Carlo, nosso método apresentou bom desempenho quando comparado com o teste de Kolmogorov-Smirnov, com o teste de Wilcoxon e com o método de Holmes. Finalmente, aplicamos nosso método em um conjunto de dados sobre medidas de escala de três grupos diferentes de pacientes submetidos a um questionário para diagnóstico de doença de Alzheimer.
|
27 |
Nonparametric Bayesian Dictionary Learning and Count and Mixture ModelingZhou, Mingyuan January 2013 (has links)
<p>Analyzing the ever-increasing data of unprecedented scale, dimensionality, diversity, and complexity poses considerable challenges to conventional approaches of statistical modeling. Bayesian nonparametrics constitute a promising research direction, in that such techniques can fit the data with a model that can grow with complexity to match the data. In this dissertation we consider nonparametric Bayesian modeling with completely random measures, a family of pure-jump stochastic processes with nonnegative increments. In particular, we study dictionary learning for sparse image representation using the beta process and the dependent hierarchical beta process, and we present the negative binomial process, a novel nonparametric Bayesian prior that unites the seemingly disjoint problems of count and mixture modeling. We show a wide variety of successful applications of our nonparametric Bayesian latent variable models to real problems in science and engineering, including count modeling, text analysis, image processing, compressive sensing, and computer vision.</p> / Dissertation
|
28 |
Bayesian Nonparametric Modeling and Inference for Multiple Object TrackingJanuary 2019 (has links)
abstract: The problem of multiple object tracking seeks to jointly estimate the time-varying cardinality and trajectory of each object. There are numerous challenges that are encountered in tracking multiple objects including a time-varying number of measurements, under varying constraints, and environmental conditions. In this thesis, the proposed statistical methods integrate the use of physical-based models with Bayesian nonparametric methods to address the main challenges in a tracking problem. In particular, Bayesian nonparametric methods are exploited to efficiently and robustly infer object identity and learn time-dependent cardinality; together with Bayesian inference methods, they are also used to associate measurements to objects and estimate the trajectory of objects. These methods differ from the current methods to the core as the existing methods are mainly based on random finite set theory.
The first contribution proposes dependent nonparametric models such as the dependent Dirichlet process and the dependent Pitman-Yor process to capture the inherent time-dependency in the problem at hand. These processes are used as priors for object state distributions to learn dependent information between previous and current time steps. Markov chain Monte Carlo sampling methods exploit the learned information to sample from posterior distributions and update the estimated object parameters.
The second contribution proposes a novel, robust, and fast nonparametric approach based on a diffusion process over infinite random trees to infer information on object cardinality and trajectory. This method follows the hierarchy induced by objects entering and leaving a scene and the time-dependency between unknown object parameters. Markov chain Monte Carlo sampling methods integrate the prior distributions over the infinite random trees with time-dependent diffusion processes to update object states.
The third contribution develops the use of hierarchical models to form a prior for statistically dependent measurements in a single object tracking setup. Dependency among the sensor measurements provides extra information which is incorporated to achieve the optimal tracking performance. The hierarchical Dirichlet process as a prior provides the required flexibility to do inference. Bayesian tracker is integrated with the hierarchical Dirichlet process prior to accurately estimate the object trajectory.
The fourth contribution proposes an approach to model both the multiple dependent objects and multiple dependent measurements. This approach integrates the dependent Dirichlet process modeling over the dependent object with the hierarchical Dirichlet process modeling of the measurements to fully capture the dependency among both object and measurements. Bayesian nonparametric models can successfully associate each measurement to the corresponding object and exploit dependency among them to more accurately infer the trajectory of objects. Markov chain Monte Carlo methods amalgamate the dependent Dirichlet process with the hierarchical Dirichlet process to infer the object identity and object cardinality.
Simulations are exploited to demonstrate the improvement in multiple object tracking performance when compared to approaches that are developed based on random finite set theory. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2019
|
29 |
Bayesian Test Analytics for Document CollectionsWalker, Daniel David 15 November 2012 (has links) (PDF)
Modern document collections are too large to annotate and curate manually. As increasingly large amounts of data become available, historians, librarians and other scholars increasingly need to rely on automated systems to efficiently and accurately analyze the contents of their collections and to find new and interesting patterns therein. Modern techniques in Bayesian text analytics are becoming wide spread and have the potential to revolutionize the way that research is conducted. Much work has been done in the document modeling community towards this end,though most of it is focused on modern, relatively clean text data. We present research for improved modeling of document collections that may contain textual noise or that may include real-valued metadata associated with the documents. This class of documents includes many historical document collections. Indeed, our specific motivation for this work is to help improve the modeling of historical documents, which are often noisy and/or have historical context represented by metadata. Many historical documents are digitized by means of Optical Character Recognition(OCR) from document images of old and degraded original documents. Historical documents also often include associated metadata, such as timestamps,which can be incorporated in an analysis of their topical content. Many techniques, such as topic models, have been developed to automatically discover patterns of meaning in large collections of text. While these methods are useful, they can break down in the presence of OCR errors. We show the extent to which this performance breakdown occurs. The specific types of analyses covered in this dissertation are document clustering, feature selection, unsupervised and supervised topic modeling for documents with and without OCR errors and a new supervised topic model that uses Bayesian nonparametrics to improve the modeling of document metadata. We present results in each of these areas, with an emphasis on studying the effects of noise on the performance of the algorithms and on modeling the metadata associated with the documents. In this research we effectively: improve the state of the art in both document clustering and topic modeling; introduce a useful synthetic dataset for historical document researchers; and present analyses that empirically show how existing algorithms break down in the presence of OCR errors.
|
30 |
Graph-based Modern Nonparametrics For High-dimensional DataWang, Kaijun January 2019 (has links)
Developing nonparametric statistical methods and inference procedures for high-dimensional large data have been a challenging frontier problem of statistics. To attack this problem, in recent years, a clear rising trend has been observed with a radically different viewpoint--``Graph-based Nonparametrics," which is the main research focus of this dissertation. The basic idea consists of two steps: (i) representation step: code the given data using graphs, (ii) analysis step: apply statistical methods on the graph-transformed problem to systematically tackle various types of data structures. Under this general framework, this dissertation develops two major research directions. Chapter 2—based on Mukhopadhyay and Wang (2019a)—introduces a new nonparametric method for high-dimensional k-sample comparison problem that is distribution-free, robust, and continues to work even when the dimension of the data is larger than the sample size. The proposed theory is based on modern LP-nonparametrics tools and unexplored connections with spectral graph theory. The key is to construct a specially-designed weighted graph from the data and to reformulate the k-sample problem into a community detection problem. The procedure is shown to possess various desirable properties along with a characteristic exploratory flavor that has practical consequences. The numerical examples show surprisingly well performance of our method under a broad range of realistic situations. Chapter 3—based on Mukhopadhyay and Wang (2019b)—revisits some foundational questions about network modeling that are still unsolved. In particular, we present unified statistical theory of the fundamental spectral graph methods (e.g., Laplacian, Modularity, Diffusion map, regularized Laplacian, Google PageRank model), which are often viewed as spectral heuristic-based empirical mystery facts. Despite half a century of research, this question has been one of the most formidable open issues, if not the core problem in modern network science. Our approach integrates modern nonparametric statistics, mathematical approximation theory (of integral equations), and computational harmonic analysis in a novel way to develop a theory that unifies and generalizes the existing paradigm. From a practical standpoint, it is shown that this perspective can provide adequate guidance for designing next-generation computational tools for large-scale problems. As an example, we have described the high-dimensional change-point detection problem. Chapter 4 discusses some further extensions and application of our methodologies to regularized spectral clustering and spatial graph regression problems. The dissertation concludes with the a discussion of two important areas of future studies. / Statistics
|
Page generated in 0.0724 seconds