• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Graph-based Modern Nonparametrics For High-dimensional Data

Wang, Kaijun January 2019 (has links)
Developing nonparametric statistical methods and inference procedures for high-dimensional large data have been a challenging frontier problem of statistics. To attack this problem, in recent years, a clear rising trend has been observed with a radically different viewpoint--``Graph-based Nonparametrics," which is the main research focus of this dissertation. The basic idea consists of two steps: (i) representation step: code the given data using graphs, (ii) analysis step: apply statistical methods on the graph-transformed problem to systematically tackle various types of data structures. Under this general framework, this dissertation develops two major research directions. Chapter 2—based on Mukhopadhyay and Wang (2019a)—introduces a new nonparametric method for high-dimensional k-sample comparison problem that is distribution-free, robust, and continues to work even when the dimension of the data is larger than the sample size. The proposed theory is based on modern LP-nonparametrics tools and unexplored connections with spectral graph theory. The key is to construct a specially-designed weighted graph from the data and to reformulate the k-sample problem into a community detection problem. The procedure is shown to possess various desirable properties along with a characteristic exploratory flavor that has practical consequences. The numerical examples show surprisingly well performance of our method under a broad range of realistic situations. Chapter 3—based on Mukhopadhyay and Wang (2019b)—revisits some foundational questions about network modeling that are still unsolved. In particular, we present unified statistical theory of the fundamental spectral graph methods (e.g., Laplacian, Modularity, Diffusion map, regularized Laplacian, Google PageRank model), which are often viewed as spectral heuristic-based empirical mystery facts. Despite half a century of research, this question has been one of the most formidable open issues, if not the core problem in modern network science. Our approach integrates modern nonparametric statistics, mathematical approximation theory (of integral equations), and computational harmonic analysis in a novel way to develop a theory that unifies and generalizes the existing paradigm. From a practical standpoint, it is shown that this perspective can provide adequate guidance for designing next-generation computational tools for large-scale problems. As an example, we have described the high-dimensional change-point detection problem. Chapter 4 discusses some further extensions and application of our methodologies to regularized spectral clustering and spatial graph regression problems. The dissertation concludes with the a discussion of two important areas of future studies. / Statistics

Page generated in 0.0692 seconds