• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • Tagged with
  • 5
  • 5
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Distributed Bootstrap for Massive Data

Yang Yu (12466911) 27 April 2022 (has links)
<p>Modern massive data, with enormous sample size and tremendous dimensionality, are usually stored and processed using a cluster of nodes in a master-worker architecture. A shortcoming of this architecture is that inter-node communication can be over a thousand times slower than intra-node computation, which makes communication efficiency a desirable feature when developing distributed learning algorithms. In this dissertation, we tackle this challenge and propose communication-efficient bootstrap methods for simultaneous inference in the distributed computational framework.</p> <p>  </p> <p>First, we propose two generic distributed bootstrap methods, \texttt{k-grad} and \texttt{n+k-1-grad}, which apply multiplier bootstrap at the master node on the gradients communicated across nodes. Based on them, we develop a communication-efficient method of producing an $\ell_\infty$-norm confidence region using distributed data with dimensionality not exceeding the local sample size. Our theory establishes the communication efficiency by providing a lower bound on the number of communication rounds $\tau_{\min}$ that warrants the statistical accuracy and efficiency and showing that $\tau_{\min}$ only increases logarithmically with the number of workers and the dimensionality. Our simulation studies validate our theory.</p> <p>  </p> <p>Then, we extend \texttt{k-grad} and \texttt{n+k-1-grad} to the high-dimensional regime and propose a distributed bootstrap method for simultaneous inference on high-dimensional distributed data. The method produces an $\ell_\infty$-norm confidence region based on a communication-efficient de-biased lasso, and we propose an efficient cross-validation approach to tune the method at every iteration. We theoretically prove a lower bound on the number of communication rounds $\tau_{\min}$ that warrants the statistical accuracy and efficiency. Furthermore, $\tau_{\min}$ only increases logarithmically with the number of workers and the intrinsic dimensionality, while nearly invariant to the nominal dimensionality. We test our theory by extensive simulation studies and a variable screening task on a semi-synthetic dataset based on the US Airline On-Time Performance dataset.</p>
2

Simultaneous Inference With Application To Dose-Response Study

Maharjan, Rachana 23 August 2022 (has links)
No description available.
3

Essays on Inference in Linear Mixed Models

Kramlinger, Peter 28 April 2020 (has links)
No description available.
4

Bayesian Simultaneous Intervals for Small Areas: An Application to Mapping Mortality Rates in U.S. Health Service Areas

Erhardt, Erik Barry 05 January 2004 (has links)
It is customary when presenting a choropleth map of rates or counts to present only the estimates (mean or mode) of the parameters of interest. While this technique illustrates spatial variation, it ignores the variation inherent in the estimates. We describe an approach to present variability in choropleth maps by constructing 100(1-alpha)% simultaneous intervals. The result provides three maps (estimate with two bands). We propose two methods to construct simultaneous intervals from the optimal individual highest posterior density (HPD) intervals to ensure joint simultaneous coverage of 100(1-alpha)%. Both methods exhibit the main feature of multiplying the lower bound and dividing the upper bound of the individual HPD intervals by parameters 0
5

Bootstrap confidence sets under model misspecification

Zhilova, Mayya 07 December 2015 (has links)
Diese Arbeit befasst sich mit einem Multiplier-Bootstrap Verfahren für die Konstruktion von Likelihood-basierten Konfidenzbereichen in zwei verschiedenen Fällen. Im ersten Fall betrachten wir das Verfahren für ein einzelnes parametrisches Modell und im zweiten Fall erweitern wir die Methode, um Konfidenzbereiche für eine ganze Familie von parametrischen Modellen simultan zu schätzen. Theoretische Resultate zeigen die Validität der Bootstrap-Prozedur für eine potenziell begrenzte Anzahl an Beobachtungen, eine große Anzahl an betrachteten parametrischen Modellen, wachsende Parameterdimensionen und eine mögliche Misspezifizierung der parametrischen Annahmen. Im Falle eines einzelnen parametrischen Modells funktioniert die Bootstrap-Approximation, wenn die dritte Potenz der Parameterdimension ist kleiner als die Anzahl an Beobachtungen. Das Hauptresultat über die Validität des Bootstrap gilt unter der sogenannten Small-Modeling-Bias Bedingung auch im Falle, dass das parametrische Modell misspezifiert ist. Wenn das wahre Modell signifikant von der betrachteten parametrischen Familie abweicht, ist das Bootstrap Verfahren weiterhin anwendbar, aber es führt zu etwas konservativeren Schätzungen: die Konfidenzbereiche werden durch den Modellfehler vergrößert. Für die Konstruktion von simultanen Konfidenzbereichen entwickeln wir ein Multiplier-Bootstrap Verfahren um die Quantile der gemeinsamen Verteilung der Likelihood-Quotienten zu schätzen und eine Multiplizitätskorrektur der Konfidenzlevels vorzunehmen. Theoretische Ergebnisse zeigen die Validität des Verfahrens; die resultierende Approximationsfehler hängt von der Anzahl an betrachteten parametrischen Modellen logarithmisch. Hier betrachten wir auch wieder den Fall, dass die parametrischen Modelle misspezifiziert sind. Wenn die Misspezifikation signifikant ist, werden Bootstrap-generierten kritischen Werte größer als die wahren Werte sein und die Bootstrap-Konfidenzmengen sind konservativ. / The thesis studies a multiplier bootstrap procedure for construction of likelihood-based confidence sets in two cases. The first one focuses on a single parametric model, while the second case extends the construction to simultaneous confidence estimation for a collection of parametric models. Theoretical results justify the validity of the bootstrap procedure for a limited sample size, a large number of considered parametric models, growing parameters’ dimensions, and possible misspecification of the parametric assumptions. In the case of one parametric model the bootstrap approximation works if the cube of the parametric dimension is smaller than the sample size. The main result about bootstrap validity continues to apply even if the underlying parametric model is misspecified under a so-called small modelling bias condition. If the true model deviates significantly from the considered parametric family, the bootstrap procedure is still applicable but it becomes conservative: the size of the constructed confidence sets is increased by the modelling bias. For the problem of construction of simultaneous confidence sets we suggest a multiplier bootstrap procedure for estimating a joint distribution of the likelihood ratio statistics, and for adjustment of the confidence level for multiplicity. Theoretical results state the bootstrap validity; a number of parametric models enters a resulting approximation error logarithmically. Here we also consider the case when parametric models are misspecified. If the misspecification is significant, then the bootstrap critical values exceed the true ones and the bootstrap confidence set becomes conservative. The theoretical approach includes non-asymptotic square-root Wilks theorem, Gaussian approximation of Euclidean norm of a sum of independent vectors, comparison and anti-concentration bounds for Euclidean norm of Gaussian vectors. Numerical experiments for misspecified regression models nicely confirm our theoretical results.

Page generated in 0.0566 seconds