• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 8
  • 2
  • 1
  • 1
  • Tagged with
  • 67
  • 67
  • 42
  • 24
  • 22
  • 18
  • 18
  • 18
  • 16
  • 16
  • 11
  • 9
  • 8
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

On the use of $\alpha$-stable random variables in Bayesian bridge regression, neural networks and kernel processes.pdf

Jorge E Loria (18423207) 23 April 2024 (has links)
<p dir="ltr">The first chapter considers the l_α regularized linear regression, also termed Bridge regression. For α ∈ (0, 1), Bridge regression enjoys several statistical properties of interest such</p><p dir="ltr">as sparsity and near-unbiasedness of the estimates (Fan & Li, 2001). However, the main difficulty lies in the non-convex nature of the penalty for these values of α, which makes an</p><p dir="ltr">optimization procedure challenging and usually it is only possible to find a local optimum. To address this issue, Polson et al. (2013) took a sampling based fully Bayesian approach to this problem, using the correspondence between the Bridge penalty and a power exponential prior on the regression coefficients. However, their sampling procedure relies on Markov chain Monte Carlo (MCMC) techniques, which are inherently sequential and not scalable to large problem dimensions. Cross validation approaches are similarly computation-intensive. To this end, our contribution is a novel non-iterative method to fit a Bridge regression model. The main contribution lies in an explicit formula for Stein’s unbiased risk estimate for the out of sample prediction risk of Bridge regression, which can then be optimized to select the desired tuning parameters, allowing us to completely bypass MCMC as well as computation-intensive cross validation approaches. Our procedure yields results in a fraction of computational times compared to iterative schemes, without any appreciable loss in statistical performance.</p><p><br></p><p dir="ltr">Next, we build upon the classical and influential works of Neal (1996), who proved that the infinite width scaling limit of a Bayesian neural network with one hidden layer is a Gaussian process, when the network weights have bounded prior variance. Neal’s result has been extended to networks with multiple hidden layers and to convolutional neural networks, also with Gaussian process scaling limits. The tractable properties of Gaussian processes then allow straightforward posterior inference and uncertainty quantification, considerably simplifying the study of the limit process compared to a network of finite width. Neural network weights with unbounded variance, however, pose unique challenges. In this case, the classical central limit theorem breaks down and it is well known that the scaling limit is an α-stable process under suitable conditions. However, current literature is primarily limited to forward simulations under these processes and the problem of posterior inference under such a scaling limit remains largely unaddressed, unlike in the Gaussian process case. To this end, our contribution is an interpretable and computationally efficient procedure for posterior inference, using a conditionally Gaussian representation, that then allows full use of the Gaussian process machinery for tractable posterior inference and uncertainty quantification in the non-Gaussian regime.</p><p><br></p><p dir="ltr">Finally, we extend on the previous chapter, by considering a natural extension to deep neural networks through kernel processes. Kernel processes (Aitchison et al., 2021) generalize to deeper networks the notion proved by Neal (1996) by describing the non-linear transformation in each layer as a covariance matrix (kernel) of a Gaussian process. In this way, each succesive layer transforms the covariance matrix in the previous layer by a covariance function. However, the covariance obtained by this process loses any possibility of representation learning since the covariance matrix is deterministic. To address this, Aitchison et al. (2021) proposed deep kernel processes using Wishart and inverse Wishart matrices for each layer in deep neural networks. Nevertheless, the approach they propose requires using a process that does not emerge from the limit of a classic neural network structure. We introduce α-stable kernel processes (α-KP) for learning posterior stochastic covariances in each layer. Our results show that our method is much better than the approach proposed by Aitchison et al. (2021) in both simulated data and the benchmark Boston dataset.</p>
42

Dimension Reduction and Variable Selection

Moradi Rekabdarkolaee, Hossein 01 January 2016 (has links)
High-dimensional data are becoming increasingly available as data collection technology advances. Over the last decade, significant developments have been taking place in high-dimensional data analysis, driven primarily by a wide range of applications in many fields such as genomics, signal processing, and environmental studies. Statistical techniques such as dimension reduction and variable selection play important roles in high dimensional data analysis. Sufficient dimension reduction provides a way to find the reduced space of the original space without a parametric model. This method has been widely applied in many scientific fields such as genetics, brain imaging analysis, econometrics, environmental sciences, etc. in recent years. In this dissertation, we worked on three projects. The first one combines local modal regression and Minimum Average Variance Estimation (MAVE) to introduce a robust dimension reduction approach. In addition to being robust to outliers or heavy-tailed distribution, our proposed method has the same convergence rate as the original MAVE. Furthermore, we combine local modal base MAVE with a $L_1$ penalty to select informative covariates in a regression setting. This new approach can exhaustively estimate directions in the regression mean function and select informative covariates simultaneously, while being robust to the existence of possible outliers in the dependent variable. The second project develops sparse adaptive MAVE (saMAVE). SaMAVE has advantages over adaptive LASSO because it extends adaptive LASSO to multi-dimensional and nonlinear settings, without any model assumption, and has advantages over sparse inverse dimension reduction methods in that it does not require any particular probability distribution on \textbf{X}. In addition, saMAVE can exhaustively estimate the dimensions in the conditional mean function. The third project extends the envelope method to multivariate spatial data. The envelope technique is a new version of the classical multivariate linear model. The estimator from envelope asymptotically has less variation compare to the Maximum Likelihood Estimator (MLE). The current envelope methodology is for independent observations. While the assumption of independence is convenient, this does not address the additional complication associated with a spatial correlation. This work extends the idea of the envelope method to cases where independence is an unreasonable assumption, specifically multivariate data from spatially correlated process. This novel approach provides estimates for the parameters of interest with smaller variance compared to maximum likelihood estimator while still being able to capture the spatial structure in the data.
43

Mixture of Factor Analyzers with Information Criteria and the Genetic Algorithm

Turan, Esra 01 August 2010 (has links)
In this dissertation, we have developed and combined several statistical techniques in Bayesian factor analysis (BAYFA) and mixture of factor analyzers (MFA) to overcome the shortcoming of these existing methods. Information Criteria are brought into the context of the BAYFA model as a decision rule for choosing the number of factors m along with the Press and Shigemasu method, Gibbs Sampling and Iterated Conditional Modes deterministic optimization. Because of sensitivity of BAYFA on the prior information of the factor pattern structure, the prior factor pattern structure is learned directly from the given sample observations data adaptively using Sparse Root algorithm. Clustering and dimensionality reduction have long been considered two of the fundamental problems in unsupervised learning or statistical pattern recognition. In this dissertation, we shall introduce a novel statistical learning technique by focusing our attention on MFA from the perspective of a method for model-based density estimation to cluster the high-dimensional data and at the same time carry out factor analysis to reduce the curse of dimensionality simultaneously in an expert data mining system. The typical EM algorithm can get trapped in one of the many local maxima therefore, it is slow to converge and can never converge to global optima, and highly dependent upon initial values. We extend the EM algorithm proposed by cite{Gahramani1997} for the MFA using intelligent initialization techniques, K-means and regularized Mahalabonis distance and introduce the new Genetic Expectation Algorithm (GEM) into MFA in order to overcome the shortcomings of typical EM algorithm. Another shortcoming of EM algorithm for MFA is assuming the variance of the error vector and the number of factors is the same for each mixture. We propose Two Stage GEM algorithm for MFA to relax this constraint and obtain different numbers of factors for each population. In this dissertation, our approach will integrate statistical modeling procedures based on the information criteria as a fitness function to determine the number of mixture clusters and at the same time to choose the number factors that can be extracted from the data.
44

Statistical Learning of Proteomics Data and Global Testing for Data with Correlations

Donglai Chen (6405944) 15 May 2019 (has links)
<div>This dissertation consists of two parts. The first part is a collaborative project with Dr. Szymanski's group in Agronomy at Purdue, to predict protein complex assemblies and interactions. Proteins in the leaf cytosol of Arabidopsis were fractionated using Size Exclusion Chromatography (SEC) and mixed-bed Ion Exchange Chromatography (IEX).</div><div>Protein mass spectrometry data were obtained for the two platforms of separation and two replicates of each. We combine the four data sets and conduct a series of statistical learning, including 1) data filtering, 2) a two-round hierarchical clustering to integrate multiple data types, 3) validation of clustering based on known protein complexes,</div><div>4) mining dendrogram trees for prediction of protein complexes. Our method is developed for integrative analysis of different data types and it eliminates the difficulty of choosing an appropriate cluster number in clustering analysis. It provides a statistical learning tool to globally analyze the oligomerization state of a system of protein complexes.</div><div><br></div><div><br></div><div>The second part examines global hypothesis testing under sparse alternatives and arbitrarily strong dependence. Global tests are used to aggregate information and reduce the burden of multiple testing. A common situation in modern data analysis is that variables with nonzero effects are sparse. The minimum p-value and higher criticism tests are particularly effective and more powerful than the F test under sparse alternatives. This is the common setting in genome-wide association study (GWAS) data. However, arbitrarily strong dependence among variables poses a great challenge towards the p-value calculation of these optimal tests. We develop a latent variable adjusted method to correct minimum p-value test. After adjustment, test statistics become weakly dependent and the corresponding null distributions are valid. We show that if the latent variable is not related to the response variable, power can be improved. Simulation studies show that our method is more powerful than other methods in highly sparse signal and correlated marginal tests setting. We also show its application in a real dataset.</div>
45

Toward a Theory of Auto-modeling

Yiran Jiang (16632711) 25 July 2023 (has links)
<p>Statistical modeling aims at constructing a mathematical model for an existing data set. As a comprehensive concept, statistical modeling leads to a wide range of interesting problems. Modern parametric models, such as deepnets, have achieved remarkable success in quite a few application areas with massive data. Although being powerful in practice, many fitted over-parameterized models potentially suffer from losing good statistical properties. For this reason, a new framework named the Auto-modeling (AM) framework is proposed. Philosophically, the mindset is to fit models to future observations rather than the observed sample. Technically, choosing an imputation model for generating future observations, we fit models to future observations via optimizing an approximation to the desired expected loss function based on its sample counterpart and what we call an adaptive {\it duality function}.</p> <p><br></p> <p>The first part of the dissertation (Chapter 2 to 7) focuses on the new philosophical perspective of the method, as well as the details of the main framework. Technical details, including essential theoretical properties of the method are also investigated. We also demonstrate the superior performance of the proposed method via three applications: Many-normal-means problem, $n < p$ linear regression and image classification.</p> <p><br></p> <p>The second part of the dissertation (Chapter 8) focuses on the application of the AM framework to the construction of linear regression models. Our primary objective is to shed light on the stability issue associated with the commonly used data-driven model selection methods such as cross-validation (CV). Furthermore, we highlight the philosophical distinctions between CV and AM. Theoretical properties and numerical examples presented in the study demonstrate the potential and promise of AM-based linear model selection. Additionally, we have devised a conformal prediction method specifically tailored for quantifying the uncertainty of AM predictions in the context of linear regression.</p>
46

Regression Analysis for Ordinal Outcomes in Matched Study Design: Applications to Alzheimer's Disease Studies

Austin, Elizabeth 09 July 2018 (has links) (PDF)
Alzheimer's Disease (AD) affects nearly 5.4 million Americans as of 2016 and is the most common form of dementia. The disease is characterized by the presence of neurofibrillary tangles and amyloid plaques [1]. The amount of plaques are measured by Braak stage, post-mortem. It is known that AD is positively associated with hypercholesterolemia [16]. As statins are the most widely used cholesterol-lowering drug, there may be associations between statin use and AD. We hypothesize that those who use statins, specifically lipophilic statins, are more likely to have a low Braak stage in post-mortem analysis. In order to address this hypothesis, we wished to fit a regression model for ordinal outcomes (e.g., high, moderate, or low Braak stage) using data collected from the National Alzheimer's Coordinating Center (NACC) autopsy cohort. As the outcomes were matched on the length of follow-up, a conditional likelihood-based method is often used to estimate the regression coefficients. However, it can be challenging to solve the conditional-likelihood based estimating equation numerically, especially when there are many matching strata. Given that the likelihood of a conditional logistic regression model is equivalent to the partial likelihood from a stratified Cox proportional hazard model, the existing R function for a Cox model, coxph( ), can be used for estimation of a conditional logistic regression model. We would like to investigate whether this strategy could be extended to a regression model for ordinal outcomes. More specifically, our aims are to (1) demonstrate the equivalence between the exact partial likelihood of a stratified discrete time Cox proportional hazards model and the likelihood of a conditional logistic regression model, (2) prove equivalence, or lack there-of, between the exact partial likelihood of a stratified discrete time Cox proportional hazards model and the conditional likelihood of models appropriate for multiple ordinal outcomes: an adjacent categories model, a continuation-ratio model, and a cumulative logit model, and (3) clarify how to set up stratified discrete time Cox proportional hazards model for multiple ordinal outcomes with matching using the existing coxph( ) R function and interpret the regression coefficient estimates that result. We verified this theoretical proof through simulation studies. We simulated data from the three models of interest: an adjacent categories model, a continuation-ratio model, and a cumulative logit model. We fit a Cox model using the existing coxph( ) R function to the simulated data produced by each model. We then compared the coefficient estimates obtained. Lastly, we fit a Cox model to the NACC dataset. We used Braak stage as the outcome variables, having three ordinal categories. We included predictors for age at death, sex, genotype, education, comorbidities, number of days having taken lipophilic statins, number of days having taken hydrophilic statins, and time to death. We matched cases to controls on the length of follow up. We have discussed all findings and their implications in detail.
47

Beyond Disagreement-based Learning for Contextual Bandits

Pinaki Ranjan Mohanty (16522407) 26 July 2023 (has links)
<p>While instance-dependent contextual bandits have been previously studied, their analysis<br> has been exclusively limited to pure disagreement-based learning. This approach lacks a<br> nuanced understanding of disagreement and treats it in a binary and absolute manner.<br> In our work, we aim to broaden the analysis of instance-dependent contextual bandits by<br> studying them under the framework of disagreement-based learning in sub-regions. This<br> framework allows for a more comprehensive examination of disagreement by considering its<br> varying degrees across different sub-regions.<br> To lay the foundation for our analysis, we introduce key ideas and measures widely<br> studied in the contextual bandit and disagreement-based active learning literature. We<br> then propose a novel, instance-dependent contextual bandit algorithm for the realizable<br> case in a transductive setting. Leveraging the ability to observe contexts in advance, our<br> algorithm employs a sophisticated Linear Programming subroutine to identify and exploit<br> sub-regions effectively. Next, we provide a series of results tying previously introduced<br> complexity measures and offer some insightful discussion on them. Finally, we enhance the<br> existing regret bounds for contextual bandits by integrating the sub-region disagreement<br> coefficient, thereby showcasing significant improvement in performance against the pure<br> disagreement-based approach.<br> In the concluding section of this thesis, we do a brief recap of the work done and suggest<br> potential future directions for further improving contextual bandit algorithms within the<br> framework of disagreement-based learning in sub-regions. These directions offer opportuni-<br> ties for further research and development, aiming to refine and enhance the effectiveness of<br> contextual bandit algorithms in practical applications.<br> <br> </p>
48

GENERAL-PURPOSE STATISTICAL INFERENCE WITH DIFFERENTIAL PRIVACY GUARANTEES

Zhanyu Wang (13893375) 06 December 2023 (has links)
<p dir="ltr">Differential privacy (DP) uses a probabilistic framework to measure the level of privacy protection of a mechanism that releases data analysis results to the public. Although DP is widely used by both government and industry, there is still a lack of research on statistical inference under DP guarantees. On the one hand, existing DP mechanisms mainly aim to extract dataset-level information instead of population-level information. On the other hand, DP mechanisms introduce calibrated noises into the released statistics, which often results in sampling distributions more complex and intractable than the non-private ones. This dissertation aims to provide general-purpose methods for statistical inference, such as confidence intervals (CIs) and hypothesis tests (HTs), that satisfy the DP guarantees. </p><p dir="ltr">In the first part of the dissertation, we examine a DP bootstrap procedure that releases multiple private bootstrap estimates to construct DP CIs. We present new DP guarantees for this procedure and propose to use deconvolution with DP bootstrap estimates to derive CIs for inference tasks such as population mean, logistic regression, and quantile regression. Our method achieves the nominal coverage level in both simulations and real-world experiments and offers the first approach to private inference for quantile regression.</p><p dir="ltr">In the second part of the dissertation, we propose to use the simulation-based ``repro sample'' approach to produce CIs and HTs based on DP statistics. Our methodology has finite-sample guarantees and can be applied to a wide variety of private inference problems. It appropriately accounts for biases introduced by DP mechanisms (such as by clamping) and improves over other state-of-the-art inference methods in terms of the coverage and type I error of the private inference. </p><p dir="ltr">In the third part of the dissertation, we design a debiased parametric bootstrap framework for DP statistical inference. We propose the adaptive indirect estimator, a novel simulation-based estimator that is consistent and corrects the clamping bias in the DP mechanisms. We also prove that our estimator has the optimal asymptotic variance among all well-behaved consistent estimators, and the parametric bootstrap results based on our estimator are consistent. Simulation studies show that our framework produces valid DP CIs and HTs in finite sample settings, and it is more efficient than other state-of-the-art methods.</p>
49

PROBABLY APPROXIMATELY CORRECT BOUNDS FOR ESTIMATING MARKOV TRANSITION KERNELS

Imon Banerjee (17555685) 06 December 2023 (has links)
<p dir="ltr">This thesis presents probably approximately correct (PAC) bounds on estimates of the transition kernels of Controlled Markov chains (CMC’s). CMC’s are a natural choice for modelling various industrial and medical processes, and are also relevant to reinforcement learning (RL). Learning the transition dynamics of CMC’s in a sample efficient manner is an important question that is open. This thesis aims to close this gap in knowledge in literature.</p><p dir="ltr">In Chapter 2, we lay the groundwork for later chapters by introducing the relevant concepts and defining the requisite terms. The two subsequent chapters focus on non-parametric estimation. </p><p dir="ltr">In Chapter 3, we restrict ourselves to a finitely supported CMC with d states and k controls and produce a general theory for minimax sample complexity of estimating the transition matrices.</p><p dir="ltr">In Chapter 4 we demonstrate the applicability of this theory by using it to recover the sample complexities of various controlled Markov chains, as well as RL problems.</p><p dir="ltr">In Chapter 5 we move to a continuous state and action spaces with compact supports. We produce a robust, non-parametric test to find the best histogram based estimator of the transition density; effectively reducing the problem into one of model selection based on empricial processes.</p><p dir="ltr">Finally, in Chapter 6 we move to a parametric and Bayesian regime, and restrict ourselves to Markov chains. Under this setting we provide a PAC-Bayes bound for estimating model parameters under tempered posteriors.</p>
50

MONITORING AUTOCORRELATED PROCESSES

Tang, Weiping 10 1900 (has links)
<p>This thesis is submitted by Weiping Tang on August 2, 2011.</p> / <p>Several control schemes for monitoring process mean shifts, including cumulative sum (CUSUM), weighted cumulative sum (WCUSUM), adaptive cumulative sum (ACUSUM) and exponentially weighted moving average (EWMA) control schemes, display high performance in detecting constant process mean shifts. However, a variety of dynamic mean shifts frequently occur and few control schemes can efficiently work in these situations due to the limited window for catching shifts, particularly when the mean decreases rapidly. This is precisely the case when one uses the residuals from autocorrelated data to monitor the process mean, a feature often referred to as forecast recovery. This thesis focuses on detecting a shift in the mean of a time series when a forecast recovery dynamic pattern in the mean of the residuals is observed. Specifically, we examine in detail several particular cases of the Autoregressive Integrated Moving Average (ARIMA) time series models. We introduce a new upper-sided control chart based on the Exponentially Weighted Moving Average (EWMA) scheme combined with the Fast Initial Response (FIR) feature. To assess chart performance we use the well-established Average</p> <p>Run Length (ARL) criterion. A non-homogeneous Markov chain method is developed for ARL calculation for the proposed chart. We show numerically that the proposed procedure performs as well or better than the Weighted Cumulative Sum (WCUSUM) chart introduced by Shu, Jiang and Tsui (2008), and better than the conventional CUSUM, the ACUSUM and the Generalized Likelihood Ratio Test (GLRT) charts. The methods are illustrated on molecular weight data from a polymer manufacturing process.</p> / Master of Science (MSc)

Page generated in 0.062 seconds