Global ETD Search

51	Adaptive L1 regularized second-order least squares method for model selection Xue, Lin 11 September 2015 (has links) The second-order least squares (SLS) method in regression model proposed by Wang (2003, 2004) is based on the first two conditional moments of the response variable given the observed predictor variables. Wang and Leblanc (2008) show that the SLS estimator (SLSE) is asymptotically more efficient than the ordinary least squares estimator (OLSE) if the third moment of the random error is nonzero. We apply the SLS method to variable selection problems and propose the adaptively weighted L1 regularized SLSE (L1-SLSE). The L1-SLSE is robust against the shape of error distributions in variable selection problems. Finite sample simulation studies show that the L1-SLSE is more efficient than L1-OLSE in the case of asymmetric error distributions. A real data application with L1-SLSE is presented to demonstrate the usage of this method. / October 2015
52	Statistical Discovery of Biomarkers in Metagenomics Abdul Wahab, Ahmad Hakeem January 2015 (has links) Metagenomics holds unyielding potential in uncovering relationships within microbial communities that have yet to be discovered, particularly because the field circumvents the need to isolate and culture microbes from their natural environmental settings. A common research objective is to detect biomarkers, microbes are associated with changes in a status. For instance, determining such microbes across conditions such as healthy and diseased groups for instance allows researchers to identify pathogens and probiotics. This is often achieved via analysis of differential abundance of microbes. The problem is that differential abundance analysis looks at each microbe individually without considering the possible associations the microbes may have with each other. This is not favorable, since microbes rarely act individually but within intricate communities involving other microbes. An alternative would be variable selection techniques such as Lasso or Elastic Net which considers all the microbes simultaneously and conducts selection. However, Lasso often selects only a representative feature of a correlated cluster of features and the Elastic Net may incorrectly select unimportant features too frequently and erratically due to high levels of sparsity and variation in the data.\par In this research paper, the proposed method AdaLassop is an augmented variable selection technique that overcomes the misgivings of Lasso and Elastic Net. It provides researchers with a holistic model that takes into account the effects of selected biomarkers in presence of other important biomarkers. For AdaLassop, variable selection on sparse ultra-high dimensional data is implemented using the Adaptive Lasso with p-values extracted from Zero Inflated Negative Binomial Regressions as augmented weights. Comprehensive simulations involving varying correlation structures indicate that AdaLassop has optimal performance in the presence multicollinearity. This is especially apparent as sample size grows. Application of Adalassop on a Metagenome-wide study of diabetic patients reveals both pathogens and probiotics that have been researched in the medical field. Adaptive Lasso Biomarker Metagenomics Variable Selection Statistics Adaptive Elastic Net
53	Reverse Engineering of Biological Systems 2014 July 1900 (has links) Gene regulatory network (GRN) consists of a set of genes and regulatory relationships between the genes. As outputs of the GRN, gene expression data contain important information that can be used to reconstruct the GRN to a certain degree. However, the reverse engineer of GRNs from gene expression data is a challenging problem in systems biology. Conventional methods fail in inferring GRNs from gene expression data because of the relative less number of observations compared with the large number of the genes. The inherent noises in the data make the inference accuracy relatively low and the combinatorial explosion nature of the problem makes the inference task extremely difficult. This study aims at reconstructing the GRNs from time-course gene expression data based on GRN models using system identification and parameter estimation methods. The main content consists of three parts: (1) a review of the methods for reverse engineering of GRNs, (2) reverse engineering of GRNs based on linear models and (3) reverse engineering of GRNs based on a nonlinear model, specifically S-systems. In the first part, after the necessary background and challenges of the problem are introduced, various methods for the inference of GRNs are comprehensively reviewed from two aspects: models and inference algorithms. The advantages and disadvantages of each method are discussed. The second part focus on inferring GRNs from time-course gene expression data based on linear models. First, the statistical properties of two sparse penalties, adaptive LASSO and SCAD, with an autoregressive model are studied. It shows that the proposed methods using these two penalties can asymptotically reconstruct the underlying networks. This provides a solid foundation for these methods and their extensions. Second, the integration of multiple datasets should be able to improve the accuracy of the GRN inference. A novel method, Huber group LASSO, is developed to infer GRNs from multiple time-course data, which is also robust to large noises and outliers that the data may contain. An efficient algorithm is also developed and its convergence analysis is provided. The third part can be further divided into two phases: estimating the parameters of S-systems with system structure known and inferring the S-systems without knowing the system structure. Two methods, alternating weighted least squares (AWLS) and auxiliary function guided coordinate descent (AFGCD), have been developed to estimate the parameters of S-systems from time-course data. AWLS takes advantage of the special structure of S-systems and significantly outperforms one existing method, alternating regression (AR). AFGCD uses the auxiliary function and coordinate descent techniques to get the smart and efficient iteration formula and its convergence is theoretically guaranteed. Without knowing the system structure, taking advantage of the special structure of the S-system model, a novel method, pruning separable parameter estimation algorithm (PSPEA) is developed to locally infer the S-systems. PSPEA is then combined with continuous genetic algorithm (CGA) to form a hybrid algorithm which can globally reconstruct the S-systems. Gene Regulatory Network S-systems Reverse Engineering LASSO
54	Rôle des répétitions textuelles dans les Psaumes de la Pénitence de LASSUS Lessoil-Daelman, Marcelle January 1993 (has links) Textual repetitions abound in verses of the Seven Penitential Psalms of Lassus and this research attempts to discover their function. A total of one hundred and thirty-two verses were analyzed. The results of this investigation exhibit numerous mathematical figures underlying the entire work's structure, and the influence of repetitions is conspicuous in each figure's organization. Moreover, this study shows, in a smaller measure, the mutual influence between form and text expression. A detailed method of calculation is also provided which may eventually be applied to other works of the repertoire of the sixteenth century.
55	Computing a journal meta-ranking using paired comparisons and adaptive lasso estimators Vana, Laura, Hochreiter, Ronald, Hornik, Kurt 01 1900 (has links) (PDF) In a "publish-or-perish culture", the ranking of scientific journals plays a central role in assessing the performance in the current research environment. With a wide range of existing methods for deriving journal rankings, meta-rankings have gained popularity as a means of aggregating different information sources. In this paper, we propose a method to create a meta-ranking using heterogeneous journal rankings. Employing a parametric model for paired comparison data we estimate quality scores for 58 journals in the OR/MS/POM community, which together with a shrinkage procedure allows for the identification of clusters of journals with similar quality. The use of paired comparisons provides a flexible framework for deriving an aggregated score while eliminating the problem of missing data.
56	Penalized Regression Methods in the Study of Serum Biomarkers for Overweight and Obesity Vasquez, Monica M., Vasquez, Monica M. January 2017 (has links) The study of circulating biomarkers and their association with disease outcomes has become progressively complex due to advances in the measurement of these biomarkers through multiplex technologies. Although the availability of numerous serum biomarkers is highly promising, multiplex assays present statistical challenges due to the high dimensionality of these data. In this dissertation, three studies are presented that address these challenges using L1 penalized regression methods. In the first part of the dissertation, an extensive simulation study is performed for the logistic regression model that compares the Least Absolute Shrinkage and Selection Operator (LASSO) method with five LASSO-type methods given scenarios that are present in serum biomarker research, such as high correlation between biomarkers, weak associations with the outcome, and sparse number of true signals. Results show that choice of optimal LASSO-type method is dependent on data structure and should be guided by the research objective. Methods are then applied to the Tucson Epidemiological Study of Airway Obstructive Disease (TESAOD) study for the identification of serum biomarkers of overweight and obesity. Measurement of serum biomarkers using multiplex technologies may be more variable as compared to traditional single biomarker methods. Measurement error may induce bias in parameter estimation and complicate the variable selection process. In the second part of the dissertation, an existing measurement error correction method for penalized linear regression with L1 penalty has been adapted to accommodate validation data on a randomly selected subset of the study sample. A simulation study and analysis of TESAOD data demonstrate that the proposed approach improves variable selection and reduces bias in parameter estimation for validation data as small as 10 percent of the study sample. In the third part of the dissertation, a measurement error correction method that utilizes validation data is proposed for the penalized logistic regression model with the L1 penalty. A simulation study and analysis of TESAOD data are used to evaluate the proposed method. Results show an improvement in variable selection. Biomarkers High-Dimensional LASSO Measurement Error Obesity Overweight
57	Statistical Modeling and Forecasting for Time Series With Trend Alraddadi, Rawiyah January 2021 (has links) No description available. Statistics LASSO forecasting wages regression autoregressive ARMA hyponatremia, normal empirical
58	Lasso for Autoregressive and Moving Average Coeffients via Residuals of Unobservable Time Series Hanh , Nguyen T. January 2018 (has links) No description available. Statistics
59	Bayesian Variable Selection for High-Dimensional Data with an Ordinal Response Zhang, Yiran January 2019 (has links) No description available. Biostatistics
60	Two Essays on High-Dimensional Robust Variable Selection and an Application to Corporate Bankruptcy Prediction Li, Shaobo 29 October 2018 (has links) No description available. Statistics Variable selection LASSO robust statistics default risk

Search results