Global ETD Search

1	Search for Cosmic Sources of High Energy Neutrinos with the AMANDA-II Detector Recherche de sources cosmiques de neutrinos à haute énergie avec le détecteur AMANDA-II Labare, Mathieu 26 January 2010 (has links) AMANDA-II est un télescope à neutrinos composé d'un réseau tri-dimensionnel de senseurs optiques déployé dans la glace du Pôle Sud. Son principe de détection repose sur la mise en évidence de particules secondaires chargées émises lors de l'interaction d'un neutrino de haute énergie (> 100 GeV) avec la matière environnant le détecteur, sur base de la détection de rayonnement Cerenkov. Ce travail est basé sur les données enregistrées par AMANDA-II entre 2000 et 2006, afin de rechercher des sources cosmiques de neutrinos. Le signal recherché est affecté d'un bruit de fond important de muons et de neutrinos issus de l'interaction du rayonnement cosmique primaire dans l'atmosphère. En se limitant à l'observation de l'hémisphère nord, le bruit de fond des muons atmosphériques, absorbés par la Terre, est éliminé. Par contre, les neutrinos atmosphériques forment un bruit de fond irréductible constituant la majorité des 6100 événements sélectionnés pour cette analyse. Il est cependant possible d'identifier une source ponctuelle de neutrinos cosmiques en recherchant un excès local se détachant du bruit de fond isotrope de neutrinos atmosphériques, couplé à une sélection basée sur l'énergie, dont le spectre est différent pour les deux catégories de neutrinos. Une approche statistique originale est développée dans le but d'optimiser le pouvoir de détection de sources ponctuelles, tout en contrôlant le taux de fausses découvertes, donc le niveau de confiance d'une observation. Cette méthode repose uniquement sur la connaissance de l'hypothèse de bruit de fond, sans aucune hypothèse sur le modèle de production de neutrinos par les sources recherchées. De plus, elle intègre naturellement la notion de facteur d'essai rencontrée dans le cadre de test d'hypothèses multiples.La procédure a été appliquée sur l'échantillon final d'évènements récoltés par AMANDA-II. --------- MANDA-II is a neutrino telescope which comprises a three dimensional array of optical sensors deployed in the South Pole glacier. Its principle rests on the detection of the Cherenkov radiation emitted by charged secondary particles produced by the interaction of a high energy neutrino (> 100 GeV) with the matter surrounding the detector. This work is based on data recorded by the AMANDA-II detector between 2000 and 2006 in order to search for cosmic sources of neutrinos. A potential signal must be extracted from the overwhelming background of muons and neutrinos originating from the interaction of primary cosmic rays within the atmosphere. The observation is limited to the northern hemisphere in order to be free of the atmospheric muon background, which is stopped by the Earth. However, atmospheric neutrinos constitute an irreducible background composing the main part of the 6100 events selected for this analysis. It is nevertheless possible to identify a point source of cosmic neutrinos by looking for a local excess breaking away from the isotropic background of atmospheric neutrinos; This search is coupled with a selection based on the energy, whose spectrum is different from that of the atmospheric neutrino background. An original statistical approach has been developed in order to optimize the detection of point sources, whilst controlling the false discovery rate -- hence the confidence level -- of an observation. This method is based solely on the knowledge of the background hypothesis, without any assumption on the production model of neutrinos in sought sources. Moreover, the method naturally accounts for the trial factor inherent in multiple testing.The procedure was applied on the final sample of events collected by AMANDA-II. False Discovery Rate statistical method
2	Detecting differentially expressed genes while controlling the false discovery rate for microarray data Jiao, Shuo. January 2009 (has links) Thesis (Ph.D.)--University of Nebraska-Lincoln, 2009. / Title from title screen (site viewed March 2, 2010). PDF text: 100 p. : col. ill. ; 953 K. UMI publication number: AAT 3379821. Includes bibliographical references. Also available in microfilm and microfiche formats.
3	An adaptive single-step FDR controlling procedure Iyer, Vishwanath January 2010 (has links) This research is focused on identifying a single-step procedure that, upon adapting to the data through estimating the unknown parameters, would asymptotically control the False Discovery Rate when testing a large number of hypotheses simultaneously, and exploring some of the characteristics of this procedure. / Statistics Statistics Adaptive False Discovery Rate Single-step
4	Regaining control of false findings in feature selection, classification, and prediction on neuroimaging and genomics data January 2018 (has links) acase@tulane.edu / The technological advances of past decades have led to the accumulation of large amounts of genomic and neuroimaging data, enabling novel strategies in precision medicine. These largely rely on machine learning algorithms and modern statistical methods for big biological datasets, which are data-driven rather than hypothesis-driven. These methods often lack guarantees on the validity of the research findings. Because it can be a matter of life and death, when computational methods are deployed in clinical practice in medicine, establishing guarantees on the validity of the results is essential for the advancement of precision medicine. This thesis proposes several novel sparse regression and sparse canonical correlation analysis techniques, which by design include guarantees on the false discovery rate in variable selection. Variable selection on biomedical data is essential for many areas of healthcare, including precision medicine, population stratification, drug development, and predictive modeling of disease phenotypes. Predictive machine learning models can directly affect the patient when used to aid diagnosis, and therefore they need to be thoroughly evaluated before deployment. We present a novel approach to validly reuse the test data for performance evaluation of predictive models. The proposed methods are validated in the application on large genomic and neuroimaging datasets, where they confirm results from previous studies and also lead to new biological insights. In addition, this work puts a focus on making the proposed methods widely available to the scientific community though the release of free and open-source scientific software. / 1 / Alexej Gossmann Sparse models False discovery rate control Data reuse
5	Topics in multiple hypotheses testing Qian, Yi 25 April 2007 (has links) It is common to test many hypotheses simultaneously in the application of statistics. The probability of making a false discovery grows with the number of statistical tests performed. When all the null hypotheses are true, and the test statistics are indepen- dent and continuous, the error rates from the family wise error rate (FWER)- and the false discovery rate (FDR)-controlling procedures are equal to the nominal level. When some of the null hypotheses are not true, both procedures are conservative. In the first part of this study, we review the background of the problem and propose methods to estimate the number of true null hypotheses. The estimates can be used in FWER- and FDR-controlling procedures with a consequent increase in power. We conduct simulation studies and apply the estimation methods to data sets with bio- logical or clinical significance. In the second part of the study, we propose a mixture model approach for the analysis of ChIP-chip high density oligonucleotide array data to study the interac- tions between proteins and DNA. If we could identify the specific locations where proteins interact with DNA, we could increase our understanding of many important cellular events. Most experiments to date are performed in culture on cell lines, bac- teria, or yeast, and future experiments will include those in developing tissues, organs, or cancer biopsies, and they are critical in understanding the function of genes and proteins. Here we investigate the ChIP-chip data structure and use a beta-mixture model to help identify the binding sites. To determine the appropriate number of components in the mixture model, we suggest the Anderson-Darling testing. Our study indicates that it is a reasonable means of choosing the number of components in a beta-mixture model. The mixture model procedure has broad applications in biology and is illustrated with several data sets from bioinformatics experiments. false discovery rate true null hypotheses ChIP-chip beta-mixture
6	RECOVERING SPARSE DIFFERENCES BETWEEN TWO HIGH-DIMENSIONAL COVARIANCE MATRICES ALHARBI, YOUSEF S. 19 July 2017 (has links) No description available. Mathematics Statistics
7	Multiple Testing in Grouped Dependent Data Clements, Nicolle January 2013 (has links) This dissertation is focused on multiple testing procedures to be used in data that are naturally grouped or possess a spatial structure. We propose `Two-Stage' procedure to control the False Discovery Rate (FDR) in situations where one-sided hypothesis testing is appropriate, such as astronomical source detection. Similarly, we propose a `Three-Stage' procedure to control the mixed directional False Discovery Rate (mdFDR) in situations where two-sided hypothesis testing is appropriate, such as vegetation monitoring in remote sensing NDVI data. The Two and Three-Stage procedures have provable FDR/mdFDR control under certain dependence situations. We also present the Adaptive versions which are examined under simulation studies. The `Stages' refer to testing hypotheses both group-wise and individually, which is motivated by the belief that the dependencies among the p-values associated with the spatially oriented hypotheses occur more locally than globally. Thus, these `Staged' procedures test hypotheses in groups that incorporate the local, unknown dependencies of neighboring p-values. If a group is found significant, further investigation is done to the individual p-values within that group. For the vegetation monitoring data, we extend the investigation by providing some spatio-temporal models and forecasts to some regions where significant change was detected through the multiple testing procedure. / Statistics Statistics Astronomy False Discovery Rate Fdr Mdfdr Multiple Testing Ndvi
8	Falso positivo na performance dos fundos de investimento com gestão ativa no Brasil: mensurando sorte dos gestores nos alfas estimados Jesus, Marcelo de 01 February 2011 (has links) Made available in DSpace on 2016-03-15T19:30:42Z (GMT). No. of bitstreams: 1 Marcelo de Jesus.pdf: 753815 bytes, checksum: 4b3631ad6c0a3a4e6928e2b70685850d (MD5) Previous issue date: 2011-02-01 / This study investigates, for the period between 2002 and 2009, what is the impact of luck on the performance of stocks mutual funds managers with active management in Brazil to surpass its benchmark. To that purpose, we used a new method, the False Discovery Rate approach - FDR to empirically test those impact. To measure precisely luck and unluck, ig, the frequency of false positives (Type I errors) in the tails of the cross-section of the tdistribution associated with the alphas of funds in the sample, this new approach was applied to measure the skills of grouped shape managers of stock funds with active management in Brazil. The FDR approach offers a simple and objective method to estimate the proportion of skilled funds (with a positive alpha), alpha-zero funds, and unskilled funds (with a negative alpha) across the population. Applying the FDR technique, it was found as a result of research that the majority of funds were alpha-zero, then no truly skilled funds, and only a small proportion of truly skilled funds. / Esta pesquisa investiga, para o período entre 2002 e 2009, qual o impacto da sorte na performance dos gestores de fundos de investimentos em ações com gestão ativa no Brasil que superam o seu benchmark. Para tanto, foi usado um novo método, a abordagem False Discovery Rate - FDR para testar empiricamente esse impacto. Para mensurar precisamente sorte e azar, ou seja, a freqüência de falsos positivos (erros do tipo I) nas caudas do crosssection da distribuição t associadas aos alfas dos fundos da amostra, foi aplicada essa nova abordagem para mensurar de forma agrupada a habilidade dos gestores de fundos de ações com gestão ativa no Brasil. A abordagem FDR oferece um método simples e objetivo para estimar a proporção de fundos habilidosos (com um alfa positivo), fundos de alfa-zero, e fundos não habilidosos (com um alfa negativo) em toda a população. Aplicando-se a técnica FDR, encontrou-se como resultado da pesquisa que a maioria dos fundos foram alfa-zero, seguida pelos fundos verdadeiramente não habilidosos, e apenas uma pequena proporção de fundos verdadeiramente habilidosos. fundos de investimento False Discovery Rate (FDR) sorte performance mutual funds False Discovery Rate (FDR) luck performance
9	Regularized methods for high-dimensional and bi-level variable selection Breheny, Patrick John 01 July 2009 (has links) Many traditional approaches cease to be useful when the number of variables is large in comparison with the sample size. Penalized regression methods have proved to be an attractive approach, both theoretically and empirically, for dealing with these problems. This thesis focuses on the development of penalized regression methods for high-dimensional variable selection. The first part of this thesis deals with problems in which the covariates possess a grouping structure that can be incorporated into the analysis to select important groups as well as important members of those groups. I introduce a framework for grouped penalization that encompasses the previously proposed group lasso and group bridge methods, sheds light on the behavior of grouped penalties, and motivates the proposal of a new method, group MCP. The second part of this thesis develops fast algorithms for fitting models with complicated penalty functions such as grouped penalization methods. These algorithms combine the idea of local approximation of penalty functions with recent research into coordinate descent algorithms to produce highly efficient numerical methods for fitting models with complicated penalties. Importantly, I show these algorithms to be both stable and linear in the dimension of the feature space, allowing them to be efficiently scaled up to very large problems. In the third part of this thesis, I extend the idea of false discovery rates to penalized regression. The Karush-Kuhn-Tucker conditions describing penalized regression estimates provide testable hypotheses involving partial residuals. I use these hypotheses to connect the previously disparate elds of multiple comparisons and penalized regression, develop estimators for the false discovery rates of methods such as the lasso and elastic net, and establish theoretical results. Finally, the methods from all three sections are studied in a number of simulations and applied to real data from gene expression and genetic association studies. coordinate descent false discovery rate group lasso lasso penalized regression Biostatistics
10	Marginal false discovery rate approaches to inference on penalized regression models Miller, Ryan 01 August 2018 (has links) Data containing large number of variables is becoming increasingly more common and sparsity inducing penalized regression methods, such the lasso, have become a popular analysis tool for these datasets due to their ability to naturally perform variable selection. However, quantifying the importance of the variables selected by these models is a difficult task. These difficulties are compounded by the tendency for the most predictive models, for example those which were chosen using procedures like cross-validation, to include substantial amounts of noise variables with no real relationship with the outcome. To address the task of performing inference on penalized regression models, this thesis proposes false discovery rate approaches for a broad class of penalized regression models. This work includes the development of an upper bound for the number of noise variables in a model, as well as local false discovery rate approaches that quantify the likelihood of each individual selection being a false discovery. These methods are applicable to a wide range of penalties, such as the lasso, elastic net, SCAD, and MCP; a wide range of models, including linear regression, generalized linear models, and Cox proportional hazards models; and are also extended to the group regression setting under the group lasso penalty. In addition to studying these methods using numerous simulation studies, the practical utility of these methods is demonstrated using real data from several high-dimensional genome wide association studies. elastic net false discovery rate high dimensional data inference lasso penalized regression Biostatistics

Search results