Global ETD Search

1	Réduction de dimension en régression logistique, application aux données actu-palu / Dimension reduction in logistic regression, application to actu-palu data Kwémou Djoukoué, Marius 29 September 2014 (has links) Cette thèse est consacrée à la sélection de variables ou de modèles en régression logistique. Elle peut-être divisée en deux parties, une partie appliquée et une partie méthodologique. La partie appliquée porte sur l'analyse des données d'une grande enquête socio - épidémiologique dénommée actu-palu. Ces grandes enquêtes socio - épidémiologiques impliquent généralement un nombre considérable de variables explicatives. Le contexte est par nature dit de grande dimension. En raison du fléau de la dimension, le modèle de régression logistique n'est pas directement applicable. Nous procédons en deux étapes, une première étape de réduction du nombre de variables par les méthodes Lasso, Group Lasso et les forêts aléatoires. La deuxième étape consiste à appliquer le modèle logistique au sous-ensemble de variables sélectionné à la première étape. Ces méthodes ont permis de sélectionner les variables pertinentes pour l'identification des foyers à risque d'avoir un épisode fébrile chez un enfant de 2 à 10 ans à Dakar. La partie méthodologique, composée de deux sous-parties, porte sur l'établissement de propriétés techniques d'estimateurs dans le modèle de régression logistique non paramétrique. Ces estimateurs sont obtenus par maximum de vraisemblance pénalisé, dans un cas avec une pénalité de type Lasso ou Group Lasso et dans l'autre cas avec une pénalité de type 1 exposant 0. Dans un premier temps, nous proposons des versions pondérées des estimateurs Lasso et Group Lasso pour le modèle logistique non paramétrique. Nous établissons des inégalités oracles non asymptotiques pour ces estimateurs. Un deuxième ensemble de résultats vise à étendre le principe de sélection de modèle introduit par Birgé et Massart (2001) à la régression logistique. Cette sélection se fait via des critères du maximum de vraisemblance pénalisé. Nous proposons dans ce contexte des critères de sélection de modèle, et nous établissons des inégalités oracles non asymptotiques pour les estimateurs sélectionnés. La pénalité utilisée, dépendant uniquement des données, est calibrée suivant l'idée de l'heuristique de pente. Tous les résultats de la partie méthodologique sont illustrés par des études de simulations numériques. / This thesis is devoted to variables selection or model selection in logistic regression. The applied part focuses on the analysis of data from a large socioepidémiological survey, called actu-palu. These large socioepidemiological survey typically involve a considerable number of explanatory variables. This is well-known as high-dimensional setting. Due to the curse of dimensionality, logistic regression model is no longer reliable. We proceed in two steps, a first step of reducing the number of variables by the Lasso, Group Lasso ans random forests methods. The second step is to apply the logistic model to the sub-set of variables selected in the first step. These methods have helped to select relevant variables for the identification of households at risk of having febrile episode amongst children from 2 to 10 years old in Dakar. In the methodological part, as a first step, we propose weighted versions of Lasso and group Lasso estimators for nonparametric logistic model. We prove non asymptotic oracle inequalities for these estimators. Secondly we extend the model selection principle introduced by Birgé and Massart (2001) to logistic regression model. This selection is done using penalized macimum likelihood criteria. We propose in this context a completely data-driven criteria based on the slope heuristics. We prove non asymptotic oracle inequalities for selected estimators. The results of the methodological part are illustrated through simulation studies. Régression logistique Logistic regression
2	The factors of influencing people to adopt public animal shelter dogs Chen, Ying-peng 27 July 2010 (has links) none shelter adoption Logistic regression
3	A combination procedure of universal kriging and logistic regression a thesis presented to the faculty of the Graduate School, Tennessee Technological University / Wu, Songfei. January 2008 (has links) Thesis (M.S.)--Tennessee Technological University, 2008. / Title from title page screen (viewed on Aug. 26, 2009). Bibliography: leaves 24-26. Kriging. Logistic regression analysis.
4	Sample comparisons using microarrays: - Application of False Discovery Rate and quadratic logistic regression Guo, Ruijuan 08 January 2008 (has links) In microarray analysis, people are interested in those features that have different characters in diseased samples compared to normal samples. The usual p-value method of selecting significant genes either gives too many false positives or cannot detect all the significant features. The False Discovery Rate (FDR) method controls false positives and at the same time selects significant features. We introduced Benjamini's method and Storey's method to control FDR, applied the two methods to human Meningioma data. We found that Benjamini's method is more conservative and that, after the number of the tests exceeds a threshold, increase in number of tests will lead to decrease in number of significant genes. In the second chapter, we investigate ways to search interesting gene expressions that cannot be detected by linear models as t-test or ANOVA. We propose a novel approach to use quadratic logistic regression to detect genes in Meningioma data that have non-linear relationship within phenotypes. By using quadratic logistic regression, we can find genes whose expression correlates to their phenotypes both linearly and quadratically. Whether these genes have clinical significant is a very interesting question, since these genes most likely be neglected by traditional linear approach. FDR Logistic regression Microarry DNA microarrays Logistic regression analysis
5	Analys av bortfall i en uppföljningsundersökning av hälsa / Analysis of attrition in a longitudinal health study Udd, Mattias, Pettersson, Niklas January 2008 (has links) The LSH-study started in 2003 at the department of Health and Society at the University of Linköping. The purpose of the study was to examine the relationship between life condition, stress and health. A total of 1007 people from ten different health centres in Östergötlands län participated. At the follow up, a couple of years later, 795 of the 1007 participated. 127 of the 212 in the attrition turned down the follow up, twelve people were not invited (for example in case of death) and the rest did not respond at all. The purpose of this paper is to find out in what degree the attrition in the follow up can be predicted using the information from the first survey and which variables are important. The differences between different types of attrition have also been examined. Simple and multiple bi- and multinomial logistic regression have been used in the analysis. In total 34 variables were examined and in the final model six variables remained with a significant relation to the attrition. High BMI, regular smoking, high pulse and lack of daily exercise at the first survey were connected to a higher risk for an individual to not participate at the follow up. It is interesting that these factors are considered as risk factors for unhealthy living. Other factors related to a higher attrition were unemployment in the last year before the first survey and if the individual had parents born in another country than Sweden. The risk for attrition increased gradually when more risk factors were shown by the individual. The factors contributing an individual to turn down the follow up instead of not responding at all was if he or she were in the older age segments in the survey or if they were not active in any type of association. Attrition health study logistic regression multinomial logistic regression Statistics Statistik
6	Analys av bortfall i en uppföljningsundersökning av hälsa / Analysis of attrition in a longitudinal health study Udd, Mattias, Pettersson, Niklas January 2008 (has links) <p>The LSH-study started in 2003 at the department of Health and Society at the University of Linköping. The purpose of the study was to examine the relationship between life condition, stress and health. A total of 1007 people from ten different health centres in Östergötlands län participated. At the follow up, a couple of years later, 795 of the 1007 participated. 127 of the 212 in the attrition turned down the follow up, twelve people were not invited (for example in case of death) and the rest did not respond at all. The purpose of this paper is to find out in what degree the attrition in the follow up can be predicted using the information from the first survey and which variables are important. The differences between different types of attrition have also been examined. Simple and multiple bi- and multinomial logistic regression have been used in the analysis.</p><p>In total 34 variables were examined and in the final model six variables remained with a significant relation to the attrition. High BMI, regular smoking, high pulse and lack of daily exercise at the first survey were connected to a higher risk for an individual to not participate at the follow up. It is interesting that these factors are considered as risk factors for unhealthy living. Other factors related to a higher attrition were unemployment in the last year before the first survey and if the individual had parents born in another country than Sweden. The risk for attrition increased gradually when more risk factors were shown by the individual. The factors contributing an individual to turn down the follow up instead of not responding at all was if he or she were in the older age segments in the survey or if they were not active in any type of association.</p> Attrition health study logistic regression multinomial logistic regression Statistics Statistik
7	A Comparison Of Remedy Methods For Logistic Regression When Data Are Collinear January 2016 (has links) Heng Wang
8	Modeling the NCAA Tournament Through Bayesian Logistic Regression Nelson, Bryan 18 July 2012 (has links) Many rating systems exist that order the Division I teams in Men's College Basketball that compete in the NCAA Tournament, such as seeding teams on an S-curve, and the Pomeroy and Sagarin ratings, simplifying the process of choosing winners to a comparison of two numbers. Rather than creating a rating system, we analyze each matchup by using the difference between the teams' individual regular season statistics as the independent variables. We use an MCMC approach and logistic regression along with several model selection techniques to arrive at models for predicting the winner of each game. When given the 63 actual games in the 2012 tournament, eight of our models performed as well as Pomeroy's rating system and four did as well as Sagarin's rating system when given the 63 actual games. Not allowing the models to fix their mistakes resulted in only one model outperforming both Pomeroy and Sagarin's systems. / McAnulty College and Graduate School of Liberal Arts / Computational Mathematics / MS / Thesis
9	Topics in ordinal logistic regression and its applications Kim, Hyun Sun 15 November 2004 (has links) Sample size calculation methods for ordinal logistic regression are proposed to test statistical hypotheses. The author was motivated to do this work by the need for statistical analysis of the red imported ﬁre ants data. The proposed methods use the concept of approximation by the moment-generating function. Some correction methods are also suggested. When a prior data set is available, an empirical method is explored. Application of the proposed methodology to the ﬁre ant mating ﬂight data is demonstrated. The proposed sample size and power calculation methods are applied in the hypothesis testing problems. Simulation studies are also conducted to illustrate their performance and to compare them with existing methods. Ordinal logistic regression Sample sizes
10	Factors Affect the Employment of Youth in China Li, Xiaoxue January 2009 (has links) Today's young people are well-educated ever but in a poor employment situation. At the beginning of this paper, I first state the situation both in the world and in China, revealing the poor employment situation of youth. Then I introduce systems related to youth employment in China and measures the government taken to help graduate students to find a job. The purpose of this paper is to analyze employment of youth people in China especially among the medium and highly educated people and find which and how the factors contribute to it. By using the Logistic Regression by STATA, I find that the main factors are gender, age, living area, and political status, major and educational level. The result reveals that the discrimination and gap between rural and urban area are severe issues in China. Last but not least, I give some suggestions both to the society and the individual to improve the youth employment. employment logistic regression Economics Nationalekonomi

Search results