Global ETD Search

1	A case study on cumulative logit models with low frequency and mixed effects Alzubaidi, Samirah Hamid January 1900 (has links) Master of Science / Department of Statistics / Perla E. Reyes Cuellar / Data with ordinal responses may be encountered in many research fields, such as social, medical, agriculture or financial sciences. In this paper, we present a case study on cumulative logit models with low frequency and mixed effects and discuss some strengths and limitations of the current methodology. Two plant pathologists requested our statistical advice to fit a cumulative logit mixed model seeking for the effect of six commercial products on the control of a seed and seedling disease in soybeans in vitro. In their attempt to estimate the model parameters using a generalized linear mixed model approach with PROC GLIMMIX, the model failed to converge. Three alternative approaches to solve the problem were examined: 1) stratifying the data searching for the random effect; 2) assuming the random effect would be small and reducing the model to a fixed model; and 3) combining the original categories of the response variable to a lower number of categories. In addition, we conducted a power analysis to evaluate the required sample size to detect treatment differences. The results of all the proposed solutions were similar. Collapsing categories for a cumulative/proportional odds model has little effect on estimation. The sample size used in the case study is enough to detect a large shift of frequencies between categories, but not for moderated changes. Moreover, we do not have enough information to estimate a random effect. Even when it is present, the results regarding the fixed factors: pathogen, evaluation day, and treatment effects are the same as the obtained by the fixed model alternatives. All six products had a significant effect in slowing the effect of the pathogen, but the effects vary between pathogen species and assessment timing or date. Cumulative Logit Multinomial Mixed
2	Incorporating Dependence Boundaries in Simulating Associated Discrete Data Haynes, Mary E 01 January 2014 (has links) In the study of associated discrete variables, limitations on the range of the possible association measures (Pearson correlation, odds ratio, etc.) arise from the form of the joint probability function between the variables. These limitations are known as the Fréchet bounds. The bounds for cases involving associated binary variables are explored in the context of simulating datasets with a desired correlation and set of marginal probabilities. A new method for creating such datasets is compared to an existing method that uses the multivariate probit. A method for simulating associated binary variables using a desired odds ratio and known marginal probabilities is also presented. The Fréchet bounds for correlation between dependent binomial and negative binomial variables are determined as families of ranges in various cases. An example of a realistic analysis involving the Fréchet bounds in a dependent binomial setting is presented. Bernoulli Emrich Piedmonte multinomial sampling NHANES Biostatistics
3	Classification of Bone Cements Using Multinomial Logistic Regression Method Wei, Jinglun 29 April 2018 (has links) Bone cement surgery is a new technique widely used in medical field nowadays. In this thesis I analyze 48 bone cement types using their content of 20 elements. My goal is to ?find a method to classify new found bone cement sample into these 48 categories. Here I will use multinomial logistic regression method to see whether it works or not. Due to the lack of observations, I generate enough data by adding white noise in proper scales to the original data again and again, and then I get a data set of over 100 times as many points as the original one. Then I use purposeful variable selection method to pick the covariates I need, rather than stepwise selection. There are 15 covariates left after the selection, and then I use my new data set to fit such a multinomial logistic regression model. The model doesn't perform that good in goodness of ?fit test, but the result is still acceptable, and the diagnostic statistics also indicate a good performance. Combined with clinical experience and prior conditions, this model is helpful in this classification case. multinomial logistic regression classification bone cement
4	Co-relation of Variables Involved in the Occurrence of Crane Accidents in U.S. through Logit Modeling. Bains, Amrit Anoop Singh 2010 August 1900 (has links) One of the primary reasons of the escalating rates of injuries and fatalities in the construction industry is the ever so complex, dynamic and continually changing nature of construction work. Use of cranes has become imperative to overcome technical challenges, which has lead to escalation of danger on a construction site. Data from OSHA show that crane accidents have increased rapidly from 2000 to 2004. By analyzing the characteristics of all the crane accident inspections, we can better understand the significance of the many variables involved in a crane accident. For this research, data were collected from the U.S. Department of Labor website via the OSHA database. The data encompass crane accident inspections for all the states. The data were divided into categories with respect to accident types, construction operations, degree of accident, fault, contributing factors, crane types, victim’s occupation, organs affected and load. Descriptive analysis was performed to compliment the previous studies, the only difference being that both fatal and non-fatal accidents have been considered. Multinomial regression has been applied to derive probability models and correlation between different accident types and the factors involved for each crane accident type. A log likelihood test as well as chi-square test was performed to validate the models. The results show that electrocution, crane tip over and crushed during assembly/disassembly have more probability of occurrence than other accident types. Load is not a significant factor for the crane accidents, and manual fault is more probable a cause for crane accident than is technical fault. Construction operations identified in the research were found to be significant for all the crane accident types. Mobile crawler crane, mobile truck crane and tower crane were found to be more susceptible. These probability models are limited as far as the inculcation of unforeseen variables in construction accidents are concerned. In fact, these models utilize the past to portray the future, and therefore significant change in the variables involved is required to be added to attain correct and expedient results. Cranes Accidents Multinomial Regression Logit Modeling
5	Analysis of Whole Milk vs. Low-Fat Milk Consumption Among WIC Children Before Programmatic Changes Bayar, Emine 2011 May 1900 (has links) The Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) is one of the food assistance programs targeted at low-income women, infants and children up to age five by providing foods, nutrition education and other services. Recent updates in food packages provided by WIC include the addition of fruits, vegetables and whole wheat products as well as the removal of whole milk for women and children two years and older. This thesis concentrates on preschool children participants in the WIC program and their milk consumption habits prior to programmatic changes. Analyzing diet preferences of these children is crucial since a quarter of the population of children aged one thorough five participates in the WIC program; as well, they are not eligible to receive whole milk with WIC food packages after the implementation of revisions. The objective is to describe the profile of preschool WIC children and their milk consumption attributes based on the National Food and Nutrition (NATFAN) questionnaire designed and conducted by the Institute for Obesity Research and Program Evaluation at Texas A & M University before the release of the revised WIC food packages. Additionally, findings of the study are compared with the National Health and Nutrition Examination Survey (NHANES) 2005-2006 dataset results. Milk consumption preferences of WIC children are analyzed nationwide and impacts of race, ethnicity, regional, and other demographic characteristics are observed. Using both NATFAN and NHANES datasets provides a comparison of actual and self-reported participation outcomes. Discrete choice models were used in this analysis, in particular binary logit and multinomial logit models. The results of the thesis indicate that WIC preschool children mostly drink whole milk (36.17 percent) and 2 percent fat milk (49.94 percent). Two year old participants, children located in the South and participants whose caregivers are younger and less educated are more likely to consume whole milk. Caucasian children are less likely to choose whole milk and more likely to choose reduced fat milk; African Americans are more likely to select whole milk. Furthermore, diet preferences and knowledge of parents/caregivers play a major role on milk consumption of children. Children whose caregivers are willing to give low-fat milk to children aged two to five are less likely to drink whole milk. Milk consumption WIC Children Multinomial Logit Model
6	Dirichlet Process Mixture Models for Nested Categorical Data Hu, Jingchen January 2015 (has links) <p>This thesis develops Bayesian latent class models for nested categorical data, e.g., people nested in households. The applications focus on generating synthetic microdata for public release and imputing missing data for household surveys, such as the 2010 U.S. Decennial Census.</p><p>The first contribution is methods for evaluating disclosure risks in fully synthetic categorical data. I quantify disclosure risks by computing Bayesian posterior probabilities that intruders can learn confidential values given the released data and assumptions about their prior knowledge. I demonstrate the methodology on a subset of data from the American Community Survey (ACS). The methods can be adapted to synthesizers for nested data, as demonstrated in later chapters of the thesis.</p><p>The second contribution is a novel two-level latent class model for nested categorical data. Here, I assume that all configurations of groups and units are theoretically possible. I use a nested Dirichlet Process prior distribution for the class membership probabilities. The nested structure facilitates simultaneous modeling of variables at both group and unit levels. I illustrate the modeling by generating synthetic data and imputing missing data for a subset of data from the 2012 ACS household data. I show that the model can capture within group relationships more effectively than standard one-level latent class models.</p><p>The third contribution is a version of the nested latent class model adapted for theoretically impossible combinations, e.g. a household with two household heads or a child older than her biological father. This version assigns zero probability to those impossible groups and units. I present a proof that the Markov Chain Monte Carlo (MCMC) sampling strategy estimates the desired target distribution. I illustrate this model by generating synthetic data and imputing missing data for a subset of data from the 2011 ACS household data. The results indicate that this version can estimate the joint distribution more effectively than the previous version.</p> / Dissertation Statistics Confidentiality Disclosure Latent Multinomial Synthetic
7	Migração inter-regional no Brasil : determinantes e perfil do migrante brasileiro no perído 1980-2000 Ribeiro Justo, Wellington January 2006 (has links) Made available in DSpace on 2014-06-12T17:17:24Z (GMT). No. of bitstreams: 2 arquivo6079_1.pdf: 1109326 bytes, checksum: 06554ffc43cc34d3905460ae38378f7a (MD5) license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2006 / A partir dos microdados dos Censos Demográficos de 1980, 1991 e 2000, o trabalho descreve os padrões de migração em várias dimensões: inter-regional, interestadual, rural-urbano e urbanourbano. Em termos da migração interestadual, Minas Gerais é o estado com maior participação relativa na emissão de migrantes, embora, a participação diminua ao longo do período analisado. Em uma outra dimensão da migração a migração inter-regional, os resultados apontam para um aumento no estoque líquido negativo de migrantes do Nordeste para as demais regiões brasileiras, passando de pouco mais de 4 milhões em 1980 para mais de 8 milhões em 2000. Neste mesmo período, a região Sul passa de um saldo líquido positivo de mais de 200 mil migrantes para um saldo líquido negativo de mais de hum milhão e duzentos mil migrantes. No que se refere à migração rural-urbana, destaca-se o Nordeste, como fonte emissora principal deste tipo de migração, embora haja uma tendência de diminuição deste fluxo ao longo do tempo, não somente oriundo do Nordeste, mas de todas as regiões brasileiras. Buscou-se, também, fornecer evidências a respeito dos determinantes dos fluxos migratórios procurando explorar duas dimensões pouco enfatizadas por estudos sobre fluxos migratórios no Brasil: a importância da incerteza quanto à renda na decisão de migrar e a importância de características do mercado de trabalho. Nesse sentido, buscou-se, sempre com base nos microdados dos Censos, ressaltar a importância da variável renda esperada (renda ponderada pela possibilidade de conseguir emprego), o efeito da distância e população (através da matriz de transformação espacial). Os resultados obtidos a partir de dados em painel e de uma transformação espacial das variáveis indicam que o controle espacial é fundamental para apreender o efeito das variáveis sobre o fluxo migratório. Ainda tendo como base os microdados censitários, o trabalho fornece evidências a respeito do perfil do migrante interno brasileiro de acordo com a região de destino. Através da estimação de um modelo logit multinomial para a decisão de migração e de escolha do destino, os resultados permitiram apontar as diferenças entre migrantes e não-migrantes e entre os próprios migrantes de acordo com a região de destino para todos os anos censitários. Entre as evidências obtidas mostra-se que qualquer que seja a região de destino e o período de migração entre 1980 e 2000, o migrante brasileiro apresenta perfil distinto daquele do não-migrante: é mais escolarizado, mais jovem sobretudo do sexo masculino e provém com maior probabilidade de UF em condição social relativamente precária. Enquanto no período 1980 -1991, há elevação das diferenças entre migrantes de acordo com a região de destino, entre 1991 e 2000, os migrantes tornam-se regionalmente mais semelhantes. Por fim, novamente utilizando-se os microdados censitários, o trabalho procurou testar se os migrantes brasileiros formam um grupo positivamente selecionado (ou seja, em média mais apto, empreendedor, motivado e ambicioso que o grupo dos nãomigrantes). Os resultados permitem apontar que há, em média, uma diferença de renda favorável aos migrantes em relação aos não-migrantes que moram nos estados que os recebem, assim como em relação aos não-migrantes dos seus estados de origem, mesmo quando controlados por uma série de variáveis importantes na determinação da renda. Desta forma, os resultados sugerem que os migrantes brasileiros constituem um grupo positivamente selecionado Migração Logit multinomial Seleção de migrantes Perfil do migrante
8	Escolha de cursos de graduação na Universidade Federal de Pernambuco : um estudo de seus determinantes Firmino Costa da Silva, Diego 31 January 2010 (has links) Made available in DSpace on 2014-06-12T17:20:04Z (GMT). No. of bitstreams: 2 arquivo573_1.pdf: 3504462 bytes, checksum: 028fb584698b08755667d8902b92ee48 (MD5) license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2010 / Conselho Nacional de Desenvolvimento Científico e Tecnológico / Este trabalho teve como objetivo principal analisar como os estudantes candidatos às vagas na Universidade Federal de Pernambuco tem escolhido qual carreira seguir. Desta forma foi estudado como as características sócio-econômicas dos candidatos influenciam na escolha e, em seguida, foi analisado como os retornos salariais correspondentes ao grupos de cursos disponíveis se associavam às características individuais para influenciar na decisão de qual profissão seguir. Para isto, foram utilizados os dados da Covest, que é a comissão organizadora do vestibular da Universidade Federal de Pernambuco, para o vestibular 2009. Além disso também foram utilizados dados da PNAD 2008 para a estimação dos salário médio de cada carreira. Os cursos oferecidos pela UFPE foram divididos em 9 grupos, obedecendo a razões de proximidade profissional das carreiras, e para realizar as estimações foram utilizados os modelos econométricos de logit multinomial e logit condicional. O primeiro modelo utiliza apenas variáveis explicativas relacionadas aos indivíduos e o segundo é quando se utiliza alguma variável explicativa característica das alternativas. Os resultados apontam para a influência que as variáveis pessoais, background familiar e variáveis educacionais exercem sobre a probabilidade de escolha dos grupos de cursos disponíveis. Também foi constatado, que o retorno salarial não tem significância sobre a probabilidade de escolha quando se leva em conta as variáveis que representam as características individuais Salário Logit Condicional Logit Multinomial Escolha Ocupacional
9	Modeling Transition Probabilities for Loan States Using a Bayesian Hierarchical Model Monson, Rebecca Lee 30 November 2007 (has links) (PDF) A Markov Chain model can be used to model loan defaults because loans move through delinquency states as the borrower fails to make monthly payments. The transition matrix contains in each location a probability that a borrower in a given state one month moves to the possible delinquency states the next month. In order to use this model, it is necessary to know the transition probabilities, which are unknown quantities. A Bayesian hierarchical model is postulated because there may not be sufficient data for some rare transition probabilities. Using a hierarchical model, similarities between types or families of loans can be taken advantage of to improve estimation, especially for those probabilities with little associated data. The transition probabilities are estimated using MCMC and the Metropolis-Hastings algorithm. MCMC Hierarchical Model multinomial Dirichlet Statistics and Probability
10	Diagnostics in some Discrete Choice Models Nagel, Herbert, Hatzinger, Reinhold January 1990 (has links) (PDF) Discrete choice models form a class of models widely used in econometrics for modelling the individual choice from a finite set of alternatives. The most widely used model is the multinomial logit model, implicitly assuming independence of irrelevant alternatives. A generalization is the nested multinomial logit model, relaxing this strong assurnp tion. Viewing both models as nonlinear regression models a set of diagnostics is derived. This includes a hat matrix, measures of leverage, influence and residuals and an approximation to the parameters for case deletion. In an example for the multinomid logit model a good performance of these diagnostics is observed and the parameter approximation by the proposed formula is better than a one step Newton-Raphson procedure. In an example for the nested logit model a constructed outlier with high influence is revealed by the measures of leverage and residual, but the parameter approximation is insufficient. (author's abstract) / Series: Forschungsberichte / Institut für Statistik

Search results