Spelling suggestions: "subject:"cultiple imputation"" "subject:"bmultiple imputation""
51 |
Bayesian estimation of factor analysis models with incomplete dataMerkle, Edgar C., January 2005 (has links)
Thesis (Ph. D.)--Ohio State University, 2005. / Title from first page of PDF file. Document formatted into pages; contains xi, 106 p.; also includes graphics. Includes bibliographical references (p. 103-106). Available online via OhioLINK's ETD Center
|
52 |
Inequality of opportunity : measurement and impact on economic growth / Inégalité d'opportunité : mesure et effet sur la croissance économiqueTeyssier, Geoffrey 17 November 2017 (has links)
Cette thèse porte sur la mesure de l'inégalité d'opportunité et son effet sur la croissance économique. Le Chapitre 1 étudie les propriétés axiomatiques de deux approches de mesure concurrentes. Dans les deux cas, la population est partitionnée en groupes rassemblant des personnes partageant les mêmes circonstances, ces déterminants de revenu que les individus ne peuvent choisir (ex. sexe ou milieu familial). L'inégalité d'opportunité est alors mesurée comme celle présente au sein d'une distribution contrefactuelle où chacun se voit attribuer le revenu représentatif de son groupe. La première approche considère la moyenne arithmétique comme revenu représentatif. Lorsque le nombre de groupes est grand et que leur taille est petite, ces moyennes sont peu précisément estimées. Afin de d'atténuer ce problème, la seconde approche, dite paramétrique, suppose que les circonstances n'ont pas d'effet d'interaction et remplace la moyenne arithmétique par la prédiction OLS du revenu régressé sur les circonstances. Le Chapitre 1 montre que la méthode paramétrique est faible d'un point de vue axiomatique. En particulier, elle ne respecte pas une version «entre-groupes» du principe des transferts. Le Chapitre 2 propose une méthodologie afin de contourner le manque actuel de micro-données sur les circonstances parentales, un déterminant majeur de l'inégalité d'opportunité. L'idée est d'utiliser 1 structure des enquêtes démographiques organisées autour de foyers afin de retrouver les circonstances parentales des adultes vivant avec leurs parents, puis d'utiliser une méthode d'ajustement statistique -l'imputation multiple -afin d'obtenir une mesure d'inégalité d'opportunité représentative de la population adulte dans son ensemble. Celle-ci est proche de la« vraie» inégalité d'opportunité, qui repose sur des questions directes à propos du milieu parental contenue dans l'enquête brésilienne du PNAD 1996. Le Chapitre 3 étudie empiriquement une récente explication quant au caractère peu concluant de la littérature empirique sur l'inégalité et la croissance: ce n'est pas l'inégalité de revenus qui compte pour la croissance mais ses deux composantes, à savoir l'inégalité d'opportunité et la composante résiduelle qu'est l'inégalité d'effort. Cette explication est validée au Brésil au niveau municipal durant la période 1980-2010, où le: inégalités d'opportunité et d'effort sont respectivement préjudiciables et bénéfiques à la croissance économique future, comme attendu. Leurs effets sont robustes et significatifs, contrairement à celui de l'inégalité total de revenus. / This thesis is about the measurement of inequality of opportunity and its impact on economic growth. Chapter 1 studies the axiomatic properties of two prominent measurement approaches. In both cases, the population is partitioned into groups of people sharing the same circumstances, those income determinants that are beyond individual control (e.g. sex or parental background) and that shape one's opportunities. Inequality of opportunity is then measured by applying a1 inequality index over a counterfactual distribution where each individual is attributed the representative income of his group. The first approach takes the representative income of a group to be its arithmetic mean. When a large number of small-sized groups are considered, these means can be poorly estimated. To mitigate this issue, the second approach, called parametric, assumes that circumstances have no interaction effect and takes this representative income to be the OLS predicted value of income regressed on circumstances. Chapter I shows that the parametric approach has poor axiomatic properties, especially with respect to a between-group version of the transfer principle. Chapter 2 provides a methodology to circumvent the current lack of microdata on parental background circumstances, a major driver of inequality of oppo1tunity. The idea is to retrieve the parental background of adults living with their parents thanks to the structure of household survey data, and then to apply a missing data procedure -multiple imputation -to obtain estimate of inequality of opportunity that are representative of the overall adult population. These estimates are shown to be close to their "true" counterpa1ts, based on direct questions about parental background contained in the Brazilian PNAD 1996 survey. Chapter 3 empirically investigates a recent and promising explanation for the inconclusiveness of traditional growth-inequality literature: income inequality does not matter for growth while its components -inequality of opportunity and the residual one, inequality of effort -do. This explanation is validated in Brazil at the municipality level over the period 1980-20 l 0, where inequalities of opportunity and effort are respectively detrimental and beneficial to subsequent growth, as expected. Their effects are robust and significant, in contrast to that of total income inequality.
|
53 |
Three-Level Multiple Imputation: A Fully Conditional Specication ApproachJanuary 2015 (has links)
abstract: Currently, there is a clear gap in the missing data literature for three-level models.
To date, the literature has only focused on the theoretical and algorithmic work
required to implement three-level imputation using the joint model (JM) method of
imputation, leaving relatively no work done on fully conditional specication (FCS)
method. Moreover, the literature lacks any methodological evaluation of three-level
imputation. Thus, this thesis serves two purposes: (1) to develop an algorithm in
order to implement FCS in the context of a three-level model and (2) to evaluate
both imputation methods. The simulation investigated a random intercept model
under both 20% and 40% missing data rates. The ndings of this thesis suggest
that the estimates for both JM and FCS were largely unbiased, gave good coverage,
and produced similar results. The sole exception for both methods was the slope for
the level-3 variable, which was modestly biased. The bias exhibited by the methods
could be due to the small number of clusters used. This nding suggests that future
research ought to investigate and establish clear recommendations for the number of
clusters required by these imputation methods. To conclude, this thesis serves as a
preliminary start in tackling a much larger issue and gap in the current missing data
literature. / Dissertation/Thesis / Masters Thesis Psychology 2015
|
54 |
Imputação múltipla de dados faltantes: exemplo de aplicação no Estudo Pró-Saúde / Multiple imputation of missing data: application in the Pro-Saude ProgramThaís de Paulo Rangel 05 March 2013 (has links)
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Dados faltantes são um problema comum em estudos epidemiológicos e, dependendo da forma como ocorrem, as estimativas dos parâmetros de interesse podem estar enviesadas. A literatura aponta algumas técnicas para se lidar com a questão, e, a imputação múltipla vem recebendo destaque nos últimos anos. Esta dissertação apresenta os resultados da utilização da imputação múltipla de dados no contexto do Estudo Pró-Saúde, um estudo longitudinal entre funcionários técnico-administrativos de uma universidade no Rio de Janeiro. No primeiro estudo, após simulação da ocorrência de dados faltantes, imputou-se a variável cor/raça das participantes, e aplicou-se um modelo de análise de sobrevivência previamente estabelecido, tendo como desfecho a história auto-relatada de miomas uterinos. Houve replicação do procedimento (100 vezes) para se determinar a distribuição dos coeficientes e erros-padrão das estimativas da variável de interesse. Apesar da natureza transversal dos dados aqui utilizados (informações da linha de base do Estudo Pró-Saúde, coletadas em 1999 e 2001), buscou-se resgatar a história do seguimento das participantes por meio de seus relatos, criando uma situação na qual a utilização do modelo de riscos proporcionais de Cox era possível. Nos cenários avaliados, a imputação demonstrou resultados satisfatórios, inclusive quando da avaliação de performance realizada. A técnica demonstrou um bom desempenho quando o mecanismo de ocorrência dos dados faltantes era do tipo MAR (Missing At Random) e o percentual de não-resposta era de 10%. Ao se imputar os dados e combinar as estimativas obtidas nos 10 bancos (m=10) gerados, o viés das estimativas era de 0,0011 para a categoria preta e 0,0015 para pardas, corroborando a eficiência da imputação neste cenário. Demais configurações também apresentaram resultados semelhantes. No segundo artigo, desenvolve-se um tutorial para aplicação da imputação múltipla em estudos epidemiológicos, que deverá facilitar a utilização da técnica por pesquisadores brasileiros ainda não familiarizados com o procedimento. São apresentados os passos básicos e decisões necessárias para se imputar um banco de dados, e um dos cenários utilizados no primeiro estudo é apresentado como exemplo de aplicação da técnica. Todas as análises foram conduzidas no programa estatístico R, versão 2.15 e os scripts utilizados são apresentados ao final do texto. / Missing data are a common problem in epidemiologic studies and depending on the way they occur, the resulting estimates may be biased. Literature shows several techniques to deal with this subject and multiple imputation has been receiving attention in the recent years. This dissertation presents the results of applying multiple imputation of missing data in the context of the Pro-Saude Study, a longitudinal study among civil servants at a university in Rio de Janeiro, Brazil. In the first paper, after simulation of missing data, the variable color/race of the female servants was imputed and analyzed through a previously established survival model, which had the self-reported history of uterine leiomyoma as the outcome. The process has been replicated a hundred times in order to determine the distribution of the coefficient and standard errors of the variable being imputed. Although the data presented were cross-sectionally collected (baseline data of the Pro-Saude Study, gathered in 1999 and 2001), the following of the servants were determined using self-reported information. In this scenario, the Cox proportional hazards model could be applied. In the situations created, imputation showed adequate results, including in the performance analyses. The technique had a satisfactory effectiveness when the missing mechanism was MAR (Missing At Random) and the percent of missing data was 10. Imputing the missing information and combining the estimates of the 10 resulting datasets produced a bias of 0,0011 to black women and 0,0015 to brown (mixed-race) women, what corroborates the efficiency of multiple imputation in this scenario. In the second paper, a tutorial was created to guide the application of multiple imputation in epidemiologic studies, which should facilitate the use of the technique by Brazilian researchers who are still not familiarized with the procedure. Basic steps and important decisions necessary to impute a dataset are presented and one of the scenarios of the first paper is used as an application example. All the analyses were performed at R statistical software, version 2.15 and the scripts are presented at the end of the text.
|
55 |
Predicting risk of cyberbullying victimization using lasso regressionOlaya Bucaro, Orlando January 2017 (has links)
The increased online presence and use of technology by today’s adolescents has created new places where bullying can occur. The aim of this thesis is to specify a prediction model that can accurately predict the risk of cyberbullying victimization. The data used is from a survey conducted at five secondary schools in Pereira, Colombia. A logistic regression model with random effects is used to predict cyberbullying exposure. Predictors are selected by lasso, tuned by cross-validation. Covariates included in the study includes demographic variables, dietary habit variables, parental mediation variables, school performance variables, physical health variables, mental health variables and health risk variables such as alcohol and drug consumption. Included variables in the final model are demographic variables, mental health variables and parental mediation variables. Variables excluded in the final model includes dietary habit variables, school performance variables, physical health variables and health risk variables. The final model has an overall prediction accuracy of 88%.
|
56 |
應用多重插補法在包含遺漏資料的離散選擇模型 / Applying Multiple Imputation to the Discrete Choice Model with Missing Data簡廷翰, Jian, Ting Han Unknown Date (has links)
此篇文章探討,使用離散選擇模型(discrete choice model)中的邏輯模型(logit model)分析,若資料具有遺漏值(incomplete-data),比較將具有遺漏值樣本值皆移除與使用多重插補方法補值之參數估計結果。
本文使用的多重差補法為Buuren(2007)等人所提出的Multiple Imputation by Chained Equation(MICE)多重插補方法進行補值,並使用Rubin(1987)所提出的方法合併參數估計結果。從模擬結果之參數偏誤盒狀圖可知插補後參數估計與設定參數差異不大,另外插補次數對於參數估計結果影響不大,且在遺漏比例(missing percentage)大時,參數估計結果比起將具有遺漏值樣本直接移除的參數估計較為穩定。
另外使用實際資料分析,發現具有遺漏值樣本直接移除的參數估標準差比起插補後參數估計標準差大的趨勢,與模擬結果相同。 / This paper focuses on using discrete choice logit model to analyze incompleted data. To deal with the incompleted data, complete case analysis and multiple imputation are used, and compare the result of parameter estimates of the two methods.
The method of multiple imputation which this paper used is Multiple Imputation by Chained Equation (MICE). With the estimates from multiple imputed data sets, using Rubin’s method (1987) to pool the estimates. The simulation shows that after imputing the missing values, the estimates from the imputed data are not much difference from the real parameters. The number of imputation does not effect the estimates much. With larger missing percentage, the estimates from the imputed data is more robust than the estimates from the complete case analysis.
In real data analysis, the standard deviation of estimates from using complete case analysis are bigger than imputed data, this result is the same with the simulation.
|
57 |
Essays in Political MethodologyBlackwell, Matthew 24 July 2012 (has links)
This dissertation provides three novel methodologies to the field of political science. In the first chapter, I describe how to make causal inferences in the face of dynamic strategies. Traditional causal inference methods assume that these dynamic decisions are made all at once, an assumption that forces a choice between omitted variable bias and post-treatment bias. I resolve this dilemma by adapting methods from biostatistics and use these methods to estimate the effectiveness of an inherently dynamic process: a candidate's decision to "go negative." Drawing on U.S. statewide elections (2000-2006), I find, in contrast to the previous literature, that negative advertising is an effective strategy for non-incumbents. In the second chapter, I develop a method for handling measurement error. Social scientists devote considerable effort to mitigating measurement error during data collection but then ignore the issue during analysis. Although many statistical methods have been proposed for reducing measurement error-induced biases, few have been widely used because implausible assumptions, high levels of model dependence, difficult computation, or inapplicability with multiple mismeasured variables. This chapter develops an easy-to-use alternative without these problems as a special case of extreme measurement error and corrects for both. In the final chapter, I introduce a model for detecting changepoints in the distribution of contributions data because it allows for overdispersion, a key feature of contributions data. While many extant changepoint models force researchers to choose the number of changepoint ex ante, the game-changers model incorporates a Dirichlet process prior in order to estimate the number of changepoints along with their location. I demonstrate the usefulness of the model in data from the 2012 Republican primary and the 2008 U.S. Senate elections. / Government
|
58 |
Changes in the sexual function of male patients with rectal cancer over a 2‐year period from diagnosis to 24‐month follow‐up: A prospective, multicenter, cohort study / 男性直腸癌に対する腹腔鏡下根治術後の性機能推移:多施設共同前向き観察研究Sakamoto, Takashi 23 March 2021 (has links)
京都大学 / 新制・課程博士 / 博士(医学) / 甲第23075号 / 医博第4702号 / 新制||医||1049(附属図書館) / 京都大学大学院医学研究科医学専攻 / (主査)教授 川上 浩司, 教授 近藤 尚己, 教授 小川 修 / 学位規則第4条第1項該当 / Doctor of Medical Science / Kyoto University / DFAM
|
59 |
(Re)-Examining the Influence of Program Placement on the Academic Achievement of Students with Learning DisabilitiesMcKibbin, Steven 17 July 2020 (has links)
This study explored the relationship between several variables known to influence achievement in Canadian Grade 6 students with Learning Disabilities (LD) who received instruction in either a regular class or specialized program placement. The main independent variable was program placement while the influence of four other independent variables was explored (i.e., level of academic need; prior achievement; socioeconomic status and sex). The dependent variable was a standardized, large-scale assessment of achievement. Hierarchical multiple regression was conducted on a secondary data file in order to address the following research questions: i) Does placement in a regular or specialized program influence the educational outcomes for Grade 6 students with LD, after controlling for the influence of prior achievement in Grade 3?; ii) Is there a relationship between the sociodemographic variables of sex and/or socioeconomic status and achievement for students with LD placed in either a regular or specialized program?; and iii) What influence does the student’s level of academic need have on achievement, beyond program placement, and after controlling for the influence of the other variables in the model? Results revealed that the variables of program placement and prior achievement were significant predictors of scholastic success only when the level of academic need variable was not taken into account. When the follow-up analysis focused on a relatively matched group of students with similar academic need, none of the predictors in the regression model significantly influenced achievement -- including program placement. These results provide important insight into the nuanced relationship of the ecological variables known to affect learning in students with LD placed in regular or specialized programs for instruction. Implications are discussed for stakeholders in Ontario’s public education system in terms of the optimum service delivery model for students with LD, and the inclusive education debate in Canada and abroad.
|
60 |
Model-based Multiple Imputation by Chained-equations for Multilevel Data below the Limit of DetectionXu, Peixin 24 May 2022 (has links)
No description available.
|
Page generated in 0.1026 seconds