• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 127
  • 25
  • 20
  • 17
  • 4
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 250
  • 250
  • 77
  • 53
  • 53
  • 52
  • 35
  • 33
  • 31
  • 25
  • 25
  • 24
  • 23
  • 20
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
151

Métodos de imputação de dados aplicados na área da saúde

Nunes, Luciana Neves January 2007 (has links)
Em pesquisas da área da saúde é muito comum que o pesquisador defronte-se com o problema de dados faltantes. Nessa situação, é freqüente que a decisão do pesquisador seja desconsiderar os sujeitos que tenham não-resposta em alguma ou algumas das variáveis, pois muitas das técnicas estatísticas foram desenvolvidas para analisar dados completos. Entretanto, essa exclusão de sujeitos pode gerar inferências que não são válidas, principalmente se os indivíduos que permanecem na análise são diferentes daqueles que foram excluídos. Nas duas últimas décadas, métodos de imputação de dados foram desenvolvidos com a intenção de se encontrar solução para esse problema. Esses métodos usam como base a idéia de preencher os dados faltantes com valores plausíveis. O método mais complexo de imputação é a chamada imputação múltipla. Essa tese tem por objetivo divulgar o método de imputação múltipla e através de dois artigos procura atingir esse objetivo. O primeiro artigo descreve duas técnicas de imputação múltipla e as aplica a um conjunto de dados reais. O segundo artigo faz a comparação do método de imputação múltipla com duas técnicas de imputação única através de uma aplicação a um modelo de risco para mortalidade cirúrgica. Para as aplicações foram usados dados secundários já utilizados por Klück (2004). / Missing data in health research is a very common problem. The most direct way of dealing with missing data is to exclude observations with missing data, probably because the traditional statistical methods have been developed for complete data sets. However, this decision may give biased results, mainly if the subjects considered in the analysis are different of those who have been excluded. In the last two decades, imputation methods were developed to solve this problem. The idea of the imputation is to fill in the missing data with reasonable values. The multiple imputation is the most complex method. The objective of this dissertation is to divulge the multiple imputation method through two papers. The first one describes two different types of multiple imputation and it shows an application to real data. The second paper shows a comparison among the multiple imputation and two single imputations applied to a risk model for surgical mortality. The used data sets were secondary data used by Klück (2004).
152

Métodos de imputação de dados aplicados na área da saúde

Nunes, Luciana Neves January 2007 (has links)
Em pesquisas da área da saúde é muito comum que o pesquisador defronte-se com o problema de dados faltantes. Nessa situação, é freqüente que a decisão do pesquisador seja desconsiderar os sujeitos que tenham não-resposta em alguma ou algumas das variáveis, pois muitas das técnicas estatísticas foram desenvolvidas para analisar dados completos. Entretanto, essa exclusão de sujeitos pode gerar inferências que não são válidas, principalmente se os indivíduos que permanecem na análise são diferentes daqueles que foram excluídos. Nas duas últimas décadas, métodos de imputação de dados foram desenvolvidos com a intenção de se encontrar solução para esse problema. Esses métodos usam como base a idéia de preencher os dados faltantes com valores plausíveis. O método mais complexo de imputação é a chamada imputação múltipla. Essa tese tem por objetivo divulgar o método de imputação múltipla e através de dois artigos procura atingir esse objetivo. O primeiro artigo descreve duas técnicas de imputação múltipla e as aplica a um conjunto de dados reais. O segundo artigo faz a comparação do método de imputação múltipla com duas técnicas de imputação única através de uma aplicação a um modelo de risco para mortalidade cirúrgica. Para as aplicações foram usados dados secundários já utilizados por Klück (2004). / Missing data in health research is a very common problem. The most direct way of dealing with missing data is to exclude observations with missing data, probably because the traditional statistical methods have been developed for complete data sets. However, this decision may give biased results, mainly if the subjects considered in the analysis are different of those who have been excluded. In the last two decades, imputation methods were developed to solve this problem. The idea of the imputation is to fill in the missing data with reasonable values. The multiple imputation is the most complex method. The objective of this dissertation is to divulge the multiple imputation method through two papers. The first one describes two different types of multiple imputation and it shows an application to real data. The second paper shows a comparison among the multiple imputation and two single imputations applied to a risk model for surgical mortality. The used data sets were secondary data used by Klück (2004).
153

Métodos de imputação de dados aplicados na área da saúde

Nunes, Luciana Neves January 2007 (has links)
Em pesquisas da área da saúde é muito comum que o pesquisador defronte-se com o problema de dados faltantes. Nessa situação, é freqüente que a decisão do pesquisador seja desconsiderar os sujeitos que tenham não-resposta em alguma ou algumas das variáveis, pois muitas das técnicas estatísticas foram desenvolvidas para analisar dados completos. Entretanto, essa exclusão de sujeitos pode gerar inferências que não são válidas, principalmente se os indivíduos que permanecem na análise são diferentes daqueles que foram excluídos. Nas duas últimas décadas, métodos de imputação de dados foram desenvolvidos com a intenção de se encontrar solução para esse problema. Esses métodos usam como base a idéia de preencher os dados faltantes com valores plausíveis. O método mais complexo de imputação é a chamada imputação múltipla. Essa tese tem por objetivo divulgar o método de imputação múltipla e através de dois artigos procura atingir esse objetivo. O primeiro artigo descreve duas técnicas de imputação múltipla e as aplica a um conjunto de dados reais. O segundo artigo faz a comparação do método de imputação múltipla com duas técnicas de imputação única através de uma aplicação a um modelo de risco para mortalidade cirúrgica. Para as aplicações foram usados dados secundários já utilizados por Klück (2004). / Missing data in health research is a very common problem. The most direct way of dealing with missing data is to exclude observations with missing data, probably because the traditional statistical methods have been developed for complete data sets. However, this decision may give biased results, mainly if the subjects considered in the analysis are different of those who have been excluded. In the last two decades, imputation methods were developed to solve this problem. The idea of the imputation is to fill in the missing data with reasonable values. The multiple imputation is the most complex method. The objective of this dissertation is to divulge the multiple imputation method through two papers. The first one describes two different types of multiple imputation and it shows an application to real data. The second paper shows a comparison among the multiple imputation and two single imputations applied to a risk model for surgical mortality. The used data sets were secondary data used by Klück (2004).
154

Distance estimation for mixed continuous and categorical data with missing values

Azevedo, Glauco Gomes de 04 June 2018 (has links)
Submitted by Glauco Gomes de Azevedo (glaucogazevedo@gmail.com) on 2018-08-28T20:54:50Z No. of bitstreams: 1 dissertacao_glauco_azevedo.pdf: 1909706 bytes, checksum: 6636e75aa9da1db2615932f064fd1138 (MD5) / Approved for entry into archive by Janete de Oliveira Feitosa (janete.feitosa@fgv.br) on 2018-09-10T19:38:08Z (GMT) No. of bitstreams: 1 dissertacao_glauco_azevedo.pdf: 1909706 bytes, checksum: 6636e75aa9da1db2615932f064fd1138 (MD5) / Made available in DSpace on 2018-09-12T17:39:51Z (GMT). No. of bitstreams: 1 dissertacao_glauco_azevedo.pdf: 1909706 bytes, checksum: 6636e75aa9da1db2615932f064fd1138 (MD5) Previous issue date: 2018-06-04 / Neste trabalho é proposta uma metodologia para estimar distâncias entre pontos de dados mistos, contínuos e categóricos, contendo dados faltantes. Estimação de distâncias é a base para muitos métodos de regressão/classificação, tais como vizinhos mais próximos e análise de discriminantes, e para técnicas de clusterização como k-means e k-medoids. Métodos clássicos para manipulação de dados faltantes se baseiam em imputação pela média, o que pode subestimar a variância, ou em métodos baseados em regressão. Infelizmente, quando a meta é a estimar a distância entre observações, a imputação de dados pode performar de modo ineficiente e enviesar os resultados na direção do modelo. Na proposta desse trabalho, estima-se a distância dos pares diretamente, tratando os dados faltantes como aleatórios. A distribuição conjunta dos dados é aproximada utilizando um modelo de mistura multivariado para dados mistos, contínuos e categóricos. Apresentamentos um algoritmo do tipo EM para estimar a mistura e uma metodologia geral para estimar a distância entre observações. Simulações mostram que um método proposto performa tanto dados simulados, como reais. / In this work we propose a methodology to estimate the pairwise distance between mixed continuous and categorical data with missing values. Distance estimation is the base for many regression/classification methods, such as nearest neighbors and discriminant analysis, and for clustering techniques such as k-means and k-medoids. Classical methods for handling missing data rely on mean imputation, that could underestimate the variance, or regression-based imputation methods. Unfortunately, when the goal is to estimate the distance between observations, data imputation may perform badly and bias the results toward the data imputation model. In this work we estimate the pairwise distances directly, treating the missing data as random. The joint distribution of the data is approximated using a multivariate mixture model for mixed continuous and categorical data. We present an EM-type algorithm for estimating the mixture and a general methodology for estimating the distance between observations. Simulation shows that the proposed method performs well in both simulated and real data.
155

Parametric LCA approaches for efficient design / Approches d'ACV paramétriques pour une conception performante

Kozderka, Michal 13 December 2016 (has links)
Ces travaux de recherche portent sur la problématique de la mise en pratique de l'analyse de cycle de vie (ACV). La question principale est : comment faire une ACV plus rapide et plus facilement accessible pour la conception des produits ? Nous nous concentrons sur deux problématiques qui prolongent l'inventaire de Cycle de Vie (ICV) : • recherche des données manquantes : Comment ranger les données manquantes selon leur importance? Comment traiter l'intersection des aspects qualitatifs et les aspects quantitatifs des données manquantes? • Modélisation du cycle de vie : Comment réutiliser le cycle de vie existant pour un nouveau produit? Comment développer un modèle de référence? Pour la recherche des solutions nous avons utilisé l'approche "Case study" selon Robert Yin. Nos contributions font résultat de trois études de cas, dont la plus importante est l'ACV du High Impact Polypropylene (HIPP) recyclé. Nous avons publié les résultats de celle-ci dans la revue scientifique Journal of Cleaner Production. Suite aux études de cas nous proposons deux approches d'amélioration d'efficacité en ICV : nous proposons l'analyse de sensibilité préalable pour classifier les données manquantes selon leur impact sur les résultats d'ACV. L'approche combine les aspects quantitatifs avec les aspects quantitatifs en protégeant le respect des objectives d'étude. Nous appelons cette protection "LCA Poka-Yoké". La modélisation du cycle de vie peut être assistée grace à la méthode basée sur l'algorithme de King. Pour la continuation de la recherche nous proposons huit perspectives, dont six font l'objet d'intégration des nouvelles approches d'amélioration dans les concepts d'ACV basés sur la norme ISO 14025 ou dans le projet de la Commission Européenne PEF. / This work addresses the different issues that put a brake to using Lifecycle assessment (LCA) in product design by answering the main question of the research: How to make Lifecycle assessment faster and easier accessible for manufactured product design? In the LCA methodology we have identified two issues to deal with and their consecutive scientific locks : • Research of missing data : How to organize missing data? How to respect quantitative and qualitative dimensions? • Modeling of the lifecycle scenario : How to translate methodological choices into the lifecycle scenario model? How to transform the reference scenario into a new one? We have dealt with these issues using the scientific approach Case study according toRobert Yin. Our contributions are based on three case studies, between which the most important is study of High Impact Polypropylene recycling in the automotive industry. We have published it in the Journal of Cleaner Production. As result of our research we present two methods to improve efficiency of the LifecycleInventory Analysis (LCI) : To organize the missing data: Preliminary sensitivity analysis with LCA Poka-Yoke ; To help with scenario modeling: Method of workflows factorization, based on Reverse engineering. For further research we propose eight perspectives, mostly based on integration of our methods into Product Category Rules (PCR)-based platforms like EPD International or the European PEF.
156

Methods for handling missing data due to a limit of detection in longitudinal lognormal data

Dick, Nicole Marie January 1900 (has links)
Master of Science / Department of Statistics / Suzanne Dubnicka / In animal science, challenge model studies often produce longitudinal data. Many times the lognormal distribution is useful in modeling the data at each time point. Escherichia coli O157 (E. coli O157) studies measure and record the concentration of colonies of the bacteria. There are times when the concentration of colonies present is too low, falling below a limit of detection. In these cases a zero is recorded for the concentration. Researchers employ a method of enrichment to determine if E. coli O157 was truly not present. This enrichment process searches for bacteria colony concentrations a second time to confirm or refute the previous measurement. If enrichment comes back without evidence of any bacteria colonies present, a zero remains as the observed concentration. If enrichment comes back with presence of bacteria colonies, a minimum value is imputed for the concentration. At the conclusion of the study the data are log10-transformed. One problem with the transformation is that the log of zero is mathematically undefined, so any observed concentrations still recorded as a zero after enrichment can not be log-transformed. Current practice carries the zero value from the lognormal data to the normal data. The purpose of this report is to evaluate methods for handling missing data due to a limit of detection and to provide results for various analyses of the longitudinal data. Multiple methods of imputing a value for the missing data are compared. Each method is analyzed by fitting three different models using SAS. To determine which method is most accurately explaining the data, a simulation study was conducted.
157

多重插補法在線上使用者評分之應用 / Managing online user-generated product reviews using multiple imputation methods

李岑志, Li, Cen Jhih Unknown Date (has links)
隨著網路普及,人們越來越常在網路上購物並在線上評價商品,產生了非常大的口碑效應。不論對廠商或對消費者來說,線上商品評論都已經變得非常重要;消費者能藉由他人購買經驗判斷產品優劣,廠商能藉由消費者評價來提升產品品質,目前已有許多電子商務網站都有蒐集消費者購買產品後的意見回饋。 這些網站中有些提供消費者能對產品打一個總分並寫一段文字評論,然而每個消費者所評論的產品特徵通常各有不同,尤其是較晚購買的消費者更可能因為自己的意見已經有人提過而省略。將每個人提到的文字敘述量化為數字分數時,沒有寫到的特徵將會使量化後的資料存在許多遺漏值。 同時消費者也有可能提到一些不重要的特徵,若能找到消費者評論中,各個特徵影響消費者的多寡,廠商就能針對產品較重要的缺點改進。本研究將會著重探討消費者所提到的特徵對產品總分的影響,以及這些遺漏值填補後是否能接近消費者真實意見。 過去許多填補遺漏值的方法都是一次填補全部資料,並沒有考慮消費者會受到時間較早的評論影響。本研究設計一套多重插補的方法並透過模擬驗證,以之填補亞馬遜網站的Canon 系列 SX210、SX230、SX260等三個世代數位相機之消費者評論資料。研究結果指出此方法能夠準確估計各項特徵對產品總分的影響。 / Online user-generated product reviews have become a rich source of product quality information for both producers and customers. As a result, many E-commerce websites allow customers to rate products using scores, and some together with text comments. However, people usually comment only on the features they care about and might omit those have been mentioned by previous customers. Consequently, missing data occur when analyzing comments. In addition, customers may comment the features which influence neither their satisfaction nor sales volume. Thus, it is important to find the significant features so that manufacturers can improve the main defects. Our research focuses on modeling customer reviews and their influence on predicting overall ratings. We aim to understand whether, by filling up missing values, the critical features can be identified and the features rating authentically reflect customer opinion. Many previous studies fill whole the dataset, but not consider that customer reviews might be influenced by the foregoing reviews. We propose a method based on multiple imputation and fill the costumer reviews of Canon digital camera (SX210, SX230, SX260 generations) on Amazon. We design a simulation to verify the method’s effectiveness and the method get a great result on identifying the critical features.
158

Linear Discriminant Analysis with Repeated Measurements

Skinner, Evelina January 2019 (has links)
The classification of observations based on repeated measurements performed on the same subject over a given period of time or under different conditions is a common procedure in many disciplines such as medicine, psychology and environmental studies. In this thesis repeated measurements follow the Growth Curve model and are classified using linear discriminant analysis. The aim of this thesis is both to examine the effect of missing data on classification accuracy and to examine the effect of additional data on classification robustness. The results indicate that an increasing amount of missing data leads to a progressive decline in classification accuracy. With regard to the effect of additional data on classification robustness the results show a less predictable effect which can only be characterised as a general tendency towards improved robustness.
159

Recommendations Regarding Q-Matrix Design and Missing Data Treatment in the Main Effect Log-Linear Cognitive Diagnosis Model

Ma, Rui 11 December 2019 (has links)
Diagnostic classification models used in conjunction with diagnostic assessments are to classify individual respondents into masters and nonmasters at the level of attributes. Previous researchers (Madison & Bradshaw, 2015) recommended items on the assessment should measure all patterns of attribute combinations to ensure classification accuracy, but in practice, certain attributes may not be measured by themselves. Moreover, the model estimation requires large sample size, but in reality, there could be unanswered items in the data. Therefore, the current study sought to provide suggestions on selecting between two alternative Q-matrix designs when an attribute cannot be measured in isolation and when using maximum likelihood estimation in the presence of missing responses. The factorial ANOVA results of this simulation study indicate that adding items measuring some attributes instead of all attributes is more optimal and that other missing data treatments should be sought if the percent of missing responses is greater than 5%.
160

Development and Evaluation of Infilling Methods for Missing Hydrologic and Chemical Watershed Monitoring Data

Johnston, Carey Andrew 30 September 1999 (has links)
Watershed monitoring programs generally do not have perfect data collection success rates due to a variety of field and laboratory factors. A major source of error in many stream-gaging records is lost or missing data caused by malfunctioning stream-side equipment. Studies estimate that between 5 and 20 percent of stream-gaging data may be marked as missing for one reason or another. Reconstructing or infilling missing data methods generate larger sets of data. These larger data sets generally generate better estimates of the sampled parameter and permit practical applications of the data in hydrologic or water quality calculations. This study utilizes data from a watershed monitoring program operating in the Northern Virginia area to: (1) identify and summarize the major reasons for the occurrence of missing data; (2) provide recommendations for reducing the occurrence of missing data; (3) describe methods for infilling missing chemical data; (4) develop and evaluate methods for infilling values to replace missing chemical data; and (5) recommend different infilling methods for various conditions. An evaluation of different infilling methods for chemical data over a variety of factors (e.g., amount of annual rainfall, whether the missing chemical parameter is strongly correlated with flow, amount of missing data) is performed using Monte Carlo modeling. Using the results of the Monte Carlo modeling, a Decision Support System (DSS) is developed for easy application of the most appropriate infilling method. / Master of Science

Page generated in 0.1135 seconds