Global ETD Search

11	A tale of two applications: closed-loop quality control for 3D printing, and multiple imputation and the bootstrap for the analysis of big data with missingness Wenbin Zhu (12226001) 20 April 2022 (has links) <div><b>1. A Closed-Loop Machine Learning and Compensation Framework for Geometric Accuracy Control of 3D Printed Products</b></div><div><b><br></b></div>Additive manufacturing (AM) systems enable direct printing of three-dimensional (3D) physical products from computer-aided design (CAD) models. Despite the many advantages that AM systems have over traditional manufacturing, one of their significant limitations that impedes their wide adoption is geometric inaccuracies, or shape deviations between the printed product and the nominal CAD model. Machine learning for shape deviations can enable geometric accuracy control of 3D printed products via the generation of compensation plans, which are modifications of CAD models informed by the machine learning algorithm that reduce deviations in expectation. However, existing machine learning and compensation frameworks cannot accommodate deviations of fully 3D shapes with different geometries. The feasibility of existing frameworks for geometric accuracy control is further limited by resource constraints in AM systems that prevent the printing of multiple copies of new shapes.<div><br></div><div>We present a closed-loop machine learning and compensation framework that can improve geometric accuracy control of 3D shapes in AM systems. Our framework is based on a Bayesian extreme learning machine (BELM) architecture that leverages data and deviation models from previously printed products to transfer deviation models, and more accurately capture deviation patterns, for new 3D products. The closed-loop nature of compensation under our framework, in which past compensated products that do not adequately meet dimensional specifications are fed into the BELMs to re-learn the deviation model, enables the identification of effective compensation plans and satisfies resource constraints by printing only one new shape at a time. The power and cost-effectiveness of our framework are demonstrated with two validation experiments that involve different geometries for a Markforged Metal X AM machine printing 17-4 PH stainless steel products. As demonstrated in our case studies, our framework can reduce shape inaccuracies by 30% to 60% (depending on a shape's geometric complexity) in at most two iterations, with three training shapes and one or two test shapes for a specific geometry involved across the iterations. We also perform an additional validation experiment using a third geometry to establish the capabilities of our framework for prospective shape deviation prediction of 3D shapes that have never been printed before. This third experiment indicates that choosing one suitable class of past products for prospective prediction and model transfer, instead of including all past printed products with different geometries, could be sufficient for obtaining deviation models with good predictive performance. Ultimately, our closed-loop machine learning and compensation framework provides an important step towards accurate and cost-efficient deviation modeling and compensation for fully 3D printed products using a minimal number of printed training and test shapes, and thereby can advance AM as a high-quality manufacturing paradigm.<br></div><div><br></div><div><b>2. Multiple Imputation and the Bootstrap for the Analysis of Big Data with Missingness</b></div><div><br></div><div>Inference can be a challenging task for Big Data. Two significant issues are that Big Data frequently exhibit complicated missing data patterns, and that the complex statistical models and machine learning algorithms typically used to analyze Big Data do not have convenient quantification of uncertainties for estimators. These two difficulties have previously been addressed using multiple imputation and the bootstrap, respectively. However, it is not clear how multiple imputation and bootstrap procedures can be effectively combined to perform statistical inferences on Big Data with missing values. We investigate a practical framework for the combination of multiple imputation and bootstrap methods. Our framework is based on two principles: distribution of multiple imputation and bootstrap calculations across parallel computational cores, and the quantification of sources of variability involved in bootstrap procedures that use subsampling techniques via random effects or hierarchical models. This framework effectively extends the scope of existing methods for multiple imputation and the bootstrap to a broad range of Big Data settings. We perform simulation studies for linear and logistic regression across Big Data settings with different rates of missingness to characterize the frequentist properties and computational efficiencies of the combinations of multiple imputation and the bootstrap. We further illustrate how effective combinations of multiple imputation and the bootstrap for Big Data analyses can be identified in practice by means of both the simulation studies and a case study on COVID infection status data. Ultimately, our investigation demonstrates how the flexible combination of multiple imputation and the bootstrap under our framework can enable valid statistical inferences in an effective manner for Big Data with missingness.<br></div> Manufacturing Safety and Quality Applied Statistics Bayesian data analysis Metal 3D printing model transfer approach Neural Network statistical shape modeling Compensation plan derivation Big data with missingness Bag of little bootstrap random effect
12	Análise de dados categorizados com omissão em variáveis explicativas e respostas / Categorical data analysis with missingness in explanatory and response variables Poleto, Frederico Zanqueta 08 April 2011 (has links) Nesta tese apresentam-se desenvolvimentos metodológicos para analisar dados com omissão e também estudos delineados para compreender os resultados de tais análises. Escrutinam-se análises de sensibilidade bayesiana e clássica para dados com respostas categorizadas sujeitas a omissão. Mostra-se que as componentes subjetivas de cada abordagem podem influenciar os resultados de maneira não-trivial, independentemente do tamanho da amostra, e que, portanto, as conclusões devem ser cuidadosamente avaliadas. Especificamente, demonstra-se que distribuições \\apriori\\ comumente consideradas como não-informativas ou levemente informativas podem, na verdade, ser bastante informativas para parâmetros inidentificáveis, e que a escolha do modelo sobreparametrizado também tem um papel importante. Quando há omissão em variáveis explicativas, também é necessário propor um modelo marginal para as covariáveis mesmo se houver interesse apenas no modelo condicional. A especificação incorreta do modelo para as covariáveis ou do modelo para o mecanismo de omissão leva a inferências enviesadas para o modelo de interesse. Trabalhos anteriormente publicados têm-se dividido em duas vertentes: ou utilizam distribuições semiparamétricas/não-paramétricas, flexíveis para as covariáveis, e identificam o modelo com a suposição de um mecanismo de omissão não-informativa, ou empregam distribuições paramétricas para as covariáveis e permitem um mecanismo mais geral, de omissão informativa. Neste trabalho analisam-se respostas binárias, combinando um mecanismo de omissão informativa com um modelo não-paramétrico para as covariáveis contínuas, por meio de uma mistura induzida pela distribuição \\apriori\\ de processo de Dirichlet. No caso em que o interesse recai apenas em momentos da distribuição das respostas, propõe-se uma nova análise de sensibilidade sob o enfoque clássico para respostas incompletas que evita suposições distribucionais e utiliza parâmetros de sensibilidade de fácil interpretação. O procedimento tem, em particular, grande apelo na análise de dados contínuos, campo que tradicionalmente emprega suposições de normalidade e/ou utiliza parâmetros de sensibilidade de difícil interpretação. Todas as análises são ilustradas com conjuntos de dados reais. / We present methodological developments to conduct analyses with missing data and also studies designed to understand the results of such analyses. We examine Bayesian and classical sensitivity analyses for data with missing categorical responses and show that the subjective components of each approach can influence results in non-trivial ways, irrespectively of the sample size, concluding that they need to be carefully evaluated. Specifically, we show that prior distributions commonly regarded as slightly informative or non-informative may actually be too informative for non-identifiable parameters, and that the choice of over-parameterized models may drastically impact the results. When there is missingness in explanatory variables, we also need to consider a marginal model for the covariates even if the interest lies only on the conditional model. An incorrect specification of either the model for the covariates or of the model for the missingness mechanism leads to biased inferences for the parameters of interest. Previously published works are commonly divided into two streams: either they use semi-/non-parametric flexible distributions for the covariates and identify the model via a non-informative missingness mechanism, or they employ parametric distributions for the covariates and allow a more general informative missingness mechanism. We consider the analysis of binary responses, combining an informative missingness model with a non-parametric model for the continuous covariates via a Dirichlet process mixture. When the interest lies only in moments of the response distribution, we consider a new classical sensitivity analysis for incomplete responses that avoids distributional assumptions and employs easily interpreted sensitivity parameters. The procedure is particularly useful for analyses of missing continuous data, an area where normality is traditionally assumed and/or relies on hard-to-interpret sensitivity parameters. We illustrate all analyses with real data sets. Análise de sensibilidade Dados faltantes ou incompletos Dirichlet process Identifiability Identificabilidade Ignorance and uncertainty intervals Incomplete or missing data Intervalos de ignorância e de incerteza MAR MAR MCAR and MNAR MCAR e MNAR Overparameterization. Processo de Dirichlet Selection and pattern-mixture models Sensitivity analysis Sobreparametrização.
13	Análise de dados categorizados com omissão em variáveis explicativas e respostas / Categorical data analysis with missingness in explanatory and response variables Frederico Zanqueta Poleto 08 April 2011 (has links) Nesta tese apresentam-se desenvolvimentos metodológicos para analisar dados com omissão e também estudos delineados para compreender os resultados de tais análises. Escrutinam-se análises de sensibilidade bayesiana e clássica para dados com respostas categorizadas sujeitas a omissão. Mostra-se que as componentes subjetivas de cada abordagem podem influenciar os resultados de maneira não-trivial, independentemente do tamanho da amostra, e que, portanto, as conclusões devem ser cuidadosamente avaliadas. Especificamente, demonstra-se que distribuições \\apriori\\ comumente consideradas como não-informativas ou levemente informativas podem, na verdade, ser bastante informativas para parâmetros inidentificáveis, e que a escolha do modelo sobreparametrizado também tem um papel importante. Quando há omissão em variáveis explicativas, também é necessário propor um modelo marginal para as covariáveis mesmo se houver interesse apenas no modelo condicional. A especificação incorreta do modelo para as covariáveis ou do modelo para o mecanismo de omissão leva a inferências enviesadas para o modelo de interesse. Trabalhos anteriormente publicados têm-se dividido em duas vertentes: ou utilizam distribuições semiparamétricas/não-paramétricas, flexíveis para as covariáveis, e identificam o modelo com a suposição de um mecanismo de omissão não-informativa, ou empregam distribuições paramétricas para as covariáveis e permitem um mecanismo mais geral, de omissão informativa. Neste trabalho analisam-se respostas binárias, combinando um mecanismo de omissão informativa com um modelo não-paramétrico para as covariáveis contínuas, por meio de uma mistura induzida pela distribuição \\apriori\\ de processo de Dirichlet. No caso em que o interesse recai apenas em momentos da distribuição das respostas, propõe-se uma nova análise de sensibilidade sob o enfoque clássico para respostas incompletas que evita suposições distribucionais e utiliza parâmetros de sensibilidade de fácil interpretação. O procedimento tem, em particular, grande apelo na análise de dados contínuos, campo que tradicionalmente emprega suposições de normalidade e/ou utiliza parâmetros de sensibilidade de difícil interpretação. Todas as análises são ilustradas com conjuntos de dados reais. / We present methodological developments to conduct analyses with missing data and also studies designed to understand the results of such analyses. We examine Bayesian and classical sensitivity analyses for data with missing categorical responses and show that the subjective components of each approach can influence results in non-trivial ways, irrespectively of the sample size, concluding that they need to be carefully evaluated. Specifically, we show that prior distributions commonly regarded as slightly informative or non-informative may actually be too informative for non-identifiable parameters, and that the choice of over-parameterized models may drastically impact the results. When there is missingness in explanatory variables, we also need to consider a marginal model for the covariates even if the interest lies only on the conditional model. An incorrect specification of either the model for the covariates or of the model for the missingness mechanism leads to biased inferences for the parameters of interest. Previously published works are commonly divided into two streams: either they use semi-/non-parametric flexible distributions for the covariates and identify the model via a non-informative missingness mechanism, or they employ parametric distributions for the covariates and allow a more general informative missingness mechanism. We consider the analysis of binary responses, combining an informative missingness model with a non-parametric model for the continuous covariates via a Dirichlet process mixture. When the interest lies only in moments of the response distribution, we consider a new classical sensitivity analysis for incomplete responses that avoids distributional assumptions and employs easily interpreted sensitivity parameters. The procedure is particularly useful for analyses of missing continuous data, an area where normality is traditionally assumed and/or relies on hard-to-interpret sensitivity parameters. We illustrate all analyses with real data sets. Análise de sensibilidade Dados faltantes ou incompletos Identificabilidade Intervalos de ignorância e de incerteza MAR MCAR e MNAR Processo de Dirichlet Sobreparametrização. Dirichlet process Identifiability Ignorance and uncertainty intervals Incomplete or missing data MAR MCAR and MNAR Overparameterization. Selection and pattern-mixture models Sensitivity analysis

Page generated in 0.0732 seconds