Global ETD Search

1	Investigating the Utility of Age-Dependent Cranial Vault Thickness as an Aging Method for Juvenile Skeletal Remains on Dry Bone, Radiographic and Computed Tomography Scans Kamnikar, Kelly R 07 May 2016 (has links) Age estimation, a component of the biological profile, contributes significantly to the creation of a post-mortem profile of an unknown set of human remains. This goal of this study is to: (1) refine the juvenile age estimation method of cranial vault thickness (CVT) through MARS modeling, (2) test the method on known age samples, and (3) compare CVT and dental development age estimations. Data for this study comes from computed tomography (CT) scans, radiographic images, and dry bone. CVT was measured at seven cranial landmarks (nasion, glabella, bregma, vertex, vertex radius, lambda and opisthocranion). Results indicate that CVT models vary in their predictive ability; vertex and lambda produce the best results. Predicted fit values and prediction intervals for CVT are larger, and less accurate than dental development age estimates. Aging by CVT could benefit from a larger known age sample composed of individuals older than 6 years old. multivariate adaptive regression splines age estimation
2	An OLS-Based Method for Causal Inference in Observational Studies Xu, Yuanfang 07 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Observational data are frequently used for causal inference of treatment effects on prespecified outcomes. Several widely used causal inference methods have adopted the method of inverse propensity score weighting (IPW) to alleviate the in uence of confounding. However, the IPW-type methods, including the doubly robust methods, are prone to large variation in the estimation of causal e ects due to possible extreme weights. In this research, we developed an ordinary least-squares (OLS)-based causal inference method, which does not involve the inverse weighting of the individual propensity scores. We first considered the scenario of homogeneous treatment effect. We proposed a two-stage estimation procedure, which leads to a model-free estimator of average treatment effect (ATE). At the first stage, two summary scores, the propensity and mean scores, are estimated nonparametrically using regression splines. The targeted ATE is obtained as a plug-in estimator that has a closed form expression. Our simulation studies showed that this model-free estimator of ATE is consistent, asymptotically normal and has superior operational characteristics in comparison to the widely used IPW-type methods. We then extended our method to the scenario of heterogeneous treatment effects, by adding in an additional stage of modeling the covariate-specific treatment effect function nonparametrically while maintaining the model-free feature, and the simplicity of OLS-based estimation. The estimated covariate-specific function serves as an intermediate step in the estimation of ATE and thus can be utilized to study the treatment effect heterogeneity. We discussed ways of using advanced machine learning techniques in the proposed method to accommodate high dimensional covariates. We applied the proposed method to a case study evaluating the effect of early combination of biologic & non-biologic disease-modifying antirheumatic drugs (DMARDs) compared to step-up treatment plan in children with newly onset of juvenile idiopathic arthritis disease (JIA). The proposed method gives strong evidence of significant effect of early combination at 0:05 level. On average early aggressive use of biologic DMARDs leads to around 1:2 to 1:7 more reduction in clinical juvenile disease activity score at 6-month than the step-up plan for treating JIA. average treatment effect causal inference regression splines sieve method
3	Robust Conic Quadratic Programming Applied To Quality Improvement -a Robustification Of Cmars Ozmen, Ayse 01 October 2010 (has links) (PDF) In this thesis, we study and use Conic Quadratic Programming (CQP) for purposes of operational research, especially, for quality improvement in manufacturing. In previous works, the importance and benefit of CQP in this area became already demonstrated. There, the complexity of the regression method Multivariate Adaptive Regression Spline (MARS), which especially means sensitivity with respect to noise in the data, became penalized in the form of so-called Tikhonov regularization, which became expressed and studied as a CQP problem. This was leading to the new method CMARS / it is more model-based and employs continuous, actually, well-structured convex optimization which enables the use of Interior Point Methods and their codes such as MOSEK. In this study, we are generalizing the regression problem by including uncertainty in the model, especially, in the input data, too. CMARS, recently developed as an alternative method to MARS, is powerful in overcoming complex and heterogeneous data. However, for MARS and CMARS method, data are assumed to contain fixed variables. In fact, data include noise in both output and input variables. Consequently, optimization problem&rsquo / s solutions can show a remarkable sensitivity to perturbations in the parameters of the problem. In this study, we include the existence of uncertainty in the future scenarios into CMARS and robustify it with robust optimization which is dealt with data uncertainty. That kind of optimization was introduced by Aharon Ben-Tal and Arkadi Nemirovski, and used by Laurent El Ghaoui in the area of data mining. It incorporates various kinds of noise and perturbations into the programming problem. This robustification of CQP with robust optimization is compared with previous contributions that based on Tikhonov regularization, and with the traditional MARS method. QA Mathematics 1-939
4	An Efficient Robust Concept Exploration Method and Sequential Exploratory Experimental Design Lin, Yao 31 August 2004 (has links) Experimentation and approximation are essential for efficiency and effectiveness in concurrent engineering analyses of large-scale complex systems. The approximation-based design strategy is not fully utilized in industrial applications in which designers have to deal with multi-disciplinary, multi-variable, multi-response, and multi-objective analysis using very complicated and expensive-to-run computer analysis codes or physical experiments. With current experimental design and metamodeling techniques, it is difficult for engineers to develop acceptable metamodels for irregular responses and achieve good design solutions in large design spaces at low prices. To circumvent this problem, engineers tend to either adopt low-fidelity simulations or models with which important response properties may be lost, or restrict the study to very small design spaces. Information from expensive physical or computer experiments is often used as a validation in late design stages instead of analysis tools that are used in early-stage design. This increases the possibility of expensive re-design processes and the time-to-market. In this dissertation, two methods, the Sequential Exploratory Experimental Design (SEED) and the Efficient Robust Concept Exploration Method (E-RCEM) are developed to address these problems. The SEED and E-RCEM methods help develop acceptable metamodels for irregular responses with expensive experiments and achieve satisficing design solutions in large design spaces with limited computational or monetary resources. It is verified that more accurate metamodels are developed and better design solutions are achieved with SEED and E-RCEM than with traditional approximation-based design methods. SEED and E-RCEM facilitate the full utility of the simulation-and-approximation-based design strategy in engineering and scientific applications. Several preliminary approaches for metamodel validation with additional validation points are proposed in this dissertation, after verifying that the most-widely-used method of leave-one-out cross-validation is theoretically inappropriate in testing the accuracy of metamodels. A comparison of the performance of kriging and MARS metamodels is done in this dissertation. Then a sequential metamodeling approach is proposed to utilize different types of metamodels along the design timeline. Several single-variable or two-variable examples and two engineering example, the design of pressure vessels and the design of unit cells for linear cellular alloys, are used in this dissertation to facilitate our studies. Kriging Multivariate Adaptive Regression Splines Simulation Entropy and Information Theory Multidisciplinary optimization Statistical metamodeling Design of experiments Design
5	Predicting bid prices in construction projects using non-parametric statistical models Pawar, Roshan 15 May 2009 (has links) Bidding is a very competitive process in the construction industry; each competitor’s business is based on winning or losing these bids. Contractors would like to predict the bids that may be submitted by their competitors. This will help contractors to obtain contracts and increase their business. Unit prices that are estimated for each quantity differ from contractor to contractor. These unit costs are dependent on factors such as historical data used for estimating unit costs, vendor quotes, market surveys, amount of material estimated, number of projects the contractor is working on, equipment rental costs, the amount of equipment owned by the contractor, and the risk averseness of the estimator. These factors are nearly similar when estimators are estimating cost of similar projects. Thus, there is a relationship between the projects that a particular contractor has bid in previous years and the cost the contractor is likely to quote for future projects. This relationship could be used to predict bids that the contractor might quote for future projects. For example, a contractor may use historical data for a certain year for bidding on certain type of projects, the unit prices may be adjusted for size, time and location, but the basis for bidding on projects of similar types is the same. Statistical tools can be used to model the underlying relationship between the final cost of the project quoted by a contractor to the quantities of materials or amount of tasks performed in a project. There are a number of statistical modeling techniques, but a model used for predicting costs should be flexible enough that it could adjust to depict any underlying pattern. Data such as amount of work to be performed for a certain line item, material cost index, labor cost index and a unique identifier for each participating contractor is used to predict bids that a contractor might quote for a certain project. To perform the analysis, artificial neural networks and multivariate adaptive regression splines are used. The results obtained from both the techniques are compared, and it is found that multivariate adaptive regression splines are able to predict the cost better than artificial neural networks. bidding highway construction projects artificial neural networks multivariate adaptive regression splines
6	Bayesian Hierarchical, Semiparametric, and Nonparametric Methods for International New Product Di ffusion Hartman, Brian Matthew 2010 August 1900 (has links) Global marketing managers are keenly interested in being able to predict the sales of their new products. Understanding how a product is adopted over time allows the managers to optimally allocate their resources. With the world becoming ever more global, there are strong and complex interactions between the countries in the world. My work explores how to describe the relationship between those countries and determines the best way to leverage that information to improve the sales predictions. In Chapter II, I describe how diffusion speed has changed over time. The most recent major study on this topic, by Christophe Van den Bulte, investigated new product di ffusions in the United States. Van den Bulte notes that a similar study is needed in the international context, especially in developing countries. Additionally, his model contains the implicit assumption that the diffusion speed parameter is constant throughout the life of a product. I model the time component as a nonparametric function, allowing the speed parameter the flexibility to change over time. I find that early in the product's life, the speed parameter is higher than expected. Additionally, as the Internet has grown in popularity, the speed parameter has increased. In Chapter III, I examine whether the interactions can be described through a reference hierarchy in addition to the cross-country word-of-mouth eff ects already in the literature. I also expand the word-of-mouth e ffect by relating the magnitude of the e ffect to the distance between the two countries. The current literature only applies that e ffect equally to the n closest countries (forming a neighbor set). This also leads to an analysis of how to best measure the distance between two countries. I compare four possible distance measures: distance between the population centroids, trade ow, tourism ow, and cultural similarity. Including the reference hierarchy improves the predictions by 30 percent over the current best model. Finally, in Chapter IV, I look more closely at the Bass Diffusion Model. It is prominently used in the marketing literature and is the base of my analysis in Chapter III. All of the current formulations include the implicit assumption that all the regression parameters are equal for each country. One dollar increase in GDP should have more of an eff ect in a poor country than in a rich country. A Dirichlet process prior enables me to cluster the countries by their regression coefficients. Incorporating the distance measures can improve the predictions by 35 percent in some cases. Bayesian Methods Nonparametrics International New Product Diffusion Bayesian Adaptive Regression Splines Hierarchical Models
7	An Algorithm For The Forward Step Of Adaptive Regression Splines Via Mapping Approach Kartal Koc, Elcin 01 September 2012 (has links) (PDF) In high dimensional data modeling, Multivariate Adaptive Regression Splines (MARS) is a well-known nonparametric regression technique to approximate the nonlinear relationship between a response variable and the predictors with the help of splines. MARS uses piecewise linear basis functions which are separated from each other with breaking points (knots) for function estimation. The model estimating function is generated in two stepwise procedures: forward selection and backward elimination. In the first step, a general model including too many basis functions so the knot points are generated / and in the second one, the least contributing basis functions to the overall fit are eliminated. In the conventional adaptive spline procedure, knots are selected from a set of distinct data points that makes the forward selection procedure computationally expensive and leads to high local variance. To avoid these drawbacks, it is possible to select the knot points from a subset of data points, which leads to data reduction. In this study, a new method (called S-FMARS) is proposed to select the knot points by using a self organizing map-based approach which transforms the original data points to a lower dimensional space. Thus, less number of knot points is enabled to be evaluated for model building in the forward selection of MARS algorithm. The results obtained from simulated datasets and of six real-world datasets show that the proposed method is time efficient in model construction without degrading the model accuracy and prediction performance. In this study, the proposed approach is implemented to MARS and CMARS methods as an alternative to their forward step to improve them by decreasing their computing time QA General 15707
8	A Computational Approach To Nonparametric Regression: Bootstrapping Cmars Method Yazici, Ceyda 01 September 2011 (has links) (PDF) Bootstrapping is a resampling technique which treats the original data set as a population and draws samples from it with replacement. This technique is widely used, especially, in mathematically intractable problems. In this study, it is used to obtain the empirical distributions of the parameters to determine whether they are statistically significant or not in a special case of nonparametric regression, Conic Multivariate Adaptive Regression Splines (CMARS). Here, the CMARS method, which uses conic quadratic optimization, is a modified version of a well-known nonparametric regression model, Multivariate Adaptive Regression Splines (MARS). Although performing better with respect to several criteria, the CMARS model is more complex than that of MARS. To overcome this problem, and to improve the CMARS performance further, three different bootstrapping regression methods, namely, Random-X, Fixed-X and Wild Bootstrap are applied on four data sets with different size and scale. Then, the performances of the models are compared using various criteria including accuracy, precision, complexity, stability, robustness and efficiency. Random-X yields more precise, accurate and less complex models particularly for medium size and medium scale data even though it is the least efficient method. HA Statistics 36161
9	A comparison of some methods of modeling baseline hazard function in discrete survival models Mashabela, Mahlageng Retang 20 September 2019 (has links) MSc (Statistics) / Department of Statistics / The baseline parameter vector in a discrete-time survival model is determined by the number of time points. The larger the number of the time points, the higher the dimension of the baseline parameter vector which often leads to biased maximum likelihood estimates. One of the ways to overcome this problem is to use a simpler parametrization that contains fewer parameters. A simulation approach was used to compare the accuracy of three variants of penalised regression spline methods in smoothing the baseline hazard function. Root mean squared error (RMSE) analysis suggests that generally all the smoothing methods performed better than the model with a discrete baseline hazard function. No single smoothing method outperformed the other smoothing methods. These methods were also applied to data on age at rst alcohol intake in Thohoyandou. The results from real data application suggest that there were no signi cant di erences amongst the estimated models. Consumption of other drugs, having a parent who drinks, being a male and having been abused in life are associated with high chances of drinking alcohol very early in life. / NRF Discrete survival models Hazard function Baseline hazard function Smoothing splines Penalised regression splines RMSE 511.442 Spline theory Polynominals Aproximation theory
10	Semiparametric Varying Coefficient Models for Matched Case-Crossover Studies Ortega Villa, Ana Maria 23 November 2015 (has links) Semiparametric modeling is a combination of the parametric and nonparametric models in which some functions follow a known form and some others follow an unknown form. In this dissertation we made contributions to semiparametric modeling for matched case-crossover data. In matched case-crossover studies, it is generally accepted that the covariates on which a case and associated controls are matched cannot exert a confounding effect on independent predictors included in the conditional logistic regression model. Any stratum effect is removed by the conditioning on the fixed number of sets of the case and controls in the stratum. However, some matching covariates such as time, and/or spatial location often play an important role as an effect modification. Failure to include them makes incorrect statistical estimation, prediction and inference. Hence in this dissertation, we propose several approaches that will allow the inclusion of time and spatial location as well as other effect modifications such as heterogeneous subpopulations among the data. To address modification due to time, three methods are developed: the first is a parametric approach, the second is a semiparametric penalized approach and the third is a semiparametric Bayesian approach. We demonstrate the advantage of the one stage semiparametric approaches using both a simulation study and an epidemiological example of a 1-4 bi-directional case-crossover study of childhood aseptic meningitis with drinking water turbidity. To address modifications due to time and spatial location, two methods are developed: the first one is a semiparametric spatial-temporal varying coefficient model for a small number of locations. The second method is a semiparametric spatial-temporal varying coefficient model, and is appropriate when the number of locations among the subjects is medium to large. We demonstrate the accuracy of these approaches by using simulation studies, and when appropriate, an epidemiological example of a 1-4 bi-directional case-crossover study. Finally, to explore further effect modifications by heterogeneous subpopulations among strata we propose a nonparametric Bayesian approach constructed with Dirichlet process priors, which clusters subpopulations and assesses heterogeneity. We demonstrate the accuracy of our approach using a simulation study, as well a an example of a 1-4 bi-directional case-crossover study. / Ph. D. Bayesian Nonparametric Conditional logistic regression Matched case-control study Regression splines Spatial-temporal data Varying Coefficient Model

Search results