Global ETD Search

1	Semi-parametric estimation in Tobit regression models Chen, Chunxia January 1900 (has links) Master of Science / Department of Statistics / Weixing Song / In the classical Tobit regression model, the regression error term is often assumed to have a zero mean normal distribution with unknown variance, and the regression function is assumed to be linear. If the normality assumption is violated, then the commonly used maximum likelihood estimate becomes inconsistent. Moreover, the likelihood function will be very complicated if the regression function is nonlinear even the error density is normal, which makes the maximum likelihood estimation procedure hard to implement. In the full nonparametric setup when both the regression function and the distribution of the error term [epsilon] are unknown, some nonparametric estimators for the regression function has been proposed. Although the assumption of knowing the distribution is strict, it is a widely adopted assumption in Tobit regression literature, and is also confirmed by many empirical studies conducted in the econometric research. In fact, a majority of the relevant research assumes that [epsilon] possesses a normal distribution with mean 0 and unknown standard deviation. In this report, we will try to develop a semi-parametric estimation procedure for the regression function by assuming that the error term follows a distribution from a class of 0-mean symmetric location and scale family. A minimum distance estimation procedure for estimating the parameters in the regression function when it has a specified parametric form is also constructed. Compare with the existing semiparametric and nonparametric methods in the literature, our method would be more efficient in that more information, in particular the knowledge of the distribution of [epsilon], is used. Moreover, the computation is relative inexpensive. Given lots of application does assume that [epsilon] has normal or other known distribution, the current work no doubt provides some more practical tools for statistical inference in Tobit regression model. Semi-parametric Tobit regression models Statistics (0463)
2	Short Term Load Forecasting Using Semi-Parametric Method and Support Vector Machines Jordaan, JA, Ukil, A 23 September 2009 (has links) Accurate short term load forecasting plays a very important role in power system management. As electrical load data is highly non-linear in nature, in the proposed approach, we first separate out the linear and the non-linear parts, and then forecast the load using the non-linear part only. The Semiparametric spectral estimation method is used to decompose a load data signal into a harmonic linear signal model and a nonlinear trend. A support vector machine is then used to predict the non-linear trend. The final predicted signal is then found by adding the support vector machine predicted trend and the linear signal part. With careful determination of the linear component, the performance of the proposed method seems to be more robust than using only the raw load data, and in many cases the predicted signal of the proposed method is more accurate when we have only a small training set. Short Term Load Forecasting Semi-Parametric Method
3	Semiparametric single-index model for estimating optimal individualized treatment strategy Song, Rui, Luo, Shikai, Zeng, Donglin, Zhang, Hao Helen, Lu, Wenbin, Li, Zhiguo 13 February 2017 (has links) Different from the standard treatment discovery framework which is used for finding single treatments for a homogenous group of patients, personalized medicine involves finding therapies that are tailored to each individual in a heterogeneous group. In this paper, we propose a new semiparametric additive single-index model for estimating individualized treatment strategy. The model assumes a flexible and nonparametric link function for the interaction between treatment and predictive covariates. We estimate the rule via monotone B-splines and establish the asymptotic properties of the estimators. Both simulations and an real data application demonstrate that the proposed method has a competitive performance. Personalized medicine single index model semi-parametric inference
4	Will Mortality Rate of HIV-Infected Patients Decrease After Starting Antiretroviral Therapy (ART)? Bahakeem, Shaher 07 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Background: Many authors have indicated that HIV-infected patients mortality risk is higher immediately following the start of Antiretroviral Therapy. However, mortality rate of HIV-infected patients is expected to decrease after starting Antiretroviral Therapy (ART) potentially complicating accurate statistical estimation of patient survival and, more generally, effective monitoring of the evolution of the worldwide epidemic. Method: In this thesis, we determine if mortality of HIV-patients increases or decreases after the initiation of ART therapy using flexible survival modelling techniques. To achieve this objective, this study uses semi-parametric statistical models for fitting and estimating survival time using different covariates. A combination of the Weibull distribution with splines is compared to the usual Weibull, exponential, and gamma distribution parametric models, and the Cox semi-parametric model. The objective of this study is to compare these models to find the best fitting model so that it can then be used to improve modeling of the survival time and explore the pattern of change in mortality rates for a cohort of HIV-infected patients recruited in a care and treatment program in Uganda. Results: The analysis shows that flexible survival Weibull models are better than usualoff-parametric and semi-parametric model fitting according to the AIC criterion. Conclusion: The mortality of HIV-patients is high right after the initiation of ART therapy and decreases rapidly subsequently. Antiretroviral therapy Flexible Survival Models Parametric Models Semi parametric Models
5	Generalized Semiparametric Approach to the Analysis of Variance Pathiravasan, Chathurangi Heshani Karunapala 01 August 2019 (has links) (PDF) The one-way analysis of variance (ANOVA) is mainly based on several assumptions and can be used to compare the means of two or more independent groups of a factor. To relax the normality assumption in one-way ANOVA, recent studies have considered exponential distortion or tilt of a reference distribution. The reason for the exponential distortion was not investigated before; thus the main objective of the study is to closely examine the reason behind it. In doing so, a new generalized semi-parametric approach for one-way ANOVA is introduced. The proposed method not only compares the means but also variances of any type of distributions. Simulation studies show that proposed method has favorable performance than classical ANOVA. The method is demonstrated on meteorological radar data and credit limit data. The asymptotic distribution of the proposed estimator was determined in order to test the hypothesis for equality of one sample multivariate distributions. The power comparison of one sample multivariate distributions reveals that there is a significant power improvement in the proposed chi-square test compared to the Hotelling's T-Square test for non normal distributions. A bootstrap paradigm is incorporated for testing equidistributions of multiple samples. As far as power comparison simulations for multiple large samples are considered, the proposed test outperforms other existing parametric, nonparametric and semi-parametric approaches for non normal distributions. analysis of variance hypothesis testing semi-parametric approach testing equidistributions
6	A semi-parametric approach to estimating item response functions Liang, Longjuan 22 June 2007 (has links) No description available. Item response theory semi-parametric approach logistic function monotonic polynomial
7	Statistical inference for inequality measures based on semi-parametric estimators Kpanzou, Tchilabalo Abozou 12 1900 (has links) Thesis (PhD)--Stellenbosch University, 2011. / ENGLISH ABSTRACT: Measures of inequality, also used as measures of concentration or diversity, are very popular in economics and especially in measuring the inequality in income or wealth within a population and between populations. However, they have applications in many other fields, e.g. in ecology, linguistics, sociology, demography, epidemiology and information science. A large number of measures have been proposed to measure inequality. Examples include the Gini index, the generalized entropy, the Atkinson and the quintile share ratio measures. Inequality measures are inherently dependent on the tails of the population (underlying distribution) and therefore their estimators are typically sensitive to data from these tails (nonrobust). For example, income distributions often exhibit a long tail to the right, leading to the frequent occurrence of large values in samples. Since the usual estimators are based on the empirical distribution function, they are usually nonrobust to such large values. Furthermore, heavy-tailed distributions often occur in real life data sets, remedial action therefore needs to be taken in such cases. The remedial action can be either a trimming of the extreme data or a modification of the (traditional) estimator to make it more robust to extreme observations. In this thesis we follow the second option, modifying the traditional empirical distribution function as estimator to make it more robust. Using results from extreme value theory, we develop more reliable distribution estimators in a semi-parametric setting. These new estimators of the distribution then form the basis for more robust estimators of the measures of inequality. These estimators are developed for the four most popular classes of measures, viz. Gini, generalized entropy, Atkinson and quintile share ratio. Properties of such estimators are studied especially via simulation. Using limiting distribution theory and the bootstrap methodology, approximate confidence intervals were derived. Through the various simulation studies, the proposed estimators are compared to the standard ones in terms of mean squared error, relative impact of contamination, confidence interval length and coverage probability. In these studies the semi-parametric methods show a clear improvement over the standard ones. The theoretical properties of the quintile share ratio have not been studied much. Consequently, we also derive its influence function as well as the limiting normal distribution of its nonparametric estimator. These results have not previously been published. In order to illustrate the methods developed, we apply them to a number of real life data sets. Using such data sets, we show how the methods can be used in practice for inference. In order to choose between the candidate parametric distributions, use is made of a measure of sample representativeness from the literature. These illustrations show that the proposed methods can be used to reach satisfactory conclusions in real life problems. / AFRIKAANSE OPSOMMING: Maatstawwe van ongelykheid, wat ook gebruik word as maatstawwe van konsentrasie of diversiteit, is baie populêr in ekonomie en veral vir die kwantifisering van ongelykheid in inkomste of welvaart binne ’n populasie en tussen populasies. Hulle het egter ook toepassings in baie ander dissiplines, byvoorbeeld ekologie, linguistiek, sosiologie, demografie, epidemiologie en inligtingskunde. Daar bestaan reeds verskeie maatstawwe vir die meet van ongelykheid. Voorbeelde sluit in die Gini indeks, die veralgemeende entropie maatstaf, die Atkinson maatstaf en die kwintiel aandeel verhouding. Maatstawwe van ongelykheid is inherent afhanklik van die sterte van die populasie (onderliggende verdeling) en beramers daarvoor is tipies dus sensitief vir data uit sodanige sterte (nierobuust). Inkomste verdelings het byvoorbeeld dikwels lang regtersterte, wat kan lei tot die voorkoms van groot waardes in steekproewe. Die tradisionele beramers is gebaseer op die empiriese verdelingsfunksie, en hulle is gewoonlik dus nierobuust teenoor sodanige groot waardes nie. Aangesien swaarstert verdelings dikwels voorkom in werklike data, moet regstellings gemaak word in sulke gevalle. Hierdie regstellings kan bestaan uit of die afknip van ekstreme data of die aanpassing van tradisionele beramers om hulle meer robuust te maak teen ekstreme waardes. In hierdie tesis word die tweede opsie gevolg deurdat die tradisionele empiriese verdelingsfunksie as beramer aangepas word om dit meer robuust te maak. Deur gebruik te maak van resultate van ekstreemwaardeteorie, word meer betroubare beramers vir verdelings ontwikkel in ’n semi-parametriese opset. Hierdie nuwe beramers van die verdeling vorm dan die basis vir meer robuuste beramers van maatstawwe van ongelykheid. Hierdie beramers word ontwikkel vir die vier mees populêre klasse van maatstawwe, naamlik Gini, veralgemeende entropie, Atkinson en kwintiel aandeel verhouding. Eienskappe van hierdie beramers word bestudeer, veral met behulp van simulasie studies. Benaderde vertrouensintervalle word ontwikkel deur gebruik te maak van limietverdelingsteorie en die skoenlus metodologie. Die voorgestelde beramers word vergelyk met tradisionele beramers deur middel van verskeie simulasie studies. Die vergelyking word gedoen in terme van gemiddelde kwadraat fout, relatiewe impak van kontaminasie, vertrouensinterval lengte en oordekkingswaarskynlikheid. In hierdie studies toon die semi-parametriese metodes ’n duidelike verbetering teenoor die tradisionele metodes. Die kwintiel aandeel verhouding se teoretiese eienskappe het nog nie veel aandag in die literatuur geniet nie. Gevolglik lei ons die invloedfunksie asook die asimptotiese verdeling van die nie-parametriese beramer daarvoor af. Ten einde die metodes wat ontwikkel is te illustreer, word dit toegepas op ’n aantal werklike datastelle. Hierdie toepassings toon hoe die metodes gebruik kan word vir inferensie in die praktyk. ’n Metode in die literatuur vir steekproefverteenwoordiging word voorgestel en gebruik om ’n keuse tussen die kandidaat parametriese verdelings te maak. Hierdie voorbeelde toon dat die voorgestelde metodes met vrug gebruik kan word om bevredigende gevolgtrekkings in die praktyk te maak. Extreme value theory Semi-parametric estimation Confidence intervals
8	Essays on semi-parametric Bayesian econometric methods Wu, Ruochen January 2019 (has links) This dissertation consists of three chapters on semi-parametric Bayesian Econometric methods. Chapter 1 applies a semi-parametric method to demand systems, and compares the abilities to recover the true elasticities of different approaches to linearly estimating the widely used Almost Ideal demand model, by either iteration or approximation. Chapter 2 co-authored with Dr. Melvyn Weeks introduces a new semi-parametric Bayesian Generalized Least Square estimator, which employs the Dirichlet Process prior to cope with potential heterogeneity in the error distributions. Two methods are discussed as special cases of the GLS estimator, the Seemingly Unrelated Regression for equation systems, and the Random Effects Model for panel data, which can be applied to many fields such as the demand analysis in Chapter 1. Chapter 3 focuses on the subset selection for the efficiencies of firms, which addresses the influence of heterogeneity in the distributions of efficiencies on subset selections by applying the semi-parametric Bayesian Random Effects Model introduced in Chapter 2.
9	Hierarchical Multi-Bottleneck Classification Method And Its Application to DNA Microarray Expression Data Xiong, Xuejian, Wong, Weng Fai, Hsu, Wen Jing 01 1900 (has links) The recent development of DNA microarray technology is creating a wealth of gene expression data. Typically these datasets have high dimensionality and a lot of varieties. Analysis of DNA microarray expression data is a fast growing research area that interfaces various disciplines such as biology, biochemistry, computer science and statistics. It is concluded that clustering and classification techniques can be successfully employed to group genes based on the similarity of their expression patterns. In this paper, a hierarchical multi-bottleneck classification method is proposed, and it is applied to classify a publicly available gene microarray expression data of budding yeast Saccharomyces cerevisiae. / Singapore-MIT Alliance (SMA) DNA microarray gene expression data Semi-parametric mixture identification
10	Bayesian Modeling and Computation for Mixed Data Cui, Kai January 2012 (has links) <p>Multivariate or high-dimensional data with mixed types are ubiquitous in many fields of studies, including science, engineering, social science, finance, health and medicine, and joint analysis of such data entails both statistical models flexible enough to accommodate them and novel methodologies for computationally efficient inference. Such joint analysis is potentially advantageous in many statistical and practical aspects, including shared information, dimensional reduction, efficiency gains, increased power and better control of error rates.</p><p>This thesis mainly focuses on two types of mixed data: (i) mixed discrete and continuous outcomes, especially in a dynamic setting; and (ii) multivariate or high dimensional continuous data with potential non-normality, where each dimension may have different degrees of skewness and tail-behaviors. Flexible Bayesian models are developed to jointly model these types of data, with a particular interest in exploring and utilizing the factor models framework. Much emphasis has also been placed on the ability to scale the statistical approaches and computation efficiently up to problems with long mixed time series or increasingly high-dimensional heavy-tailed and skewed data.</p><p>To this end, in Chapter 1, we start with reviewing the mixed data challenges. We start developing generalized dynamic factor models for mixed-measurement time series in Chapter 2. The framework allows mixed scale measurements in different time series, with the different measurements having distributions in the exponential family conditional on time-specific dynamic latent factors. Efficient computational algorithms for Bayesian inference are developed that can be easily extended to long time series. Chapter 3 focuses on the problem of jointly modeling of high-dimensional data with potential non-normality, where the mixed skewness and/or tail-behaviors in different dimensions are accurately captured via the proposed heavy-tailed and skewed factor models. Chapter 4 further explores the properties and efficient Bayesian inference for the generalized semiparametric Gaussian variance-mean mixtures family, and introduce it as a potentially useful family for modeling multivariate heavy-tailed and skewed data.</p> / Dissertation Statistics Factor analysis High-dimensional Mixed data Non-Gaussian Semi-parametric Sparse

Search results