Global ETD Search

91	Nonlinear models for neural networks. Brittain, Susan. January 2000 (has links) The most commonly used applications of hidden-layer feed forward neural networks are to fit curves to regression data or to provide a surface from which a classification rule can be found. From a statistical viewpoint, the principle underpinning these networks is that of nonparametric regression with sigmoidal curves being located and scaled so that their sum approximates the data well, and the underlying mechanism is that of nonlinear regression, with the weights of the network corresponding to parameters in the regression model, and the objective function implemented in the training of the network defining the error structure. The aim ofthe present study is to use these statistical insights to critically appraise the reliability and the precision of the predicted outputs from a trained hiddenlayer feed forward neural network. / Thesis (M.Sc.)-University of Natal, Pietermaritzburg, 2000. Neural Networks (Computer Science)
92	Inference from finite population sampling : a unified approach. January 2007 (has links) In this thesis, we have considered the inference aspects of sampling from a finite population. There are significant differences between traditional statistical inference and finite population sampling inference. In the case of finite population sampling, the statistician is free to choose his own sampling design and is not confined to independent and identically distributed observations as is often the case with traditional statistical inference. We look at the correspondence between the sampling design and the sampling scheme. We also look at methods used for drawing samples. The non – existence theorems (Godambe (1955), Hanurav and Basu (1971)) are also discussed. Since the minimum variance unbiased estimator does not exist for infinite populations, a number of estimators need to be considered for estimating the same parameter. We discuss the admissible properties of estimators and the use of sufficient statistics and the Rao-Blackwell Theorem for the improvement of inefficient inadmissible estimators. Sampling strategies using auxiliary information, relating to the population, need to be used as no sampling strategy can provide an efficient estimator of the population parameter in all situations. Finally few well known sampling strategies are studied and compared under a super population model. / Thesis (M.Sc.)-University of KwaZulu-Natal, Westville, 2007. Sampling (Statistics) Prediction theory.
93	Aspects of categorical data analysis. Govender, Yogarani. January 1998 (has links) The purpose of this study is to investigate and understand data which are grouped into categories. At the onset, the study presents a review of early research contributions and controversies surrounding categorical data analysis. The concept of sparseness in a contingency table refers to a table where many cells have small frequencies. Previous research findings showed that incorrect results were obtained in the analysis of sparse tables. Hence, attention is focussed on the effect of sparseness on modelling and analysis of categorical data in this dissertation. Cressie and Read (1984) suggested a versatile alternative, the power divergence statistic, to statistics proposed in the past. This study includes a detailed discussion of the power-divergence goodness-of-fit statistic with areas of interest covering a review on the minimum power divergence estimation method and evaluation of model fit. The effects of sparseness are also investigated for the power-divergence statistic. Comparative reviews on the accuracy, efficiency and performance of the power-divergence family of statistics under large and small sample cases are presented. Statistical applications on the power-divergence statistic have been conducted in SAS (Statistical Analysis Software). Further findings on the effect of small expected frequencies on accuracy of the X2 test are presented from the studies of Tate and Hyer (1973) and Lawal and Upton (1976). Other goodness-of-fit statistics which bear relevance to the sparse multino-mial case are discussed. They include Zelterman's (1987) D2 goodness-of-fit statistic, Simonoff's (1982, 1983) goodness-of-fit statistics as well as Koehler and Larntz's tests for log-linear models. On addressing contradictions for the sparse sample case under asymptotic conditions and an increase in sample size, discussions are provided on Simonoff's use of nonparametric techniques to find the variances as well as his adoption of the jackknife and bootstrap technique. / Thesis (M.Sc.)-University of Natal, Durban, 1998. Multivariate analysis. Categories (Mathematics)
94	Modelling CD4+ count over time in HIV positive patients initiated on HAART in South Africa using linear mixed models. Yende Zuma, Nonhlanhla. January 2009 (has links) HIV is among the highly infectious and pathogenic diseases with a high mortality rate. The spread of HIV is in uenced by several individual based epidemiological factors such as age, gender, mobility, sexual partner pro le and the presence of sexually transmitted infections (STI). CD4+ count over time provided the rst surrogate marker of HIV disease progression and is currently used for clinical management of HIV-positive patients. The CD4+ count as a key disease marker is repeatedly measured among those individuals who test HIV positive to monitor the progression of the disease since it is known that HIV/AIDS is a long wave event. This gives rise to what is commonly known as longitudinal data. The aim of this project is to determine if the patients' weight, baseline age, sex, viral load and clinic site, in uences the rate of change in CD4+ count over time. We will use data of patients who commenced highly active antiretroviral therapy (HAART) from the Center for the AIDS Programme of Research in South Africa (CAPRISA) in the AIDS Treatment Project (CAT) between June 2004 and September 2006, including two years of follow-up for each patient. Analysis was done using linear mixed models methods for longitudinal data. The results showed that larger increase in CD4+ count over time was observed in females and individuals who were younger. However, upon tting baseline log viral load in the model instead of the log viral at all visits was that, larger increase in CD4+ count was observed in females, individuals who were younger, had higher baseline log viral load and lower weight. / Thesis (M.Sc.)-University of KwaZulu-Natal, Pietermaritzburg, 2009. AIDS (Disease) HIV infections.
95	Stochastic volatility effects on defaultable bonds. Mkize, Thembisile. January 2009 (has links) We study the eff ects of stochastic volatility of defaultable bonds using the first -passage structural approach. In this approach Black and Cox (1976) argued that default can happen at any time. This then led to the development of afirst-passage model, in which a rm (company) default occurs when its value falls to a barrier. In the first-passage model the rm debt is considered to be a single pure discount bond and default occurs only if the rm value falls below the face value of the bond at maturity. Here the firm's debt can be viewed as a portfolio composed of a risk-free bond and a short-put option on the value of a rm. The classic Black-Scholes-Merton model only considers a single liability and the solvency is tested at the maturity date, while the extended Black-Scholes-Merton model allows for default at any time before maturity to cater for more complex capital structures and was delivered by Geske, Black-Cox, Leland, Leland and Toft and others. In this work a review of the eff ect of stochastic volatility on defaultable bonds is given. In addition a study from the first-passage structural approach and reduced-form approach is made. We also introduce symmetry analysis to study some of the equations that appear in option-pricing models. This approach is quite recent and has produced successful results. In this work we lay the foundation of this method. Keywords: Stochastic Volatility, Defaultable bonds, Lie Symmetries. / Thesis (M.Sc.)-University of KwaZulu-Natal, Westville, 2009. Bond issues. Stochastic processes.
96	Evaluation of strategies to combine multiple biomarkers in diagnostic testing. Mohammed, Muna Balla Elshareef. January 2012 (has links) A challenge in clinical medicine is that of correct diagnosis of disease. Medical researchers invest considerable time and effort to enhance accurate disease diagnosis. Diagnostic tests are important components in modern medical practice. The receiver operating characteristic (ROC) is a commonly used statistical tool for describing the discriminatory accuracy and performance of a diagnostic test. A popular summary index of discriminatory accuracy is the area under ROC curve (AUC). In the era of high-dimensional data, scientists are evaluating hundreds to multiple thousands of biomarkers simultaneously. A critical challenge is the combination of these markers into models that give insight into disease. In infectious disease, markers are often evaluated in the host as well as in the microorganism or virus causing infection, adding more complexity to the analysis. In addition to providing an improved understanding of factors associated with infection and disease development, combinations of relevant markers is important to diagnose and treat disease. Taken together, this presents many novel and major challenges to, and extends the role of, the statistical analyst. In this thesis, we will address the problem of how to select from multiple markers using existing methods. Logistic regression models offer a simple method for combining markers. We applied resampling methods (e.g., Cross-Validation and bootstrap) to adjust for overfitting associated with model selection. We simulated several multivariate models to evaluate the performance of the resampling approaches in this setting. We applied the methods to data collected from a study of tuberculosis immune reconstitution inflammatory syndrome (TB-IRIS) in Cape Town. Baseline levels of five biomarkers were evaluated and we used this dataset to evaluate whether a combination of these biomarkers could accurately discriminate between Tuberculosis Immune Reconstitution Inflammatory Syndrome (TB-IRIS) and non TB-IRIS patients, applying AUC analysis and resampling methods. / Thesis (M.Sc.)-University of KwaZulu-Natal, Pietermaritzburg, 2012. Biochemical markers. Drug development.
97	Use of statistical modelling and analyses of malaria rapid diagnostic test outcome in Ethiopia. Ayele, Dawit Getnet. 12 December 2013 (has links) The transmission of malaria is among the leading public health problems in Ethiopia. From the total area of Ethiopia, more than 75% is malarious. Identifying the infectiousness of malaria by socio-economic, demographic and geographic risk factors based on the malaria rapid diagnosis test (RDT) survey results has several advantages for planning, monitoring and controlling, and eventual malaria eradication effort. Such a study requires thorough understanding of the diseases process and associated factors. However such studies are limited. Therefore, the aim of this study was to use different statistical tools suitable to identify socioeconomic, demographic and geographic risk factors of malaria based on the malaria rapid diagnosis test (RDT) survey results in Ethiopia. A total of 224 clusters of about 25 households were selected from the Amhara, Oromiya and Southern Nation Nationalities and People (SNNP) regions of Ethiopia. Accordingly, a number of binary response statistical analysis models were used. Multiple correspondence analysis was carried out to identify the association among socioeconomic, demographic and geographic factors. Moreover a number of binary response models such as survey logistic, GLMM, GLMM with spatial correlation, joint models and semi-parametric models were applied. To test and investigate how well the observed malaria RDT result, use of mosquito nets and use of indoor residual spray data fit the expectations of the model, Rasch model was used. The fitted models have their own strengths and weaknesses. Application of these models was carried out by analysing data on malaria RDT result. The data used in this study, which was conducted from December 2006 to January 2007 by The Carter Center, is from baseline malaria indicator survey in Amhara, Oromiya and Southern Nation Nationalities and People (SNNP) regions of Ethiopia. The correspondence analysis and survey logistic regression model was used to identify predictors which affect malaria RDT results. The effect of identified socioeconomic, demographic and geographic factors were subsequently explored by fitting a generalized linear mixed model (GLMM), i.e., to assess the covariance structures of the random components (to assess the association structure of the data). To examine whether the data displayed any spatial autocorrelation, i.e., whether surveys that are near in space have malaria prevalence or incidence that is similar to the surveys that are far apart, spatial statistics analysis was performed. This was done by introducing spatial autocorrelation structure in GLMM. Moreover, the customary two variables joint modelling approach was extended to three variables joint effect by exploring the joint effect of malaria RDT result, use of mosquito nets and indoor residual spray in the last twelve months. Assessing the association between these outcomes was also of interest. Furthermore, the relationships between the response and some confounding covariates may have unknown functional form. This led to proposing the use of semiparametric additive models which are less restrictive in their specification. Therefore, generalized additive mixed models were used to model the effect of age, family size, number of rooms per person, number of nets per person, altitude and number of months the room sprayed nonparametrically. The result from the study suggests that with the correct use of mosquito nets, indoor residual spraying and other preventative measures, coupled with factors such as the number of rooms in a house, are associated with a decrease in the incidence of malaria as determined by the RDT. However, the study also suggests that the poor are less likely to use these preventative measures to effectively counteract the spread of malaria. In order to determine whether or not the limited number of respondents had undue influence on the malaria RDT result, a Rasch model was used. The result shows that none of the responses had such influences. Therefore, application of the Rasch model has supported the viability of the total sixteen (socio-economic, demographic and geographic) items for measuring malaria RDT result, use of indoor residual spray and use of mosquito nets. From the analysis it can be seen that the scale shows high reliability. Hence, the result from Rasch model supports the analysis carried out in previous models. / Thesis (Ph.D.)-University of KwaZulu-Natal, Pietermaritzburg, 2013. Mathematical statistics. Probabilities. Linear models (Statistics)
98	Statistical modelling of availability of major food cereals in Lesotho : application of regression models and diagnostics. Khoeli, Makhala Bernice. January 2012 (has links) Oftentimes, application of regression models to analyse cereals data is limited to estimating and predicting crop production or yield. The general approach has been to fit the model without much consideration of the problems that accompany application of regression models to real life data, such as collinearity, models not fitting the data correctly and violation of assumptions. These problems may interfere with applicability and usefulness of the models, and compromise validity of results if they are not corrected when fitting the model. We applied regression models and diagnostics on national and household data to model availability of main cereals in Lesotho, namely, maize, sorghum and wheat. The application includes the linear regression model, regression and collinear diagnostics, Box-Cox transformation, ridge regression, quantile regression, logistic regression and its extensions with multiple nominal and ordinal responses. The Linear model with first-order autoregressive process AR(1) was used to determine factors that affected availability of cereals at the national level. Case deletion diagnostics were used to identify extreme observations with influence on different quantities of the fitted regression model, such as estimated parameters, predicted values, and covariance matrix of the estimates. Collinearity diagnostics detected the presence of more than one collinear relationship coexisting in the data set. They also determined variables involved in each relationship, and assessed potential negative impact of collinearity on estimated parameters. Ridge regression remedied collinearity problems by controlling inflation and instability of estimates. The Box-Cox transformation corrected non-constant variance, longer and heavier tails of the distribution of data. These increased applicability and usefulness of the linear models in modeling availability of cereals. Quantile regression, as a robust regression, was applied to the household data as an alternative to classical regression. Classical regression estimates from ordinary least squares method are sensitive to distributions with longer and heavier tails than the normal distribution, as well as to outliers. Quantile regression estimates appear to be more efficient than least squares estimates for a wide range of error term distribution. We studied availability of cereals further by categorizing households according to availability of different cereals, and applied the logistic regression model and its extensions. Logistic regression was applied to model availability and non-availability of cereals. Multinomial logistic regression was applied to model availability with nominal multiple categories. Ordinal logistic regression was applied to model availability with ordinal categories and this made full use of available information. The three variants of logistic regression model gave results that are in agreement, which are also in agreement with the results from the linear regression model and quantile regression model. / Thesis (Ph.D.)-University of KwaZulu-Natal, Durban, 2012. Mathematical statistics. Probabilities. Linear models (Statistics)
99	Analysis of Financial Data using a Difference-Poisson Autoregressive Model Baroud, Hiba January 2011 (has links) Box and Jenkins methodologies have massively contributed to the analysis of time series data. However, the assumptions used in these methods impose constraints on the type of the data. As a result, difficulties arise when we apply those tools to a more generalized type of data (e.g. count, categorical or integer-valued data) rather than the classical continuous or more specifically Gaussian type. Papers in the literature proposed alternate methods to model discrete-valued time series data, among these methods is Pegram's operator (1980). We use this operator to build an AR(p) model for integer-valued time series (including both positive and negative integers). The innovations follow the differenced Poisson distribution, or Skellam distribution. While the model includes the usual AR(p) correlation structure, it can be made more general. In fact, the operator can be extended in a way where it is possible to have components which contribute to positive correlation, while at the same time have components which contribute to negative correlation. As an illustration, the process is used to model the change in a stock’s price, where three variations are presented: Variation I, Variation II and Variation III. The first model disregards outliers; however, the second and third include large price changes associated with the effect of large volume trades and market openings. Parameters of the model are estimated using Maximum Likelihood methods. We use several model selection criteria to select the best order for each variation of the model as well as to determine which is the best variation of the model. The most adequate order for all the variations of the model is $AR(3)$. While the best fit for the data is Variation II, residuals' diagnostic plots suggest that Variation III represents a better correlation structure for the model. Pegrams's Operator Skellam Distribution Negative Correlations Actuarial Science
100	Markovian Approaches to Joint-life Mortality with Applications in Risk Management Ji, Min 28 July 2011 (has links) The combined survival status of the insured lives is a critical problem when pricing and reserving insurance products with more than one life. Our preliminary experience examination of bivariate annuity data from a large Canadian insurance company shows that the relative risk of mortality for an individual increases after the loss of his/her spouse, and that the increase is especially dramatic shortly after bereavement. This preliminary result is supported by the empirical studies over the past 50 years, which suggest dependence between a husband and wife. The dependence between a married couple may be significant in risk management of joint-life policies. This dissertation progressively explores Markovian models in pricing and risk management of joint-life policies, illuminating their advantages in dependent modeling of joint time-until-death (or other exit time) random variables. This dissertation argues that in the dependent modeling of joint-life dependence, Markovian models are flexible, transparent, and easily extended. Multiple state models have been widely used in historic data analysis, particularly in the modeling of failures that have event-related dependence. This dissertation introduces a ¡°common shock¡± factor into a standard Markov joint-life mortality model, and then extends it to a semi-Markov model to capture the decaying effect of the "broken heart" factor. The proposed models transparently and intuitively measure the extent of three types of dependence: the instantaneous dependence, the short-term impact of bereavement, and the long-term association between lifetimes. Some copula-based dependence measures, such as upper tail dependence, can also be derived from Markovian approaches. Very often, death is not the only mode of decrement. Entry into long-term care and voluntary prepayment, for instance, can affect reverse mortgage terminations. The semi-Markov joint-life model is extended to incorporate more exit modes, to model joint-life reverse mortgage termination speed. The event-triggered dependence between a husband and wife is modeled. For example, one spouse's death increases the survivor's inclination to move close to kin. We apply the proposed model specifically to develop the valuation formulas for roll-up mortgages in the UK and Home Equity Conversion Mortgages in the US. We test the significance of each termination mode and then use the model to investigate the mortgage insurance premiums levied on Home Equity Conversion Mortgage borrowers. Finally, this thesis extends the semi-Markov joint-life mortality model to having stochastic transition intensities, for modeling joint-life longevity risk in last-survivor annuities. We propose a natural extension of Gompertz' law to have correlated stochastic dynamics for its two parameters, and incorporate it into the semi-Markov joint-life mortality model. Based on this preliminary joint-life longevity model, we examine the impact of mortality improvement on the cost of a last survivor annuity, and investigate the market prices of longevity risk in last survivor annuities using risk-neutral pricing theory. Markov Semi-Markov Joint lives Mortality Longevity Actuarial Science

Search results