Spelling suggestions: "subject:"amathematical statistics."" "subject:"dmathematical statistics.""
51 |
Multivariate volatility modelling in modern financeBongers, Martin B January 2008 (has links)
Includes abstract. / Includes bibliographical references (leaves 100-101). / The aim of the study is to ascertain whether the information gained from the more complicated multivariate matrix decomposition models can be used to better forecast the covariance matrix and produce a Value at Risk estimate which more appropriately describes fat-tailed financial time-series.
|
52 |
Portfolio construction in South Africa with regard to the exchange rateHoldsworth, Christopher G January 2006 (has links)
Includes bibliographical references (leaves 120-123) / In South Africa the exchange rate receives a large amount of attention, due to its volatility and its perceived effect on share returns. This dissertation examines the international literature regarding exchange rate exposure and replicates their methods in a South African context to determine the model that finds the most exchange rate exposure. With this model, and a few variations, the persistence of exchange rate exposure is examined, where it is found that a few shares consistently act as the best hedges against R/$ depreciation and similarly a few shares are consistently the best at exploiting Rand strength. With this in mind two hedging techniques are compared in their ability to protect against a R/$ depreciation, and simultaneously provide market related returns, against the ITRIX exchange traded fund. It was found that the methods were successful in that they were able to hedge against R/$ depreciation while still participating in the recent rapid growth on the J.S.E.
|
53 |
Discriminant analysis : a review of its application to the classificationof grape cultivarsBlignaut, Rennette Julia January 1989 (has links)
The aim of this study was to calculate a classification function for discriminating between five grape cultivars with a view to determine the cultivar of an unknown grape juice. In order to discriminate between the five grape cultivars various multivariate statistical techniques, such as principal component analysis, cluster analysis, correspondence analysis and discriminant analysis were applied. Discriminant analysis resulted in the most appropriate technique for the problem at hand and therefore an in depth study of this technique was undertaken. Discriminant analysis was the most appropriate technique for classifying these grape samples into distinct cultivars because this technique utilized prior information of population membership. This thesis is divided into two main sections. The first section (chapters 1 to 5) is a review on discriminant analysis, describing various aspects of this technique and matters related thereto. In the second section (chapter 6) the theories discussed in the first section are applied to the problem at hand. The results obtained when discriminating between the different grape cultivars are given. Chapter 1 gives a general introduction to the subject of discriminant analysis, including certain basic derivations used in this study. Two approaches to discriminant analysis are discussed in Chapter 2, namely the parametrical and non-parametrical approaches. In this review the emphasis is placed on the classical approach to discriminant analysis. Non-parametrical approaches such as the K-nearest neighbour technique, the kernel method and ranking are briefly discussed. Chapter 3 deals with estimating the probability of misclassification. In Chapter 4 variable selection techniques are discussed. Chapter 5 briefly deals with sequential and logistical discrimination techniques. The estimation of missing values is also discussed in this chapter. A final summary and conclusion is given in Chapter 7. Appendices A to D illustrate some of the obtained results from the practical analyses.
|
54 |
Time series analysis of count data with an application to the incidence of choleraHolloway, Jennifer Patricia January 2011 (has links)
Includes bibliographical references (leaves 88-93). / This dissertation comprises a study into the application of count data time series models to weekly counts of cholera cases that have been recorded in Beira, Mozambique. The study specifically looks at two classes of time series models for count data, namely observation-driven and parameter-driven, and two models from each of these classes are investigated. The autoregressive conditional Poisson (ACP) and double autoregressive conditional Poisson (DACP) are considered under the observation-driven class, while the parameter-driven models used are the Poisson-gamma and stochastic autoregressive mean (SAM) model. An in-depth case study of the cholera counts is presented in which the four selected count data time series models are compared. In addition the time series models are compared to static Poisson and negative binomial regression, thereby indicating the benefits gained in using count data time series models when the counts exhibit serial correlation. In the process of comparing the models, the effect of environmental drivers on the outbreaks of cholera are observed and discussed.
|
55 |
A comparison of methods for analysing interval-censored and truncated survival dataDavidse, Alistair January 2004 (has links)
Bibliography: leaves 49-50. / This thesis examines three methods for analysing right-censored data: the Cox proportional hazards model (Cox, 1972), the Buckley-James regression model (Buckley and James, 1979) and the accelerated failure time model. These models are extended to incorporate the analysis of interval-censored and left-truncated data. The models are compared in an attempt to determine whether one model performs better than the others in terms of goodness-of-it and in terms of predictive power. Plots of the residuals and random effects from the Cox proportional hazards model are also examined.
|
56 |
Techniques for handling clustered binary dataHanslo, Monique January 2002 (has links)
Bibliography : leaves 143-153. / Over the past few decades there has been increasing interest in clustered studies and hence much research has gone into the analysis of data arising from these studies. It is erroneous to treat clustered data, where observations within a cluster are correlated with each other, as one would treat independent data. It has been found that point estimates are not as greatly affected by clustering as are the standard deviations of the estimates. But as a consequence, confidence intervals and hypothesis testing are severely affected. Therefore one has to approach the analysis of clustered data with caution. Methods that specifically deal with correlated data have been developed. Analysis may be further complicated when the outcome variable of interest is binary rather than continuous. Methods for estimation of proportions, their variances, calculation of confidence intervals and a variety of techniques for testing the homogeneity of proportions have been developed over the years (Donner and Klar, 1993; Donner, 1989, and Rao and Scott, 1992). The methods developed within the context of experimental design generally involve incorporating the effect of clustering in the analysis. This cluster effect is quantified by the intracluster correlation and needs to be taken into account when estimating proportions, comparing proportions and in sample size calculations. In the context of observational studies, the effect of clustering is expressed by the design effect which is the inflation in the variance of an estimate that is due to selecting a cluster sample rather than an independent sample. Another important aspect of the analysis of complex sample data that is often neglected is sampling weights. One needs to recognise that each individual may not have the same probability of being selected. These weights adjust for this fact (Little et al, 1997). Methods for modelling correlated binary data have also been discussed quite extensively. Among the many models which have been proposed for analyzing binary clustered data are two approaches which have been studied and compared: the population-averaged and cluster-specific approach. The population-averaged model focuses on estimating the effect of a set of covariates on the marginal expectation of the response. One example of the population-averaged approach for parameter estimation is known as generalized estimating equations, proposed by Liang and Zeger (1986). It involves assuming that elements within a cluster are independent and then imposing a correlation structure on the set of responses. This is a useful application in longitudinal studies where a subject is regarded as a cluster. Then the parameters describe how the population-averaged response rather than a specific subject's response depends on the covariates of interest. On the other hand, cluster specific models introduce cluster to cluster variability in the model by including random effects terms, which are specific to the cluster, as linear predictors in the regression model (Neuhaus et al, 1991). Unlike the special case of correlated Gaussian responses, the parameters for the cluster specific model obtained for binary data describe different effects on the responses compared to that obtained from the population-averaged model. For longitudinal data, the parameters of a cluster-specific model describe how a specific individuals probability of a response depends on the covariates. The decision to use either of these modelling methods depends on the questions of interest. Cluster-specific models are useful for studying the effects of cluster-varying covariates and when an individual's response rather than an average population's response is the focus. The population-averaged model is useful when interest lies in how the average response across clusters changes with covariates. A criticism of this approach is that there may be no individual with the characteristics of the population-averaged model.
|
57 |
Statistical aspects of bioavailabilityFresen, John Lawrence January 1985 (has links)
Includes bibliography. / In 1984 it became legal for pharmacists to offer customers a cheaper generic alternative for a given prescription. The motivation for this was the excessively high cost of brand name drugs. The substitution of a generic alternative for a brand name drug is based on the assumption that drugs with a comparable chemical composition will have a similar therapeutic effect. The fact that this supposition is not always true has been demonstrated by a number of particular drugs, digoxon being perhaps the most vivid example. The objective of this thesis is to review the statistical aspects associated with (i) measuring the bioavailability of a drug (Chapter 2) (ii) establishing the equivalence of a new and standard formulation of a drug (Chapter 3). In the process of reviewing the literature two problems were identified. Firstly, it is commonly assumed that bioavailability parameters follow either the normal or lognormal distribution. This assumption is difficult to defend, hence procedures based on such assumptions became suspect. Secondly, bioavailability is inherently multivariate whereas in practice univariate procedures are employed. Efren's bootstrap method, which does not rest on assumptions about the underlying distribution, is proposed as a tool for assessing bioequivalence. A new measure of bioequivalence, the Index of Concordance, is proposed. This index can be computed with equal ease for univariate or multivariate data using the bootstrap (Chapter 5). The bootstrap idea of resampling the data can also be applied to compartmental modelling of bioavailability data. One result of this is a nonparametric estimate of the underlying distribution of the bioavailability parameters (Chapter 6). The bootstrap is, on its own, a fascinating concept. A review of the bootstrap is given in Chapter 4.
|
58 |
The Impact of the Carry Trade on Global Currency MarketsSmit, Steven 28 January 2020 (has links)
This work analyses the effect of the carry trade factor, statistically derived from a comprehensive basket of currencies, on currencies in various heuristically defined global risk appetite regimes. Findings of a heightened (lessened) impact of this factor for Emerging/Commodity (Developed/European) currencies in the presence of high risk are presented. The risk appetite process is additionally analysed by modelling it as a Markov-switching model, providing evidence of three inherent regimes, with properties roughly consistent with findings in the literature.
|
59 |
Analysis of clustered competing risks with application to a multicentre clinical trialFamilusi, Mary Ajibola January 2016 (has links)
The usefulness of time-to-event (survival) analysis has made it gain a wide applicability in statistically modelling research. The methodological developments of time-to-event analysis that have been widely adopted are: (i) The Kaplan-Meier method, for estimating the survival function; (ii) The log-rank test, for comparing the equality of two or more survival distributions; (m) The Cox proportional hazards model, for examining the covariate effects on the hazard function; and (iv) The accelerated failure time model, for examining the covariate effects on the survival function. Nonetheless, in time-to-event endpoints assessment, if subjects can fail from multiple mutually-exclusive causes, data are said to have competing risks. For competing risks data, the Fine and Gray proportional hazards model for sub-distributions has gained popularity due to its convenience in directly assessing the effect of covariates on the cumulative incidence function. Furthermore, sometimes competing risks data cannot be considered as independent because of a clustered design; for instance, in registry cohorts or multi-centre clinical trials. The Fine and Gray model has been extended to the analysis of clustered time-to-event data, by including random-centre effects or frailties in the sub-distribution hazard. This research focuses on the analysis of clustered competing risks with an application to the investigation of the management of pericarditis clinical trial (IMPI) dataset. IMPI is a multi- centre clinical trial that was carried out from 19 centres in 8 African countries with the principal objective of assessing the effectiveness and safety of adjunctive prednisolone and Mycobacterium indicus pranii immunotherapy, in reducing the composite outcome of death, constriction or cardiac tamponade, requiring pericardial drainage in patients with probable or definite tuberculous pericarditis. The clinical objective in this thesis is therefore to analyse time to these outcomes. In addition, the risk factors associated with these outcomes were determined, and the effect of the prednisolone and M. indcus pranii was examined, while adjusting for these risk factors and considering centres as a random effect. Using Cox proportional hazards model, it was found that age, weight, New York Heart Association (NYHA) class, hypotension, creatinine, and peripheral oedema show a statistically significant association with the composite outcome. Furthermore, weight, NYHA class, hypotension, creatinine and peripherial oedema show a statistically significant association with death. In addition, NYHA class and hypotension show a statistically significant association with cardiac tamponade. Lastly, prednisolone, gender, NYHA class, tachycardia, haemoglobin level, peripheral oedema, pulmonary infiltrate and HIV status show a statistically significant association with constriction. A value of 0.1 significance level was used to identify variables as significant in the univariate model using forward stepwise regression method. The random effect was found to be significant in the incidence of composite outcomes of death, cardiac tamponade and constriction, and in the individual outcome of constriction, but this only slightly changed the estimated effect of the covariates as compared to when the random effect was not considered. Accounting for death as a competing event to the outcomes of cardiac tamponade or constriction, does not affect the effect of the covariates on these outcomes. In addition, in the multivariate models that adjust for other risk factors, there was no significant difference in the primary outcome between patients who received prednisolone, and those who received placebo, or between those who received M. indicus pranii immunotherapy, and those who received placebo.
|
60 |
Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approachesAdeyemi, Rasheed Alani January 2015 (has links)
Includes bibliographical references / This thesis explores uncertainty statistics to model agricultural crop yields, in a situation where there are neither sampling observations nor historical record. The Bayesian approach to a linear regression model is useful for predict ion of crop yield when there are quantity data issue s and the model structure uncertainty and the regression model involves a large number of explanatory variables. Data quantity issues might occur when a farmer is cultivating a new crop variety, moving to a new farming location or when introducing a new farming technology, where the situation may warrant a change in the current farming practice. The first part of this thesis involved the collection of data from experts' domain and the elicitation of the probability distributions. Uncertainty statistics, the foundation of uncertainty theory and the data gathering procedures were discussed in detail. We proposed an estimation procedure for the estimation of uncertainty distributions. The procedure was then implemented on agricultural data to fit some uncertainty distributions to five cereal crop yields. A Delphi method was introduced and used to fit uncertainty distributions for multiple experts' data of sesame seed yield. The thesis defined an uncertainty distance and derived a distance for a difference between two uncertainty distributions. We lastly estimated the distance between a hypothesized distribution and an uncertainty normal distribution. Although, the applicability of uncertainty statistics is limited to one sample model, the approach provides a fast approach to establish a standard for process parameters. Where no sampling observation exists or it is very expensive to acquire, the approach provides an opportunity to engage experts and come up with a model for guiding decision making. In the second part, we fitted a full dataset obtained from an agricultural survey of small-scale farmers to a linear regression model using direct Markov Chain Monte Carlo (MCMC), Bayesian estimation (with uniform prior) and maximum likelihood estimation (MLE) method. The results obtained from the three procedures yielded similar mean estimates, but the credible intervals were found to be narrower in Bayesian estimates than confidence intervals in MLE method. The predictive outcome of the estimated model was then assessed using simulated data for a set of covariates. Furthermore, the dataset was then randomly split into two data sets. The informative prior was later estimated from one-half called the "old data" using Ordinary Least Squares (OLS) method. Three models were then fitted onto the second half called the "new data": General Linear Model (GLM) (M1), Bayesian model with a non-informative prior (M2) and Bayesian model with informative prior (M3). A leave-one-outcross validation (LOOCV) method was used to compare the predictive performance of these models. It was found that the Bayesian models showed better predictive performance than M1. M3 (with a prior) had moderate average Cross Validation (CV) error and Cross Validation (CV) standard error. GLM performed worst with least average CV error and highest (CV) standard error among the models. In Model M3 (expert prior), the predictor variables were found to be significant at 95% credible intervals. In contrast, most variables were not significant under models M1 and M2. Also, The model with informative prior had narrower credible intervals compared to the non-information prior and GLM model. The results indicated that variability and uncertainty in the data was reasonably reduced due to the incorporation of expert prior / information prior. We lastly investigated the residual plots of these models to assess their prediction performance. Bayesian Model Average (BMA) was later introduced to address the issue of model structure uncertainty of a single model. BMA allows the computation of weighted average over possible model combinations of predictors. An approximate AIC weight was then proposed for model selection instead of frequentist alternative hypothesis testing (or models comparison in a set of competing candidate models). The method is flexible and easy to interpret instead of raw AIC or Bayesian information criterion (BIC), which approximates the Bayes factor. Zellner's g-prior was considered appropriate as it has widely been used in linear models. It preserves the correlation structure among predictors in its prior covariance. The method also yields closed-form marginal likelihoods which lead to huge computational savings by avoiding sampling in the parameter space as in BMA. We lastly determined a single optimal model from all possible combination of models and also computed the log-likelihood of each model.
|
Page generated in 0.1248 seconds