Global ETD Search

161	Comparative approaches to handling missing data, with particular focus on multiple imputation for both cross-sectional and longitudinal models. Hassan, Ali Satty Ali. January 2012 (has links) Much data-based research are characterized by the unavoidable problem of incompleteness as a result of missing or erroneous values. This thesis discusses some of the various strategies and basic issues in statistical data analysis to address the missing data problem, and deals with both the problem of missing covariates and missing outcomes. We restrict our attention to consider methodologies which address a specific missing data pattern, namely monotone missingness. The thesis is divided into two parts. The first part placed a particular emphasis on the so called missing at random (MAR) assumption, but focuses the bulk of attention on multiple imputation techniques. The main aim of this part is to investigate various modelling techniques using application studies, and to specify the most appropriate techniques as well as gain insight into the appropriateness of these techniques for handling incomplete data analysis. This thesis first deals with the problem of missing covariate values to estimate regression parameters under a monotone missing covariate pattern. The study is devoted to a comparison of different imputation techniques, namely markov chain monte carlo (MCMC), regression, propensity score (PS) and last observation carried forward (LOCF). The results from the application study revealed that we have universally best methods to deal with missing covariates when the missing data pattern is monotone. Of the methods explored, the MCMC and regression methods of imputation to estimate regression parameters with monotone missingness were preferable to the PS and LOCF methods. This study is also concerned with comparative analysis of the techniques applied to incomplete Gaussian longitudinal outcome or response data due to random dropout. Three different methods are assessed and investigated, namely multiple imputation (MI), inverse probability weighting (IPW) and direct likelihood analysis. The findings in general favoured MI over IPW in the case of continuous outcomes, even when the MAR mechanism holds. The findings further suggest that the use of MI and direct likelihood techniques lead to accurate and equivalent results as both techniques arrive at the same substantive conclusions. The study also compares and contrasts several statistical methods for analyzing incomplete non-Gaussian longitudinal outcomes when the underlying study is subject to ignorable dropout. The methods considered include weighted generalized estimating equations (WGEE), multiple imputation after generalized estimating equations (MI-GEE) and generalized linear mixed model (GLMM). The current study found that the MI-GEE method was considerably robust, doing better than all the other methods in terms of small and large sample sizes, regardless of the dropout rates. The primary interest of the second part of the thesis falls under the non-ignorable dropout (MNAR) modelling frameworks that rely on sensitivity analysis in modelling incomplete Gaussian longitudinal data. The aim of this part is to deal with non-random dropout by explicitly modelling the assumptions that caused the dropout and incorporated this additional sub-model into the model for the measurement data, and to assess the sensitivity of the modelling assumptions. The study pays attention to the analysis of repeated Gaussian measures subject to potentially non-random dropout in order to study the influence on inference that might be caused in the data by the dropout process. We consider the construction of a particular type of selection model, namely the Diggle-Kenward model as a tool for assessing the sensitivity of a selection model in terms of the modelling assumptions. The major conclusions drawn were that there was evidence in favour of the MAR process rather than an MCAR process in the context of the assumed model. In addition, there was the need to obtain further insight into the data by comparing various sensitivity analysis frameworks. Lastly, two families of models were also compared and contrasted to investigate the potential influence on inference that dropout might have or exert on the dependent measurement data considered, and to deal with incomplete sequences. The models were based on selection and pattern mixture frameworks used for sensitivity analysis to jointly model the distribution of the dropout process and longitudinal measurement process. The results of the sensitivity analysis were in agreement and hence led to similar parameter estimates. Additional confidence in the findings was gained as both models led to similar results for significant effects such as marginal treatment effects. / Thesis (M.Sc.)-University of KwaZulu-Natal, Pietermaritzburg, 2012. Multiple imputation (Statistics) Multivariate analysis. Missing observations (Statistics)
162	Likelihood based statistical methods for estimating HIV incidence rate. Gabaitiri, Lesego. January 2013 (has links) Estimation of current levels of human immunodeficiency virus (HIV) incidence is essential for monitoring the impact of an epidemic, determining public health priorities, assessing the impact of interventions and for planning purposes. However, there is often insufficient data on incidence as compared to prevalence. A direct approach is to estimate incidence from longitudinal cohort studies. Although this approach can provide direct and unbiased measure of incidence for settings where the study is conducted, it is often too expensive and time consuming. An alternative approach is to estimate incidence from cross sectional survey using biomarkers that distinguish between recent and non-recent/longstanding infections. The original biomarker based approach proposes the detection of HIV-1 p24 antigen in the pre-seroconversion period to identify persons with acute infection for estimating HIV incidence. However, this approach requires large sample sizes in order to obtain reliable estimates of HIV incidence because the duration of antigenemia before antibody detection is short, about 22.5 days. Subsequently, another method that involves dual antibody testing system was developed. In stage one, a sensitive test is used to diagnose HIV infection and a less sensitive test such is used in the second stage to distinguish between long standing infections and recent infections among those who tested positive for HIV in stage one. The question is: how do we combine this data with other relevant information, such as the period an individual takes from being undetectable by a less sensitive test to being detectable, to estimate incidence? The main objective of this thesis is therefore to develop likelihood based method that can be used to estimate HIV incidence when data is derived from cross sectional surveys and the disease classification is achieved by combining two biomarker or assay tests. The thesis builds on the dual antibody testing approach and extends the statistical framework that uses the multinomial distribution to derive the maximum likelihood estimators of HIV incidence for different settings. In order to improve incidence estimation, we develop a model for estimating HIV incidence that incorporate information on the previous or past prevalence and derive maximum likelihood estimators of incidence assuming incidence density is constant over a specified period. Later, we extend the method to settings where a proportion of subjects remain non-reactive to a less sensitive test long after seroconversion. Diagnostic tests used to determine recent infections are prone to errors. To address this problem, we considered a method that simultaneously makes adjustment for sensitivity and specificity. In addition, we also showed that sensitivity is similar to the proportion of subjects who eventually transit the “recent infection” state. We also relax the assumption of constant incidence density by proposing linear incidence density to accommodate settings where incidence might be declining or increasing. We extend the standard adjusted model for estimating incidence to settings where some subjects who tested positive for HIV antibodies were not tested by a less sensitive test resulting in missing outcome data. Models for the risk factors (covariates) of HIV incidence are considered in the last but one chapter. We used data from Botswana AIDS Impact (BAIS) III of 2008 to illustrate the proposed methods. The general conclusion and recommendations for future work are provided in the final chapter. / Theses (Ph.D.)-University of KwaZulu-Natal, Pietermaritzburg, 2013. Mathematical statistics. Probabilities. Biometry. Cohort analysis. HIV-positive persons.
163	Gestion du risque sécuritaire et prédiction des incidents disciplinaires : la contribution des modèles d'importation, de privation et du LS/CMI Charton, Thibault January 2008 (has links) Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal Incidents disciplinaires Prison LS/CMI Gestion du risque Prison Misconducts Risk assessment Deprivation Importation Actuarial
164	Multi-label feature selection with application to musical instrument recognition Sandrock, Trudie 12 1900 (has links) Thesis (PhD)--Stellenbosch University, 2013. / ENGLISH ABSTRACT: An area of data mining and statistics that is currently receiving considerable attention is the field of multi-label learning. Problems in this field are concerned with scenarios where each data case can be associated with a set of labels instead of only one. In this thesis, we review the field of multi-label learning and discuss the lack of suitable benchmark data available for evaluating multi-label algorithms. We propose a technique for simulating multi-label data, which allows good control over different data characteristics and which could be useful for conducting comparative studies in the multi-label field. We also discuss the explosion in data in recent years, and highlight the need for some form of dimension reduction in order to alleviate some of the challenges presented by working with large datasets. Feature (or variable) selection is one way of achieving dimension reduction, and after a brief discussion of different feature selection techniques, we propose a new technique for feature selection in a multi-label context, based on the concept of independent probes. This technique is empirically evaluated by using simulated multi-label data and it is shown to achieve classification accuracy with a reduced set of features similar to that achieved with a full set of features. The proposed technique for feature selection is then also applied to the field of music information retrieval (MIR), specifically the problem of musical instrument recognition. An overview of the field of MIR is given, with particular emphasis on the instrument recognition problem. The particular goal of (polyphonic) musical instrument recognition is to automatically identify the instruments playing simultaneously in an audio clip, which is not a simple task. We specifically consider the case of duets – in other words, where two instruments are playing simultaneously – and approach the problem as a multi-label classification one. In our empirical study, we illustrate the complexity of musical instrument data and again show that our proposed feature selection technique is effective in identifying relevant features and thereby reducing the complexity of the dataset without negatively impacting on performance. / AFRIKAANSE OPSOMMING: ‘n Area van dataontginning en statistiek wat tans baie aandag ontvang, is die veld van multi-etiket leerteorie. Probleme in hierdie veld beskou scenarios waar elke datageval met ‘n stel etikette geassosieer kan word, instede van slegs een. In hierdie skripsie gee ons ‘n oorsig oor die veld van multi-etiket leerteorie en bespreek die gebrek aan geskikte standaard datastelle beskikbaar vir die evaluering van multi-etiket algoritmes. Ons stel ‘n tegniek vir die simulasie van multi-etiket data voor, wat goeie kontrole oor verskillende data eienskappe bied en wat nuttig kan wees om vergelykende studies in die multi-etiket veld uit te voer. Ons bespreek ook die onlangse ontploffing in data, en beklemtoon die behoefte aan ‘n vorm van dimensie reduksie om sommige van die uitdagings wat deur sulke groot datastelle gestel word die hoof te bied. Veranderlike seleksie is een manier van dimensie reduksie, en na ‘n vlugtige bespreking van verskillende veranderlike seleksie tegnieke, stel ons ‘n nuwe tegniek vir veranderlike seleksie in ‘n multi-etiket konteks voor, gebaseer op die konsep van onafhanklike soek-veranderlikes. Hierdie tegniek word empiries ge-evalueer deur die gebruik van gesimuleerde multi-etiket data en daar word gewys dat dieselfde klassifikasie akkuraatheid behaal kan word met ‘n verminderde stel veranderlikes as met die volle stel veranderlikes. Die voorgestelde tegniek vir veranderlike seleksie word ook toegepas in die veld van musiek dataontginning, spesifiek die probleem van die herkenning van musiekinstrumente. ‘n Oorsig van die musiek dataontginning veld word gegee, met spesifieke klem op die herkenning van musiekinstrumente. Die spesifieke doel van (polifoniese) musiekinstrument-herkenning is om instrumente te identifiseer wat saam in ‘n oudiosnit speel. Ons oorweeg spesifiek die geval van duette – met ander woorde, waar twee instrumente saam speel – en hanteer die probleem as ‘n multi-etiket klassifikasie een. In ons empiriese studie illustreer ons die kompleksiteit van musiekinstrumentdata en wys weereens dat ons voorgestelde veranderlike seleksie tegniek effektief daarin slaag om relevante veranderlikes te identifiseer en sodoende die kompleksiteit van die datastel te verminder sonder ‘n negatiewe impak op klassifikasie akkuraatheid. Music -- Data processing Musical analysis -- Computer processing Computer algorithms
165	Contracting under Heterogeneous Beliefs Ghossoub, Mario 25 May 2011 (has links) The main motivation behind this thesis is the lack of belief subjectivity in problems of contracting, and especially in problems of demand for insurance. The idea that an underlying uncertainty in contracting problems (e.g. an insurable loss in problems of insurance demand) is a given random variable on some exogenously determined probability space is so engrained in the literature that one can easily forget that the notion of an objective uncertainty is only one possible approach to the formulation of uncertainty in economic theory. On the other hand, the subjectivist school led by De Finetti and Ramsey challenged the idea that uncertainty is totally objective, and advocated a personal view of probability (subjective probability). This ultimately led to Savage's approach to the theory of choice under uncertainty, where uncertainty is entirely subjective and it is only one's preferences that determine one's probabilistic assessment. It is the purpose of this thesis to revisit the "classical" insurance demand problem from a purely subjectivist perspective on uncertainty. To do so, we will first examine a general problem of contracting under heterogeneous subjective beliefs and provide conditions under which we can show the existence of a solution and then characterize that solution. One such condition will be called "vigilance". We will then specialize the study to the insurance framework, and characterize the solution in terms of what we will call a "generalized deductible contract". Subsequently, we will study some mathematical properties of collections of vigilant beliefs, in preparation for future work on the idea of vigilance. This and other envisaged future work will be discussed in the concluding chapter of this thesis. In the chapter preceding the concluding chapter, we will examine a model of contracting for innovation under heterogeneity and ambiguity, simply to demonstrate how the ideas and techniques developed in the first chapter can be used beyond problems of insurance demand. Optimal insurance Contracting Heterogeneous beliefs Decision Theory Uncertainty Ambiguity Equimeasurable rearrangements Choquet integral Actuarial Science
166	Actuarial Inference and Applications of Hidden Markov Models Till, Matthew Charles January 2011 (has links) Hidden Markov models have become a popular tool for modeling long-term investment guarantees. Many different variations of hidden Markov models have been proposed over the past decades for modeling indexes such as the S&P 500, and they capture the tail risk inherent in the market to varying degrees. However, goodness-of-fit testing, such as residual-based testing, for hidden Markov models is a relatively undeveloped area of research. This work focuses on hidden Markov model assessment, and develops a stochastic approach to deriving a residual set that is ideal for standard residual tests. This result allows hidden-state models to be tested for goodness-of-fit with the well developed testing strategies for single-state models. This work also focuses on parameter uncertainty for the popular long-term equity hidden Markov models. There is a special focus on underlying states that represent lower returns and higher volatility in the market, as these states can have the largest impact on investment guarantee valuation. A Bayesian approach for the hidden Markov models is applied to address the issue of parameter uncertainty and the impact it can have on investment guarantee models. Also in this thesis, the areas of portfolio optimization and portfolio replication under a hidden Markov model setting are further developed. Different strategies for optimization and portfolio hedging under hidden Markov models are presented and compared using real world data. The impact of parameter uncertainty, particularly with model parameters that are connected with higher market volatility, is once again a focus, and the effects of not taking parameter uncertainty into account when optimizing or hedging in a hidden Markov are demonstrated. Hidden Markov Models Regime-Switching Models Residual Analysis MCMC Portfolio Optimization Portfolio Replication Actuarial Science
167	An introduction to Gerber-Shiu analysis Huynh, Mirabelle January 2011 (has links) A valuable analytical tool to understand the event of ruin is a Gerber-Shiu discounted penalty function. It acts as a unified means of identifying ruin-related quantities which may help insurers understand their vulnerability ruin. This thesis provides an introduction to the basic concepts and common techniques used for the Gerber-Shiu analysis. Chapter 1 introduces the insurer's surplus process in the ordinary Sparre Andersen model. Defective renewal equations, the Dickson-Hipp transform, and Lundberg's fundamental equation are reviewed. Chapter 2 introduces the classical Gerber-Shiu discounted penalty function. Two framework equations are derived by conditioning on the first drop in surplus below its initial value, and by conditioning on the time and amount of the first claim. A detailed discussion is provided for each of these conditioning arguments. The classical Poisson model (where interclaim times are exponentially distributed) is then considered. We also consider when claim sizes are exponentially distributed. Chapter 3 introduces the Gerber-Shiu function in the delayed renewal model which allows the time until the first claim to be distributed differently than subsequent interclaim times. We determine a functional relationship between the Gerber-Shiu function in the ordinary Sparre Andersen model and the Gerber-Shiu function in the delayed model for a class of first interclaim time densities which includes the equilibrium density for the stationary renewal model, and the exponential density. To conclude, Chapter 4 introduces a generalized Gerber-Shiu function where the penalty function includes two additional random variables: the minimum surplus level before ruin, and the surplus immediately after the claim before the claim causing ruin. This generalized Gerber-Shiu function allows for the study of random variables which otherwise could not be studied using the classical definition of the function. Additionally, it is assumed that the size of a claim is dependant on the interclaim time that precedes it. As is done in Chapter 2, a detailed discussion of each of the two conditioning arguments is provided. Using the uniqueness property of Laplace transforms, the form of joint defective discounted densities of interest are determined. The classical Poisson model and the exponential claim size assumption is also revisited. Gerber-Shiu defective renewal equations ruin theory Laplace transforms risk management Actuarial Science
168	Directional Control of Generating Brownian Path under Quasi Monte Carlo Liu, Kai January 2012 (has links) Quasi-Monte Carlo (QMC) methods are playing an increasingly important role in computational finance. This is attributed to the increased complexity of the derivative securities and the sophistication of the financial models. Simple closed-form solutions for the finance applications typically do not exist and hence numerical methods need to be used to approximate their solutions. QMC method has been proposed as an alternative method to Monte Carlo (MC) method to accomplish this objective. Unlike MC methods, the efficiency of QMC-based methods is highly dependent on the dimensionality of the problems. In particular, numerous researches have documented, under the Black-Scholes models, the critical role of the generating matrix for simulating the Brownian paths. Numerical results support the notion that generating matrix that reduces the effective dimension of the underlying problems is able to increase the efficiency of QMC. Consequently, dimension reduction methods such as principal component analysis, Brownian bridge, Linear Transformation and Orthogonal Transformation have been proposed to further enhance QMC. Motivated by these results, we first propose a new measure to quantify the effective dimension. We then propose a new dimension reduction method which we refer as the directional method (DC). The proposed DC method has the advantage that it depends explicitly on the given function of interest. Furthermore, by assigning appropriately the direction of importance of the given function, the proposed method optimally determines the generating matrix used to simulate the Brownian paths. Because of the flexibility of our proposed method, it can be shown that many of the existing dimension reduction methods are special cases of our proposed DC methods. Finally, many numerical examples are provided to support the competitive efficiency of the proposed method. QMC Low Discrepancy Sequence Effective Dimension Dimension Reduction PCA BB LT OT FOT DC Actuarial Science
169	Economic Pricing of Mortality-Linked Securities Zhou, Rui January 2012 (has links) In previous research on pricing mortality-linked securities, the no-arbitrage approach is often used. However, this method, which takes market prices as given, is difficult to implement in today's embryonic market where there are few traded securities. In particular, with limited market price data, identifying a risk neutral measure requires strong assumptions. In this thesis, we approach the pricing problem from a different angle by considering economic methods. We propose pricing approaches in both competitive market and non-competitive market. In the competitive market, we treat the pricing work as a Walrasian tâtonnement process, in which prices are determined through a gradual calibration of supply and demand. Such a pricing framework provides with us a pair of supply and demand curves. From these curves we can tell if there will be any trade between the counterparties, and if there will, at what price the mortality-linked security will be traded. This method does not require the market prices of other mortality-linked securities as input. This can spare us from the problems associated with the lack of market price data. We extend the pricing framework to incorporate population basis risk, which arises when a pension plan relies on standardized instruments to hedge its longevity risk exposure. This extension allows us to obtain the price and trading quantity of mortality-linked securities in the presence of population basis risk. The resulting supply and demand curves help us understand how population basis risk would affect the behaviors of agents. We apply the method to a hypothetical longevity bond, using real mortality data from different populations. Our illustrations show that, interestingly, population basis risk can affect the price of a mortality-linked security in different directions, depending on the properties of the populations involved. We have also examined the impact of transitory mortality jumps on trading in a competitive market. Mortality dynamics are subject to jumps, which are due to events such as the Spanish flu in 1918. Such jumps can have a significant impact on prices of mortality-linked securities, and therefore should be taken into account in modeling. Although several single-population mortality models with jump effects have been developed, they are not adequate for trades in which population basis risk exists. We first develop a two-population mortality model with transitory jump effects, and then we use the proposed mortality model to examine how mortality jumps may affect the supply and demand of mortality-linked securities. Finally, we model the pricing process in a non-competitive market as a bargaining game. Nash's bargaining solution is applied to obtain a unique trading contract. With no requirement of a competitive market, this approach is more appropriate for the current mortality-linked security market. We compare this approach with the other proposed pricing method. It is found that both pricing methods lead to Pareto optimal outcomes. Mortality-linked security Pricing Competitive market Bargaining Transitory jumps Multi-population mortality model Actuarial Science
170	Modelización y cobertura de operaciones actuariales en colectivos con múltiples estados Pociello García, Enrique 12 May 2000 (has links) El trabajo que aquí presentamos se centra en la modelización de una operación actuarial en colectivos con múltiples estados sobre la cual planteamos la cobertura de las fluctuaciones aleatorias de la siniestralidad adversas. Por operación actuarial en colectivos con múltiples estados, entenderemos una operación de vida que contempla diferentes estados del asegurado (invalido, activo, empleado, desempleado, etc.). Normalmente, asociamos estas operaciones con prestaciones de invalidez o con otro tipo de prestaciones complementarias a las de vida. uno de los objetivos que nos fijamos, será plantear de forma general la modelización de las operaciones con múltiples estados. Con tal fin, describiremos un modelo general de múltiples estados, basado en la aplicación de procesos y semiprocesos estocásticos de markov que nos permita realizar un enfoque sistemático, general y matemáticamente riguroso de cualquiera operación con múltiples estados, definida de forma continua o discreta en el tiempo. Ampliaremos su estudio para el caso particular de una operación de invalidezVeremos que la aplicación de unas tablas de probabilidad ajustadas a las características del colectivo no es suficiente para garantizar su solvencia ya que ésta también dependerá de la estructura del colectivo. Para garantizar su solvencia, propondremos la aplicación del reaseguro de diferencia de siniestralidad. Esta modalidad de reaseguro cuya definición y aplicación desarrollaremos en el marco de una operación con múltiples estados permite cubrir las desviaciones adversas de siniestralidad debidas a la misma estructura del colectivo.La aplicación del reaseguro de diferencia de siniestralidad permitirà garantizar totalmente la solvencia del colectivo, considerando como única fuente de ingresos las primas pagadas por sus integrantes. En este contexto, entenderemos por solvencia, la capacidad de atender a las obligaciones presentes y futuras contraídas por el colectivo con sus propios miembros.Con la finalidad de diseñar una estrategia óptima, combinaremos la utilización del reaseguro de diferencia de siniestralidad con la aplicación del otro instrumento del que dispone el colectivo para asegurar su solvencia: recargo de seguridad, de forma que el valor actual actuarial del coste total (primas de reaseguro más las primas de la operación aseguradora recargadas) sea mínimo. El colectivo se encuentra sujeto por su propia dinámica a entradas de nuevos asegurados. Con tal objeto, analizaremos los efectos generados por éstas y su repercusión en el colectivo. Resultado de las diferentes propuestas que realicemos, formularemos diferentes variantes del reaseguro de diferencia de siniestralidad. Prestacions d'assegurança social Economia actuarial Sinistralitat 33

Search results