Global ETD Search

111	Comparison of methods to calculate measures of inequality based on interval data Neethling, Willem Francois 12 1900 (has links) Thesis (MComm)—Stellenbosch University, 2015. / ENGLISH ABSTRACT: In recent decades, economists and sociologists have taken an increasing interest in the study of income attainment and income inequality. Many of these studies have used census data, but social surveys have also increasingly been utilised as sources for these analyses. In these surveys, respondents’ incomes are most often not measured in true amounts, but in categories of which the last category is open-ended. The reason is that income is seen as sensitive data and/or is sometimes difficult to reveal. Continuous data divided into categories is often more difficult to work with than ungrouped data. In this study, we compare different methods to convert grouped data to data where each observation has a specific value or point. For some methods, all the observations in an interval receive the same value; an example is the midpoint method, where all the observations in an interval are assigned the midpoint. Other methods include random methods, where each observation receives a random point between the lower and upper bound of the interval. For some methods, random and non-random, a distribution is fitted to the data and a value is calculated according to the distribution. The non-random methods that we use are the midpoint-, Pareto means- and lognormal means methods; the random methods are the random midpoint-, random Pareto- and random lognormal methods. Since our focus falls on income data, which usually follows a heavy-tailed distribution, we use the Pareto and lognormal distributions in our methods. The above-mentioned methods are applied to simulated and real datasets. The raw values of these datasets are known, and are categorised into intervals. These methods are then applied to the interval data to reconvert the interval data to point data. To test the effectiveness of these methods, we calculate some measures of inequality. The measures considered are the Gini coefficient, quintile share ratio (QSR), the Theil measure and the Atkinson measure. The estimated measures of inequality, calculated from each dataset obtained through these methods, are then compared to the true measures of inequality. / AFRIKAANSE OPSOMMING: Oor die afgelope dekades het ekonome en sosioloë ŉ toenemende belangstelling getoon in studies aangaande inkomsteverkryging en inkomste-ongelykheid. Baie van die studies maak gebruik van sensus data, maar die gebruik van sosiale opnames as bronne vir die ontledings het ook merkbaar toegeneem. In die opnames word die inkomste van ŉ persoon meestal in kategorieë aangedui waar die laaste interval oop is, in plaas van numeriese waardes. Die rede vir die kategorieë is dat inkomste data as sensitief beskou word en soms is dit ook moeilik om aan te dui. Kontinue data wat in kategorieë opgedeel is, is meeste van die tyd moeiliker om mee te werk as ongegroepeerde data. In dié studie word verskeie metodes vergelyk om gegroepeerde data om te skakel na data waar elke waarneming ŉ numeriese waarde het. Vir van die metodes word dieselfde waarde aan al die waarnemings in ŉ interval gegee, byvoorbeeld die ‘midpoint’ metode waar elke waarde die middelpunt van die interval verkry. Ander metodes is ewekansige metodes waar elke waarneming ŉ ewekansige waarde kry tussen die onder- en bogrens van die interval. Vir sommige van die metodes, ewekansig en nie-ewekansig, word ŉ verdeling oor die data gepas en ŉ waarde bereken volgens die verdeling. Die nie-ewekansige metodes wat gebruik word, is die ‘midpoint’, ‘Pareto means’ en ‘Lognormal means’ en die ewekansige metodes is die ‘random midpoint’, ‘random Pareto’ en ‘random lognormal’. Ons fokus is op inkomste data, wat gewoonlik ŉ swaar stertverdeling volg, en om hierdie rede maak ons gebruik van die Pareto en lognormaal verdelings in ons metodes. Al die metodes word toegepas op gesimuleerde en werklike datastelle. Die rou waardes van die datastelle is bekend en word in intervalle gekategoriseer. Die metodes word dan op die interval data toegepas om dit terug te skakel na data waar elke waarneming ŉ numeriese waardes het. Om die doeltreffendheid van die metodes te toets word ŉ paar maatstawwe van ongelykheid bereken. Die maatstawwe sluit in die Gini koeffisiënt, ‘quintile share ratio’ (QSR), die Theil en Atkinson maatstawwe. Die beraamde maatstawwe van ongelykheid, wat bereken is vanaf die datastelle verkry deur die metodes, word dan vergelyk met die ware maatstawwe van ongelykheid. Interval data UCTD
112	Machine learning for systems pathology Verleyen, Wim January 2013 (has links) Systems pathology attempts to introduce more holistic approaches towards pathology and attempts to integrate clinicopathological information with “-omics” technology. This doctorate researches two examples of a systems approach for pathology: (1) a personalized patient output prediction for ovarian cancer and (2) an analytical approach differentiates between individual and collective tumour invasion. During the personalized patient output prediction for ovarian cancer study, clinicopathological measurements and proteomic biomarkers are analysed with a set of newly engineered bioinformatic tools. These tools are based upon feature selection, survival analysis with Cox proportional hazards regression, and a novel Monte Carlo approach. Clinical and pathological data proves to have highly significant information content, as expected; however, molecular data has little information content alone, and is only significant when selected most-informative variables are placed in the context of the patient's clinical and pathological measures. Furthermore, classifiers based on support vector machines (SVMs) that predict one-year PFS and three-year OS with high accuracy, show how the addition of carefully selected molecular measures to clinical and pathological knowledge can enable personalized prognosis predictions. Finally, the high-performance of these classifiers are validated on an additional data set. A second study, an analytical approach differentiates between individual and collective tumour invasion, analyses a set of morphological measures. These morphological measurements are collected with a newly developed process using automated imaging analysis for data collection in combination with a Bayesian network analysis to probabilistically connect morphological variables with tumour invasion modes. Between an individual and collective invasion mode, cell-cell contact is the most discriminating morphological feature. Smaller invading groups were typified by smoother cellular surfaces than those invading collectively in larger groups. Interestingly, elongation was evident in all invading cell groups and was not a specific feature of single cell invasion as a surrogate of epithelialmesenchymal transition. In conclusion, the combination of automated imaging analysis and Bayesian network analysis provides an insight into morphological variables associated with transition of cancer cells between invasion modes. We show that only two morphologically distinct modes of invasion exist. The two studies performed in this thesis illustrate the potential of a systems approach for pathology and illustrate the need of quantitative approaches in order to reveal the system behind pathology. 610.285
113	BAYES RISK ANALYSIS OF REGIONAL REGRESSION ESTIMATES OF FLOODS Metler, William Arledge 02 1900 (has links) This thesis defines a methodology for the evaluation of the worth of streamflow data using a Bayes risk approach. Using regional streamflow data in a regression analysis, the Bayes risk can be computed by considering the probability of the error in using the regionalized estimates of bridge or culvert design parameters. Cost curves for over- and underestimation of the design parameter can be generated based on the error of the estimate. The Bayes risk can then be computed by integrating the probability of estimation error over the cost curves. The methodology may then be used to analyze the regional data collection effort by considering the worth of data for a record site relative to the other sites contributing to the regression equations. The methodology is illustrated by using a set of actual streamflow data from Missouri. The cost curves for over- and underestimation of the streamflow design parameter for bridges and culverts are hypothesized so that the Bayes risk might be computed and the results of the analysis discussed. The results are discussed by demonstrating small sample bias that is introduced into the estimate of the design parameter for the construction of bridges and culverts. The conclusions are that the small sample bias in the estimation of large floods can be substantial and that the Bayes risk methodology can evaluate the relative worth of data when the data are used in regionalization. Flood forecasting
114	Optimal design for experiments with mixtures 陳令由, Chan, Ling-yau. January 1986 (has links) published_or_final_version / Mathematics / Doctoral / Doctor of Philosophy Mixtures - Statistical methods. Experimental design. Mathematical optimization.
115	Construction and testing of causal models in voting behaviour with reference to Hong Kong Lui, Kwok-man, Richard., 呂國民. January 1996 (has links) published_or_final_version / Politics and Public Administration / Doctoral / Doctor of Philosophy
116	Multilevel models for survival analysis in dental research Wong, Chun-mei, May., 王春美. January 2005 (has links) published_or_final_version / abstract / Dentistry / Doctoral / Doctor of Philosophy Survival analysis (Biometry)
117	New recursive parameter estimation algorithms in impulsive noise environment with application to frequency estimation and systemidentification Lau, Wing-yi., 劉穎兒. January 2006 (has links) published_or_final_version / abstract / Electrical and Electronic Engineering / Master / Master of Philosophy Signal processing - Statistical methods. Parameter estimation. Algorithms.
118	Statistical analysis of the infectivity and fatality of an emerging epidemic Xu, Ying, 徐穎 January 2009 (has links) published_or_final_version / Statistics and Actuarial Science / Doctoral / Doctor of Philosophy Epidemiology - Statistical methods. SARS (Disease) - Epidemiology.
119	Statistical evaluation of mixed DNA stains Choy, Yan-tsun., 蔡恩浚. January 2009 (has links) published_or_final_version / Statistics and Actuarial Science / Master / Master of Philosophy DNA fingerprinting. Forensic genetics - Statistical methods.
120	Statistical analysis of temporal and spatial variations in suicide data Yang, Kit-ling., 楊潔玲. January 2009 (has links) published_or_final_version / Statistics and Actuarial Science / Master / Master of Philosophy

Search results