Return to search

Comparison of methods to calculate measures of inequality based on interval data

Thesis (MComm)—Stellenbosch University, 2015. / ENGLISH ABSTRACT: In recent decades, economists and sociologists have taken an increasing interest in the study of
income attainment and income inequality. Many of these studies have used census data, but
social surveys have also increasingly been utilised as sources for these analyses. In these
surveys, respondents’ incomes are most often not measured in true amounts, but in categories
of which the last category is open-ended. The reason is that income is seen as sensitive data
and/or is sometimes difficult to reveal.
Continuous data divided into categories is often more difficult to work with than ungrouped data.
In this study, we compare different methods to convert grouped data to data where each
observation has a specific value or point. For some methods, all the observations in an interval
receive the same value; an example is the midpoint method, where all the observations in an
interval are assigned the midpoint. Other methods include random methods, where each
observation receives a random point between the lower and upper bound of the interval. For
some methods, random and non-random, a distribution is fitted to the data and a value is
calculated according to the distribution.
The non-random methods that we use are the midpoint-, Pareto means- and lognormal means
methods; the random methods are the random midpoint-, random Pareto- and random
lognormal methods. Since our focus falls on income data, which usually follows a heavy-tailed
distribution, we use the Pareto and lognormal distributions in our methods.
The above-mentioned methods are applied to simulated and real datasets. The raw values of
these datasets are known, and are categorised into intervals. These methods are then applied
to the interval data to reconvert the interval data to point data. To test the effectiveness of these
methods, we calculate some measures of inequality. The measures considered are the Gini
coefficient, quintile share ratio (QSR), the Theil measure and the Atkinson measure. The
estimated measures of inequality, calculated from each dataset obtained through these
methods, are then compared to the true measures of inequality. / AFRIKAANSE OPSOMMING: Oor die afgelope dekades het ekonome en sosioloë ʼn toenemende belangstelling getoon in
studies aangaande inkomsteverkryging en inkomste-ongelykheid. Baie van die studies maak
gebruik van sensus data, maar die gebruik van sosiale opnames as bronne vir die ontledings
het ook merkbaar toegeneem. In die opnames word die inkomste van ʼn persoon meestal in
kategorieë aangedui waar die laaste interval oop is, in plaas van numeriese waardes. Die rede
vir die kategorieë is dat inkomste data as sensitief beskou word en soms is dit ook moeilik om
aan te dui.
Kontinue data wat in kategorieë opgedeel is, is meeste van die tyd moeiliker om mee te werk as
ongegroepeerde data. In dié studie word verskeie metodes vergelyk om gegroepeerde data om
te skakel na data waar elke waarneming ʼn numeriese waarde het. Vir van die metodes word
dieselfde waarde aan al die waarnemings in ʼn interval gegee, byvoorbeeld die ‘midpoint’
metode waar elke waarde die middelpunt van die interval verkry. Ander metodes is ewekansige
metodes waar elke waarneming ʼn ewekansige waarde kry tussen die onder- en bogrens van die
interval. Vir sommige van die metodes, ewekansig en nie-ewekansig, word ʼn verdeling oor die
data gepas en ʼn waarde bereken volgens die verdeling.
Die nie-ewekansige metodes wat gebruik word, is die ‘midpoint’, ‘Pareto means’ en ‘Lognormal
means’ en die ewekansige metodes is die ‘random midpoint’, ‘random Pareto’ en ‘random
lognormal’. Ons fokus is op inkomste data, wat gewoonlik ʼn swaar stertverdeling volg, en om
hierdie rede maak ons gebruik van die Pareto en lognormaal verdelings in ons metodes.
Al die metodes word toegepas op gesimuleerde en werklike datastelle. Die rou waardes van die
datastelle is bekend en word in intervalle gekategoriseer. Die metodes word dan op die interval
data toegepas om dit terug te skakel na data waar elke waarneming ʼn numeriese waardes het.
Om die doeltreffendheid van die metodes te toets word ʼn paar maatstawwe van ongelykheid
bereken. Die maatstawwe sluit in die Gini koeffisiënt, ‘quintile share ratio’ (QSR), die Theil en
Atkinson maatstawwe. Die beraamde maatstawwe van ongelykheid, wat bereken is vanaf die
datastelle verkry deur die metodes, word dan vergelyk met die ware maatstawwe van
ongelykheid.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/97780
Date12 1900
CreatorsNeethling, Willem Francois
ContributorsDe Wet, Tertius, Neethling, Ariane, Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science
PublisherStellenbosch : Stellenbosch University
Source SetsSouth African National ETD Portal
Languageen_ZA
Detected LanguageUnknown
TypeThesis
Format167 pages
RightsStellenbosch University

Page generated in 0.0021 seconds