Return to search

Identifikation av icke-representativa svar i frågeundersökningar genom detektion av multivariata avvikare

To United Minds, large-scale surveys are an important offering to clients, not least the public opinion poll Väljarbarometern. A risk associated with surveys is satisficing – sub-optimal response behaviour impairing the possibility of correctly describing the sampled population through its results. The purpose of this study is to – through the use of multivariate outlier detection methods - identify those observations assumed to be non-representative of the population. The possibility of categorizing responses generated through satisficing as outliers is investigated. With regards to the character of the Väljarbarometern dataset, three existing algorithms are adapted to detect these outliers. Also, a number of randomly generated observations are added to the data, by all algorithms correctly labelled as outliers. The resulting anomaly scores generated by each algorithm are compared, concluding the Otey algorithm as the most effective for the purpose, above all since it takes into account correlation between variables. A plausible cut-off value for outliers and separation between non-representative and representative outliers are discussed. The resulting recommendation is to handle observations labelled as outliers through respondent follow-up or if not possible, through downweighting, inversely proportional to the anomaly scores.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-219546
Date January 2014
CreatorsGalvenius, Hugo
PublisherUppsala universitet, Matematiska institutionen
Source SetsDiVA Archive at Upsalla University
LanguageSwedish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUPTEC STS, 1650-8319 ; 14003

Page generated in 0.0032 seconds