Return to search

The Effect of 5-anonymity on a classifier based on neural network that is applied to the adult dataset

Privacy issues relating to having data made public is relevant with the introduction of the GDPR. To limit problems related to data becoming public, intentionally or via an event such as a security breach, anonymization of datasets can be employed. In this report, the impact of the application of 5-anonymity to the adult dataset on a classifier based on a neural network predicting whether people had an income exceeding $50,000 was investigated using precision, recall and accuracy. The classifier was trained using the non-anonymized data, the anonymized data, and the non-anonymized data with those attributes which were suppressed in the anonymized data removed. The result was that average accuracy dropped from 0.82 to 0.76, precision from 0.58 to 0.50, and recall increased from 0.82 to 0.87. The average values and distributions seem to support the estimation that the majority of the performance impact of anonymization in this case comes from the suppression of attributes.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:his-17918
Date January 2019
CreatorsPaulson, Jörgen
PublisherHögskolan i Skövde, Institutionen för informationsteknologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0095 seconds