Return to search

Privacy-aware data generation : Using generative adversarial networks and differential privacy

Today we are surrounded by IOT devices that constantly generate different kinds of data about its environment and its users. Much of this data could be useful for different research purposes and development, but a lot of this collected data is privacy-sensitive for the individual person. To protect the individual's privacy, we have data protection laws. But these restrictions by laws also dramatically reduce the amount of data available for research and development. Therefore it would be beneficial if we could find a work around that respects people's privacy without breaking the laws while still maintaining the usefulness of data. The purpose of this thesis is to show how we can generate privacy-aware data from a dataset by using Generative Adversarial Networks (GANS) and Differential Privacy (DP), that maintains data utility. This is useful because it allows for the sharing of privacy-preserving data, so that the data can be used in research and development with concern for privacy. GANS is used for generating synthetic data. DP is an anonymization technique of data. With the combination of these two techniques, we generate synthetic-privacy-aware data from an existing open-source Fitbit dataset. The specific type of GANS model that is used is called CTGAN and differential privacy is achieved with the help of gaussian noise. The results from the experiments performed show many similarities between the original dataset and the experimental datasets. The experiments performed very well at the Kolmogorov Smirnov test, with the lowest P-value of all experiments sitting at 0.92. The conclusion that is drawn is that this is another promising methodology for creating privacy-aware-synthetic data, that maintains reasonable data utility while still utilizing DP techniques to achieve data privacy.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:umu-191512
Date January 2022
CreatorsHübinette, Felix
PublisherUmeå universitet, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUMNAD ; 1301

Page generated in 0.0025 seconds