Spelling suggestions: "subject:"privacy preserving data publishing"" "subject:"privacy preserving data ublishing""
1 |
Privacy in Complex Sample Based SurveysShawn A Merrill (11806802) 20 December 2021 (has links)
In the last few decades, there has been a dramatic uptick in the issues related to protecting user privacy in released data, both in statistical databases and anonymized records. Privacy-preserving data publishing is a field established to handle these releases while avoiding the problems that plagued many earlier attempts. This issue is of particular importance for governmental data, where both the release and the privacy requirements are frequently governed by legislature (e.g., HIPAA, FERPA, Clery Act). This problem is doubly compounded by the complex survey methods employed to counter problems in data collection. The preeminent definition for privacy is that of differential privacy, which protects users by limiting the impact that any individual can have on the result of any query. <br><br>The thesis proposes models for differentially private versions of current survey methodologies and, discusses the evaluation of those models. We focus on the issues of missing data and weighting which are common techniques employed in complex surveys to counter problems with sampling and response rates. First we propose a model for answering queries on datasets with missing data while maintaining differential privacy. Our model uses k-Nearest Neighbor imputation to replicate donor values while protecting the privacy of the donor. Our model provides significantly better bias reduction in realistic experiments using existing data, as well as providing less noise than a naive solution. Our second model proposes a method of performing Iterative Proportional Fitting (IPF) in a differentially private manner, a common technique used to ensure that survey records are weighted consistently with known values. We also focus on the general philosophical need to incorporate privacy when creating new survey methodologies, rather than assuming that privacy can simply be added at a later step.
|
2 |
Towards a Privacy Preserving Framework for Publishing Longitudinal DataSehatkar, Morvarid January 2014 (has links)
Recent advances in information technology have enabled public organizations and corporations to collect and store huge amounts of individuals' data in data repositories. Such data are powerful sources of information about an individual's life such as interests, activities, and finances. Corporations can employ data mining and knowledge discovery techniques to extract useful knowledge and interesting patterns from large repositories of individuals' data. The extracted knowledge can be exploited to improve strategic decision making, enhance business performance, and improve services. However, person-specific data often contain sensitive information about individuals and publishing such data poses potential privacy risks. To deal with these privacy issues, data must be anonymized so that no sensitive information about individuals can be disclosed from published data while distortion is minimized to ensure usefulness of data in practice. In this thesis, we address privacy concerns in publishing longitudinal data. A data set is longitudinal if it contains information of the same observation or event about individuals collected at several points in time. For instance, the data set of multiple visits of patients of a hospital over a period of time is longitudinal. Due to temporal correlations among the events of each record, potential background knowledge of adversaries about an individual in the context of longitudinal data has specific characteristics. None of the previous anonymization techniques can effectively protect longitudinal data against an adversary with such knowledge. In this thesis we identify the potential privacy threats on longitudinal data and propose a novel framework of anonymization algorithms in a way that protects individuals' privacy against both identity disclosure and attribute disclosure, and preserves data utility. Particularly, we propose two privacy models: (K,C)^P -privacy and (K,C)-privacy, and for each of these models we propose efficient algorithms for anonymizing longitudinal data. An extensive experimental study demonstrates that our proposed framework can effectively and efficiently anonymize longitudinal data.
|
3 |
Rigorous and Flexible Privacy Protection Framework for Utilizing Personal Spatiotemporal Data / 個人時空間データ利活用のための厳密で柔軟なプライバシ保護フレムワークYang, Cao 23 March 2017 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第20508号 / 情博第636号 / 新制||情||110(附属図書館) / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授 吉川 正俊, 教授 田中 克己, 教授 岡部 寿男 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
Page generated in 0.1004 seconds