Return to search

Semantic attack on transaction data anonymised by set-based generalisation

Publishing data that contains information about individuals may lead to privacy breaches. However, data publishing is useful to support research and analysis. Therefore, privacy protection in data publishing becomes important and has received much recent attention. To improve privacy protection, many researchers have investigated how secure the published data is by designing de-anonymisation methods to attack anonymised data. Most of the de-anonymisation methods consider anonymised data in a syntactic manner. That is, items in a dataset are considered to be contextless or even meaningless literals, and they have not considered the semantics of these data items. In this thesis, we investigate how secure the anonymised data is under attacks that use semantic information. More specifically, we propose a de-anonymisation method to attack transaction data anonymised by set-based generalisation. Set-based generalisation protects data by replacing one item by a set of items, so that the identity of an individual can be hidden. Our goal is to identify those items that are added to a transaction during generalisation. Our attacking method has two components: scoring and elimination. Scoring measures semantic relationship between items in a transaction, and elimination removes items that are deemed not to be in the original transaction. Our experiments on both real and synthetic data show that set-based generalisation may not provide adequate protection for transaction data, and about 70% of the items added to the transactions during generalisation can be detected by our method with a precision greater than 85%.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:655956
Date January 2015
CreatorsOng, Hoang
PublisherCardiff University
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://orca.cf.ac.uk/74553/

Page generated in 0.0123 seconds