1 |
Privacy Preserving Data Mining Operations without Disrupting Data QualityB.Swapna, R.VijayaPrakash 01 December 2012 (has links)
Data mining operations have become prevalent as they can
extract trends or patterns that help in taking good business
decisions. Often they operate on large historical databases
or data warehouses to obtain actionable knowledge or
business intelligence that helps in taking well informed
decisions. In the data mining domain there came many
tools to perform data mining operations. These tools are
best used to obtain actionable knowledge from data.
Manually doing this is not possible as the data is very huge
and takes lot of time. Thus the data mining domain is
being improved in a rapid pace. While data mining
operations are very useful in obtaining business
intelligence, they also have some drawbacks that are they
get sensitive information from the database. People may
misuse the freedom given by obtaining sensitive
information illegally. Preserving privacy of data is also
important. Towards this end many Privacy Preserving
Data Mining (PPDM) algorithms came into existence that
sanitize data to prevent data mining algorithms from
extracting sensitive information from the databases. / Data mining operations help discover business intelligence from
historical data. The extracted business intelligence or actionable
knowledge helps in taking well informed decisions that leads to
profit to the organization that makes use of it. While performing
mining privacy of data has to be given utmost importance. To
achieve this PPDM (Privacy Preserving Data Mining) came into
existence by sanitizing database that prevents discovery of
association rules. However, this leads to modification of data and
thus disrupting the quality of data. This paper proposes a new
technique and algorithms that can perform privacy preserving
data mining operations while ensuring that the data quality is not
lost. The empirical results revealed that the proposed technique is
useful and can be used in real world applications.
|
2 |
Privacy Preserving Data Mining using Unrealized Data Sets: Scope Expansion and Data CompressionFong, Pui Kuen 16 May 2013 (has links)
In previous research, the author developed a novel PPDM method – Data Unrealization – that preserves both data privacy and utility of discrete-value training samples. That method transforms original samples into unrealized ones and guarantees 100% accurate decision tree mining results. This dissertation extends their research scope and achieves the following accomplishments: (1) it expands the application of Data Unrealization on other data mining algorithms, (2) it introduces data compression methods that reduce storage requirements for unrealized training samples and increase data mining performance and (3) it adds a second-level privacy protection that works perfectly with Data Unrealization.
From an application perspective, this dissertation proves that statistical information (i. e. counts, probability and information entropy) can be retrieved precisely from unrealized training samples, so that Data Unrealization is applicable for all counting-based, probability-based and entropy-based data mining models with 100% accuracy.
For data compression, this dissertation introduces a new number sequence – J-Sequence – as a mean to compress training samples through the J-Sampling process. J-Sampling converts the samples into a list of numbers with many replications. Applying run-length encoding on the resulting list can further compress the samples into a constant storage space regardless of the sample size. In this way, the storage requirement of the sample database becomes O(1) and the time complexity of a statistical database query becomes O(1).
J-Sampling is used as an encryption approach to the unrealized samples already protected by Data Unrealization; meanwhile, data mining can be performed on these samples without decryption. In order to retain privacy preservation and to handle data compression internally, a column-oriented database management system is recommended to store the encrypted samples. / Graduate / 0984 / fong_bee@hotmail.com
|
Page generated in 0.018 seconds