• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Privacy Preserving Data Mining Operations without Disrupting Data Quality

B.Swapna, R.VijayaPrakash 01 December 2012 (has links)
Data mining operations have become prevalent as they can extract trends or patterns that help in taking good business decisions. Often they operate on large historical databases or data warehouses to obtain actionable knowledge or business intelligence that helps in taking well informed decisions. In the data mining domain there came many tools to perform data mining operations. These tools are best used to obtain actionable knowledge from data. Manually doing this is not possible as the data is very huge and takes lot of time. Thus the data mining domain is being improved in a rapid pace. While data mining operations are very useful in obtaining business intelligence, they also have some drawbacks that are they get sensitive information from the database. People may misuse the freedom given by obtaining sensitive information illegally. Preserving privacy of data is also important. Towards this end many Privacy Preserving Data Mining (PPDM) algorithms came into existence that sanitize data to prevent data mining algorithms from extracting sensitive information from the databases. / Data mining operations help discover business intelligence from historical data. The extracted business intelligence or actionable knowledge helps in taking well informed decisions that leads to profit to the organization that makes use of it. While performing mining privacy of data has to be given utmost importance. To achieve this PPDM (Privacy Preserving Data Mining) came into existence by sanitizing database that prevents discovery of association rules. However, this leads to modification of data and thus disrupting the quality of data. This paper proposes a new technique and algorithms that can perform privacy preserving data mining operations while ensuring that the data quality is not lost. The empirical results revealed that the proposed technique is useful and can be used in real world applications.
2

Privacy Preserving Data Mining using Unrealized Data Sets: Scope Expansion and Data Compression

Fong, Pui Kuen 16 May 2013 (has links)
In previous research, the author developed a novel PPDM method – Data Unrealization – that preserves both data privacy and utility of discrete-value training samples. That method transforms original samples into unrealized ones and guarantees 100% accurate decision tree mining results. This dissertation extends their research scope and achieves the following accomplishments: (1) it expands the application of Data Unrealization on other data mining algorithms, (2) it introduces data compression methods that reduce storage requirements for unrealized training samples and increase data mining performance and (3) it adds a second-level privacy protection that works perfectly with Data Unrealization. From an application perspective, this dissertation proves that statistical information (i. e. counts, probability and information entropy) can be retrieved precisely from unrealized training samples, so that Data Unrealization is applicable for all counting-based, probability-based and entropy-based data mining models with 100% accuracy. For data compression, this dissertation introduces a new number sequence – J-Sequence – as a mean to compress training samples through the J-Sampling process. J-Sampling converts the samples into a list of numbers with many replications. Applying run-length encoding on the resulting list can further compress the samples into a constant storage space regardless of the sample size. In this way, the storage requirement of the sample database becomes O(1) and the time complexity of a statistical database query becomes O(1). J-Sampling is used as an encryption approach to the unrealized samples already protected by Data Unrealization; meanwhile, data mining can be performed on these samples without decryption. In order to retain privacy preservation and to handle data compression internally, a column-oriented database management system is recommended to store the encrypted samples. / Graduate / 0984 / fong_bee@hotmail.com

Page generated in 0.0133 seconds