As privacy concerns become more important, effective and efficient security techniques will become critical to those that are charged with the protection of sensitive information. Agencies that disseminate numerical data encounter a common disclosure control problem called the complementary cell suppression problem. In this problem, cell values that are considered sensitive in the statistical table must be suppressed before the table is made public. However, suppressing only these cells may not provide adequate protection since their values may be inferred using available marginal subtotals. In order to ensure that the values of the sensitive cells cannot be estimated within a specified degree of precision additional non-sensitive cells, called complementary cells, must also be suppressed. Since suppression of non-sensitive cells diminishes the utility of the released data, the objective in the complementary cell suppression problem is to minimize the information lost due to complementary suppression while guaranteeing that the sensitive cells are adequately protected. The resulting constrained optimization problem is known to be NP-hard and has been a major focus of research in statistical data security.
Several heuristic methods have been developed to find good solutions for the complementary cell suppression problem. More recently, genetic algorithms have been used to improve upon these solutions. A problem with these GA-based approaches is that a vast majority of the solutions produced do not protect the sensitive cells. This is because the genetic operators used do not maintain the associations between cells that provide the protection. Consequently, the GA has to include an additional procedure for repairing the solutions. This dissertation details an improved GA-based method for the complementary cell suppression problem that addresses this limitation by designing more effective genetic operators. Specifically, it mitigated the problem of chromosomal repair by developing a crossover operator that maintains the necessary associations. The study also designed an improved mutation operator that exploits domain knowledge to increase the probability of finding good quality solutions. The proposed GA was evaluated by comparing it to extant methods based on the quality of its evolved solutions and its computational efficiency.
Identifer | oai:union.ndltd.org:nova.edu/oai:nsuworks.nova.edu:gscis_etd-1135 |
Date | 01 January 2010 |
Creators | Ditrich, Eric |
Publisher | NSUWorks |
Source Sets | Nova Southeastern University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | CEC Theses and Dissertations |
Page generated in 0.002 seconds