• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 5
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

New Record Ordering Heuristics for Multivariate Microaggregation

Heaton, William 01 January 2011 (has links)
Microaggregation is a method of statistical disclosure control that attempts to reconcile the need to release information to researchers with the need to protect privacy of individual records in a dataset. Under microaggregation, records are divided into groups containing at least k members. Actual data values of the members are replaced by the mean value of the group, such that each record in the group is indistinguishable from at least k-1 other records. The goal of microaggregation is to create groups of similar records such that information loss is minimized, where information loss is the sum squared deviation between the actual data values and the group means. Optimal multivariate microaggregation is an NP-hard problem, and heuristics have been proposed to generate solutions in reasonable running time. New heuristics are desirable for either producing groups with lower information loss, or for producing groups with similar information loss but lower computational complexity. Some of the best performing existing microaggregation heuristics are based on record ordering, since it has been proven that for a given ordering of records, the optimal set of groups for that particular ordering can be efficiently computed. This dissertation improves on previous heuristics that order records in a dataset and subsequently use this record ordering to generate high quality microaggregated k- partitions. This was accomplished by using heuristics from the traveling salesman problem (TSP) literature in order to more effectively order the records. In particular, two tour construction heuristics - the Greedy heuristic and the Quick Boruvka heuristic - that are comparable in complexity to extant microaggregation methods were investigated. Next, three tour improvement heuristics - 2-opt, 3-opt, and Lin-Kernighan - were used on the tours constructed to investigate whether further reduction in information loss could be achieved. The tour improvement heuristics - particularly the 3-opt and Lin-Kernighan heuristics - provided microaggregation solutions better than the best previous known solutions across several datasets and values of k.
2

Contribucions a la microagregació per a la protecció de dades estadístiques

Torres Aragó, Àngel 08 September 2003 (has links)
DE TESIDesprés de recollir la informació referent a les tècniques pertorbatives més rellevants de control de la revelació de microdades contínues actualment existents, l'objectiu general de la tesi és l'anàlisi i la millora d'aquestes tècniques de control de la revelació mitjançant mètodes d'estadística matemàtica; millora referida a almenys un dels tres aspectes següents: 1) Nivell de protecció. Donar un bon grau de protecció a la informació confidencial de les dades que han de ser publicades.2) Pèrdua d'informació. Minimitzar la pèrdua d'informació durant el procés de modificació de les dades. 3) Complexitat computacional. Reduir el temps de càlcul i/o computació inherent a l'aplicació de tècniques de control de la revelació.L'anàlisi i millora referides als objectius generals d'aquesta tesi han estat aplicades concretament a una tècnica de control de la revelació per a microdades contínues anomenada microagregació que bàsicament ajunta registres individuals del conjunt de microdades per tal de disminuir el risc de revelació.Podem diferenciar les diverses aportacions de la tesi de la següent manera:1. Aportacions als mètodes de microagregació univariant, aplicats fonamentalment al tractament de microdades contínues univariants.2. Aportacions als mètodes de microagregació multivariant, aplicats bàsicament al tractament de microdades contínues multivariants (més d'una variable observada).3. Mesures comparatives de mètodes pertorbatius.1. Microagregació univariant1.1. S'ha desenvolupat un estudi analític mitjançant estadístics d'ordre sobre la seguretat del mètode de microagregació amb ordenació individual.1.2. S'ha comparat la qualitat del mètode de microagregació mitjançant ordenació individual amb altres mètodes de control de la revelació per a microdades contínues; qualitat que ha estat mesurada per l'equilibri aconseguit entre la pèrdua d'informació i el risc de revelació.2. Microagregació multivariant2.1. S'ha creat un nou mètode de microagregació multivariant de la "Distància Màxima Modificat" (DMM), modificació d'un altre mètode existent anomenat de la "Distància Màxima" (DM) i s'han comparat les seves complexitats computacionals.2.2. Hem comparat la qualitat del nou mètode de microagregació de la Distància Màxima Modificat" (DMM) amb altres mètodes de control de la revelació per a microdades contínues; qualitat que també ha estat mesurada per l'equilibri aconseguit entre la pèrdua d'informació i el risc de revelació.2.3. Hem desenvolupat un estudi analític per calcular el número de possibles particions d'un conjunt de p variables observades en h-1 conjunts de mida s i un únic conjunt de mida s+r, on p=hs+r.2.4. S'ha realitzat un estudi sobre el número de variables que han de tenir els conjunts d'una partició sobre la que s'executarà el mètode DMM perquè el conjunt modificat de dades resultant tingui una bona qualitat.2.5. Hem fet un estudi sobre la combinació de variables dintre els conjunts que formen una partició que, juntament amb l'anterior estudi sobre el número de variables, proporcionen a l'usuari de la microagregació multivariant una guia per saber quantes i quines variables haurien de formar la partició del conjunt de variables sobre la que s'executarà el mètode DMM perquè el conjunt modificat de dades resultant tingui una millor qualitat.3. Mesures comparatives3.1. Distinció entre les diverses naturaleses que formen part de les mesures emprades per comparar mètodes pertorbatius.3.2. Ponderació de les diverses mesures tenint en compte les diverses naturaleses trobades en el punt anterior.3.3.Creació d'una nova mesura de pèrdua de confidencialitat basada en intervals de confiança construïts a partir de desviacions típiques. / THESIS SUMMARYThis Ph. D. thesis deals with topics related to the protection of the confidentiality of statistical data being disseminated by statistical offices.Beyond presenting a state of the art on the most relevant perturbative techniques for statistical disclosure control of microdata, the general objective of this thesis is to analyze and improve such techniques through the use of mathematical statistics. Improvements achieved in at least one of the three following directions:1) Increase the protection level, i.e. increase the level of protection of sensitive information in the data being published.2) Decrease information loss, i.e. the loss of data utility caused by the application of statistical disclosure control techniques.3) Decrease computational complexity, i.e. the computation inherent to the application of statistical disclosure control techniques.The analysis and improvement mentioned in the general objectives of this thesis have beenapplied to a specific statistical disclosure control technique for continuous microdata. This technique, known as microaggregation, basically consists of clustering individual records in the data set in order to reduce disclosure risk.The contributions of this thesis can be classified as follows:1. Contributions to univariate microaggregation methods, which are mainly used to treat univariate continuous data.2. Contributions to multivariate microaggregation methods, which are mainly used to treat multivariate continuous data (observations of several variables).3. Comparative measures for perturbative methods.1. Univariate microaggregation1.1. An analytical study has been carried out using order statistics whose purpose is to assess the security of individual ranking microaggregation.1.2. The quality of individual ranking microaggregation has been compared with the quality of other statistical disclosure control methods for continuous microdata. Quality is measured as the balance between information loss and disclosure risk.2.Multivariate microaggregation.2.1. A new multivariate microaggregation method has been presented which is called "modified maximum distance" (MMD). MMD is a modification of a previous method called "maximum distance" (MD). Computational complexities of MMD and MD have been compared.2.2. The quality of MMD has been compared with the quality of other statistical disclosure control methods for continuous microdata. Quality is measured as the balance between information loss and disclosure risk.2.3. An analytical study has been carried out in order to compute the number of possible partitions of a set of p observed variables into h-1 sets of size s and one set of size s+r, where p=hs+r.2.4. A study has been carried out on the number of variables required by the sets of a partition on which the MMD method is run in order to obtain a modified data set with good quality.2.5. A study on the combination of variables within the sets that form a partition has been performed which, together with the previous study on the number of variables, provides the user of multivariate microaggregation with a guideline for deciding how many and which variables should form the partition of the set of variables on which MMD is to be run in order to obtain a modified data set with better quality.3. Comparative measures3.1. A distinction has been established between the different natures of the measures used to compare perturbative methods.3.2. A weighting of the various measures has been proposed which takes into account the various natures determined in the previous item.3.3. A new disclosure risk measure has been introduced which consists of a confidence interval based on standard deviations (rather than on ranks, as proposed in previous work). This new measure is especially suited for skewed data.
3

Population-Based Ant Colony Optimization for Multivariate Microaggregation

Askut, Ann Ahu 01 January 2013 (has links)
Numerous organizations collect and distribute non-aggregate personal data for a variety of different purposes, including demographic and public health research. In these situations, the data distributor is responsible with the protection of the anonymity and personal information of individuals. Microaggregation is one of the most commonly used statistical disclosure control methods. In microaggregation, the set of original records is first partitioned into several groups. The records in the same group are similar to each other. The minimum number of records in each group is k. Each record is replaced by the mean value of the group (centroid). The confidentiality of records is protected by ensuring that each group has at least a minimum of k records and each record is indistinguishable from at least k-1 other records in the microaggregated dataset. The goal of this process is to keep the within-group homogeneity higher and the information loss lower, where information loss is the sum squared deviation between the actual records and the group centroids. Several heuristics have been proposed for the NP-hard minimum information loss microaggregation problem. Among the most promising methods is the multivariate Hansen-Mukherjee (MHM) algorithm that uses a shortest path algorithm to identify the best partition consistent with a specified ordering of records. Developing improved heuristics for ordering multivariate points for microaggregation remains an open research challenge. This dissertation adapts a version of the population-based ant colony optimization algorithm (PACO) to order records within which MHM algorithm is used iteratively to improve the quality of grouping. Results of computational experiments using benchmark test problems indicate that P-ACO/MHM based microaggregation algorithm yields comparable or improved information loss than those obtained by extant methods.
4

On the use of economic price theory to determine the optimum levels of privacy and information utility in microdata anonymisation

Zielinski, Marek Piotr 09 June 2010 (has links)
Statistical data, such as in the form of microdata, is used by different organisations as a basis for creating knowledge to assist in their planning and decision-making activities. However, before microdata can be made available for analysis, it needs to be anonymised in order to protect the privacy of the individuals whose data is released. The protection of privacy requires us to hide or obscure the released data. On the other hand, making data useful for its users implies that we should provide data that is accurate, complete and precise. Ideally, we should maximise both the level of privacy and the level of information utility of a released microdata set. However, as we increase the level of privacy, the level of information utility decreases. Without guidelines to guide the selection of the optimum levels of privacy and information utility, it is difficult to determine the optimum balance between the two goals. The objective and constraints of this optimisation problem can be captured naturally with concepts from Economic Price Theory. In this thesis, we present an approach based on Economic Price Theory for guiding the process of microdata anonymisation such that optimum levels of privacy and information utility are achieved. / Thesis (PhD)--University of Pretoria, 2010. / Computer Science / unrestricted
5

The role of extracellular polymeric substances from microbes in soil aggregate stabilization in semiarid grasslands

Zethof, Jeroen Hendricus Theodoor 19 July 2021 (has links)
Soil structural stability plays a pivotal role in landscape preservation when a protective vegetation cover is lacking. For example, under semiarid climates seasonal rainfall cannot sustain a full vegetation cover, but still causes soil erosion. With the loss of (fertile) soil material, ecosystem productivity reduces and less C can be stored. In natural semiarid systems, soil erosion is a spatially heterogeneous process, whereby local highly erodible spots are alternated by improved soil structure under the sparse canopy cover, creating a very heterogeneous landscape. Although the physical protection by the plant canopy is well understood, the potential influence of soil archaea and bacteria on soil structural stability in relation to plants and parent material is less well known. Mainly from studies under controlled conditions, we know that certain archaeal and bacterial species have the ability to produce extracellular polymeric substances (EPS), forming an extracellular matrix. As the formed matrix connects soil particles, EPS seem to have the potential of playing a substantial role in soil aggregation, thereby controlling soil erodibility. Little is known of this gluing process by EPS and its importance under natural conditions as most evidence is derived from controlled conditions in the laboratory. This dissertation aims to unravel the role of EPS from soil archaea and bacteria in soil aggregate stabilization in semiarid grasslands by considering the potential role of plant species and parent material in this process. The sparse vegetation in semiarid grasslands provide a useful gradient in soil organic C contents to study these processes. Improved conditions for soil microbes producing EPS can be found at the root surface, while the bare canopy interspaces lack in C/resources. Two sites were selected in southeast Spain, mainly differing in graphitic C, inorganic C and nitrogen contents. On both sites, soil adjacent to the widely occurring Anthyllis cytisoides legumes shrubs and Macrochloa tenacissima grass tussocks were sampled during two campaigns. The first sampling campaign in April 2017 focused on the top soil, whereby a distance gradient from the plant stem to the bare intercanopy area was sampled. The second sampling campaign in April 2018 focused more on the effect of plant roots on soil archaeal and bacterial communities by including the rhizosphere. As the parent material of the Rambla Honda site, i.e. one of the study sites, contains a substantial amount of graphitic C, several methods were tested to quantify the different types of C in these soils to understand their role in shaping EPS contents. Furthermore, the quantification of graphitic C contents opened the possibility to study a potential interaction between graphite minerals and microbes. Although graphitic C contents explained part of the variances in microbial community, no direct link with EPS-saccharide contents was found. EPS contents were relative high in the rhizosphere, most notable at the legumes shrub Anthyllis cytisoides, and were linked to the enrichment of N-fixing bacteria. However, outside the root influenced soil, EPS contents were still substantially high, whereby the abundance of microbial species, previously associated to biofilm formation in other environments, indicated that EPS synthesis is not only restricted to the rhizosphere. Soil aggregation was linked to EPS-saccharide contents, whereby two mechanisms were hypothesized. Firstly, the strong link between soil wettability and EPS-saccharide content in the soil of the carbonate poor Rambla Honda site, indicated that aggregates become stabilized by hydrophobic bonds created by the EPS. Secondly, results from the carbonate rich Alboloduy site indicates that EPS has a facilitating role in creating stable aggregates by precipitating carbonates on the EPS structure. This likely lead to a higher soil structural stability, as carbonate bindings are more stable when prolonged drought reduces soil biological activity and thereby EPS contents. Overall, EPS play a substantial role in soil aggregate stabilization in semiarid grasslands, whereby EPS contents were increased by legume plants, by means of enriching EPS producing bacteria. / Die Stabilität der Bodenstruktur spielt eine entscheidende Rolle in der Erhaltung der Landschaft, insbesondere wenn keine schützende Vegetationsbedeckung vorhanden ist. So ist beispielsweise unter semiariden Klimabedingungen wegen der Saisonalität der Niederschläge keine vollständige Vegetationsbedeckung vorhanden, was Bodenerosion verursacht. Durch den Verlust von (fruchtbarem) Bodenmaterial verringert sich die Produktivität des Ökosystems. Dadurch kann weniger Kohlenstoff (C) im Boden gespeichert werden. In natürlichen semiariden Systemen ist die Bodenerosion ein räumlich heterogener Prozess, bei dem sich stark erosionsanfällige Stellen mit solchen Bereichen abwechseln, welche durch günstige Bodenstruktur unter der spärlichen Pflanzendecke gekennzeichnet sind. Hierdurch entsteht eine sehr heterogene Landschaft. Während zum physikalischen Schutz durch Vegetationsüberschirmung viele Erkenntnisse vorliegen, ist über den möglichen Einfluss von Archaeen und Bakterien auf die strukturelle Stabilität des Bodens in Bezug auf Pflanzen und Ausgangsmaterial weit weniger bekannt. Hauptsächlich aus Studien unter kontrollierten Bedingungen wissen wir, dass bestimmte Archaen- und Bakterienarten die Fähigkeit besitzen, extrazelluläre polymere Substanzen (EPS) zu produzieren, die eine extrazelluläre Matrix bilden. Da die gebildete Matrix Bodenpartikel verbindet, scheint EPS das Potenzial für eine maßgebliche Beeinflussung der Bodenaggregation zu haben und dadurch die Erosionsanfälligkeit zu steuern. Über solche Klebemechanismen von EPS und deren Bedeutung unter natürlichen Bedingungen ist aber wenig bekannt; die meisten Hinweise stammen aus kontrollierten Bedingungen im Labor. Diese Dissertation zielt darauf ab, die Bedeutung von EPS von Archaeen und Bakterien hinsichtlich der Stabilisierung von Bodenaggregaten in semiariden Graslandschaften unter Berücksichtigung der möglichen Rolle von Pflanzenarten und Ausgangsmaterial in diesem Prozess aufzuklären. Zur Untersuchung solcher Prozesse bietet die spärliche Vegetation in semiariden Graslandschaften einen zweckdienlichen Gradienten bezüglich des Gehalt an organischem C im Boden. Günstige Bedingungen für EPS-produzierende Bodenmikroorganiosmen sind an der Wurzeloberfläche zu finden, während dem unbedeckten Boden zwischen Stellen ohne Pflanzenbedeckung C / Ressourcen fehlen. Es wurden zwei Standorte in Südostspanien ausgewählt, die sich hauptsächlich in den Gehalten an graphitischem C, anorganischem C und Stickstoff unterscheiden. An beiden Standorten wurden im Rahmen von zwei Feldkampagnen Böden in unmittelbarer Nähe zu der weit verbreiteten Leguminosenart Anthyllis cytisoides-Hülsenfrüchten und Grasbüscheln von Macrochloa tenacissima beprobt. Die erste Probenahmekampagne im April 2017 konzentrierte sich auf den obersten Boden, wobei ein Abstandsgradient vom Pflanzenspross zum unbedeckten Boden zwischen der Pflanzendecke beprobt wurde. Die zweite Probenahmekampagne im April 2018 konzentrierte sich mehr auf die Wirkung von Pflanzenwurzeln auf Archaeen- und Bakteriengemeinschaften durch Beprobung der Rhizosphäre. Am Rambla Honda-Standort enthält das Ausgangsmaterial eine erhebliche Menge an graphitischem C. Deshalb wurden verschiedene Methoden getestet, um die verschiedenen Arten von C in diesen Böden zu quantifizieren und ihre Rolle bei der Gestaltung des EPS-Gehalts zu verstehen. Darüber hinaus eröffnete die Quantifizierung des graphitischen C-Gehalts die Möglichkeit, die Wechselwirkung zwischen Graphitmineralen und Mikroorganismen zu untersuchen. Obwohl der Gehalt an graphitischem C einen Teil der Varianzen in der mikrobiellen Gemeinschaft erklärte, wurde kein direkter Zusammenhang mit dem EPS-Saccharidgehalt gefunden. Die EPS-Gehalte waren in der Rhizosphäre relativ hoch - am deutlichsten bei der Leguminosenart Anthyllis cytisoides - und mit der Anreicherung von N-fixierenden Bakterien verbunden. Außerhalb des von der Wurzel beeinflussten Bodens war der EPS-Gehalt jedoch immer noch deutlich erhöht. Dabei wies die Häufigkeit von Mikroorganismenarten, die zuvor mit der Bildung von Biofilmen in anderen Umgebungen in Verbindung gebracht wurden, darauf hin, dass die EPS-Synthese nicht nur auf die Rhizosphäre beschränkt ist. Die Bodenaggregation zeigte eine Verbindung mit dem EPS-Saccharidgehalt auf, wobei zwei Mechanismen angenommen wurden: Erstens wies der starke Zusammenhang zwischen der Bodenbenetzbarkeit und dem EPS-Saccharidgehalt im Boden des karbonatarmen Rambla Honda-Standorts auf eine Aggregatstabilisierung durch EPS-erzeugte hydrophobe Bindungen hin. Zweitens zeigen die Ergebnisse des Standorts Alboloduy-Standorts mit karbonatreichem Boden, dass EPS eine unterstützende Funktion bei der Erzeugung stabiler Aggregate besitzt, indem Karbonate auf der EPS-Struktur ausgefällt werden. Dies führt wahrscheinlich zu einer höheren Stabilität der Bodenstruktur, da Karbonatbindungen stabiler sind, wenn eine längere Trockenheit zu einer Verringerung der biologischen Aktivität im Boden und damit des EPS-Gehalts führt. Insgesamt spielt EPS eine wesentliche Rolle bei der Stabilisierung von Bodenaggregaten in semiariden Graslandschaften, wobei der EPS-Gehalt durch Leguminsosen, mittels Anreicherung von EPS-produzierenden Bakterien, erhöht wurde.

Page generated in 0.1056 seconds