Global ETD Search

61	A comparative analysis of database sanitization techniques for privacy-preserving association rule mining / En jämförande analys av tekniker för databasanonymisering inom sekretessbevarande associationsregelutvinning Mårtensson, Charlie January 2023 (has links) Association rule hiding (ARH) is the process of modifying a transaction database to prevent sensitive patterns (association rules) from discovery by data miners. An optimal ARH technique successfully hides all sensitive patterns while leaving all nonsensitive patterns public. However, in practice, many ARH algorithms cause some undesirable side effects, such as failing to hide sensitive rules or mistakenly hiding nonsensitive ones. Evaluating the utility of ARH algorithms therefore involves measuring the side effects they cause. There are a wide array of ARH techniques in use, with evolutionary algorithms in particular gaining popularity in recent years. However, previous research in the area has focused on incremental improvement of existing algorithms. No work was found that compares the performance of ARH algorithms without the incentive of promoting a newly suggested algorithm as superior. To fill this research gap, this project compares three ARH algorithms developed between 2019 and 2022—ABC4ARH, VIDPSO, and SA-MDP— using identical and unbiased parameters. The algorithms were run on three real databases and three synthetic ones of various sizes, in each case given four different sets of sensitive rules to hide. Their performance was measured in terms of side effects, runtime, and scalability (i.e., performance on increasing database size). It was found that the performance of the algorithms varied considerably depending on the characteristics of the input data, with no algorithm consistently outperforming others at the task of mitigating side effects. VIDPSO was the most efficient in terms of runtime, while ABC4ARH maintained the most robust performance as the database size increased. However, results matching the quality of those in the papers originally describing each algorithm could not be reproduced, showing a clear need for validating the reproducibility of research before the results can be trusted. / ”Association rule hiding”, ungefär ”döljande av associationsregler” – hädanefter ARH – är en process som går ut på att modifiera en transaktionsdatabas för att förhindra att känsliga mönster (så kallade associationsregler) upptäcks genom datautvinning. En optimal ARH-teknik döljer framgångsrikt alla känsliga mönster medan alla ickekänsliga mönster förblir öppet tillgängliga. I praktiken är det dock vanligt att ARH-algoritmer orsakar oönskade sidoeffekter. Exempelvis kan de misslyckas med att dölja vissa känsliga regler eller dölja ickekänsliga regler av misstag. Evalueringen av ARH-algoritmers användbarhet inbegriper därför mätning av dessa sidoeffekter. Bland det stora urvalet ARH-tekniker har i synnerhet evolutionära algoritmer ökat i popularitet under senare år. Tidigare forskning inom området har dock fokuserat på inkrementell förbättring av existerande algoritmer. Ingen forskning hittades som jämförde ARH-algoritmer utan det underliggande incitamentet att framhäva överlägsenheten hos en nyutvecklad algoritm. Detta projekt ämnar fylla denna lucka i forskningen genom en jämförelse av tre ARH-algoritmer som tagits fram mellan 2019 och 2022 – ABC4ARH, VIDPSO och SA-MDP – med hjälp av identiska och oberoende parametrar. Algoritmerna kördes på sex databaser – tre hämtade från verkligheten, tre syntetiska av varierande storlek – och fick i samtliga fall fyra olika uppsättningar känsliga regler att dölja. Prestandan mättes enligt sidoeffekter, exekveringstid samt skalbarhet (dvs. prestation när databasens storlek ökar). Algoritmernas prestation varierade avsevärt beroende på indatans egenskaper. Ingen algoritm var konsekvent överlägsen de andra när det gällde att minimera sidoeffekter. VIDPSO var tidsmässigt mest effektiv, medan ABC4ARH var mest robust vid hanteringen av växande indata. Resultat i nivå med de som uppmättes i forskningsrapporterna som ursprungligen presenterat varje algoritm kunde inte reproduceras, vilket tyder på ett behov av att validera reproducerbarheten hos forskning innan dess resultat kan anses tillförlitliga. Association rule hiding privacy-preserving data mining evolutionary algorithms performance evaluation Associationsregeldöljning sekretessbevarande datautvinning evolutionära algoritmer prestandaevaluering Computer and Information Sciences Data- och informationsvetenskap
62	Anonymous Opt-Out and Secure Computation in Data Mining Shepard, Samuel Steven 09 November 2007 (has links) No description available. Computer Science collusion resistance secure sum edge-disjoint hamiltonian cycle bit-partitioned privacy-preserving data mining anonymous opt-out ID assignment
63	Privacy-Preserving Ontology Publishing for EL Instance Stores: Extended Version Baader, Franz, Kriegel, Francesco, Nuradiansyah, Adrian 20 June 2022 (has links) We make a first step towards adapting an existing approach for privacypreserving publishing of linked data to Description Logic (DL) ontologies. We consider the case where both the knowledge about individuals and the privacy policies are expressed using concepts of the DL EL, which corresponds to the setting where the ontology is an EL instance store. We introduce the notions of compliance of a concept with a policy and of safety of a concept for a policy, and show how optimal compliant (safe) generalizations of a given EL concept can be computed. In addition, we investigate the complexity of the optimality problem. info:eu-repo/classification/ddc/004 ddc:004
64	Un modèle rétroactif de réconciliation utilité-confidentialité sur les données d’assurance Rioux, Jonathan 04 1900 (has links) Le partage des données de façon confidentielle préoccupe un bon nombre d’acteurs, peu importe le domaine. La recherche évolue rapidement, mais le manque de solutions adaptées à la réalité d’une entreprise freine l’adoption de bonnes pratiques d’affaires quant à la protection des renseignements sensibles. Nous proposons dans ce mémoire une solution modulaire, évolutive et complète nommée PEPS, paramétrée pour une utilisation dans le domaine de l’assurance. Nous évaluons le cycle entier d’un partage confidentiel, de la gestion des données à la divulgation, en passant par la gestion des forces externes et l’anonymisation. PEPS se démarque du fait qu’il utilise la contextualisation du problème rencontré et l’information propre au domaine afin de s’ajuster et de maximiser l’utilisation de l’ensemble anonymisé. À cette fin, nous présentons un algorithme d’anonymat fortement contextualisé ainsi que des mesures de performances ajustées aux analyses d’expérience. / Privacy-preserving data sharing is a challenge for almost any enterprise nowadays, no matter their field of expertise. Research is evolving at a rapid pace, but there is still a lack of adapted and adaptable solutions for best business practices regarding the management and sharing of privacy-aware datasets. To this problem, we offer PEPS, a modular, upgradeable and end-to-end system tailored for the need of insurance companies and researchers. We take into account the entire cycle of sharing data: from data management to publication, while negotiating with external forces and policies. Our system distinguishes itself by taking advantage of the domain-specific and problem-specific knowledge to tailor itself to the situation and increase the utility of the resulting dataset. To this end, we also present a strongly contextualised privacy algorithm and adapted utility measures to evaluate the performance of a successful disclosure of experience analysis. Partage confidentiel de données Gestion de la confidentialité Données d’assurance Privacy-preserving data sharing Confidentiality management Insurance data Utility measures for anonymized datasets
65	Energy efficient secure and privacy preserving data aggregation in Wireless Sensor Networks / Energy efficient secure and privacy preserving data aggregation in Wireless Sensor Networks Memon, Irfana 12 November 2013 (has links) Les réseaux de capteurs sans fils sont composés de noeuds capteurs capables de mesurer certains paramètres de l’environnement, traiter l’information recueillie, et communiquer par radio sans aucune autre infrastructure. La communication avec les autres noeuds consomme le plus d’énergie. Les protocoles de collecte des données des réseaux de capteurs sans fils doit donc avoir comme premier objectif de minimiser les communications. Une technique souvent utilisée pour ce faire est l’agrégation des données. Les réseaux de capteurs sans fils sont souvent déployés dans des environnements ouverts, et sont donc vulnérables aux attaques de sécurité. Cette thèse est une contribution à la conception de protocoles sécurisés pour réseaux de capteurs sans fils. Nous faisons une classification des principaux protocoles d’agrégation de données ayant des propriétés de sécurité. Nous proposons un nouveau protocole d’agrégation (ESPPA). ESPPA est basé sur la construction d’un arbre recouvrant sûr et utilise une technique de brouillage pour assurer la confidentialité et le respect de la vie privée. Notre algorithme de construction (et re-construction) de l’arbre recouvrant sûr tient compte des éventuelles pannes des noeuds capteurs. Les résultats de nos simulations montrent que ESPPA assure la sécurité en terme de confidentialité et de respect de la vie privée, et génère moins de communications que SMART. Finalement, nous proposons une extension du schéma de construction de l’arbre recouvrant sûr qui identifie les noeuds redondants en terme de couverture de captage et les met en veille. Les résultats de nos simulations montrent l’efficacité de l’extension proposée. / WSNs are formed by sensor nodes that have the ability to sense the environment, process the sensed information, and communicate via radio without any additional prior backbone infrastructure. In WSNs, communication with other nodes is the most energy consuming task. Hence, the primary objective in designing protocols for WSNs is to minimize communication overhead. This is often achieved using in-network data aggregation. As WSNs are often deployed in open environments, they are vulnerable to security attacks. This thesis contributes toward the design of energy efficient secure and privacy preserving data aggregation protocol for WSNs. First, we classify the main existing secure and privacy-preserving data aggregation protocols for WSNs in the literature. We then propose an energy-efficient secure and privacy-preserving data aggregation (ESPPA) scheme for WSNs. ESPPA scheme is tree-based and achieves confidentiality and privacy based on shuffling technique. We propose a secure tree construction (ST) and tree-reconstruction scheme. Simulation results show that ESPPA scheme effectively preserve privacy, confidentiality, and has less communication overhead than SMART. Finally we propose an extension of ST scheme, called secure coverage tree (SCT) construction scheme. SCT applies sleep scheduling. Through simulations, we show the efficacy and efficiency of the SCT scheme. Beside the work on secure and privacy preserving data aggregation, during my research period, we have also worked on another interesting topic (i.e., composite event detection for WSNs). Appendix B presents a complementary work on composite event detection for WSNs. Réseaux de capteurs sans fil Agrégation de données Consommation d’énergie Wireless sensor networks Aggregation tree construction Data-aggregation Energy efficiency 004
66	Secure and Efficient Comparisons between Untrusted Parties Beck, Martin 11 September 2018 (has links) A vast number of online services is based on users contributing their personal information. Examples are manifold, including social networks, electronic commerce, sharing websites, lodging platforms, and genealogy. In all cases user privacy depends on a collective trust upon all involved intermediaries, like service providers, operators, administrators or even help desk staff. A single adversarial party in the whole chain of trust voids user privacy. Even more, the number of intermediaries is ever growing. Thus, user privacy must be preserved at every time and stage, independent of the intrinsic goals any involved party. Furthermore, next to these new services, traditional offline analytic systems are replaced by online services run in large data centers. Centralized processing of electronic medical records, genomic data or other health-related information is anticipated due to advances in medical research, better analytic results based on large amounts of medical information and lowered costs. In these scenarios privacy is of utmost concern due to the large amount of personal information contained within the centralized data. We focus on the challenge of privacy-preserving processing on genomic data, specifically comparing genomic sequences. The problem that arises is how to efficiently compare private sequences of two parties while preserving confidentiality of the compared data. It follows that the privacy of the data owner must be preserved, which means that as little information as possible must be leaked to any party participating in the comparison. Leakage can happen at several points during a comparison. The secured inputs for the comparing party might leak some information about the original input, or the output might leak information about the inputs. In the latter case, results of several comparisons can be combined to infer information about the confidential input of the party under observation. Genomic sequences serve as a use-case, but the proposed solutions are more general and can be applied to the generic field of privacy-preserving comparison of sequences. The solution should be efficient such that performing a comparison yields runtimes linear in the length of the input sequences and thus producing acceptable costs for a typical use-case. To tackle the problem of efficient, privacy-preserving sequence comparisons, we propose a framework consisting of three main parts. a) The basic protocol presents an efficient sequence comparison algorithm, which transforms a sequence into a set representation, allowing to approximate distance measures over input sequences using distance measures over sets. The sets are then represented by an efficient data structure - the Bloom filter -, which allows evaluation of certain set operations without storing the actual elements of the possibly large set. This representation yields low distortion for comparing similar sequences. Operations upon the set representation are carried out using efficient, partially homomorphic cryptographic systems for data confidentiality of the inputs. The output can be adjusted to either return the actual approximated distance or the result of an in-range check of the approximated distance. b) Building upon this efficient basic protocol we introduce the first mechanism to reduce the success of inference attacks by detecting and rejecting similar queries in a privacy-preserving way. This is achieved by generating generalized commitments for inputs. This generalization is done by treating inputs as messages received from a noise channel, upon which error-correction from coding theory is applied. This way similar inputs are defined as inputs having a hamming distance of their generalized inputs below a certain predefined threshold. We present a protocol to perform a zero-knowledge proof to assess if the generalized input is indeed a generalization of the actual input. Furthermore, we generalize a very efficient inference attack on privacy-preserving sequence comparison protocols and use it to evaluate our inference-control mechanism. c) The third part of the framework lightens the computational load of the client taking part in the comparison protocol by presenting a compression mechanism for partially homomorphic cryptographic schemes. It reduces the transmission and storage overhead induced by the semantically secure homomorphic encryption schemes, as well as encryption latency. The compression is achieved by constructing an asymmetric stream cipher such that the generated ciphertext can be converted into a ciphertext of an associated homomorphic encryption scheme without revealing any information about the plaintext. This is the first compression scheme available for partially homomorphic encryption schemes. Compression of ciphertexts of fully homomorphic encryption schemes are several orders of magnitude slower at the conversion from the transmission ciphertext to the homomorphically encrypted ciphertext. Indeed our compression scheme achieves optimal conversion performance. It further allows to generate keystreams offline and thus supports offloading to trusted devices. This way transmission-, storage- and power-efficiency is improved. We give security proofs for all relevant parts of the proposed protocols and algorithms to evaluate their security. A performance evaluation of the core components demonstrates the practicability of our proposed solutions including a theoretical analysis and practical experiments to show the accuracy as well as efficiency of approximations and probabilistic algorithms. Several variations and configurations to detect similar inputs are studied during an in-depth discussion of the inference-control mechanism. A human mitochondrial genome database is used for the practical evaluation to compare genomic sequences and detect similar inputs as described by the use-case. In summary we show that it is indeed possible to construct an efficient and privacy-preserving (genomic) sequences comparison, while being able to control the amount of information that leaves the comparison. To the best of our knowledge we also contribute to the field by proposing the first efficient privacy-preserving inference detection and control mechanism, as well as the first ciphertext compression system for partially homomorphic cryptographic systems. info:eu-repo/classification/ddc/004 ddc:004
67	Privacy-preserving Building Occupancy Estimation via Low-Resolution Infrared Thermal Cameras Zhu, Shuai January 2021 (has links) Building occupancy estimation has become an important topic for sustainable buildings that has attracted more attention during the pandemics. Estimating building occupancy is a considerable problem in computer vision, while computer vision has achieved breakthroughs in recent years. But, machine learning algorithms for computer vision demand large datasets that may contain users’ private information to train reliable models. As privacy issues pose a severe challenge in the field of machine learning, this work aims to develop a privacypreserved machine learningbased method for people counting using a lowresolution thermal camera with 32 × 24 pixels. The method is applicable for counting people in different scenarios, concretely, counting people in spaces smaller than the field of view (FoV) of the camera, as well as large spaces over the FoV of the camera. In the first scenario, counting people in small spaces, we directly count people within the FoV of the camera by Multiple Object Detection (MOD) techniques. Our MOD method achieves up to 56.8% mean average precision (mAP). In the second scenario, we use Multiple Object Tracking (MOT) techniques to track people entering and exiting the space. We record the number of people who entered and exited, and then calculate the number of people based on the tracking results. The MOT method reaches 47.4% multiple object tracking accuracy (MOTA), 78.2% multiple object tracking precision (MOTP), and 59.6% identification F-Score (IDF1). Apart from the method, we create a novel thermal images dataset containing 1770 thermal images with proper annotation. / Uppskattning av hur många personer som vistas i en byggnad har blivit ett viktigt ämne för hållbara byggnader och har fått mer uppmärksamhet under pandemierna. Uppskattningen av byggnaders beläggning är ett stort problem inom datorseende, samtidigt som datorseende har fått ett genombrott under de senaste åren. Algoritmer för maskininlärning för datorseende kräver dock stora datamängder som kan innehålla användarnas privata information för att träna tillförlitliga modeller. Eftersom integritetsfrågor utgör en allvarlig utmaning inom maskininlärning syftar detta arbete till att utveckla en integritetsbevarande maskininlärningsbaserad metod för personräkning med hjälp av en värmekamera med låg upplösning med 32 x 24 pixlar. Metoden kan användas för att räkna människor i olika scenarier, dvs. att räkna människor i utrymmen som är mindre än kamerans FoV och i stora utrymmen som är större än kamerans FoV. I det första scenariot, att räkna människor i små utrymmen, räknar vi direkt människor inom kamerans FoV med MOD teknik. Vår MOD-metod uppnår upp till 56,8% av den totala procentuella fördelningen. I det andra scenariot använder vi MOT-teknik för att spåra personer som går in i och ut ur rummet. Vi registrerar antalet personer som går in och ut och beräknar sedan antalet personer utifrån spårningsresultaten. MOT-metoden ger 47,4% MOTA, 78,2% MOTP och 59,6% IDF1. Förutom metoden skapar vi ett nytt dataset för värmebilder som innehåller 1770 värmebilder med korrekt annotering. Building occupancy estimation People counting Privacy-preserving Low-resolution thermal camera Multiple Object Detection Multiple Object Tracking Uppskattning av bebyggelse personräkning integritetsbevarande värmekamera med låg upplösning detektering av flera objekt spårning av flera objekt Computer and Information Sciences Data- och informationsvetenskap
68	Vers une plateforme holistique de protection de la vie privée dans les services géodépendants Sahnoune, Zakaria 04 1900 (has links) No description available. Services géodépendants Mesure de confiance Quantification de risques Position jumelle Position indicatrice Réseaux pair-à-pair Transfert inconscient Information mutuelle Logique floue Location-based services Location privacy Location privacy-preserving mechanism Risk quantification Trust measurement Twin positions Telltale positions P2P networks Oblivious transfer Mutual information Fuzzy logic
69	Construction of Secure and Efficient Private Set Intersection Protocol Kumar, Vikas January 2013 (has links) (PDF) Private set intersection(PSI) is a two party protocol where both parties possess a private set and at the end of the protocol, one party (client) learns the intersection while other party (server) learns nothing. Motivated by some interesting practical applications, several provably secure and efficient PSI protocols have appeared in the literature in recent past. Some of the proposed solutions are secure in the honest-but-curious (HbC) model while the others are secure in the (stronger) malicious model. Security in the latter is traditionally achieved by following the classical approach of attaching a zero knowledge proof of knowledge (ZKPoK) (and/or using the so-called cut-and-choose technique). These approaches prevent the parties from deviating from normal protocol execution, albeit with significant computational overhead and increased complexity in the security argument, which includes incase of ZKPoK, knowledge extraction through rewinding. We critically investigate a subset of the existing protocols. Our study reveals some interesting points about the so-called provable security guarantee of some of the proposed solutions. Surprisingly, we point out some gaps in the security argument of several protocols. We also discuss an attack on a protocol when executed multiple times between the same client and server. The attack, in fact, indicates some limitation in the existing security definition of PSI. On the positive side, we show how to correct the security argument for the above mentioned protocols and show that in the HbC model the security can be based on some standard computational assumption like RSA and Gap Diﬃe-Hellman problem. For a protocol, we give improved version of that protocol and prove security in the HbC model under standard computational assumption. For the malicious model, we construct two PSI protocols using deterministic blind signatures i.e., Boldyreva’s blind signature and Chaum’s blind signature, which do not involve ZKPoK or cut-and-choose technique. Chaum’s blind signature gives a new protocol in the RSA setting and Boldyreva’s blind signature gives protocol in gap Diﬃe-Hellman setting which is quite similar to an existing protocol but it is efficient and does not involve ZKPoK. Data Protection Private Set Intersection (PSI) Simulation-based Proofs Malicious Model Secure Two-Party Computation Cryptographic Protocols Honest-But-Curious Model Data Security Models HbC Model Gap Diffie-Hellman Setting Computer Science
70	Privacy preserving software engineering for data driven development Tongay, Karan Naresh 14 December 2020 (has links) The exponential rise in the generation of data has introduced many new areas of research including data science, data engineering, machine learning, artificial in- telligence to name a few. It has become important for any industry or organization to precisely understand and analyze the data in order to extract value out of the data. The value of the data can only be realized when it is put into practice in the real world and the most common approach to do this in the technology industry is through software engineering. This brings into picture the area of privacy oriented software engineering and thus there is a rise of data protection regulation acts such as GDPR (General Data Protection Regulation), PDPA (Personal Data Protection Act), etc. Many organizations, governments and companies who have accumulated huge amounts of data over time may conveniently use the data for increasing business value but at the same time the privacy aspects associated with the sensitivity of data especially in terms of personal information of the people can easily be circumvented while designing a software engineering model for these types of applications. Even before the software engineering phase for any data processing application, often times there can be one or many data sharing agreements or privacy policies in place. Every organization may have their own way of maintaining data privacy practices for data driven development. There is a need to generalize or categorize their approaches into tactics which could be referred by other practitioners who are trying to integrate data privacy practices into their development. This qualitative study provides an understanding of various approaches and tactics that are being practised within the industry for privacy preserving data science in software engineering, and discusses a tool for data usage monitoring to identify unethical data access. Finally, we studied strategies for secure data publishing and conducted experiments using sample data to demonstrate how these techniques can be helpful for securing private data before publishing. / Graduate Data Privacy Privacy Data Engineering Software Engineering Data Driven Developers Data Science Privacy Preserving Data Driven Development Machine Learning One class SVM Data Usage Monitoring Health data k-anonymity l-diversity differential privacy Information management Secure data sharing Survey Audits and access control Data Privacy Tactics

Search results