Global ETD Search

51	Privacy-preserving Synthetic Data Generation for Healthcare Planning / Sekretessbevarande syntetisk generering av data för vårdplanering Yang, Ruizhi January 2021 (has links) Recently, a variety of machine learning techniques have been applied to different healthcare sectors, and the results appear to be promising. One such sector is healthcare planning, in which patient data is used to produce statistical models for predicting the load on different units of the healthcare system. This research introduces an attempt to design and implement a privacy-preserving synthetic data generation method adapted explicitly to patients’ health data and for healthcare planning. A Privacy-preserving Conditional Generative Adversarial Network (PPCGAN) is used to generate synthetic data of Healthcare events, where a well-designed noise is added to the gradients in the training process. The concept of differential privacy is used to ensure that adversaries cannot reveal the exact training samples from the trained model. Notably, the goal is to produce digital patients and model their journey through the healthcare system. / Nyligen har en mängd olika maskininlärningstekniker tillämpats på olika hälso- och sjukvårdssektorer, och resultaten verkar lovande. En sådan sektor är vårdplanering, där patientdata används för att ta fram statistiska modeller för att förutsäga belastningen på olika enheter i sjukvården. Denna forskning introducerar ett försök att utforma och implementera en sekretessbevarande syntetisk datagenereringsmetod som uttryckligen anpassas till patienters hälsodata och för vårdplanering. Ett sekretessbevarande villkorligt generativt kontradiktoriskt nätverk (PPCGAN) används för att generera syntetisk data från hälsovårdshändelser, där ett väl utformat brus läggs till gradienterna i träningsprocessen. Begreppet differentiell integritet används för att säkerställa att motståndare inte kan avslöja de exakta träningsproven från den tränade modellen. Målet är särskilt att producera digitala patienter och modellera deras resa genom sjukvården. Synthetic data generation differential privacy generative network GAN Moments Accountant Markov modeling. Syntetisk datagenerering differentiell integritet generativt nätverk GAN Moments Accountant Markov -modellering. Elektroteknik och elektronik
52	Social Graph Anonymization / Anonymisation de graphes sociaux Nguyen, Huu-Hiep 04 November 2016 (has links) La vie privée est une préoccupation des utilisateurs des réseaux sociaux. Les réseaux sociaux sont une source de données précieuses pour des analyses scientifiques ou commerciales. Cette thèse aborde trois problèmes de confidentialité des réseaux sociaux: l'anonymisation de graphes sociaux, la détection de communautés privées et l'échange de liens privés. Nous abordons le problème d'anonymisation de graphes via la sémantique de l'incertitude et l'intimité différentielle. Pour la première, nous proposons un modèle général appelé Uncertain Adjacency Matrix (UAM) qui préserve dans le graphe anonymisé les degrés des nœuds du graphe non-anonymisé. Nous analysons deux schémas proposés récemment et montrons leur adaptation dans notre modèle. Nous aussi présentons notre approche dite MaxVar. Pour la technique d'intimité différentielle, le problème devient difficile en raison de l'énorme espace des graphes anonymisés possibles. Un grand nombre de systèmes existants ne permettent pas de relâcher le budget contrôlant la vie privée, ni de déterminer sa borne supérieure. Dans notre approche nous pouvons calculer cette borne. Nous introduisons le nouveau schéma Top-m-Filter de complexité linéaire et améliorons la technique récente EdgeFlip. L'évaluation de ces algorithmes sur une large gamme de graphes donne un panorama de l'état de l'art. Nous présentons le problème original de la détection de la communauté dans le cadre de l'intimité différentielle. Nous analysons les défis majeurs du problème et nous proposons quelques approches pour les aborder sous deux angles: par perturbation d'entrée (schéma LouvainDP) et par perturbation d'algorithme (schéma ModDivisive) / Privacy is a serious concern of users in daily usage of social networks. Social networks are a valuable data source for large-scale studies on social organization and evolution and are usually published in anonymized forms. This thesis addresses three privacy problems of social networks: graph anonymization, private community detection and private link exchange. First, we tackle the problem of graph anonymization via uncertainty semantics and differential privacy. As for uncertainty semantics, we propose a general obfuscation model called Uncertain Adjacency Matrix (UAM) that keep expected node degrees equal to those in the unanonymized graph. We analyze two recently proposed schemes and show their fitting into the model. We also present our scheme Maximum Variance (MaxVar) to fill the gap between them. Using differential privacy, the problem is very challenging because of the huge output space of noisy graphs. A large body of existing schemes on differentially private release of graphs are not consistent with increasing privacy budgets as well as do not clarify the upper bounds of privacy budgets. In this thesis, such a bound is provided. We introduce the new linear scheme Top-m-Filter (TmF) and improve the existing technique EdgeFlip. Thorough comparative evaluation on a wide range of graphs provides a panorama of the state-of-the-art's performance as well as validates our proposed schemes. Second, we present the problem of community detection under differential privacy. We analyze the major challenges behind the problem and propose several schemes to tackle them from two perspectives: input perturbation (LouvainDP) and algorithm perturbation (ModDivisive) Réseaux sociaux Incertaine matrice d'adjacence Maximum Variance Vie privée différentielle Top-M-Filte Détection de communautés LouvainDP ModDivisive Échange intime des liens $(\alpha \beta)$-Échange Social networks Uncertain Adjacency Matrix Maximum Variance Differential privacy Top-M-Filter Community detection LouvainDP ModDivisive Private link exchange $(\alpha \beta)$-Exchange 005.8
53	Privacy preserving software engineering for data driven development Tongay, Karan Naresh 14 December 2020 (has links) The exponential rise in the generation of data has introduced many new areas of research including data science, data engineering, machine learning, artificial in- telligence to name a few. It has become important for any industry or organization to precisely understand and analyze the data in order to extract value out of the data. The value of the data can only be realized when it is put into practice in the real world and the most common approach to do this in the technology industry is through software engineering. This brings into picture the area of privacy oriented software engineering and thus there is a rise of data protection regulation acts such as GDPR (General Data Protection Regulation), PDPA (Personal Data Protection Act), etc. Many organizations, governments and companies who have accumulated huge amounts of data over time may conveniently use the data for increasing business value but at the same time the privacy aspects associated with the sensitivity of data especially in terms of personal information of the people can easily be circumvented while designing a software engineering model for these types of applications. Even before the software engineering phase for any data processing application, often times there can be one or many data sharing agreements or privacy policies in place. Every organization may have their own way of maintaining data privacy practices for data driven development. There is a need to generalize or categorize their approaches into tactics which could be referred by other practitioners who are trying to integrate data privacy practices into their development. This qualitative study provides an understanding of various approaches and tactics that are being practised within the industry for privacy preserving data science in software engineering, and discusses a tool for data usage monitoring to identify unethical data access. Finally, we studied strategies for secure data publishing and conducted experiments using sample data to demonstrate how these techniques can be helpful for securing private data before publishing. / Graduate Data Privacy Privacy Data Engineering Software Engineering Data Driven Developers Data Science Privacy Preserving Data Driven Development Machine Learning One class SVM Data Usage Monitoring Health data k-anonymity l-diversity differential privacy Information management Secure data sharing Survey Audits and access control Data Privacy Tactics
54	Facilitating mobile crowdsensing from both organizers’ and participants’ perspectives / Facilitation de la collecte participative des données mobiles (mobile crowdsensing) au point de vue des organisateurs et des participants Wang, Leye 18 May 2016 (has links) La collecte participative des données mobiles est un nouveau paradigme dédié aux applications de détection urbaines utilisant une foule de participants munis de téléphones intelligents. Pour mener à bien les tâches de collecte participative des données mobiles, diverses préoccupations relatives aux participants et aux organisateurs doivent être soigneusement prises en considération. Pour les participants, la principale préoccupation porte sur la consommation d'énergie, le coût des données mobiles, etc. Pour les organisateurs, la qualité des données et le budget sont les deux préoccupations essentielles. Dans cette thèse, deux mécanismes de collecte participative des données mobiles sont proposés : le téléchargement montant collaboratif des données et la collecte clairsemée des données mobiles. Pour le téléchargement montant collaboratif des données, deux procédés sont proposés 1) « effSense », qui fournit la meilleure solution permettant d’économiser la consommation d'énergie aux participants ayant un débit suffisant, et de réduire le coût des communications mobiles aux participants ayant un débit limité; 2) « ecoSense », qui permet de réduire le remboursement incitatif par les organisateurs des frais associés au coût des données mobiles des participants. Dans la collecte clairsemée des données mobiles, les corrélations spatiales et temporelles entre les données détectées sont exploitées pour réduire de manière significative le nombre de tâches allouées et, par conséquent, le budget associé aux organisateurs, tout en assurant la qualité des données. De plus, l’intimité différentielle est afin de répondre au besoin de préservation de la localisation des participants / Mobile crowdsensing is a novel paradigm for urban sensing applications using a crowd of participants' sensor-equipped smartphones. To successfully complete mobile crowdsensing tasks, various concerns of participants and organizers need to be carefully considered. For participants, primary concerns include energy consumption, mobile data cost, privacy, etc. For organizers, data quality and budget are two critical concerns. In this dissertation, to address both participants' and organizers' concerns, two mobile crowdsensing mechanisms are proposed - collaborative data uploading and sparse mobile crowdsensing. In collaborative data uploading, participants help each other through opportunistic encounters and data relays in the data uploading process of crowdsensing, in order to save energy consumption, mobile data cost, etc. Specifically, two collaborative data uploading procedures are proposed (1) effSense, which helps participants with enough data plan to save energy consumption, and participants with little data plan to save mobile data cost; (2) ecoSense, which reduces organizers' incentive refund that is paid for covering participants' mobile data cost. In sparse mobile crowdsensing, spatial and temporal correlations among sensed data are leveraged to significantly reduce the number of allocated tasks thus organizers' budget, still ensuring data quality. Specifically, a sparse crowdsensing task allocation framework, CCS-TA, is implemented with compressive sensing, active learning, and Bayesian inference techniques. Furthermore, differential privacy is introduced into sparse mobile crowdsensing to address participants' location privacy concerns Consommation d'énergie Coût des données mobiles Qualité des données Confidentialité de la localisation Allocation de tâches Intimité différentielle Mobile crowdsensing Energy consumption Mobile data cost Data quality Location privacy Delay-tolerant data uploading Task allocation Differential privacy
55	DISTRIBUTED MACHINE LEARNING OVER LARGE-SCALE NETWORKS Frank Lin (16553082) 18 July 2023 (has links) <p>The swift emergence and wide-ranging utilization of machine learning (ML) across various industries, including healthcare, transportation, and robotics, have underscored the escalating need for efficient, scalable, and privacy-preserving solutions. Recognizing this, we present an integrated examination of three novel frameworks, each addressing different aspects of distributed learning and privacy issues: Two Timescale Hybrid Federated Learning (TT-HF), Delay-Aware Federated Learning (DFL), and Differential Privacy Hierarchical Federated Learning (DP-HFL). TT-HF introduces a semi-decentralized architecture that combines device-to-server and device-to-device (D2D) communications. Devices execute multiple stochastic gradient descent iterations on their datasets and sporadically synchronize model parameters via D2D communications. A unique adaptive control algorithm optimizes step size, D2D communication rounds, and global aggregation period to minimize network resource utilization and achieve a sublinear convergence rate. TT-HF outperforms conventional FL approaches in terms of model accuracy, energy consumption, and resilience against outages. DFL focuses on enhancing distributed ML training efficiency by accounting for communication delays between edge and cloud. It also uses multiple stochastic gradient descent iterations and periodically consolidates model parameters via edge servers. The adaptive control algorithm for DFL mitigates energy consumption and edge-to-cloud latency, resulting in faster global model convergence, reduced resource consumption, and robustness against delays. Lastly, DP-HFL is introduced to combat privacy vulnerabilities in FL. Merging the benefits of FL and Hierarchical Differential Privacy (HDP), DP-HFL significantly reduces the need for differential privacy noise while maintaining model performance, exhibiting an optimal privacy-performance trade-off. Theoretical analysis under both convex and nonconvex loss functions confirms DP-HFL’s effectiveness regarding convergence speed, privacy performance trade-off, and potential performance enhancement with appropriate network configuration. In sum, the study thoroughly explores TT-HF, DFL, and DP-HFL, and their unique solutions to distributed learning challenges such as efficiency, latency, and privacy concerns. These advanced FL frameworks have considerable potential to further enable effective, efficient, and secure distributed learning.</p> Network engineering Signal processing Knowledge representation and reasoning Modelling and simulation Planning and decision making Numerical analysis Optimisation Federated Learning frameworks differential privacy composition network optimization algorithm distributed machine learning (ML) Convergence analysis Edge Intelligence edge cloud computing

Page generated in 0.0824 seconds