• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 39
  • 4
  • 1
  • 1
  • 1
  • Tagged with
  • 62
  • 62
  • 31
  • 27
  • 20
  • 17
  • 16
  • 15
  • 14
  • 11
  • 10
  • 7
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Rigorous and Flexible Privacy Protection Framework for Utilizing Personal Spatiotemporal Data / 個人時空間データ利活用のための厳密で柔軟なプライバシ保護フレムワーク

Yang, Cao 23 March 2017 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第20508号 / 情博第636号 / 新制||情||110(附属図書館) / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授 吉川 正俊, 教授 田中 克己, 教授 岡部 寿男 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
32

RISK INTERPRETATION OF DIFFERENTIAL PRIVACY

Jiajun Liang (13190613) 31 July 2023 (has links)
<p><br></p><p>How to set privacy parameters is a crucial problem for the consistent application of DP in practice. The current privacy parameters do not provide direct suggestions for this problem. On the other hand, different databases may have varying degrees of information leakage, allowing attackers to enhance their attacks with the available information. This dissertation provides an additional interpretation of the current DP notions by introducing a framework that directly considers the worst-case average failure probability of attackers under different levels of knowledge. </p><p><br></p><p>To achieve this, we introduce a novel measure of attacker knowledge and establish a dual relationship between (type I error, type II error) and (prior, average failure probability). By leveraging this framework, we propose an interpretable paradigm to consistently set privacy parameters on different databases with varying levels of leaked information. </p><p><br></p><p>Furthermore, we characterize the minimax limit of private parameter estimation, driven by $1/(n(1-2p))^2+1/n$, where $p$ represents the worst-case probability risk and $n$ is the number of data points. This characterization is more interpretable than the current lower bound $\min{1/(n\epsilon^2),1/(n\delta^2)}+1/n$ on $(\epsilon,\delta)$-DP. Additionally, we identify the phase transition of private parameter estimation based on this limit and provide suggestions for protocol designs to achieve optimal private estimations. </p><p><br></p><p>Last, we consider a federated learning setting where the data are stored in a distributed manner and privacy-preserving interactions are required. We extend the proposed interpretation to federated learning, considering two scenarios: protecting against privacy breaches against local nodes and protecting privacy breaches against the center. Specifically, we consider a non-convex sparse federated parameter estimation problem and apply it to the generalized linear models. We tackle two challenges in this setting. Firstly, we encounter the issue of initialization due to the privacy requirements that limit the number of queries to the database. Secondly, we overcome the heterogeneity in the distribution among local nodes to identify low-dimensional structures.</p>
33

DIFFERENTIALLY PRIVATE SUBLINEAR ALGORITHMS

Tamalika Mukherjee (16050815) 07 June 2023 (has links)
<p>Collecting user data is crucial for advancing machine learning, social science, and government policies, but the privacy of the users whose data is being collected is a growing concern. {\em Differential Privacy (DP)} has emerged as the most standard notion for privacy protection with robust mathematical guarantees. Analyzing such massive amounts of data in a privacy-preserving manner motivates the need to study differentially-private algorithms that are also super-efficient.  </p> <p><br></p> <p>This thesis initiates a systematic study of differentially-private sublinear-time and sublinear-space algorithms. The contributions of this thesis are two-fold. First, we design some of the first differentially private sublinear algorithms for many fundamental problems. Second, we develop general DP techniques for designing differentially-private sublinear algorithms. </p> <p><br></p> <p>We give the first DP sublinear algorithm for clustering by generalizing a subsampling framework from the non-DP sublinear-time literature. We give the first DP sublinear algorithm for estimating the maximum matching size. Our DP sublinear algorithm for estimating the average degree of the graph achieves a better approximation than previous works. We give the first DP algorithm for releasing $L_2$-heavy hitters in the sliding window model and a pure $L_1$-heavy hitter algorithm in the same model, which improves upon previous works.  </p> <p><br></p> <p>We develop general techniques that address the challenges of designing sublinear DP algorithms. First, we introduce the concept of Coupled Global Sensitivity (CGS). Intuitively, the CGS of a randomized algorithm generalizes the classical  notion of global sensitivity of a function, by considering a coupling of the random coins of the algorithm when run on neighboring inputs. We show that one can achieve pure DP by adding Laplace noise proportional to the CGS of an algorithm. Second, we give a black box DP transformation for a specific class of approximation algorithms. We show that such algorithms can be made differentially private without sacrificing accuracy, as long as the function has small global sensitivity. In particular, this transformation gives rise to sublinear DP algorithms for many problems, including triangle counting, the weight of the minimum spanning tree, and norm estimation.</p>
34

Data Security and Privacy under the Binary Cloak

Ji, Tianxi 26 August 2022 (has links)
No description available.
35

Differential privacy and machine learning: Calculating sensitivity with generated data sets / Differential privacy och maskininlärning: Beräkning av sensitivitet med genererade dataset

Lundmark, Magnus, Dahlman, Carl-Johan January 2017 (has links)
Privacy has never been more important to maintain in today’s information society. Companies and organizations collect large amounts of data about their users. This information is considered to be valuable due to its statistical usage that provide insight into certain areas such as medicine, economics, or behavioural patterns among individuals. A technique called differential privacy has been developed to ensure that the privacy of individuals are maintained. This enables the ability to create useful statistics while the privacy of the individual is maintained. However the disadvantage of differential privacy is the magnitude of the randomized noise applied to the data in order to hide the individual. This research examined whether it is possible to improve the usability of the privatized result by using machine learning to generate a data set that the noise can be based on. The purpose of the generated data set is to provide a local representation of the underlying data set that is safe to use when calculating the magnitude of the randomized noise. The results of this research has determined that this approach is currently not a feasible solution, but demonstrates possible ways to base further research in order to improve the usability of differential privacy. The research indicates limiting the noise to a lower bound calculated from the underlying data set might be enough to reach all privacy requirements. Furthermore, the accuracy of the machining learning algorithm and its impact on the usability of the noise, was not fully investigated and could be of interest in future studies. / Aldrig tidigare har integritet varit viktigare att upprätthålla än i dagens informationssamhälle, där företag och organisationer samlar stora mängder data om sina användare. Merparten av denna information är sedd som värdefull och kan användas för att skapa statistik som i sin tur kan ge insikt inom områden som medicin, ekonomi eller beteendemönster bland individer. För att säkerställa att en enskild individs integritet upprätthålls har en teknik som heter differential privacy utvecklats. Denna möjliggör framtagandet av användbar statistik samtidigt som individens integritet upprätthålls. Differential privacy har dock en nackdel, och det är storleken på det randomiserade bruset som används för att dölja individen i en fråga om data. Denna undersökning undersökte huruvida detta brus kunde förbättras genom att använda maskininlärning för att generera ett data set som bruset kunde baseras på. Tanken var att den genererade datasetet skulle kunna ge en lokal representation av det underliggande datasetet som skulle vara säker att använda vid beräkning av det randomiserade brusets storlek. Forskningen visar att detta tillvägagångssätt för närvarande inte stöds av resultaten. Storleken på det beräknade bruset var inte tillräckligt stort och resulterade därmed i en oacceptabel mängd läckt information. Forskningen visar emellertid att genom att begränsa bruset till en lägsta nivå som är beräknad från det lokala datasetet möjligtvis kan räcka för att uppfylla alla sekretesskrav. Ytterligare forskning behövs för att säkerställa att detta ger den nödvändiga nivån av integritet. Vidare undersöktes inte noggrannheten hos maskininlärningsalgoritmen och dess inverkan på brusets användbarhet vilket kan vara en inriktning för vidare studier.
36

Optimizing Linear Queries Under Differential Privacy

Li, Chao 01 September 2013 (has links)
Private data analysis on statistical data has been addressed by many recent literatures. The goal of such analysis is to measure statistical properties of a database without revealing information of individuals who participate in the database. Differential privacy is a rigorous privacy definition that protects individual information using output perturbation: a differentially private algorithm produces statistically indistinguishable outputs no matter whether the database contains a tuple corresponding to an individual or not. It is straightforward to construct differentially private algorithms for many common tasks and there are published algorithms to support various tasks under differential privacy. However methods to design error-optimal algorithms for most non-trivial tasks are still unknown. In particular, we are interested in error-optimal algorithms for sets of linear queries. A linear query is a sum of counts of tuples that satisfy a certain condition, which covers the scope of many aggregation tasks including count, sum and histogram. We present the matrix mechanism, a novel mechanism for answering sets of linear queries under differential privacy. The matrix mechanism makes a clear distinction between a set of queries submitted by users, called the query workload, and an alternative set of queries to be answered under differential privacy, called the query strategy. The answer to the query workload can then be computed using the answer to the query strategy. Given a query workload, the query strategy determines the distribution of the output noise and the power of the matrix mechanism comes from adaptively choosing a query strategy that minimizes the output noise. Our analyses also provide a theoretical measure to the quality of different strategies for a given workload. This measure is then used in accurate and approximate formulations to the optimization problem that outputs the error-optimal strategy. We present a lower bound of error to answer each workload under the matrix mechanism. The bound reveals that the hardness of a query workload is related to the spectral properties of the workload when it is represented in matrix form. In addition, we design an approximate algorithm, which generates strategies generated by our a out perform state-of-art mechanisms over (epsilon, delta)-differential privacy. Those strategies lead to more accurate data analysis while preserving a rigorous privacy guarantee. Moreover, we also combine the matrix mechanism with a novel data-dependent algorithm, which achieves differential privacy by adding noise that is adapted to the input data and to the given query workload.
37

Balancing Privacy and Accuracy in IoT using Domain-Specific Features for Time Series Classification

Lakhanpal, Pranshul 01 June 2023 (has links) (PDF)
ε-Differential Privacy (DP) has been popularly used for anonymizing data to protect sensitive information and for machine learning (ML) tasks. However, there is a trade-off in balancing privacy and achieving ML accuracy since ε-DP reduces the model’s accuracy for classification tasks. Moreover, not many studies have applied DP to time series from sensors and Internet-of-Things (IoT) devices. In this work, we try to achieve the accuracy of ML models trained with ε-DP data to be as close to the ML models trained with non-anonymized data for two different physiological time series. We propose to transform time series into domain-specific 2D (image) representations such as scalograms, recurrence plots (RP), and their joint representation as inputs for training classifiers. The advantages of using these image representations render our proposed approach secure by preventing data leaks since these image transformations are irreversible. These images allow us to apply state-of-the-art image classifiers to obtain accuracy comparable to classifiers trained on non-anonymized data by ex- ploiting the additional information such as textured patterns from these images. In order to achieve classifier performance with anonymized data close to non-anonymized data, it is important to identify the value of ε and the input feature. Experimental results demonstrate that the performance of the ML models with scalograms and RP was comparable to ML models trained on their non-anonymized versions. Motivated by the promising results, an end-to-end IoT ML edge-cloud architecture capable of detecting input drifts is designed that employs our technique to train ML models on ε-DP physiological data. Our classification approach ensures the privacy of individuals while processing and analyzing the data at the edge securely and efficiently.
38

Addressing Fundamental Limitations in Differentially Private Machine Learning

Nandi, Anupama January 2021 (has links)
No description available.
39

Differentially Private Federated Learning Algorithms for Sparse Basis Recovery

Ajinkya K Mulay (18823252) 14 June 2024 (has links)
<p dir="ltr">Sparse basis recovery is an important learning problem when the number of model dimensions (<i>p</i>) is much larger than the number of samples (<i>n</i>). However, there has been little work that studies sparse basis recovery in the Federated Learning (FL) setting, where the Differential Privacy (DP) of the client data must also be simultaneously protected. Notably, the performance guarantees of existing DP-FL algorithms (such as DP-SGD) will degrade significantly when the system is ill-determined (i.e., <i>p >> n</i>), and thus they will fail to accurately learn the true underlying sparse model. The goal of my thesis is therefore to develop DP-FL sparse basis recovery algorithms that can recover the true underlying sparse basis provably accurately even when <i>p >> n</i>, yet still guaranteeing the differential privacy of the client data.</p><p dir="ltr">During my PhD studies, we developed three DP-FL sparse basis recovery algorithms for this purpose. Our first algorithm, SPriFed-OMP, based on the Orthogonal Matching Pursuit (OMP) algorithm, can achieve high accuracy even when <i>n = O(\sqrt{p})</i> under the stronger Restricted Isometry Property (RIP) assumption for least-square problems. Our second algorithm, Humming-Bird, based on a carefully modified variant of the Forward-Backward Algorithm (FoBA), can achieve differentially private sparse recovery for the same setup while requiring the much weaker Restricted Strong Convexity (RSC) condition. We further extend Humming-Bird to support loss functions beyond least-square satisfying the RSC condition. To the best of our knowledge, these are the first DP-FL results guaranteeing sparse basis recovery in the <i>p >> n</i> setting.</p>
40

A Study on Private and Secure Federated Learning / プライベートで安全な連合学習

Kato, Fumiyuki 25 March 2024 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第25427号 / 情博第865号 / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授 伊藤 孝行, 教授 黒田 知宏, 教授 岡部 寿男, 吉川 正俊(京都大学 名誉教授) / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM

Page generated in 0.0172 seconds