Global ETD Search

21	Privacy Preserving in Online Social Network Data Sharing and Publication Tianchong Gao (7428566) 17 October 2019 (has links) <p>Following the trend of online data sharing and publishing, researchers raise their concerns about the privacy problem. Online Social Networks (OSNs), for example, often contain sensitive information about individuals. Therefore, anonymizing network data before releasing it becomes an important issue. This dissertation studies the privacy preservation problem from the perspectives of both attackers and defenders. </p> <p><br></p> <p>To defenders, preserving the private information while keeping the utility of the published OSN is essential in data anonymization. At one extreme, the final data equals the original one, which contains all the useful information but has no privacy protection. At the other extreme, the final data is random, which has the best privacy protection but is useless to the third parties. Hence, the defenders aim to explore multiple potential methods to strike a desirable tradeoff between privacy and utility in the published data. This dissertation draws on the very fundamental problem, the definition of utility and privacy. It draws on the design of the privacy criterion, the graph abstraction model, the utility method, and the anonymization method to further address the balance between utility and privacy. </p> <p><br></p> <p>To attackers, extracting meaningful information from the collected data is essential in data de-anonymization. De-anonymization mechanisms utilize the similarities between attackers’ prior knowledge and published data to catch the targets. This dissertation focuses on the problems that the published data is periodic, anonymized, and does not cover the target persons. There are two thrusts in studying the de-anonymization attacks: the design of seed mapping method and the innovation of generating-based attack method. To conclude, this dissertation studies the online data privacy problem from both defenders’ and attackers’ point of view and introduces privacy and utility enhancement mechanisms in different novel angles.</p> Computer Engineering privacy protection online social networks differential privacy anonymization de-anonymization
22	Protecting sensitive information from untrusted code Roy, Indrajit 13 December 2010 (has links) As computer systems support more aspects of modern life, from finance to health care, security is becoming increasingly important. However, building secure systems remains a challenge. Software continues to have security vulnerabilities due to reasons ranging from programmer errors to inadequate programming tools. Because of these vulnerabilities we need mechanisms that protect sensitive data even when the software is untrusted. This dissertation shows that secure and practical frameworks can be built for protecting users' data from untrusted applications in both desktop and cloud computing environment. Laminar is a new framework that secures desktop applications by enforcing policies written as information flow rules. Information flow control, a form of mandatory access control, enables programmers to write powerful, end-to-end security guarantees while reducing the amount of trusted code. Current programming abstractions and implementations of this model either compromise end-to-end security guarantees or require substantial modifications to applications, thus deterring adoption. Laminar addresses these shortcomings by exporting a single set of abstractions to control information flows through operating system resources and heap-allocated objects. Programmers express security policies by labeling data and represent access restrictions on code using a new abstraction called a security region. The Laminar programming model eases incremental deployment, limits dynamic security checks, and supports multithreaded programs that can access heterogeneously labeled data. In large scale, distributed computations safeguarding information requires solutions beyond mandatory access control. An important challenge is to ensure that the computation, including its output, does not leak sensitive information about the inputs. For untrusted code, access control cannot guarantee that the output does not leak information. This dissertation proposes Airavat, a MapReduce-based system which augments mandatory access control with differential privacy to guarantee security and privacy for distributed computations. Data providers control the security policy for their sensitive data, including a mathematical bound on potential privacy violations. Users without security expertise can perform computations on the data; Airavat prevents information leakage beyond the data provider's policy. Our prototype implementation of Airavat demonstrates that several data mining tasks can be performed in a privacy preserving fashion with modest performance overheads. / text Security Operating systems Java virtual machine Cloud computing Information flow control Access control Differential privacy
23	Programmation sûre en précision finie : Contrôler les erreurs et les fuites d'informations Gazeau, Ivan 14 October 2013 (has links) (PDF) Dans cette thèse, nous analysons les problèmes liés à la représentation finie des nombres réels et nous contrôlons la déviation induite par cette approximation. Nous nous intéressons particulièrement à deux problèmes. Le premier est l'étude de l'influence de la représentation finie sur les protocoles de confidentialité différentielle. Nous présentons une méthode pour étudier les perturbations d'une distribution de probabilité causées par la représentation finie des nombres. Nous montrons qu'une implémentation directe des protocoles théoriques pour garantir la confidentialité différentielle n'est pas fiable, tandis qu'après l'ajout de correctifs, la propriété est conservée en précision finie avec une faible perte de confidentialité. Notre deuxième contribution est une méthode pour étudier les programmes qui ne peuvent pas être analysés par composition à cause de branchements conditionnels au comportement trop erratique. Cette méthode, basée sur la théorie des systèmes de réécriture, permet de partir de la preuve de l'algorithme en précision exacte pour fournir la preuve que le programme en précision finie ne déviera pas trop. Précision finie Analyse statique Differential privacy
24	Klassificeringsalgoritmer vs differential privacy : Effekt på klassificeringsalgoritmer vid användande av numerisk differential privacy / Classification algorithms vs differential privacy : Effect of using numerical differential privacy on classification algorithms Olsson, Mattias January 2018 (has links) Data mining är ett samlingsnamn för ett antal tekniker som används för att analysera datamängder och finna mönster, exempelvis genom klassificering. Anonymisering innefattar en rad tekniker för att skydda den personliga integriteten. Den här studien undersöker hur stor påverkansgrad anonymisering med tekniken differential privacy har på möjligheten att klassificera en datamängd. Genom ett experiment undersöks ett antal magnituder av anonymisering och vilken effekt de har på möjligheten att klassificera data. Klassificering av den anonymiserade datamängden jämförs mot klassificering av den råa datamängden. Liknande studier har genomförts med k-anonymitet som anonymiseringsteknik där möjligheten att klassificera förbättrades genom generalisering. Resultatet från den här studien å andra sidan visar att möjligheten att klassificera sjunker något, vilket beror på att differential privacy sprider ut informationen i datamängden över ett bredare spektrum. Detta försvårar generellt för klassificeringsalgoritmerna att hitta karakteriserande mönster i datamängden och de lyckas därmed inte få lika hög grad av korrekt klassificering. Data mining klassificering algoritmer anonymisering differential privacy runkeeper Computer and Information Sciences Data- och informationsvetenskap
25	New Paradigms and Optimality Guarantees in Statistical Learning and Estimation Wang, Yu-Xiang 01 December 2017 (has links) Machine learning (ML) has become one of the most powerful classes of tools for artificial intelligence, personalized web services and data science problems across fields. Within the field of machine learning itself, there had been quite a number of paradigm shifts caused by the explosion of data size, computing power, modeling tools, and the new ways people collect, share, and make use of data sets. Data privacy, for instance, was much less of a problem before the availability of personal information online that could be used to identify users in anonymized data sets. Images, videos, as well as observations generated over a social networks, often have highly localized structures, that cannot be captured by standard nonparametric models. Moreover, the “common task framework” that is adopted by many sub- disciplines of AI has made it possible for many people to collaboratively and repeated work on the same data set, leading to implicit overfitting on public benchmarks. In addition, data collected in many internet services, e.g., web search and targeted ads, are not iid, but rather feedbacks specific to the deployed algorithm. This thesis presents technical contributions under a number of new mathematical frameworks that are designed to partially address these new paradigms. • Firstly, we consider the problem of statistical learning with privacy constraints. Under Vapnik’s general learning setting and the formalism of differential privacy (DP), we establish simple conditions that characterizes the private learnability, which reveals a mixture of positive and negative insight. We then identify generic methods that reuses existing randomness to effectively solve private learning in practice; and discuss weaker notions of privacy that allows for more favorable privacy-utility tradeoff. • Secondly, we develop a few generalizations of trend filtering, a locally-adaptive nonparametric regression technique that is minimax in 1D, to the multivariate setting and to graphs. We also study specific instances of the problems, e.g., total variation denoising on d-dimensional grids more closely and the results reveal interesting statistical computational trade-offs. • Thirdly, we investigate two problems in sequential interactive learning: a) off- policy evaluation in contextual bandits, that aims to use data collected from one algorithm to evaluate the performance of a different algorithm; b) the problem of adaptive data analysis, that uses randomization to prevent adversarial data analysts from a form of “p-hacking” through multiple steps of sequential data access. In the above problems, we will provide not only performance guarantees of algorithms but also certain notions of optimality. Whenever applicable, careful empirical studies on synthetic and real data are also included. machine learning statistics differential privacy nonparametric regression trend filtering contextual bandits
26	Quantifying Information Leakage via Adversarial Loss Functions: Theory and Practice January 2020 (has links) abstract: Modern digital applications have significantly increased the leakage of private and sensitive personal data. While worst-case measures of leakage such as Differential Privacy (DP) provide the strongest guarantees, when utility matters, average-case information-theoretic measures can be more relevant. However, most such information-theoretic measures do not have clear operational meanings. This dissertation addresses this challenge. This work introduces a tunable leakage measure called maximal $\alpha$-leakage which quantifies the maximal gain of an adversary in inferring any function of a data set. The inferential capability of the adversary is modeled by a class of loss functions, namely, $\alpha$-loss. The choice of $\alpha$ determines specific adversarial actions ranging from refining a belief for $\alpha =1$ to guessing the best posterior for $\alpha = \infty$, and for the two specific values maximal $\alpha$-leakage simplifies to mutual information and maximal leakage, respectively. Maximal $\alpha$-leakage is proved to have a composition property and be robust to side information. There is a fundamental disjoint between theoretical measures of information leakages and their applications in practice. This issue is addressed in the second part of this dissertation by proposing a data-driven framework for learning Censored and Fair Universal Representations (CFUR) of data. This framework is formulated as a constrained minimax optimization of the expected $\alpha$-loss where the constraint ensures a measure of the usefulness of the representation. The performance of the CFUR framework with $\alpha=1$ is evaluated on publicly accessible data sets; it is shown that multiple sensitive features can be effectively censored to achieve group fairness via demographic parity while ensuring accuracy for several \textit{a priori} unknown downstream tasks. Finally, focusing on worst-case measures, novel information-theoretic tools are used to refine the existing relationship between two such measures, $(\epsilon,\delta)$-DP and R\'enyi-DP. Applying these tools to the moments accountant framework, one can track the privacy guarantee achieved by adding Gaussian noise to Stochastic Gradient Descent (SGD) algorithms. Relative to state-of-the-art, for the same privacy budget, this method allows about 100 more SGD rounds for training deep learning models. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2020 Electrical engineering Differential Privacy Fairness Generative adversarial models Information leakage measure Loss function Privacy/Censoring
27	A Study on Federated Learning Systems in Healthcare Smith, Arthur, M.D. 18 August 2021 (has links) No description available. Computer Science Health Care Information Technology Medicine Federated Learning Differential Privacy Homomorphic Encryption
28	Privacy-awareness in the era of Big Data and machine learning / Integritetsmedvetenhet i eran av Big Data och maskininlärning Vu, Xuan-Son January 2019 (has links) Social Network Sites (SNS) such as Facebook and Twitter, have been playing a great role in our lives. On the one hand, they help connect people who would not otherwise be connected before. Many recent breakthroughs in AI such as facial recognition [49] were achieved thanks to the amount of available data on the Internet via SNS (hereafter Big Data). On the other hand, due to privacy concerns, many people have tried to avoid SNS to protect their privacy. Similar to the security issue of the Internet protocol, Machine Learning (ML), as the core of AI, was not designed with privacy in mind. For instance, Support Vector Machines (SVMs) try to solve a quadratic optimization problem by deciding which instances of training dataset are support vectors. This means that the data of people involved in the training process will also be published within the SVM models. Thus, privacy guarantees must be applied to the worst-case outliers, and meanwhile data utilities have to be guaranteed. For the above reasons, this thesis studies on: (1) how to construct data federation infrastructure with privacy guarantee in the big data era; (2) how to protect privacy while learning ML models with a good trade-off between data utilities and privacy. To the first point, we proposed different frameworks em- powered by privacy-aware algorithms that satisfied the definition of differential privacy, which is the state-of-the-art privacy-guarantee algorithm by definition. Regarding (2), we proposed different neural network architectures to capture the sensitivities of user data, from which, the algorithm itself decides how much it should learn from user data to protect their privacy while achieves good performance for a downstream task. The current outcomes of the thesis are: (1) privacy-guarantee data federation infrastructure for data analysis on sensitive data; (2) privacy-guarantee algorithms for data sharing; (3) privacy-concern data analysis on social network data. The research methods used in this thesis include experiments on real-life social network dataset to evaluate aspects of proposed approaches. Insights and outcomes from this thesis can be used by both academic and industry to guarantee privacy for data analysis and data sharing in personal data. They also have the potential to facilitate relevant research in privacy-aware representation learning and related evaluation methods. Differential Privacy Machine Learning Deep Learning Big Data Computer Sciences Datavetenskap (datalogi)
29	Side-channel Threats on Modern Platforms: Attacks and Countermeasures Zhang, Xiaokuan January 2021 (has links) No description available. Computer Science Computer Engineering side channels ARM cache attacks smartphones differential privacy
30	An Analysis of Notions of Differential Privacy for Edge-Labeled Graphs / En analys av olika uppfattningar om differentiell integritet i grafer med kantetiketter Christensen, Robin January 2020 (has links) The user data in social media platforms is an excellent source of information that is beneficial for both commercial and scientific purposes. However, recent times has seen that the user data is not always used for good, which has led to higher demands on user privacy. With accurate statistical research data being just as important as the privacy of the user data, the relevance of differential privacy has increased. Differential privacy allows user data to be accessible under certain privacy conditions at the cost of accuracy in query results, which is caused by noise. The noise is based on a tuneable constant ε and the global sensitivity of a query. The query sensitivity is defined as the greatest possible difference in query result between the queried database and a neighboring database. Where the neighboring database is defined to differ by one record in a tabular database, there are multiple neighborhood notions for edge-labeled graphs. This thesis considers the notions of edge neighborhood, node neighborhood, QL-edge neighborhood and QL-outedges neighborhood. To study these notions, a framework was developed in Java to function as a query mechanism for a graph database. ArangoDB was used as a storage for graphs, which was generated by parsing data sets in the RDF format as well as through a graph synthesizer in the developed framework. Querying a database in the framework is done with Apache TinkerPop, and a Laplace distribution is used when generating noise for the query results. The framework was used to study the privacy and utility trade-off of different histogram queries on a number of data sets, while employing the different notions of neighborhood in edge-labeled graphs. The level of privacy is determined by the value on ε, and the utility is defined as a measurement based on the L1-distance between the true and noisy result. In the general case, the notions of edge neighborhood and QL-edge neighborhood are the better alternatives in terms of privacy and utility. Although, there are indications that node neighborhood and QL-outedges neighborhood are considerable options for larger graphs, where the level of privacy for edge neighborhood and QL-edge neighborhood appears to be negligible based on utility measurements. Differential privacy Analysis framework Query mechanism Graph databases Graph neighbors Neighborhood notions Computer Engineering Datorteknik

Search results