Global ETD Search

41	Computing Compliant Anonymisations of Quantified ABoxes w.r.t. EL Policies Baader, Franz, Kriegel, Francesco, Nuradiansyah, Adrian, Peñaloza, Rafael 28 December 2021 (has links) We adapt existing approaches for privacy-preserving publishing of linked data to a setting where the data are given as Description Logic (DL) ABoxes with possibly anonymised (formally: existentially quantified) individuals and the privacy policies are expressed using sets of concepts of the DL EL. We provide a chacterization of compliance of such ABoxes w.r.t. EL policies, and show how optimal compliant anonymisations of ABoxes that are non-compliant can be computed. This work extends previous work on privacy-preserving ontology publishing, in which a very restricted form of ABoxes, called instance stores, had been considered, but restricts the attention to compliance. The approach developed here can easily be adapted to the problem of computing optimal repairs of quantified ABoxes. info:eu-repo/classification/ddc/006 ddc:006
42	Secure and Reliable Data Outsourcing in Cloud Computing Cao, Ning 31 July 2012 (has links) "The many advantages of cloud computing are increasingly attracting individuals and organizations to outsource their data from local to remote cloud servers. In addition to cloud infrastructure and platform providers, such as Amazon, Google, and Microsoft, more and more cloud application providers are emerging which are dedicated to offering more accessible and user friendly data storage services to cloud customers. It is a clear trend that cloud data outsourcing is becoming a pervasive service. Along with the widespread enthusiasm on cloud computing, however, concerns on data security with cloud data storage are arising in terms of reliability and privacy which raise as the primary obstacles to the adoption of the cloud. To address these challenging issues, this dissertation explores the problem of secure and reliable data outsourcing in cloud computing. We focus on deploying the most fundamental data services, e.g., data management and data utilization, while considering reliability and privacy assurance. The first part of this dissertation discusses secure and reliable cloud data management to guarantee the data correctness and availability, given the difficulty that data are no longer locally possessed by data owners. We design a secure cloud storage service which addresses the reliability issue with near-optimal overall performance. By allowing a third party to perform the public integrity verification, data owners are significantly released from the onerous work of periodically checking data integrity. To completely free the data owner from the burden of being online after data outsourcing, we propose an exact repair solution so that no metadata needs to be generated on the fly for the repaired data. The second part presents our privacy-preserving data utilization solutions supporting two categories of semantics - keyword search and graph query. For protecting data privacy, sensitive data has to be encrypted before outsourcing, which obsoletes traditional data utilization based on plaintext keyword search. We define and solve the challenging problem of privacy-preserving multi- keyword ranked search over encrypted data in cloud computing. We establish a set of strict privacy requirements for such a secure cloud data utilization system to become a reality. We first propose a basic idea for keyword search based on secure inner product computation, and then give two improved schemes to achieve various stringent privacy requirements in two different threat models. We also investigate some further enhancements of our ranked search mechanism, including supporting more search semantics, i.e., TF Ã— IDF, and dynamic data operations. As a general data structure to describe the relation between entities, the graph has been increasingly used to model complicated structures and schemaless data, such as the personal social network, the relational database, XML documents and chemical compounds. In the case that these data contains sensitive information and need to be encrypted before outsourcing to the cloud, it is a very challenging task to effectively utilize such graph-structured data after encryption. We define and solve the problem of privacy-preserving query over encrypted graph-structured data in cloud computing. By utilizing the principle of filtering-and-verification, we pre-build a feature-based index to provide feature-related information about each encrypted data graph, and then choose the efficient inner product as the pruning tool to carry out the filtering procedure." cloud computing security privacy preserving cloud storage data reliability data availability keyword search ranked search graph query
43	Practical Private Information Retrieval Olumofin, Femi George January 2011 (has links) In recent years, the subject of online privacy has been attracting much interest, especially as more Internet users than ever are beginning to care about the privacy of their online activities. Privacy concerns are even prompting legislators in some countries to demand from service providers a more privacy-friendly Internet experience for their citizens. These are welcomed developments and in stark contrast to the practice of Internet censorship and surveillance that legislators in some nations have been known to promote. The development of Internet systems that are able to protect user privacy requires private information retrieval (PIR) schemes that are practical, because no other efficient techniques exist for preserving the confidentiality of the retrieval requests and responses of a user from an Internet system holding unencrypted data. This thesis studies how PIR schemes can be made more relevant and practical for the development of systems that are protective of users' privacy. Private information retrieval schemes are cryptographic constructions for retrieving data from a database, without the database (or database administrator) being able to learn any information about the content of the query. PIR can be applied to preserve the confidentiality of queries to online data sources in many domains, such as online patents, real-time stock quotes, Internet domain names, location-based services, online behavioural profiling and advertising, search engines, and so on. In this thesis, we study private information retrieval and obtain results that seek to make PIR more relevant in practice than all previous treatments of the subject in the literature, which have been mostly theoretical. We also show that PIR is the most computationally efficient known technique for providing access privacy under realistic computation powers and network bandwidths. Our result covers all currently known varieties of PIR schemes. We provide a more detailed summary of our contributions below: Our first result addresses an existing question regarding the computational practicality of private information retrieval schemes. We show that, unlike previously argued, recent lattice-based computational PIR schemes and multi-server information-theoretic PIR schemes are much more computationally efficient than a trivial transfer of the entire PIR database from the server to the client (i.e., trivial download). Our result shows the end-to-end response times of these schemes are one to three orders of magnitude (10--1000 times) smaller than the trivial download of the database for realistic computation powers and network bandwidths. This result extends and clarifies the well-known result of Sion and Carbunar on the computational practicality of PIR. Our second result is a novel approach for preserving the privacy of sensitive constants in an SQL query, which improves substantially upon the earlier work. Specifically, we provide an expressive data access model of SQL atop of the existing rudimentary index- and keyword-based data access models of PIR. The expressive SQL-based model developed results in between 7 and 480 times improvement in query throughput than previous work. We then provide a PIR-based approach for preserving access privacy over large databases. Unlike previously published access privacy approaches, we explore new ideas about privacy-preserving constraint-based query transformations, offline data classification, and privacy-preserving queries to index structures much smaller than the databases. This work addresses an important open problem about how real systems can systematically apply existing PIR schemes for querying large databases. In terms of applications, we apply PIR to solve user privacy problem in the domains of patent database query and location-based services, user and database privacy problems in the domain of the online sales of digital goods, and a scalability problem for the Tor anonymous communication network. We develop practical tools for most of our techniques, which can be useful for adding PIR support to existing and new Internet system designs. PIR private information retrieval access privacy SQL privacy large database privacy location privacy Tor privacy-preserving e-commerce Computer Science
44	Secure and high-performance big-data systems in the cloud Tang, Yuzhe 21 September 2015 (has links) Cloud computing and big data technology continue to revolutionize how computing and data analysis are delivered today and in the future. To store and process the fast-changing big data, various scalable systems (e.g. key-value stores and MapReduce) have recently emerged in industry. However, there is a huge gap between what these open-source software systems can offer and what the real-world applications demand. First, scalable key-value stores are designed for simple data access methods, which limit their use in advanced database applications. Second, existing systems in the cloud need automatic performance optimization for better resource management with minimized operational overhead. Third, the demand continues to grow for privacy-preserving search and information sharing between autonomous data providers, as exemplified by the Healthcare information networks. My Ph.D. research aims at bridging these gaps. First, I proposed HINDEX, for secondary index support on top of write-optimized key-value stores (e.g. HBase and Cassandra). To update the index structure efficiently in the face of an intensive write stream, HINDEX synchronously executes append-only operations and defers the so-called index-repair operations which are expensive. The core contribution of HINDEX is a scheduling framework for deferred and lightweight execution of index repairs. HINDEX has been implemented and is currently being transferred to an IBM big data product. Second, I proposed Auto-pipelining for automatic performance optimization of streaming applications on multi-core machines. The goal is to prevent the bottleneck scenario in which the streaming system is blocked by a single core while all other cores are idling, which wastes resources. To partition the streaming workload evenly to all the cores and to search for the best partitioning among many possibilities, I proposed a heuristic based search strategy that achieves locally optimal partitioning with lightweight search overhead. The key idea is to use a white-box approach to search for the theoretically best partitioning and then use a black-box approach to verify the effectiveness of such partitioning. The proposed technique, called Auto-pipelining, is implemented on IBM Stream S. Third, I proposed ǫ-PPI, a suite of privacy preserving index algorithms that allow data sharing among unknown parties and yet maintaining a desired level of data privacy. To differentiate privacy concerns of different persons, I proposed a personalized privacy definition and substantiated this new privacy requirement by the injection of false positives in the published ǫ-PPI data. To construct the ǫ-PPI securely and efficiently, I proposed to optimize the performance of multi-party computations which are otherwise expensive; the key idea is to use addition-homomorphic secret sharing mechanism which is inexpensive and to do the distributed computation in a scalable P2P overlay. Cloud Big-data Security Efficiency Performance Streaming Multi-core Index Key-value stores Privacy preserving Performance optimization Log-structured systems
45	Practical Private Information Retrieval Olumofin, Femi George January 2011 (has links) In recent years, the subject of online privacy has been attracting much interest, especially as more Internet users than ever are beginning to care about the privacy of their online activities. Privacy concerns are even prompting legislators in some countries to demand from service providers a more privacy-friendly Internet experience for their citizens. These are welcomed developments and in stark contrast to the practice of Internet censorship and surveillance that legislators in some nations have been known to promote. The development of Internet systems that are able to protect user privacy requires private information retrieval (PIR) schemes that are practical, because no other efficient techniques exist for preserving the confidentiality of the retrieval requests and responses of a user from an Internet system holding unencrypted data. This thesis studies how PIR schemes can be made more relevant and practical for the development of systems that are protective of users' privacy. Private information retrieval schemes are cryptographic constructions for retrieving data from a database, without the database (or database administrator) being able to learn any information about the content of the query. PIR can be applied to preserve the confidentiality of queries to online data sources in many domains, such as online patents, real-time stock quotes, Internet domain names, location-based services, online behavioural profiling and advertising, search engines, and so on. In this thesis, we study private information retrieval and obtain results that seek to make PIR more relevant in practice than all previous treatments of the subject in the literature, which have been mostly theoretical. We also show that PIR is the most computationally efficient known technique for providing access privacy under realistic computation powers and network bandwidths. Our result covers all currently known varieties of PIR schemes. We provide a more detailed summary of our contributions below: Our first result addresses an existing question regarding the computational practicality of private information retrieval schemes. We show that, unlike previously argued, recent lattice-based computational PIR schemes and multi-server information-theoretic PIR schemes are much more computationally efficient than a trivial transfer of the entire PIR database from the server to the client (i.e., trivial download). Our result shows the end-to-end response times of these schemes are one to three orders of magnitude (10--1000 times) smaller than the trivial download of the database for realistic computation powers and network bandwidths. This result extends and clarifies the well-known result of Sion and Carbunar on the computational practicality of PIR. Our second result is a novel approach for preserving the privacy of sensitive constants in an SQL query, which improves substantially upon the earlier work. Specifically, we provide an expressive data access model of SQL atop of the existing rudimentary index- and keyword-based data access models of PIR. The expressive SQL-based model developed results in between 7 and 480 times improvement in query throughput than previous work. We then provide a PIR-based approach for preserving access privacy over large databases. Unlike previously published access privacy approaches, we explore new ideas about privacy-preserving constraint-based query transformations, offline data classification, and privacy-preserving queries to index structures much smaller than the databases. This work addresses an important open problem about how real systems can systematically apply existing PIR schemes for querying large databases. In terms of applications, we apply PIR to solve user privacy problem in the domains of patent database query and location-based services, user and database privacy problems in the domain of the online sales of digital goods, and a scalability problem for the Tor anonymous communication network. We develop practical tools for most of our techniques, which can be useful for adding PIR support to existing and new Internet system designs. PIR private information retrieval access privacy SQL privacy large database privacy location privacy Tor privacy-preserving e-commerce Computer Science
46	FULLY HOMOMORPHIC ENCRYPTION BASED DATA ACCESS FRAMEWORK FOR PRIVACY-PRESERVING HEALTHCARE ANALYTICS Ganduri, Sri Lasya 01 December 2021 (has links) The main aim of this thesis is to develop a library for integrating fully homomorphic encryption-based computations on a standard database. The fully homomorphic encryption is an encryption scheme that allows functions to be performed directly on encrypted data without the requirement of decrypting the data and yields the same results as if the functions were run on the plaintext. This implementation is a promising solution for preserving the privacy of the health care system, where millions of patients’ data are stored. The personal health care tools gather medical data and store it in a database. Upon importing this library into the database, the data that is being entered into the database is encrypted and the computations can be performed on the encrypted data without decrypting. Cipher Texts Contact-Tracing-Application FHE (Fully Homomorphic Encryption) healthcare PHE (Partially homomorphic encryption) Privacy-preserving encryption
47	Towards Building Privacy-Preserving Language Models: Challenges and Insights in Adapting PrivGAN for Generation of Synthetic Clinical Text Nazem, Atena January 2023 (has links) The growing development of artificial intelligence (AI), particularly neural networks, is transforming applications of AI in healthcare, yet it raises significant privacy concerns due to potential data leakage. As neural networks memorise training data, they may inadvertently expose sensitive clinical data to privacy breaches, which can engender serious repercussions like identity theft, fraud, and harmful medical errors. While regulations such as GDPR offer safeguards through guidelines, rooted and technical protections are required to address the problem of data leakage. Reviews of various approaches show that one avenue of exploration is the adaptation of Generative Adversarial Networks (GANs) to generate synthetic data for use in place of real data. Since GANs were originally designed and mainly researched for generating visual data, there is a notable gap for further exploration of adapting GANs with privacy-preserving measures for generating synthetic text data. Thus, to address this gap, this study aims at answering the research questions of how a privacy-preserving GAN can be adapted to safeguard the privacy of clinical text data and what challenges and potential solutions are associated with these adaptations. To this end, the existing privGAN framework—originally developed and tested for image data—was tailored to suit clinical text data. Following the design science research framework, modifications were made while adhering to the privGAN architecture to incorporate reinforcement learning (RL) for addressing the discrete nature of text data. For synthetic data generation, this study utilised the 'Discharge summary' class from the Noteevents table of the MIMIC-III dataset, which is clinical text data in American English. The utility of the generated data was assessed using the BLEU-4 metric, and a white-box attack was conducted to test the model's resistance to privacy breaches. The experiment yielded a very low BLEU-4 score, indicating that the generator could not produce synthetic data that would capture the linguistic characteristics and patterns of real data. The relatively low white-box attack accuracy of one discriminator (0.2055) suggests that the trained discriminator was less effective in inferring sensitive information with high accuracy. While this may indicate a potential for preserving privacy, increasing the number of discriminators proves less favourable results (0.361). In light of these results, it is noted that the adapted approach in defining the rewards as a measure of discriminators’ uncertainty can signal a contradicting learning strategy and lead to the low utility of data. This study underscores the challenges in adapting privacy-preserving GANs for text data due to the inherent complexity of GANs training and the required computational power. To obtain better results in terms of utility and confirm the effectiveness of the privacy measures, further experiments are required to consider a more direct and granular rewarding system for the generator and to obtain an optimum learning rate. As such, the findings reiterate the necessity for continued experimentation and refinement in adapting privacy-preserving GANs for clinical text. Generative Adversarial Networks privacy-preserving language models clinical text data reinforcement learning synthetic data Information Systems
48	Ex Ante Approaches for Security, Privacy, and Enforcement in Spectrum Sharing Bahrak, Behnam 17 December 2013 (has links) Cognitive radios (CRs) are devices that are capable of sensing the spectrum and using its free portions in an opportunistic manner. The free spectrum portions are referred to as white spaces or spectrum holes. It is widely believed that CRs are one of the key enabling technologies for realizing a new regulatory spectrum management paradigm, viz. dynamic spectrum access (DSA). CRs often employ software-defined radio (SDR) platforms that are capable of executing artificial intelligence (AI) algorithms to reconfigure their transmission/reception (TX/RX) parameters to communicate efficiently while avoiding interference with licensed (a.k.a. primary or incumbent) users and unlicensed (a.k.a. secondary or cognitive) users. When different stakeholders share a common resource, such as the case in spectrum sharing, security, privacy, and enforcement become critical considerations that affect the welfare of all stakeholders. Recent advances in radio spectrum access technologies, such as CRs, have made spectrum sharing a viable option for significantly improving spectrum utilization efficiency. However, those technologies have also contributed to exacerbating the difficult problems of security, privacy and enforcement. In this dissertation, we review some of the critical security and privacy threats that impact spectrum sharing. We also discuss ex ante (preventive) approaches which mitigate the security and privacy threats and help spectrum enforcement. / Ph. D. Cognitive radio networks Dynamic Spectrum Access Spectrum Enforcement Policy Reasoning Spectrum Learning Geolocation Databases Privacy-preserving Techniques
49	Toward Privacy-Preserving and Secure Dynamic Spectrum Access Dou, Yanzhi 19 January 2018 (has links) Dynamic spectrum access (DSA) technique has been widely accepted as a crucial solution to mitigate the potential spectrum scarcity problem. Spectrum sharing between the government incumbents and commercial wireless broadband operators/users is one of the key forms of DSA. Two categories of spectrum management methods for shared use between incumbent users (IUs) and secondary users (SUs) have been proposed, i.e., the server-driven method and the sensing-based method. The server-driven method employs a central server to allocate spectrum resources while considering incumbent protection. The central server has access to the detailed IU operating information, and based on some accurate radio propagation model, it is able to allocate spectrum following a particular access enforcement method. Two types of access enforcement methods -- exclusion zone and protection zone -- have been adopted for server-driven DSA systems in the current literature. The sensing-based method is based on recent advances in cognitive radio (CR) technology. A CR can dynamically identify white spaces through various incumbent detection techniques and reconfigure its radio parameters in response to changes of spectrum availability. The focus of this dissertation is to address critical privacy and security issues in the existing DSA systems that may severely hinder the progress of DSA's deployment in the real world. Firstly, we identify serious threats to users' privacy in existing server-driven DSA designs and propose a privacy-preserving design named P²-SAS to address the issue. P²-SAS realizes the complex spectrum allocation process of protection-zone-based DSA in a privacy-preserving way through Homomorphic Encryption (HE), so that none of the IU or SU operation data would be exposed to any snooping party, including the central server itself. Secondly, we develop a privacy-preserving design named IP-SAS for the exclusion-zone- based server-driven DSA system. We extend the basic design that only considers semi- honest adversaries to include malicious adversaries in order to defend the more practical and complex attack scenarios that can happen in the real world. Thirdly, we redesign our privacy-preserving SAS systems entirely to remove the somewhat- trusted third party (TTP) named Key Distributor, which in essence provides a weak proxy re-encryption online service in P²-SAS and IP-SAS. Instead, in this new system, RE-SAS, we leverage a new crypto system that supports both a strong proxy re-encryption notion and MPC to realize privacy-preserving spectrum allocation. The advantages of RE-SAS are that it can prevent single point of vulnerability due to TTP and also increase SAS's service performance dramatically. Finally, we identify the potentially crucial threat of compromised CR devices to the ambient wireless infrastructures and propose a scalable and accurate zero-day malware detection system called GuardCR to enhance CR network security at the device level. GuardCR leverages a host-based anomaly detection technique driven by machine learning, which makes it autonomous in malicious behavior recognition. We boost the performance of GuardCR in terms of accuracy and efficiency by integrating proper domain knowledge of CR software. / Ph. D. Dynamic spectrum access Cognitive radio networks Privacy-preserving system Malware detection Machine learning Homomorphic encryption Proxy re-encryption
50	PRIVACY PRESERVING AND EFFICIENT MACHINE LEARNING ALGORITHMS Efstathia Soufleri (19184887) 21 July 2024 (has links) <p dir="ltr">Extensive data availability has catalyzed the expansion of deep learning. Such advancements include image classification, speech, and natural language processing. However, this data-driven progress is often hindered by privacy restrictions preventing the public release of specific datasets. For example, some vision datasets cannot be shared due to privacy regulations, particularly those containing images depicting visually sensitive or disturbing content. At the same time, it is imperative to deploy deep learning efficiently, specifically Deep Neural Networks (DNNs), which are the core of deep learning. In this dissertation, we focus on achieving efficiency by reducing the computational cost of DNNs in multiple ways.</p><p dir="ltr">This thesis first tackles the privacy concerns arising from deep learning. It introduces a novel methodology that synthesizes and releases synthetic data, instead of private data. Specifically, we propose Differentially Private Image Synthesis (DP-ImgSyn) for generating and releasing synthetic images used for image classification tasks. These synthetic images satisfy the following three properties: (1) they have DP guarantees, (2) they preserve the utility of private images, ensuring that models trained using synthetic images result in comparable accuracy to those trained on private data, and (3) they are visually dissimilar from private images. The DP-ImgSyn framework consists of the following steps: firstly, a teacher model is trained on private images using a DP training algorithm. Subsequently, public images are used for initializing synthetic images, which are optimized in order to be aligned with the private dataset. This optimization leverages the teacher network's batch normalization layer statistics (mean, standard deviation) to inject information from the private dataset into the synthetic images. Third, the synthetic images and their soft labels obtained from the teacher model are released and can be employed for neural network training in image classification tasks.</p><p dir="ltr">As a second direction, this thesis delves into achieving efficiency in deep learning. With neural networks widely deployed for tackling diverse and complex problems, the resulting models often become parameter-heavy, demanding substantial computational resources for deployment. To address this challenge, we focus on quantizing the weights and the activations of DNNs. In more detail, we propose a method for compressing neural networks through layer-wise mixed-precision quantization. Determining the optimal bit widths for each layer is a non-trivial task, given the fact that the search space is exponential. Thus, we employ a Multi-Layer Perceptron (MLP) trained to determine the suitable bit-width for each layer. The Kullback-Leibler (KL) divergence of softmax outputs between the quantized and full precision networks is the metric used to gauge quantization quality. We experimentally investigate the relationship between KL divergence and network size, noting that more aggressive quantization correlates with higher divergence and vice versa. The MLP is trained using the layer-wise bit widths as labels and their corresponding KL divergence as inputs. To generate the training set, pairs of layer-wise bit widths and their respective KL divergence values are obtained through Monte Carlo sampling of the search space. This approach aims to reduce the computational cost of DNN deployment, while maintaining high classification accuracy.</p><p dir="ltr">Additionally, we aim to enhance efficiency in machine learning by introducing a computationally efficient method for action recognition on compressed videos. Rather than decompressing videos for action recognition tasks, our approach performs action recognition directly on the compressed videos. This is achieved by leveraging the modalities within the compressed video format, specifically motion vectors, residuals, and intra-frames. To process each modality, we deploy three neural networks. Our observations indicate a hierarchy in convergence behavior: the network processing intra-frames tend to converge to a flatter minimum than the network processing residuals, which, in turn, converge to a flatter minimum than the motion vector network. This hierarchy motivates our strategy for knowledge transfer among modalities to achieve flatter minima, generally associated with better generalization. Based on this insight, we propose Progressive Knowledge Distillation (PKD), a technique that incrementally transfers knowledge across modalities. This method involves attaching early exits, known as Internal Classifiers (ICs), to the three networks. PKD begins by distilling knowledge from the motion vector network, then the residual network, and finally the intra-frame network, sequentially improving the accuracy of the ICs. Moreover, we introduce Weighted Inference with Scaled Ensemble (WISE), which combines outputs from the ICs using learned weights, thereby boosting accuracy during inference. The combination of PKD and WISE demonstrates significant improvements in efficiency and accuracy for action recognition on compressed videos.</p><p dir="ltr">In summary, this dissertation contributes to advancing privacy preserving and efficient machine learning algorithms. The proposed methodologies offer practical solutions for deploying machine learning systems in real-world scenarios by addressing data privacy and computational efficiency. Through innovative approaches to image synthesis, neural network compression, and action recognition, this work aims to foster the development of robust and scalable machine learning frameworks for diverse computer vision applications.</p> Computer vision Deep learning Neural networks privacy preserving machine learning computer vision neural network compression action recognition compressed videos

Search results