1 |
Quality of service in cloud computing: Data model; resource allocation; and data availability and securityAkintoye, Samson Busuyi January 2019 (has links)
Philosophiae Doctor - PhD / Recently, massive migration of enterprise applications to the cloud has been recorded in
the Information Technology (IT) world. The number of cloud providers offering their
services and the number of cloud customers interested in using such services is rapidly
increasing. However, one of the challenges of cloud computing is Quality-of-Service
management which denotes the level of performance, reliability, and availability offered
by cloud service providers. Quality-of-Service is fundamental to cloud service providers
who find the right tradeoff between Quality-of-Service levels and operational cost. In
order to find out the optimal tradeoff, cloud service providers need to comply with service
level agreements contracts which define an agreement between cloud service providers
and cloud customers. Service level agreements are expressed in terms of quality of service
(QoS) parameters such as availability, scalability performance and the service cost. On
the other hand, if the cloud service provider violates the service level agreement contract,
the cloud customer can file for damages and claims some penalties that can result in
revenue losses, and probably detriment to the provider’s reputation. Thus, the goal of
any cloud service provider is to meet the Service level agreements, while reducing the
total cost of offering its services.
|
2 |
The Role of Social Ties in Dynamic NetworksZuo, Xiang 07 April 2016 (has links)
Social networks are everywhere, from face-to-face activities to online social networks such as Flickr, YouTube and Facebook. In social networks, ties (relationships) are connections between people. The change of social relationships over time consequently leads to the evolution of the social network structure. At the same time, ties serve as carriers to transfer pieces of information from one person to another.
Studying social ties is critical to understanding the fundamental processes behind the network. Although many studies on social networks have been carried out over the last many decades, most of the work either used small in-lab datasets, or focused on directly connected static relations while ignoring indirect relations and the dynamic nature of real networks. Today, because of the emergence of online social networks, more and more large longitudinal social datasets are becoming available. The available real social datasets are fundamental to understanding evolution processes of networks in more depth. In this thesis, we study the role of social ties in dynamic networks using datasets from various domains of online social networks.
Networks, especially social networks often exhibit dual dynamic nature: the structure of the graph changes (by node and edge insertion and removal), and information flows in the network. Our work focuses on both aspects of network dynamics. The purpose of this work is to better understand the role of social ties in network evolution and changes over time, and to determine what social factors help shape individuals’ choices in negative behavior. We first developed a metric that measures the strength of indirectly connected ties. We validated the accuracy of the measurement of indirect tie metric with real-world social datasets from four domains.
Another important aspect of my research is the study of edge creation and forecast future graph structure in time evolving networks. We aim to develop algorithms that explain the edge formation properties and process which govern the network evolution. We also designed algorithms in the information propagation process to identify next spreaders several steps ahead, and use them to predict diffusion paths.
Next, because different social ties or social ties in different contexts have different influence between people, we looked at the influence of social ties in behavior contagion, particularly in a negative behavior cheating. Our recent work included the study of social factors that motivate or limit the contagion of cheating in a large real-world online social network. We tested several factors drawn from sociology and psychology explaining cheating behavior but have remained untested outside of controlled laboratory experiments or only with small, survey based studies.
In addition, this work analyzed online social networks with large datasets that certain inherent influences or patterns only emerge or become visible when dealing with massive data. We analyzed the world’s largest online gaming community, Steam Community, collected data with 3, 148, 289 users and 44, 725, 277 edges. We also made interesting observations of cheating influence that were not observed in previous in-lab experiments.
Besides providing empirically based understanding of social ties and their influence in evolving networks at large scales, our work has high practical importance for using social influence to maintain a fair online community environment, and build systems to detect, prevent, and mitigate undesirable influence.
|
3 |
Latin American Data Drought: An Assessment of Available River Observation Data in Select Latin American Countries and Development of a Web-Based Application for a Hydrometerological Database System in SpanishBolster, Stephen Joseph 01 December 2014 (has links) (PDF)
The demand and collection of hydrometeorological data is growing to support hydrologic and hydraulic analyses, and other studies. These data can amount to extensive information that requires sound data management to enable efficient storage, access, and use. While much of the globe is using technology to efficiently collect and store hydrometeorological data, other parts, such as developing countries, are unable to do so. This thesis presents an assessment of available river observations data in Latin American countries in Central America and the Caribbean. The assessment analyzes 1) access to available data, 2) spatial density of data, and 3) the temporal extents of data. This assessment determines that there are sections of the study area that constitute a drought of data or have limited data available.Furthermore, the development of an internationalized HydroServer Lite, a lite-weight web-based application for database and data management, is undertaken. A pilot program of the translated system in Spanish is established with an agency in each of the following countries: Guatemala, Honduras, and Nicaragua. The internationalized version of HydroServer Lite promises to be a useful tool for these groups. While full implementation is currently underway, benefits include improved database management, access to data, and connectivity to global groups seeking to aid developing countries with hydrometeorological data.
|
4 |
Biopharmaceuticals in Europe. Investigating their early diffusion and influencing factors through a cross-national perspectiveVeszelei, Ivar January 2024 (has links)
Abstract The path to patient access post-market approval is anything but straightforward. While some medications seamlessly find their way to those in need, others encounter significant obstacles. Biopharmaceuticals offer benefits in the treatment of many diseases. However, the adoption of these new biological medicines varies widely across European countries in view of often high costs. Despite the significant growth in approvals and economic expansion observed in the biopharmaceutical market, there are few cross-national comparative studies focused on the utilization of biologics to provide future guidance. Objective The study aimed to better understand the disparities in the early diffusion of new biologics across Europe. The research questions included identifying the extent of data availability and variations in the early diffusion of biopharmaceuticals across European countries, as well as investigating macro-level factors influencing their early diffusion. Methods A cross-sectional study was undertaken to analyze the diffusion patterns of 17 biopharmaceuticals, approved between 2015 and 2019, across European countries between 2015 and 2022. The study addressed data availability, diffusion rates, measured with Defined Daily Doses per 1000 population and relative rankings between countries to assess the early diffusion over the initial four years following market authorization. Additionally, macro-level factors influencing early diffusion were identified by meetings with policy researchers and experts. Results Data availability varied, 12 out of 29 countries provided complete data on inpatient and outpatient care, 10 provided limited data, and 7 provided no data. The introduction patterns varied between medicines, with Tildrakizumab and Follitropin delta being introduced in the least number of countries. Germany, Norway, Denmark, and Sweden demonstrated the highest early diffusion rates, while Estonia, Scotland, Romania, and Lithuania had the lowest. Three major categories of macro-level factors were identified, the country's healthcare system, its health technology assessment of new medicines and early awareness, each with associated feasible analytical comparative metrics to provide future guidance. Conclusions This study revealed significant variability in the early diffusion of biopharmaceuticals and inconsistent data availability between European countries. The study also provided a valuable framework for further research on the key macro-level factors influencing biopharmaceutical introduction, aiming to enhance accessibility and efficiency in Europe's biopharmaceutical healthcare landscape.
|
5 |
Secure and Reliable Data Outsourcing in Cloud ComputingCao, Ning 31 July 2012 (has links)
"The many advantages of cloud computing are increasingly attracting individuals and organizations to outsource their data from local to remote cloud servers. In addition to cloud infrastructure and platform providers, such as Amazon, Google, and Microsoft, more and more cloud application providers are emerging which are dedicated to offering more accessible and user friendly data storage services to cloud customers. It is a clear trend that cloud data outsourcing is becoming a pervasive service. Along with the widespread enthusiasm on cloud computing, however, concerns on data security with cloud data storage are arising in terms of reliability and privacy which raise as the primary obstacles to the adoption of the cloud. To address these challenging issues, this dissertation explores the problem of secure and reliable data outsourcing in cloud computing. We focus on deploying the most fundamental data services, e.g., data management and data utilization, while considering reliability and privacy assurance. The first part of this dissertation discusses secure and reliable cloud data management to guarantee the data correctness and availability, given the difficulty that data are no longer locally possessed by data owners. We design a secure cloud storage service which addresses the reliability issue with near-optimal overall performance. By allowing a third party to perform the public integrity verification, data owners are significantly released from the onerous work of periodically checking data integrity. To completely free the data owner from the burden of being online after data outsourcing, we propose an exact repair solution so that no metadata needs to be generated on the fly for the repaired data. The second part presents our privacy-preserving data utilization solutions supporting two categories of semantics - keyword search and graph query. For protecting data privacy, sensitive data has to be encrypted before outsourcing, which obsoletes traditional data utilization based on plaintext keyword search. We define and solve the challenging problem of privacy-preserving multi- keyword ranked search over encrypted data in cloud computing. We establish a set of strict privacy requirements for such a secure cloud data utilization system to become a reality. We first propose a basic idea for keyword search based on secure inner product computation, and then give two improved schemes to achieve various stringent privacy requirements in two different threat models. We also investigate some further enhancements of our ranked search mechanism, including supporting more search semantics, i.e., TF × IDF, and dynamic data operations. As a general data structure to describe the relation between entities, the graph has been increasingly used to model complicated structures and schemaless data, such as the personal social network, the relational database, XML documents and chemical compounds. In the case that these data contains sensitive information and need to be encrypted before outsourcing to the cloud, it is a very challenging task to effectively utilize such graph-structured data after encryption. We define and solve the problem of privacy-preserving query over encrypted graph-structured data in cloud computing. By utilizing the principle of filtering-and-verification, we pre-build a feature-based index to provide feature-related information about each encrypted data graph, and then choose the efficient inner product as the pruning tool to carry out the filtering procedure."
|
6 |
Data availability and requirements for flood hazard mapping in South AfricaEls, Zelda 12 1900 (has links)
Thesis (MSc)--Stellenbosch University, 2011. / ENGLISH ABSTRACT: Floods have been identified as one of the major natural hazards occurring in South Africa. A disaster risk assessment forms the first phase in planning for effective disaster risk management through identifying and assessing all hazards that occur within a geographical area, as required by the Disaster Management Act (Act No. 57 of 2002). The National Water Act (Act No. 36 of 1998) requires that flood lines be determined for areas where high risk dams exist and where new town developments occur. However, very few flood hazard maps exist in South Africa for rural areas. The data required for flood modelling analysis is very limited, particularly in rural areas. This study investigated whether flood hazard maps can be created using the existing data sources. A literature review of flood modelling methodologies, data requirements and flood hazard mapping was carried out and an assessment of all available flood-related data sources in South Africa was made. The most appropriate data sources were identified and used to assess an evaluation site. Through combining GIS and hydraulic modelling, results were obtained that indicate the likely extent, frequency and depth of predicted flood events. The results indicate that hydraulic modelling can be performed using the existing data sources but that not enough data is available for calibrating and validating the model. The limitations of the available data are discussed and recommendations for the collection of better data are provided. / AFRIKAANSE OPSOMMING: Vloede is van die vernaamste natuurlike gevare wat in Suid-Afrika voorkom. 'n Ramprisiko-analise is die eerste stap in die proses van suksesvolle ramprisiko-beplanning deur middel van die identifisering en analise van alle gevare wat voorkom in 'n geografiese gebied, soos vereis deur die Rampbestuurwet (Wet 57 van 2002). Die Nasionale Waterwet (Wet 36 van 1998) bepaal dat vloedlyne slegs vir gebiede waar hoë-risiko damme voorkom en vir nuwe uitbreidingsplanne in dorpe vasgestel moet word. Egter is die data wat vir vloedmodelleringsanalises benodig word baie skaars in Suid-Afrikaanse landelike gebiede. Hierdie studie het ondersoek of vloedgevaar-kartering met die beskikbare data moontlik is. 'n Literatuurstudie oor vloedmodelleringsmetodologieë, data-vereistes en vloedgevaarkartering is voltooi en alle beskikbare vloed-verwante data in Suid-Afrika is geëvalueer. Geskikte data-bronne is gekies en gebruik om 'n toetsgebied te assesseer. Deur GIS en hidrouliese modellering te kombineer, is die omvang, waarskynlikheid en diepte van die voorspelde vloedgebeurtenisse gemodelleer. Die studie het bevind dat, alhoewel vloedgevaarkartering met die beskikbare data moontlik is, daar nie genoeg data beskikbaar is om die model te kalibreer en te valideer nie. Tekortkominge van die bestaande data word bespreek en aanbevelings oor die verbetering van die bestaande data vir toepassings in vloedgevaarkartering word gemaak.
|
7 |
Sécurité et disponibilité des données stockées dans les nuages / Data availability and sécurity in cloud storageRelaza, Théodore Jean Richard 12 February 2016 (has links)
Avec le développement de l'Internet, l'informatique s'est basée essentiellement sur les communications entre serveurs, postes utilisateurs, réseaux et data centers. Au début des années 2000, les deux tendances à savoir la mise à disposition d'applications et la virtualisation de l'infrastructure ont vu le jour. La convergence de ces deux tendances a donné naissance à un concept fédérateur qu'est le Cloud Computing (informatique en nuage). Le stockage des données apparaît alors comme un élément central de la problématique liée à la mise dans le nuage des processus et des ressources. Qu'il s'agisse d'une simple externalisation du stockage à des fins de sauvegarde, de l'utilisation de services logiciels hébergés ou de la virtualisation chez un fournisseur tiers de l'infrastructure informatique de l'entreprise, la sécurité des données est cruciale. Cette sécurité se décline selon trois axes : la disponibilité, l'intégrité et la confidentialité des données. Le contexte de nos travaux concerne la virtualisation du stockage dédiée à l'informatique en nuage (Cloud Computing). Ces travaux se font dans le cadre du projet SVC (Secured Virtual Cloud) financé par le Fond National pour la Société Numérique " Investissement d'avenir ". Ils ont conduit au développement d'un intergiciel de virtualisation du stockage, nommé CloViS (Cloud Virtualized Storage), qui entre dans une phase de valorisation portée par la SATT Toulouse-Tech-Transfer. CloViS est un intergiciel de gestion de données développé au sein du laboratoire IRIT, qui permet la virtualisation de ressources de stockage hétérogènes et distribuées, accessibles d'une manière uniforme et transparente. CloViS possède la particularité de mettre en adéquation les besoins des utilisateurs et les disponibilités du système par le biais de qualités de service définies sur des volumes virtuels. Notre contribution à ce domaine concerne les techniques de distribution des données afin d'améliorer leur disponibilité et la fiabilité des opérations d'entrées/sorties dans CloViS. En effet, face à l'explosion du volume des données, l'utilisation de la réplication ne peut constituer une solution pérenne. L'utilisation de codes correcteurs ou de schémas de seuil apparaît alors comme une alternative valable pour maîtriser les volumes de stockage. Néanmoins aucun protocole de maintien de la cohérence des données n'est, à ce jour, adapté à ces nouvelles méthodes de distribution. Nous proposons pour cela des protocoles de cohérence des données adaptés à ces différentes techniques de distribution des données. Nous analysons ensuite ces protocoles pour mettre en exergue leurs avantages et inconvénients respectifs. En effet, le choix d'une technique de distribution de données et d'un protocole de cohérence des données associé se base sur des critères de performance notamment la disponibilité en écriture et lecture, l'utilisation des ressources système (comme l'espace de stockage utilisé) ou le nombre moyen de messages échangés durant les opérations de lecture et écriture. / With the development of Internet, Information Technology was essentially based on communications between servers, user stations, networks and data centers. Both trends "making application available" and "infrastructure virtualization" have emerged in the early 2000s. The convergence of these two trends has resulted in a federator concept, which is the Cloud Computing. Data storage appears as a central component of the problematic associated with the move of processes and resources in the cloud. Whether it is a simple storage externalization for backup purposes, use of hosted software services or virtualization in a third-party provider of the company computing infrastructure, data security is crucial. This security declines according to three axes: data availability, integrity and confidentiality. The context of our work concerns the storage virtualization dedicated to Cloud Computing. This work is carried out under the aegis of SVC (Secured Virtual Cloud) project, financed by the National Found for Digital Society "Investment for the future". This led to the development of a storage virtualization middleware, named CloViS (Cloud Virtualized Storage), which is entering a valorization phase driven by SATT Toulouse-Tech-Transfer. CloViS is a data management middleware developped within the IRIT laboratory. It allows virtualizing of distributed and heterogeneous storage resources, with uniform and seamless access. CloViS aligns user needs and system availabilities through qualities of service defined on virtual volumes. Our contribution in this field concerns data distribution techniques to improve their availability and the reliability of I/O operations in CloViS. Indeed, faced with the explosion in the amount of data, the use of replication can not be a permanent solution. The use of "Erasure Resilient Code" or "Threshold Schemes" appears as a valid alternative to control storage volumes. However, no data consistency protocol is, to date, adapted to these new data distribution methods. For this reason, we propose to adapt these different data distribution techniques. We then analyse these new protocols, highlighting their respective advantages and disadvantages. Indeed, the choice of a data distribution technique and the associated data consistency protocol is based on performance criteria, especially the availability and the number of messages exchanged during the read and write operations or the use of system resources (such as storage space used).
|
8 |
MELQART : un système d'exécution de mashups avec disponibilité de données / MELQART : a mashup execution system with data availabilityOthman Abdallah, Mohamad 26 February 2014 (has links)
Cette thèse présente MELQART, un système d'exécution de mashups avec disponibilité de données. Un mashup est une application web qui combine des données provenant de fournisseurs hétérogènes (web services). Ces données sont agrégées pour former un résultat homogène affiché dans des composants appelés mashlets. Les travaux dans le domaine des mashups, se sont principalement intéressés au fonctionnement des mashups, aux différents outils de construction et à leur utilisation et interaction avec les utilisateurs. Dans cette thèse, nous nous intéressons à la gestion de données dans les mashups et plus particulièrement à la disponibilité et la fraîcheur de ces données. L'amélioration de la disponibilité tient compte du caractère dynamique des données des mashups. Elle garantit (1) l'accès aux données même si le fournisseur est indisponible, (2) la fraicheur de ces données et (3) un partage de données entre les mashups afin d'augmenter la disponibilité de données. Pour cela nous avons défini un modèle de description de mashups permettant de spécifier les caractéristiques de disponibilité des données. Le principe d'exécution de mashups est défini selon ce modèle en proposant d'améliorer la disponibilité et la fraicheur des données du mashup par des fonctionnalités orthogonales à son processus d'exécution. Le système MELQART implante ce principe et permet de valider notre approche à travers l'exécution de plusieurs instances de mashups dans des conditions aléatoires de rupture de communication avec les fournisseurs de données. / This thesis presents MELQART: a mashup execution system that ensures data availability. A mashup is a Web application that application that combines data from heterogeneous provides (Web services). Data are aggregated for building a homogenous result visualized by components named mashlets. Previous works have mainly focused, on the definition of mashups and associated tools and on their use and interaction with users. In this thesis, we focus on mashups data management, and more specifically on fresh mashups data availability. Improving the data availability take into account the dynamic aspect of mashups data. It ensures (1) the access to the required data even if the provider is unavailable, (2) the freshness of these data and (3) the data sharing between mashups in order to avoid the multiple retrieval of the same data. For this purpose, we have defined a mashup description formal model, which allows the specification of data availability features. The mashups execution schema is defined according to this model with functionalities that improve availability and freshness of mashed-up data. These functionalities are orthogonal to the mashup execution process. The MELQART system implements our contribution and validates it by executing mashups instances with unpredictable situations of broken communications with data providers.
|
9 |
Increasing data availability in mobile ad-hoc networks : A community-centric and resource-aware replication approach / Vers une meilleure disponibilité des données dans les réseaux ad-hoc mobiles : Proposition d’une méthodologie de réplication fondée sur la notion de communauté d’intérêt et le contrôle des ressourcesTorbey Takkouz, Zeina 28 September 2012 (has links)
Les réseaux ad hoc mobiles sont des réseaux qui se forment spontanément grâce à la présence de terminaux mobiles. Ces réseaux sans fil sont de faible capacité. Les nœuds se déplacent librement et de manière imprévisible et ils se déchargent très rapidement. En conséquence, un réseau MANET est très enclin à subir des partitionnements fréquents. Les applications déployées sur de tels réseaux, souffrent de problèmes de disponibilité des données induits par ces partitionnements. La réplication des données constitue un mécanisme prometteur pour pallier ce problème. Cependant, la mise en œuvre d’un tel mécanisme dans un environnement aussi contraint en ressources constitue un réel défi. L’objectif principal est donc de réaliser un mécanisme peu consommateur en ressources. Le second objectif de la réplication est de permettre le rééquilibrage de la charge induite par les requêtes de données. Le choix des données à répliquer ainsi que celui des nœuds optimaux pour le placement des futurs réplicas est donc crucial, spécialement dans le contexte du MANET. Dans cette thèse, nous proposons CReaM (Community-Centric and Resource-Aware Replication Model”) un modèle de réplication adapté à un réseau MANET. CReaM fonctionne en mode autonomique : les prises de décisions se basent sur des informations collectées dans le voisinage du nœud plutôt que sur des données globalement impliquant tous les nœuds, ce qui permet de réduire le trafic réseau lié à la réplication. Pour réduire l’usage des ressources induit par la réplication sur un nœud, les niveaux de consommation des ressources sont contrôlés par un moniteur. Toute consommation excédant un seuil prédéfini lié à cette ressource déclenche le processus de réplication. Pour permettre le choix de la donnée à répliquer, une classification multi critères a été proposée (rareté de la donnée, sémantique, niveau de demande); et un moteur d’inférence qui prend en compte l’état de consommation des ressources du nœud pour désigner la catégorie la plus adaptée pour choisir la donnée à répliquer. Pour permettre de placer les réplicas au plus près des nœuds intéressés, CReaM propose un mécanisme pour l’identification et le maintien à jour des centres d’intérêt des nœuds. Les utilisateurs intéressés par un même sujet constituent une communauté. Par ailleurs, chaque donnée à répliquer est estampillée par le ou les sujets au(x)quel(s) elle s’apparente. Un nœud désirant placer un réplica apparenté à un sujet choisira le nœud ayant la plus grande communauté sur ce sujet. Les résultats d’expérimentations confirment la capacité de CReaM à améliorer la disponibilité des données au même niveau que les solutions concurrentes, tout en réduisant la charge liée à la réplication. D’autre part, CReaM permet de respecter l’état de consommation des ressources sur les nœuds. / A Mobile Ad-hoc Network is a self-configured infrastructure-less network. It consists of autonomous mobile nodes that communicate over bandwidth-constrained wireless links. Nodes in a MANET are free to move randomly and organize themselves arbitrarily. They can join/quit the network in an unpredictable way; such rapid and untimely disconnections may cause network partitioning. In such cases, the network faces multiple difficulties. One major problem is data availability. Data replication is a possible solution to increase data availability. However, implementing replication in MANET is not a trivial task due to two major issues: the resource-constrained environment and the dynamicity of the environment makes making replication decisions a very tough problem. In this thesis, we propose a fully decentralized replication model for MANETs. This model is called CReaM: “Community-Centric and Resource-Aware Replication Model”. It is designed to cause as little additional network traffic as possible. To preserve device resources, a monitoring mechanism are proposed. When the consumption of one resource exceeds a predefined threshold, replication is initiated with the goal of balancing the load caused by requests over other nodes. The data item to replicate is selected depending on the type of resource that triggered the replication process. The best data item to replicate in case of high CPU consumption is the one that can better alleviate the load of the node, i.e. a highly requested data item. Oppositely, in case of low battery, rare data items are to be replicated (a data item is considered as rare when it is tagged as a hot topic (a topic with a large community of interested users) but has not been disseminated yet to other nodes). To this end, we introduce a data item classification based on multiple criteria e.g., data rarity, level of demand, semantics of the content. To select the replica holder, we propose a lightweight solution to collect information about the interests of participating users. Users interested in the same topic form a so-called “community of interest”. Through a tags analysis, a data item is assigned to one or more communities of interest. Based on this framework of analysis of the social usage of the data, replicas are placed close to the centers of the communities of interest, i.e. on the nodes with the highest connectivity with the members of the community. The results of evaluating CReaM show that CReaM has positive effects on its main objectives. In particular, it imposes a dramatically lower overhead than that of traditional periodical replication systems (less than 50% on average), while it maintains the data availability at a level comparable to those of its adversaries.
|
10 |
Secret sharing approaches for secure data warehousing and on-line analysis in the cloud / Approches de partage de clés secrètes pour la sécurisation des entrepôts de données et de l’analyse en ligne dans le nuageAttasena, Varunya 22 September 2015 (has links)
Les systèmes d’information décisionnels dans le cloud Computing sont des solutions de plus en plus répandues. En effet, ces dernières offrent des capacités pour l’aide à la décision via l’élasticité des ressources pay-per-use du Cloud. Toutefois, les questions de sécurité des données demeurent une des principales préoccupations notamment lorsqu'il s’agit de traiter des données sensibles de l’entreprise. Beaucoup de questions de sécurité sont soulevées en terme de stockage, de protection, de disponibilité, d'intégrité, de sauvegarde et de récupération des données ainsi que des transferts des données dans un Cloud public. Les risques de sécurité peuvent provenir non seulement des fournisseurs de services de cloud computing mais aussi d’intrus malveillants. Les entrepôts de données dans les nuages devraient contenir des données sécurisées afin de permettre à la fois le traitement d'analyse en ligne hautement protégé et efficacement rafraîchi. Et ceci à plus faibles coûts de stockage et d'accès avec le modèle de paiement à la demande. Dans cette thèse, nous proposons deux nouvelles approches pour la sécurisation des entrepôts de données dans les nuages basées respectivement sur le partage vérifiable de clé secrète (bpVSS) et le partage vérifiable et flexible de clé secrète (fVSS). L’objectif du partage de clé cryptée et la distribution des données auprès de plusieurs fournisseurs du cloud permet de garantir la confidentialité et la disponibilité des données. bpVSS et fVSS abordent cinq lacunes des approches existantes traitant de partage de clés secrètes. Tout d'abord, ils permettent le traitement de l’analyse en ligne. Deuxièmement, ils garantissent l'intégrité des données à l'aide de deux signatures interne et externe. Troisièmement, ils aident les utilisateurs à minimiser le coût de l’entreposage du cloud en limitant le volume global de données cryptées. Sachant que fVSS fait la répartition des volumes des données cryptées en fonction des tarifs des fournisseurs. Quatrièmement, fVSS améliore la sécurité basée sur le partage de clé secrète en imposant une nouvelle contrainte : aucun groupe de fournisseurs de service ne peut contenir suffisamment de volume de données cryptées pour reconstruire ou casser le secret. Et cinquièmement, fVSS permet l'actualisation de l'entrepôt de données, même si certains fournisseurs de services sont défaillants. Pour évaluer l'efficacité de bpVSS et fVSS, nous étudions théoriquement les facteurs qui influent sur nos approches en matière de sécurité, de complexité et de coût financier dans le modèle de paiement à la demande. Nous validons également expérimentalement la pertinence de nos approches avec le Benchmark schéma en étoile afin de démontrer son efficacité par rapport aux méthodes existantes. / Cloud business intelligence is an increasingly popular solution to deliver decision support capabilities via elastic, pay-per-use resources. However, data security issues are one of the top concerns when dealing with sensitive data. Many security issues are raised by data storage in a public cloud, including data privacy, data availability, data integrity, data backup and recovery, and data transfer safety. Moreover, security risks may come from both cloud service providers and intruders, while cloud data warehouses should be both highly protected and effectively refreshed and analyzed through on-line analysis processing. Hence, users seek secure data warehouses at the lowest possible storage and access costs within the pay-as-you-go paradigm.In this thesis, we propose two novel approaches for securing cloud data warehouses by base-p verifiable secret sharing (bpVSS) and flexible verifiable secret sharing (fVSS), respectively. Secret sharing encrypts and distributes data over several cloud service providers, thus enforcing data privacy and availability. bpVSS and fVSS address five shortcomings in existing secret sharing-based approaches. First, they allow on-line analysis processing. Second, they enforce data integrity with the help of both inner and outer signatures. Third, they help users minimize the cost of cloud warehousing by limiting global share volume. Moreover, fVSS balances the load among service providers with respect to their pricing policies. Fourth, fVSS improves secret sharing security by imposing a new constraint: no cloud service provide group can hold enough shares to reconstruct or break the secret. Five, fVSS allows refreshing the data warehouse even when some service providers fail. To evaluate bpVSS' and fVSS' efficiency, we theoretically study the factors that impact our approaches with respect to security, complexity and monetary cost in the pay-as-you-go paradigm. Moreover, we also validate the relevance of our approaches experimentally with the Star Schema Benchmark and demonstrate its superiority to related, existing methods.
|
Page generated in 0.0766 seconds