1 |
Integritetsstudie av Ceph : En mjukvarubaserad lagringsplattformLundberg, Sebastian, Sönnerfors, Peter January 2015 (has links)
Dagens moderna organisationer driver vanligtvis en kostsam IT-baserad infrastruktur. De besitter högpresterande funktionalitet och säkerhet för att garantera att de dagliga operationerna fortskrider. Mjukvarubaserade lagringsplattformar är ett alternativ som kan användas för att reducera dessa kostnader. Genom att efterlikna avancerad teknik som redan existerar i traditionella lagringsplattformar kan detta implementeras som mjukvara på commodity hardware i syfte att uppnå samma funktionalitet. Ceph är ett alternativ som numera implementeras och ämnar erbjuda detta val. Vi anser att mjukvarubaserade lagringslösningar är obeprövade och en serie förkonstruerade tester utfördes som undersökte om Ceph kan garantera dataintegritet. Det som observerades var bristande funktionalitet att bibehålla dataintegritet efter att Ceph slutfört återskapningsprocessen när lagringsklustret erfor hög nyttjandegrad. / Modern organizations today normally operate an expensive IT-based infrastructure. They possess high-performance functionality and security to ensure that the daily operations progresses. Software-based storage platforms is an option that can be used to reduce these costs. By imitating advanced technology that already exists in the traditional storage platforms, this can be implemented as software on commodity hardware in order to achieve the same functionality. Ceph is an alternative that is implemented today and intends to provide this choice. We believe that software-based storage solutions are untested and a series of pre-built tests were conducted that examined whether Ceph can guarantee data integrity. What we observed was a lack of functionality to maintain data integrity after Ceph completed the recovery process, when the storage cluster experienced high utilization.
|
2 |
Evaluation of Storage Systems for Big Data AnalyticsJanuary 2017 (has links)
abstract: Recent trends in big data storage systems show a shift from disk centric models to memory centric models. The primary challenges faced by these systems are speed, scalability, and fault tolerance. It is interesting to investigate the performance of these two models with respect to some big data applications. This thesis studies the performance of Ceph (a disk centric model) and Alluxio (a memory centric model) and evaluates whether a hybrid model provides any performance benefits with respect to big data applications. To this end, an application TechTalk is created that uses Ceph to store data and Alluxio to perform data analytics. The functionalities of the application include offline lecture storage, live recording of classes, content analysis and reference generation. The knowledge base of videos is constructed by analyzing the offline data using machine learning techniques. This training dataset provides knowledge to construct the index of an online stream. The indexed metadata enables the students to search, view and access the relevant content. The performance of the application is benchmarked in different use cases to demonstrate the benefits of the hybrid model. / Dissertation/Thesis / Masters Thesis Computer Science 2017
|
3 |
Scalable Data Management for Object-based Storage SystemsWadhwa, Bharti 19 August 2020 (has links)
Parallel I/O performance is crucial to sustain scientific applications on large-scale High-Performance Computing (HPC) systems. Large scale distributed storage systems, in particular the object-based storage systems, face severe challenges for managing the data efficiently. Inefficient data management leads to poor I/O and storage performance in HPC applications and scientific workflows. Some of the main challenges for efficient data management arise from poor resource allocation, load imbalance in object storage targets, and inflexible data sharing between applications in a workflow. In addition, parallel I/O makes it challenging to shoehorn new interfaces, such as taking advantage of multiple layers of storage and support for analysis in the data path. Solving these challenges to improve performance and efficiency of object-based storage systems is crucial, especially for upcoming era of exascale systems.
This dissertation is focused on solving these major challenges in object-based storage systems by providing scalable data management strategies. In the first part of the dis-sertation (Chapter 3), we present a resource contention aware load balancing tool (iez) for large scale distributed object-based storage systems. In Chapter 4, we extend iez to support Progressive File Layout for object-based storage system: Lustre. In the second part (Chapter 5), we present a technique to facilitate data sharing in scientific workflows using object-based storage, with our proposed tool Workflow Data Communicator. In the last part of this dissertation, we present a solution for transparent data management in multi-layer storage hierarchy of present and next-generation HPC systems.This dissertation shows that by intelligently employing scalable data management techniques, scientific applications' and workflows' flexibility and performance in object-based storage systems can be enhanced manyfold. Our proposed data management strategies can guide next-generation HPC storage systems' software design to efficiently support data for scientific applications and workflows. / Doctor of Philosophy / Large scale object-based storage systems face severe challenges to manage the data efficiently for HPC applications and workflows. These storage systems often manage and share data inflexibly, without considering the load imbalance and resource contention in the underlying multi-layer storage hierarchy. This dissertation first studies how resource contention and inflexible data sharing mechanisms impact HPC applications' storage and I/O performance; and then presents a series of efficient techniques, tools and algorithms to provide efficient and scalable data management for current and next-generation HPC storage systems
|
4 |
On Codes for Private Information Retrieval and Ceph Implementation of a High-Rate Regenerating CodeVinayak, R January 2017 (has links) (PDF)
Error-control codes, which are being extensively used in communication systems, have found themselves very useful in data storage as well during the past decade. This thesis deals with two types of codes for data storage, one pertaining to the issue of privacy and the other to reliability.
In many scenarios, user accessing some critical data from a server would not want the server to learn the identity of data retrieved. This problem, called Private Information Retrieval (PIR) was rst formally introduced by Chor et al and they gave protocols for PIR in the case where multiple copies of the same data is stored in non-communicating servers. The PIR protocols that came up later also followed this replication model. The problem with data replication is the high storage overhead involved, which will lead to large storage costs. Later, Fazeli, Vardy and Yaakobi, came up with the notion of PIR code that enables information-theoretic PIR with low storage overhead. In the rst part of this thesis, construction of PIR codes for certain parameter values is presented. These constructions are based on a variant of conventional Reed-Muller (RM) codes called binary Projective Reed-Muller (PRM) codes. A lower bound on block length of systematic PIR codes is derived and the PRM based PIR codes are shown to be optimal with respect to this bound in some special cases. The codes constructed here have smaller block lengths than the short block length PIR codes known in the literature. The generalized Hamming weights of binary PRM codes are also studied.
Another work described here is the implementation and evaluation of an erasure code called Coupled Layer (CL) code in Ceph distributed storage system. Erasure codes are used in distributed storage to ensure reliability. An additional desirable feature required for codes used in this setting is the ability to handle node repair efficiently. The Minimum Storage Regenerating (MSR) version of CL code downloads optimal amount of data from other nodes during repair of a failed node and even disk reads during this process is optimum, for that storage overhead. The CL-Near-MSR code, which is a variant of CL-MSR, can efficiently handle a restricted set of multiple node failures also. Four example CL codes were evaluated using a 26 node Amazon cluster and performance metrics like network bandwidth, disk read and repair time were measured. Repair time reduction of the order of 3 was observed for one of those codes, in comparison with Reed Solomon code having same parameters. To the best of our knowledge, such large gains in repair performance have never been demonstrated before.
|
5 |
Разработка инфраструктуры и серверного приложения для проекта «Мониторинг IT-конференций» : магистерская диссертация / Development of infrastructure and server application for the project "Monitoring IT conferences"Сухарев, Н. В., Sukharev, N. V. January 2021 (has links)
Цель работы – разработка серверной части приложения и инфраструктурных компонентов для проекта «Мониторинг IT-конференций». Методы исследования: анализ, сравнение, систематизацию и обобщение данных о существующих и разработанных инфраструктурных компонентах, апробация современных подходов при построении архитектуры инфраструктуры. В результате работы сконфигурированы две виртуальные машины для работы Kubernetes и Gitlab Runner, настроены компоненты хранения постоянных данных для PostgreSQL, RabbitMQ и S3-хранилища на базе Rook Ceph, создано приложение на базе Django для предоставления API клиентскому приложению, написана конфигурация для Gitlab CI, обеспечивающая сборку образа приложения и его развертывание в Kubernetes. Созданное приложение предоставляет функционал управления контентом для администраторов сервиса (загрузка видео в S3-хранилище, разметка с помощью системы тегов, привязывание конференций к спикерам) и HTTP API для клиентского приложения с возможностью регистрации, аутентификации через JWT-токены, иерархическому поиску по системе тегов и отдаче подписанных ссылок на S3-хранилище для просмотра видео. / The purpose of the work is to develop the server part of the application and infrastructure components for the project "Monitoring IT conferences". Research methods: analysis, comparison, systematization and generalization of data on existing and developed infrastructure components, approbation of modern approaches in building infrastructure architecture. As a result of the work, two virtual machines were configured for Kubernetes and Gitlab Runner, persistent data storage components for PostgreSQL, RabbitMQ and S3 storage based on Rook Ceph were configured, an application based on Django was created to provide an API to a client application, a configuration for Gitlab CI was written, providing building an application image and deploying it to Kubernetes. The created application provides content management functionality for service administrators (uploading videos to S3 storage, marking using a tag system, binding conferences to speakers) and an HTTP API for a client application with the ability to register, authenticate through JWT tokens, hierarchical search using the tag system, and giving back signed links to S3 storage for watching videos.
|
6 |
Korrelationen zwischen kephalometrischen Werten und dem Knochenangebot intraoraler Spenderregionen für präimplantologische Knochenaugmentationen / Correlations between cephalometric values and bone volumes of intraoral harvest sites for pre-implantation bone graftsSevinc, Tayhan 30 March 2021 (has links)
No description available.
|
7 |
Différenciation génétique des populations humaines pour les gènes de la réponse aux médicaments / Genetic Differentiation of Human Populations for Genes Involved in Drug ResponsePatillon, Blandine 16 July 2014 (has links)
Tous les individus ne répondent pas de la même façon à un même traitement médicamenteux, tant sur le plan pharmacologique (efficacité) que sur le plan toxicologique (effets indésirables). Des facteurs génétiques affectant la pharmacocinétique et la pharmacodynamie des médicaments jouent un rôle déterminant dans cette variabilité interindividuelle de réponse. Certains de ces facteurs sont distribués de manière hétérogène entre les populations humaines. Ces différences s’expliquent en partie par des phénomènes d’adaptation locale des populations à leur environnement. Au cours de son histoire, l’homme a dû en effet faire face à des changements de son environnement chimique, qui ont entraîné des pressions de sélection naturelle sur les gènes intervenant dans la réponse de l’organisme aux xénobiotiques. Ce sont ces mêmes gènes qui, aujourd’hui, influencent la réponse aux médicaments.La formidable accélération des progrès de la génétique donne accès aujourd’hui à la variabilité génétique des populations humaines sur l’ensemble du génome, facilitant la découverte et la compréhension des mécanismes génétiques à l’origine des traits complexes comme la réponse aux médicaments. Les outils de la génétique des populations permettent notamment d’identifier des variants affichant un niveau de différenciation génétique inhabituel entre les populations humaines et de déterminer dans quelle mesure la sélection naturelle a joué un rôle dans les profils atypiques observés.Dans cette thèse, nous avons appliqué ces outils à des données de génotypage et de séquençage pour analyser les profils de différenciation génétique des populations humaines pour les gènes de la réponse aux médicaments. Nous avons ainsi démontré qu’une sélection positive récente en Asie de l’Est dans la région génomique du gène VKORC1 était responsable d’une hétérogénéité de distribution du variant fonctionnel de VKORC1, à l’origine des différences de sensibilité génétique aux anticoagulant oraux de type antivitamine K entre les populations humaines. Puis, en étendant notre analyse à l’ensemble des pharmacogènes majeurs, nous avons identifié de nouveaux variants potentiellement intéressants en pharmacogénétique pour expliquer les différences de réponse aux médicaments entre les populations humaines et les individus. Enfin, l’étude approfondie du gène NAT2 nous a permis de révéler un processus de sélection homogénéisante ciblant un variant fonctionnel associé à un phénotype d’acétylation très lent. Ces résultats soulignent l’influence déterminante de la sélection naturelle dans la variabilité de réponse aux médicaments entre les populations et les individus. Ils montrent l’apport de la génétique des populations pour une meilleure compréhension de la composante génétique de la réponse aux médicaments et des traits complexes. / Response to drug treatment can be highly variable between individuals, both in terms of therapeutic effect (efficacy) and of adverse reactions (toxicity).Genetic factors affecting drug pharmacodynamics and pharmacokinetics play a major role in this inter-individual variability. Some of these factors are heterogeneously distributed among human populations. Local adaptation of populations to their environment partly explained those differences. Indeed,during human evolution, populations had to cope with changes in their chemical environment that triggered selective pressures on genes involved in xenobiotic response. Those genes are the same ones that influence drug response today.The tremendous recent advances in genotyping and sequencing technologies now provide access to the genome-wide patterns of genetic variation in a growing number of human populations, facilitating our understanding of the genetic mechanisms underlying complex traits such as drug response. Population genetic tools allow the identification of variants showing an unusual pattern of genetic differentiation among human populations and the determination of the role played by natural selection in shaping the atypical patterns observed.In this thesis, we have applied these tools on both SNP-chip genotyping data and Next Generation Sequencing data to analyze the genetic differentiation patterns of human populations for genes involved in drug response. We show that a nearly complete selective sweep in East Asia in the genomic region of the VKORC1 gene is responsible for an heterogeneous distribution of theVKORC1 functional variant and can explain the inter-population genetic differences in response to oral anti-vitamin K anticoagulants. Extending the analysis to all major pharmacogenes, we have identified new variants of potential relevance to pharmacogenetics which could explain inter-population and inter-individual differences in drug response. Finally, by a comprehensive analysis of the NAT2 gene, we evidence a homogenizing selection process targeting a functional variant associated with a very slow acetylation phenotype. These results emphasize the crucial role of natural selection in the inter-population and inter-individual drug response variability.They also illustrate the relevance of population genetics studies for a better understanding of the genetic component underlying drug response and complex traits.
|
8 |
Différenciation génétique des populations humaines pour les gènes de la réponse aux médicamentsPatillon, Blandine 16 July 2014 (has links) (PDF)
Tous les individus ne répondent pas de la même façon à un même traitement médicamenteux, tant sur le plan pharmacologique (efficacité) que sur le plan toxicologique (effets indésirables). Des facteurs génétiques affectant la pharmacocinétique et la pharmacodynamie des médicaments jouent un rôle déterminant dans cette variabilité interindividuelle de réponse. Certains de ces facteurs sont distribués de manière hétérogène entre les populations humaines. Ces différences s'expliquent en partie par des phénomènes d'adaptation locale des populations à leur environnement. Au cours de son histoire, l'homme a dû en effet faire face à des changements de son environnement chimique, qui ont entraîné des pressions de sélection naturelle sur les gènes intervenant dans la réponse de l'organisme aux xénobiotiques. Ce sont ces mêmes gènes qui, aujourd'hui, influencent la réponse aux médicaments.La formidable accélération des progrès de la génétique donne accès aujourd'hui à la variabilité génétique des populations humaines sur l'ensemble du génome, facilitant la découverte et la compréhension des mécanismes génétiques à l'origine des traits complexes comme la réponse aux médicaments. Les outils de la génétique des populations permettent notamment d'identifier des variants affichant un niveau de différenciation génétique inhabituel entre les populations humaines et de déterminer dans quelle mesure la sélection naturelle a joué un rôle dans les profils atypiques observés.Dans cette thèse, nous avons appliqué ces outils à des données de génotypage et de séquençage pour analyser les profils de différenciation génétique des populations humaines pour les gènes de la réponse aux médicaments. Nous avons ainsi démontré qu'une sélection positive récente en Asie de l'Est dans la région génomique du gène VKORC1 était responsable d'une hétérogénéité de distribution du variant fonctionnel de VKORC1, à l'origine des différences de sensibilité génétique aux anticoagulant oraux de type antivitamine K entre les populations humaines. Puis, en étendant notre analyse à l'ensemble des pharmacogènes majeurs, nous avons identifié de nouveaux variants potentiellement intéressants en pharmacogénétique pour expliquer les différences de réponse aux médicaments entre les populations humaines et les individus. Enfin, l'étude approfondie du gène NAT2 nous a permis de révéler un processus de sélection homogénéisante ciblant un variant fonctionnel associé à un phénotype d'acétylation très lent. Ces résultats soulignent l'influence déterminante de la sélection naturelle dans la variabilité de réponse aux médicaments entre les populations et les individus. Ils montrent l'apport de la génétique des populations pour une meilleure compréhension de la composante génétique de la réponse aux médicaments et des traits complexes.
|
Page generated in 0.029 seconds