Global ETD Search

51	Data Transfer and Management through the IKAROS framework : Adopting an asynchronous non-blocking event driven approach to implement the Elastic-Transfer's IMAP client-server connection Gkikas, Nikolaos January 2015 (has links) Given the current state of input/output (I/O) and storage devices in petascale systems, incremental solutions would be ineffective when implemented in exascale environments. According to the "The International Exascale Software Roadmap", by Dongarra, et al. existing I/O architectures are not sufficiently scalable, especially because current shared file systems have limitations when used in large-scale environments. These limitations are: Bandwidth does not scale economically to large-scale systems, I/O traffic on the high speed network can impact on and be influenced by other unrelated jobs, and I/O traffic on the storage server can impact on and be influenced by other unrelated jobs. Future applications on exascale computers will require I/O bandwidth proportional to their computational capabilities. To avoid these limitations C. Filippidis, C. Markou, and Y. Cotronis proposed the IKAROS framework. In this thesis project, the capabilities of the publicly available elastic-transfer (eT) module which was directly derived from the IKAROS, will be expanded. The eT uses Google’s Gmail service as an utility for efficient meta-data management. Gmail is based on the IMAP protocol, and the existing version of the eT framework implements the Internet Message Access Protocol (IMAP) client-server connection through the ‘‘Inbox’’ module from the Node Package Manager (NPM) of the Node.js programming language. This module was used as a proof of concept, but in a production environment this implementation undermines the system’s scalability and there is an inefficient allocation of the system’s resources when a large number of concurrent requests arrive at the eT′s meta-data server (MDS) at the same time. This thesis solves this problem by adopting an asynchronous non-blocking event driven approach to implement the IMAP client-server connection. This was done by integrating and modifying the ‘‘Imap’’ NPM module from the NPM repository to suit the eT framework. Additionally, since the JavaScript Object Notation (JSON) format has become one of the most widespread data-interchange formats, eT′s meta-data scheme is appropriately modified to make the system’s meta-data easily parsed as JSON objects. This feature creates a framework with wider compatibility and interoperability with external systems. The evaluation and operational behavior of the new module was tested through a set of data transfer experiments over a wide area network environment. These experiments were performed to ensure that the changes in the system’s architecture did not affected its performance. / Givet det nuvarande läget för input/output (I/O) och lagringsenheter för system i peta-skala, skulle inkrementella lösningar bli ineffektiva om de implementerades i exa-skalamiljöer. Enligt ”The International Exascale Software Roadmap”, av Dongarra et al., är nuvarande I/O-arkitekturer inte tillräckligt skalbara, särskilt eftersom nuvarande delade filsystem har begränsningar när de används i storskaliga miljöer. Dessa begränsningar är: Bandbredd skalar inte på ett ekonomiskt sätt i storskaliga system, I/O-trafik på höghastighetsnätverk kan ha påverkan på och blir påverkad av andra orelaterade jobb, och I/O-trafik på lagringsservern kan ha påverkan på och bli påverkad av andra orelaterade jobb. Framtida applikationer på exa-skaladatorer kommer kräva I/O-bandbredd proportionellt till deras beräkningskapacitet. För att undvika dessa begränsningar föreslog C. Filippidis, C. Markou och Y. Cotronis ramverket IKAROS. I detta examensarbete utökas funktionaliteten hos den publikt tillgängliga modulen elastic-transfer (eT) som framtagits utifrån IKAROS. Den befintliga versionen av eT-ramverket implementerar Internet Message Access Protocol (IMAP) klient-serverkommunikation genom modulen ”Inbox” från Node Package Manager (NPM) ur Node.js programmeringsspråk. Denna modul användes som ett koncepttest, men i en verklig miljö så underminerar denna implementation systemets skalbarhet när ett stort antal värdar ansluter till systemet. Varje klient begär individuellt information relaterad till systemets metadata från IMAP-servern, vilket leder till en ineffektiv allokering av systemets resurser när ett stort antal värdar är samtidigt anslutna till eT-ramverket. Denna uppsats löser problemet genom att använda ett asynkront, icke-blockerande och händelsedrivet tillvägagångssätt för att implementera en IMAP klient-serveranslutning. Detta görs genom att integrera och modifiera NPM:s ”Imap”-modul, tagen från NPM:s katalog, så att den passar eT-ramverket. Eftersom formatet JavaScript Object Notation (JSON) har blivit ett av de mest spridda formaten för datautbyte så modifieras även eT:s metadata-struktur för att göra systemets metadata enkelt att omvandla till JSON-objekt. Denna funktionalitet ger ett bredare kompatibilitet och interoperabilitet med externa system. Utvärdering och tester av den nya modulens operationella beteende utfördes genom en serie dataöverföringsexperiment i en wide area network-miljö. Dessa experiment genomfördes för att få bekräftat att förändringarna i systemets arkitektur inte påverkade dess prestanda. parallel file systems distributed file systems IKAROS file system elastic-transfer grid computing storage systems I/O limitations exascale low power consumption low cost devices synchronous blocking asynchronous non-blocking event-driven JSON. parallella filsystem distribuerade filsystem IKAROS filsystem elastic-transfer grid computing lagringssystem I/O-begränsningar exa-skala låg energiförbrukning lågkostnadsenheter synkron blockerande asynkron icke-blockerande händelsedriven JSON Communication Systems Kommunikationssystem
52	Extensible Networked-storage Virtualization with Metadata Management at the Block Level Flouris, Michail D. 24 September 2009 (has links) Increased scaling costs and lack of desired features is leading to the evolution of high-performance storage systems from centralized architectures and specialized hardware to decentralized, commodity storage clusters. Existing systems try to address storage cost and management issues at the filesystem level. Besides dictating the use of a specific filesystem, however, this approach leads to increased complexity and load imbalance towards the file-server side, which in turn increase costs to scale. In this thesis, we examine these problems at the block-level. This approach has several advantages, such as transparency, cost-efficiency, better resource utilization, simplicity and easier management. First of all, we explore the mechanisms, the merits, and the overheads associated with advanced metadata-intensive functionality at the block level, by providing versioning at the block level. We find that block-level versioning has low overhead and offers transparency and simplicity advantages over filesystem-based approaches. Secondly, we study the problem of providing extensibility required by diverse and changing application needs that may use a single storage system. We provide support for (i)adding desired functions as block-level extensions, and (ii)flexibly combining them to create modular I/O hierarchies. In this direction, we design, implement and evaluate an extensible block-level storage virtualization framework, Violin, with support for metadata-intensive functions. Extending Violin we build Orchestra, an extensible framework for cluster storage virtualization and scalable storage sharing at the block-level. We show that Orchestra's enhanced block interface can substantially simplify the design of higher-level storage services, such as cluster filesystems, while being scalable. Finally, we consider the problem of consistency and availability in decentralized commodity clusters. We propose RIBD, a novel storage system that provides support for handling both data and metadata consistency issues at the block layer. RIBD uses the notion of consistency intervals (CIs) to provide fine-grain consistency semantics on sequences of block level operations by means of a lightweight transactional mechanism. RIBD relies on Orchestra's virtualization mechanisms and uses a roll-back recovery mechanism based on low-overhead block-level versioning. We evaluate RIBD on a cluster of 24 nodes, and find that it performs comparably to two popular cluster filesystems, PVFS and GFS, while offering stronger consistency guarantees. Operating Systems Computer Systems Distributed Systems File Systems Cluster Storage Block-level Versioning Storage Virtualization Commodity Clusters Block-level Storage Disk Storage Shared Virtual Disk Extensible Virtual Storage Consistency Availability 0984
53	Extensible Networked-storage Virtualization with Metadata Management at the Block Level Flouris, Michail D. 24 September 2009 (has links) Increased scaling costs and lack of desired features is leading to the evolution of high-performance storage systems from centralized architectures and specialized hardware to decentralized, commodity storage clusters. Existing systems try to address storage cost and management issues at the filesystem level. Besides dictating the use of a specific filesystem, however, this approach leads to increased complexity and load imbalance towards the file-server side, which in turn increase costs to scale. In this thesis, we examine these problems at the block-level. This approach has several advantages, such as transparency, cost-efficiency, better resource utilization, simplicity and easier management. First of all, we explore the mechanisms, the merits, and the overheads associated with advanced metadata-intensive functionality at the block level, by providing versioning at the block level. We find that block-level versioning has low overhead and offers transparency and simplicity advantages over filesystem-based approaches. Secondly, we study the problem of providing extensibility required by diverse and changing application needs that may use a single storage system. We provide support for (i)adding desired functions as block-level extensions, and (ii)flexibly combining them to create modular I/O hierarchies. In this direction, we design, implement and evaluate an extensible block-level storage virtualization framework, Violin, with support for metadata-intensive functions. Extending Violin we build Orchestra, an extensible framework for cluster storage virtualization and scalable storage sharing at the block-level. We show that Orchestra's enhanced block interface can substantially simplify the design of higher-level storage services, such as cluster filesystems, while being scalable. Finally, we consider the problem of consistency and availability in decentralized commodity clusters. We propose RIBD, a novel storage system that provides support for handling both data and metadata consistency issues at the block layer. RIBD uses the notion of consistency intervals (CIs) to provide fine-grain consistency semantics on sequences of block level operations by means of a lightweight transactional mechanism. RIBD relies on Orchestra's virtualization mechanisms and uses a roll-back recovery mechanism based on low-overhead block-level versioning. We evaluate RIBD on a cluster of 24 nodes, and find that it performs comparably to two popular cluster filesystems, PVFS and GFS, while offering stronger consistency guarantees. Operating Systems Computer Systems Distributed Systems File Systems Cluster Storage Block-level Versioning Storage Virtualization Commodity Clusters Block-level Storage Disk Storage Shared Virtual Disk Extensible Virtual Storage Consistency Availability 0984
54	Distribuerade datalagringssystem för tjänsteleverantörer : Undersökning av olika användningsfall för distribuerade datalagringssystem / Distributed Data Storage Systems for Service Providers : Investigation of different use cases for distributed data storage systems Ahmed, Tanvir Saif, Markovic, Bratislav January 2016 (has links) Detta examensarbete handlar om undersökning av tre olika användningsfall inom datalagring; Cold Storage, High Performance Storage och Virtual Machine Storage. Rapporten har som syfte att ge en översikt över kommersiella distribuerade filsystem samt en djupare undersökning av distribuerade filsystem som bygger på öppen källkod och därmed hitta en optimal lösning för dessa användnings-fall. I undersökningen ingick att analysera och jämföra tidigare arbeten där jämförelser mellan pre-standamätningar, dataskydd och kostnader utfördes samt lyfta upp diverse funktionaliteter (snapshotting, multi-tenancy, datadeduplicering, datareplikering) som moderna distribuerade filsy-stem kännetecknas av. Både kommersiella och öppna distribuerade filsystem undersöktes. Även en kostnadsuppskattning för kommersiella och öppna distribuerade filsystem gjordes för att ta reda på lönsamheten för dessa två typer av distribuerat filsystem.Efter att jämförelse och analys av olika tidigare arbeten utfördes, visade sig att det öppna distribue-rade filsystemet Ceph lämpade sig bra som en lösning utifrån kraven som sattes som mål för High Performance Storage och Virtual Machine Storage. Kostnadsuppskattningen visade att det var mer lönsamt att implementera ett öppet distribuerat filsystem. Denna undersökning kan användas som en vägledning vid val mellan olika distribuerade filsystem. / In this thesis, a study of three different uses cases has been made within the field of data storage, which are as following: Cold Storage, High Performance Storage and Virtual Machine Storage. The purpose of the survey is to give an overview of commercial distributed file systems and a deeper study of open source codes distributed file systems in order to find the most optimal solution for these use cases. Within the study, previous works concerning performance, data protection and costs were an-alyzed and compared in means to find different functionalities (snapshotting, multi-tenancy, data duplication and data replication) which distinguish modern distributed file systems. Both commercial and open distributed file systems were examined. A cost estimation for commercial and open distrib-uted file systems were made in means to find out the profitability for these two types of distributed file systems.After comparing and analyzing previous works, it was clear that the open source distributed file sys-tem Ceph was proper as a solution in accordance to the objectives that were set for High Performance Storage and Virtual Machine Storage. The cost estimation showed that it was more profitable to im-plement an open distributed file system. This study can be used as guidance to choose between different distributed file systems. Cold Storage High Performance Storage Virtual Machine Storage uses cases snapshotting multi-tenancy data deduplication data replication distributed file systems Cold Storage High Performance Storage Virtual Machine Storage användningsfall snapshotting multi-tenancy datadeduplicering datareplikering distribuerade filsystem Computer Systems Datorsystem Computer Engineering Datorteknik

Page generated in 0.0577 seconds