Return to search

KTHFS – A HIGHLY AVAILABLE ANDSCALABLE FILE SYSTEM

KTHFS is a highly available and scalable file system built from the version 0.24 of the Hadoop Distributed File system. It provides a platform to overcome the limitations of existing distributed file systems. These limitations include scalability of metadata server in terms of memory usage, throughput and its availability. This document describes KTHFS architecture and how it addresses these problems by providing a well coordinated distributed stateless metadata server (or in our case, Namenode) architecture. This is backed with the help of a persistence layer such as NDB cluster. Its primary focus is towards High Availability of the Namenode. It achieves scalability and recovery by persisting the metadata to an NDB cluster. All namenodes are connected to this NDB cluster and hence are aware of the state of the file system at any point in time. In terms of High Availability, KTHFS provides Multi-Namenode architecture. Since these namenodes are stateless and have a consistent view of the metadata, clients can issue requests on any of the namenodes. Hence, if one of these servers goes down, clients can retry its operation on the next available namenode. We next discuss the evaluation of KTHFS in terms of its metadata capacity for medium and large size clusters, throughput and high availability of the Namenode and an analysis of the underlying NDBcluster. Finally, we conclude this document with a few words on the ongoing and future work in KTHFS.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-117918
Date January 2013
CreatorsD'Souza, Jude Clement
PublisherKTH, Skolan för informations- och kommunikationsteknik (ICT)
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationTrita-ICT-EX ; 2013:30

Page generated in 0.0022 seconds