Global ETD Search

11	Technologie vysoké dostupnosti MS SQL Serveru / High availability Microsoft Sql Server Pyszko, Pavel January 2015 (has links) The thesis contains a complete theoretical overview of high availability technologies in Microsoft SQL Server. For each technology, guidance is provided for the deployment of technologies. Technologies are analyzed from a security perspective, they are determined advantages and disadvantages of using technology in practice and is determined the optimal variant of use of the technology. High availability technology are compared with each other and is given the availability of individual technologies in versions of MS SQL Server. The thesis contains three scenarios with practical examples of using technology for high availability in practice. It provides an analysis of high-availability features in Oracle and are subsequently compared high availability features in Oracle environments with high availability technology in MS SQL Server.
12	KTHFS – A HIGHLY AVAILABLE ANDSCALABLE FILE SYSTEM D'Souza, Jude Clement January 2013 (has links) KTHFS is a highly available and scalable file system built from the version 0.24 of the Hadoop Distributed File system. It provides a platform to overcome the limitations of existing distributed file systems. These limitations include scalability of metadata server in terms of memory usage, throughput and its availability. This document describes KTHFS architecture and how it addresses these problems by providing a well coordinated distributed stateless metadata server (or in our case, Namenode) architecture. This is backed with the help of a persistence layer such as NDB cluster. Its primary focus is towards High Availability of the Namenode. It achieves scalability and recovery by persisting the metadata to an NDB cluster. All namenodes are connected to this NDB cluster and hence are aware of the state of the file system at any point in time. In terms of High Availability, KTHFS provides Multi-Namenode architecture. Since these namenodes are stateless and have a consistent view of the metadata, clients can issue requests on any of the namenodes. Hence, if one of these servers goes down, clients can retry its operation on the next available namenode. We next discuss the evaluation of KTHFS in terms of its metadata capacity for medium and large size clusters, throughput and high availability of the Namenode and an analysis of the underlying NDBcluster. Finally, we conclude this document with a few words on the ongoing and future work in KTHFS. Namenode NDB cluster MySQL cluster KTHFS HDFS metadata High Availability Scalability throughput Engineering and Technology Teknik och teknologier
13	High Availability in Lifecycle Management of Cloud-Native Network Functions : A Near-Zero Downtime Database Version Change Prototype Zhang, Ziheng January 2023 (has links) Ensuring high system availability is a crucial goal for many organizations, such as Ericsson. In this context, databases play a significant role as they represent a fundamental element that affects system availability within today’s complex technological environments. Mitigating downtime and maintaining high availability during database version changes are essential to ensure seamless continuity of business and system operations, such as data transactions, queries, and administrative tasks. In this project, we developed a prototype system to facilitate near-zero downtime during database version changes, thus preserving service availability and ensuring the process remains transparent to end users. Contrary to traditional database versioning approaches in the telecommunication industry, which require extensive downtime for data backup, validation, and migration, our system applies the established Blue-Green release strategy in a novel way. It benefits from the Logical Replication feature of PostgreSQL for data synchronization and further automates it for cloud-native deployments using the Kubernetes Operator Pattern. The entire database version change operation is automated by applying a Kubernetes Operator Pattern, ensuring uninterrupted external access to the system during the version change process. This innovative approach holds significant potential to augment database management practices, leading to enhanced system availability and reliability for applications deployed on cloud-native infrastructure. / Att säkerställa hög systemtillgänglighet är ett avgörande mål för många organisationer, som Ericsson. I detta sammanhang spelar databaser en betydande roll då de representerar ett grundläggande element som påverkar systemtillgängligheten inom dagens komplexa tekniska miljöer. Att minska driftstopp och bibehålla hög tillgänglighet under databasversionsändringar är avgörande för att säkerställa sömlös kontinuitet i affärs- och systemdrift, såsom datatransaktioner, frågor och administrativa uppgifter. I det här projektet utvecklade vi ett prototypsystem för att underlätta nästan noll driftstopp under databasversionsändringar, vilket bevarar tjänstens tillgänglighet och säkerställer att processen förblir transparent för slutanvändarna. I motsats till traditionella databasversionsmetoder, som kräver omfattande driftstopp för säkerhetskopiering, validering och migrering av data, tillämpar vårt system den etablerade Blue-Green releasestrategin på ett nytt sätt. Den drar nytta av den logiska replikeringsfunktionen i PostgreSQL för datasynkronisering och automatiserar den ytterligare för molnbaserade distributioner med hjälp av Kubernetes Operator Pattern. Hela databasversionsändringsoperationen automatiseras genom att tillämpa ett Kubernetes Operator Pattern, vilket säkerställer oavbruten extern åtkomst till systemet under versionsändringsprocessen. Detta innovativa tillvägagångssätt har betydande potential för att utöka databashanteringsmetoderna, vilket leder till förbättrad systemtillgänglighet och tillförlitlighet för applikationer som distribueras på en molnbaserad infrastruktur. Kubernetes Kubernetes Operator Pattern Database High Availability Computer and Information Sciences Data- och informationsvetenskap
14	Resilire: Achieving High Availability Through Virtual Machine Live Migration Lu, Peng 16 October 2013 (has links) High availability is a critical feature of data centers, cloud, and cluster computing environments. Replication is a classical approach to increase service availability by providing redundancy. However, traditional replication methods are increasingly unattractive for deployment due to several limitations such as application-level non-transparency, non-isolation of applications (causing security vulnerabilities), complex system management, and high cost. Virtualization overcomes these limitations through another layer of abstraction, and provides high availability through virtual machine (VM) live migration: a guest VM image running on a primary host is transparently check-pointed and migrated, usually at a high frequency, to a backup host, without pausing the VM; the VM is resumed from the latest checkpoint on the backup when a failure occurs. A virtual cluster (VC) generalizes the VM concept for distributed applications and systems: a VC is a set of multiple VMs deployed on different physical machines connected by a virtual network. This dissertation presents a set of VM live migration techniques, their implementations in the Xen hypervisor and Linux operating system kernel, and experimental studies conducted using benchmarks (e.g., SPEC, NPB, Sysbench) and production applications (e.g., Apache webserver, SPECweb). We first present a technique for reducing VM migration downtimes called FGBI. FGBI reduces the dirty memory updates that must be migrated during each migration epoch by tracking memory at block granularity. Additionally, it determines memory blocks with identical content and shares them to reduce the increased memory overheads due to block-level tracking granularity, and uses a hybrid compression mechanism on the dirty blocks to reduce the migration traffic. We implement FGBI in the Xen hypervisor and conduct experimental studies, which reveal that the technique reduces the downtime by 77% and 45% over competitors including LLM and Remus, respectively, with a performance overhead of 13%. We then present a lightweight, globally consistent checkpointing mechanism for virtual cluster, called VPC, which checkpoints the VC for immediate restoration after (one or more) VM failures. VPC predicts the checkpoint-caused page faults during each checkpointing interval, in order to implement a lightweight checkpointing approach for the entire VC. Additionally, it uses a globally consistent checkpointing algorithm, which preserves the global consistency of the VMs' execution and communication states, and only saves the updated memory pages during each checkpointing interval. Our Xen-based implementation and experimental studies reveal that VPC reduces the solo VM downtime by as much as 45% and reduces the entire VC downtime by as much as 50% over competitors including VNsnap, with a memory overhead of 9% and performance overhead of 16%. The dissertation's third contribution is a VM resumption mechanism, called VMresume, which restores a VM from a (potentially large) checkpoint on slow-access storage in a fast and efficient way. VMresume predicts and preloads the memory pages that are most likely to be accessed after the VM's resumption, minimizing otherwise potential performance degradation due to cascading page faults that may occur on VM resumption. Our experimental studies reveal that VM resumption time is reduced by an average of 57% and VM's unusable time is reduced by 73.8% over native Xen's resumption mechanism. Traditional VM live migration mechanisms are based on hypervisors. However, hypervisors are increasingly becoming the source of several major security attacks and flaws. We present a mechanism called HSG-LM that does not involve the hypervisor during live migration. HSG-LM is implemented in the guest OS kernel so that the hypervisor is completely bypassed throughout the entire migration process. The mechanism exploits a hybrid strategy that reaps the benefits of both pre-copy and post-copy migration mechanisms, and uses a speculation mechanism that improves the efficiency of handling post-copy page faults. We modify the Linux kernel and develop a new page fault handler inside the guest OS to implement HSG-LM. Our experimental studies reveal that the technique reduces the downtime by as much as 55%, and reduces the total migration time by as much as 27% over competitors including Xen-based pre-copy, post-copy, and self-migration mechanisms. In a virtual cluster environment, one of the main challenges is to ensure equal utilization of all the available resources while avoiding overloading a subset of machines. We propose an efficient load balancing strategy using VM live migration, called DCbalance. Differently from previous work, DCbalance records the history of mappings to inform future placement decisions, and uses a workload-adaptive live migration algorithm to minimize VM downtime. We improve Xen's original live migration mechanism and implement the DCbalance technique, and conduct experimental studies. Our results reveal that DCbalance reduces the decision generating time by 79%, the downtime by 73%, and the total migration time by 38%, over competitors including the OSVD virtual machine load balancing mechanism and the DLB (Xen-based) dynamic load balancing algorithm. The dissertation's final contribution is a technique for VM live migration in Wide Area Networks (WANs), called FDM. In contrast to live migration in Local Area Networks (LANs), VM migration in WANs involve migrating disk data, besides memory state, because the source and the target machines do not share the same disk service. FDM is a fast and storage-adaptive migration mechanism that transmits both memory state and disk data with short downtime and total migration time. FDM uses page cache to identify data that is duplicated between memory and disk, so as to avoid transmitting the same data unnecessarily. We implement FDM in Xen, targeting different disk formats including raw and Qcow2. Our experimental studies reveal that FDM reduces the downtime by as much as 87%, and reduces the total migration time by as much as 58% over competitors including pre-copy or post-copy disk migration mechanisms and the disk migration mechanism implemented in BlobSeer, a widely used large-scale distributed storage service. / Ph. D. High Availability Virtual Machine Live Migration Checkpointing Load Balancing Downtime Xen Hypervisor
15	A High-Availability Architecture for the Dynamic Domain Name System Filippi, Geoffrey George 09 June 2008 (has links) The Domain Name System (DNS) provides a mapping between host names and Internet Protocol (IP) addresses. Hosts that are configured using the Dynamic Host Configuration Protocol (DHCP) can have their assigned IP addresses updated in a Dynamic DNS (DDNS). DNS and DDNS are critical components of the Internet. Most applications use host names rather than IP addresses, allowing the underlying operating system (OS) to translate these host names to IP addresses on behalf of the application. When the DDNS service is unavailable, applications that use DNS cannot contact the hosts served by that DDNS server. Unfortunately, the current DDNS implementation cannot continue to operate under failure of a master DNS server. Although a slave DNS server can continue to translate names to addresses, new IP addresses or changes to existing IP addresses cannot be added. Therefore, those new hosts cannot be reached by the DDNS. A new architecture is presented that eliminates this single point of failure. In this design, instead of storing resource records in a flat text file, all name servers connect to a Lightweight Directory Access Protocol (LDAP) directory to store and retrieve resource records. These directory servers replicate all resource records across each other using a multi-master replication mechanism. The DHCP servers can add records to any of the functioning DNS servers in event of an outage. In this scheme, all DNS servers use the anycast Border Gateway Protocol (BGP). This allows any of the DNS servers to answer queries sent to a single IP address. The DNS clients always use the same IP address to send queries. The routing system removes routes to non-functional name servers and delivers the request to the closest (according to network metrics) available DNS server. This thesis also describes a concrete implementation of this system that was created to demonstrate the viability of this solution. A reference implementation was built in a laboratory to represent an Internet Service Provider (ISP) with three identical regions. This implementation was built using Quagga as the BGP routing software running on a set of core routers and on each of the DNS servers. The Berkeley Internet Name Daemon (BIND) was used as an implementation of the DNS. The BIND Simplified Database Backend (SDB) interface was used to allow the DNS server to store and retrieve resource records in an LDAP directory. The Fedora Directory Server was used as a multi-master LDAP directory. DHCP service was provided by the Internet Systems Consortium's (ISC) DHCP server. The objectives for the design were high-availability, scalability and consistency. These properties were analyzed using the metrics of downtime during failover, replication overhead, and latency of replication. The downtime during failover was less than one second. The precision of this metric was limited by the synchronization provided by the Network Time Protocol (NTP) implementation used in the laboratory. The network traffic overhead for a three-way replication was shown to be only 3.5 times non-replicated network traffic. The latency of replication was also shown to be less than one second. The results show the viability of this approach and indicate that this solution should be usable over a wide area network, serving a large number of clients. / Master of Science DNS DDNS BGP anycast DHCP replication LDAP multi-master high-availability reliability
16	Optimizing recovery protocols for replicated database systems García Muñoz, Luis Hector 02 September 2013 (has links) En la actualidad, el uso de tecnologías de informacíon y sistemas de cómputo tienen una gran influencia en la vida diaria. Dentro de los sistemas informáticos actualmente en uso, son de gran relevancia los sistemas distribuidos por la capacidad que pueden tener para escalar, proporcionar soporte para la tolerancia a fallos y mejorar el desempeño de aplicaciones y proporcionar alta disponibilidad. Los sistemas replicados son un caso especial de los sistemas distribuidos. Esta tesis está centrada en el área de las bases de datos replicadas debido al uso extendido que en el presente se hace de ellas, requiriendo características como: bajos tiempos de respuesta, alto rendimiento en los procesos, balanceo de carga entre las replicas, consistencia e integridad de datos y tolerancia a fallos. En este contexto, el desarrollo de aplicaciones utilizando bases de datos replicadas presenta dificultades que pueden verse atenuadas mediante el uso de servicios de soporte a mas bajo nivel tales como servicios de comunicacion y pertenencia. El uso de los servicios proporcionados por los sistemas de comunicación de grupos permiten ocultar los detalles de las comunicaciones y facilitan el diseño de protocolos de replicación y recuperación. En esta tesis, se presenta un estudio de las alternativas y estrategias empleadas en los protocolos de replicación y recuperación en las bases de datos replicadas. También se revisan diferentes conceptos sobre los sistemas de comunicación de grupos y sincronia virtual. Se caracterizan y clasifican diferentes tipos de protocolos de replicación con respecto a la interacción o soporte que pudieran dar a la recuperación, sin embargo el enfoque se dirige a los protocolos basados en sistemas de comunicación de grupos. Debido a que los sistemas comerciales actuales permiten a los programadores y administradores de sistemas de bases de datos renunciar en alguna medida a la consistencia con la finalidad de aumentar el rendimiento, es importante determinar el nivel de consistencia necesario. En el caso de las bases de datos replicadas la consistencia está muy relacionada con el nivel de aislamiento establecido entre las transacciones. Una de las propuestas centrales de esta tesis es un protocolo de recuperación para un protocolo de replicación basado en certificación. Los protocolos de replicación de base de datos basados en certificación proveen buenas bases para el desarrollo de sus respectivos protocolos de recuperación cuando se utiliza el nivel de aislamiento snapshot. Para tal nivel de aislamiento no se requiere que los readsets sean transferidos entre las réplicas ni revisados en la fase de cetificación y ya que estos protocolos mantienen un histórico de la lista de writesets que es utilizada para certificar las transacciones, este histórico provee la información necesaria para transferir el estado perdido por la réplica en recuperación. Se hace un estudio del rendimiento del protocolo de recuperación básico y de la versión optimizada en la que se compacta la información a transferir. Se presentan los resultados obtenidos en las pruebas de la implementación del protocolo de recuperación en el middleware de soporte. La segunda propuesta esta basada en aplicar el principio de compactación de la informacion de recuperación en un protocolo de recuperación para los protocolos de replicación basados en votación débil. El objetivo es minimizar el tiempo necesario para transfeir y aplicar la información perdida por la réplica en recuperación obteniendo con esto un protocolo de recuperación mas eficiente. Se ha verificado el buen desempeño de este algoritmo a través de una simulación. Para efectuar la simulación se ha hecho uso del entorno de simulación Omnet++. En los resultados de los experimentos puede apreciarse que este protocolo de recuperación tiene buenos resultados en múltiples escenarios. Finalmente, se presenta la verificación de la corrección de ambos algoritmos de recuperación en el Capítulo 5. / Nowadays, information technology and computing systems have a great relevance on our lives. Among current computer systems, distributed systems are one of the most important because of their scalability, fault tolerance, performance improvements and high availability. Replicated systems are a specific case of distributed system. This Ph.D. thesis is centered in the replicated database field due to their extended usage, requiring among other properties: low response times, high throughput, load balancing among replicas, data consistency, data integrity and fault tolerance. In this scope, the development of applications that use replicated databases raises some problems that can be reduced using other fault-tolerant building blocks, as group communication and membership services. Thus, the usage of the services provided by group communication systems (GCS) hides several communication details, simplifying the design of replication and recovery protocols. This Ph.D. thesis surveys the alternatives and strategies being used in the replication and recovery protocols for database replication systems. It also summarizes different concepts about group communication systems and virtual synchrony. As a result, the thesis provides a classification of database replication protocols according to their support to (and interaction with) recovery protocols, always assuming that both kinds of protocol rely on a GCS. Since current commercial DBMSs allow that programmers and database administrators sacrifice consistency with the aim of improving performance, it is important to select the appropriate level of consistency. Regarding (replicated) databases, consistency is strongly related to the isolation levels being assigned to transactions. One of the main proposals of this thesis is a recovery protocol for a replication protocol based on certification. Certification-based database replication protocols provide a good basis for the development of their recovery strategies when a snapshot isolation level is assumed. In that level readsets are not needed in the validation step. As a result, they do not need to be transmitted to other replicas. Additionally, these protocols hold a writeset list that is used in the certification/validation step. That list maintains the set of writesets needed by the recovery protocol. This thesis evaluates the performance of a recovery protocol based on the writeset list tranfer (basic protocol) and of an optimized version that compacts the information to be transferred. The second proposal applies the compaction principle to a recovery protocol designed for weak-voting replication protocols. Its aim is to minimize the time needed for transferring and applying the writesets lost by the recovering replica, obtaining in this way an efficient recovery. The performance of this recovery algorithm has been checked implementing a simulator. To this end, the Omnet++ simulating framework has been used. The simulation results confirm that this recovery protocol provides good results in multiple scenarios. Finally, the correction of both recovery protocols is also justified and presented in Chapter 5. / García Muñoz, LH. (2013). Optimizing recovery protocols for replicated database systems [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/31632 Distrubuted systems Replicated systems Replicated databases High availability Replication protocols Recovery protocols LENGUAJES Y SISTEMAS INFORMATICOS
17	Resource Allocation in Network Function Virtualization with Workload-Dependent Unavailability / 負荷依存の不可用性を伴うネットワーク機能仮想化における資源割り当て朱, 梦菲 23 May 2024 (has links) 京都大学 / 新制・課程博士 / 博士(情報学) / 甲第25511号 / 情博第884号 / 新制\|\|情\|\|148(附属図書館) / 京都大学大学院情報学研究科通信情報システム専攻 / (主査)教授大木英司, 教授原田博司, 教授山下信雄 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Resource Allocation Workload-Dependent Unavailability High availability Backup and recovery Service continuity 7
18	Dinaminis kompiuterinių sistemų infrastruktūros atnaujinimo modelis, pagrįstas atviro kodo sprendimais / Dynamic update model of computer systems infrastructure based on open source solutions Pachomov, Artiom 17 July 2014 (has links) Šiame darbe analizuojamos įmonės su užsistovėjusia bei pasenusia programine įranga dinaminis atnaujinimas utilizuojant naujos atviro kodo nemokamos įrangos galimybes. Formuojamas dinaminis modelis, kuriame pritaikomi nepertraukiamų paslaugų, vartotojų centralizuoto valdymo bei neprarandamų duomenų sprendimai. Taip pat pateikiama analizė, kaip atlikti paslaugų įrangos migravimą bei sukurti pagalbinę IT infrastruktūros dalį, optimizuojančia sistemų priežiūrą bei našumą. / This paper analyzes, dynamic systems software updates for institution with deprecated system infrastructure using free, open source based solutions using most of it possibilities. Dynamic model is formed, which includes identity management, high availability clustering, data replication and data integrity solutions. Also additional analysis is included for IT infrastructure usage optimization. Informatics Duomenų replikavimas Klasteris Nepertraukiamos paslaugos tiekimas Paslaugų migravimas Clustering High availability Data replication Uninterrupted service supply
19	Live updates in High-availability (HA) clouds Sanagari, Vivek January 2018 (has links) Background. High-availability (HA) is a cloud’s ability to keep functioning after one or more hardware or software components fail. Its purpose is to minimize the system downtime and data loss. Many service providers guarantee a Service Level Agreement including uptime percentage of the computing service, which is calculated based on the available time and system downtime excluding the planned outage time. The aim of the thesis is to perform the update of the virtual machines running in the cloud without causing any interruptions to the user by redirecting the resources/services running on them to an alternative virtual machine before the original VM is updated. Objectives. The objectives for the above aim include. • The first objective is to investigate existing solutions for high-availability and, if possible, adapt them to our aim. The alternative is to design our own solution. • The second objective is to implement the solution in an Open Stack environment. As an alternative, we can try a smaller scale implementation under a virtualization platform such as Virtual Box. • The final objective is to run experiments to quantify the effectiveness of our solution in terms of overhead and degree of seamlessness to the users. Methods. An environment with multiple virtual machines may be created to represent multiple virtual servers in the cloud. The state of service provided by the primary virtual machine is saved to persistent storage and the client is redirected to an alternate virtual machine. At that point the primary virtual machine may reboot for an update or any other issues. Results. In the case of CPU Utilization, the mean CPU utilization on Server and Host in scenario 1 are 0.34% and 3.2% respectively. The mean CPU utilization on Primary server and Host in scenario 2 during the failover cycle are 2.0% and 9.7% respectively. The mean CPU utilization on Secondary server and Host in scenario 2 during failover cycle are 0.99% and 8.0% respectively. For the Memory Utilization, the mean Memory usage on server in scenario 1 is 16%. The mean Memory usage on primary server and secondary server in scenario 2 during failover cycle are 37% and 48% respectively. The Time for failover of the high availability environment remains for 6.8 seconds and the time for the off-line node to rejoin the cluster as on-line when told would take 1.5 seconds. The network traffic is measured in Kilobits per second, it is 1.2 Kilobits per second on port 80 in scenario 2 and is 1.4 Kilobits per second between the client and the server in scenario 1. In addition, data traffic on ports 5405, 2224 and 7788 are captured where port 5405 (Pacemaker/Corosync) contains UDP traffic, port 2224 (Pcsd) contains TCP traffic and port 7788 (DRBD) contains TCP traffic. The traffic captured on these ports represent network overhead due to HA. During failover cycle an additional traffic of 45Kb/s, 1.2Kb/s. 7.0Kb/s flow on 5405, 2224 and 7788 ports respectively. Conclusions. From our experiment results we can say that the overhead to handle live updates on high availability environment is approximately 1.1 - 1.7 % of CPU higher in HA mode than when a stand-alone server is used. The overhead is around 21 - 32 % higher in terms of memory utilization for the live updates on the HA system than for the standard server. The network traffic overhead induced by the ports used by high availability environment (5405, 2224, 7788) is approximately 53 Kilobits /Second while the minimum overhead is approximately 16 Kilobits / Second. The Final and the important metric is the Failover time which tells the seamlessness of the service as the environment needs to provide the services uninterrupted to the users. The failover time of the HA model is about just 6.8 seconds leaving the environment highly available. However, the user may notice slight interruption for the requests made during this span. Cloud Cluster Corosync Distributed Replicated Block Device High Availability Pacemaker Virtual machine Web server Web Application. Engineering and Technology Teknik och teknologier
20	Reaching High Availability in Connected Car Backend Applications Yadav, Arpit 08 September 2017 (has links) (PDF) The connected car segment has high demands on the exchange of data between the car on the road, and a variety of services in the backend. By the end of 2020, connected services will be mainstream automotive offerings, according to Telefónica - Connected Car Industry Report 2014 the overall number of vehicles with built-in internet connectivity will increase from 10% of the overall market today to 90% by the end of the decade [1]. Connected car solutions will soon become one of the major business drivers for the industry; they already have a significant impact on existing solutions development and aftersales market. It has been more than three decades since the introduction of the first software component in cars, and since then a vast amount of different services has been introduced, creating an ecosystem of complex applications, architectures, and platforms. The complexity of the connected car ecosystem results into a range of new challenges. The backend applications must be scalable and flexible enough to accommodate loads created by the random user and device behavior. To deliver superior uptime, back-end systems must be highly integrated and automated to guarantee lowest possible failure rate, high availability, and fastest time-to-market. Connected car services increasingly rely on cloud-based service delivery models for improving user experiences and enhancing features for millions of vehicles and their users on a daily basis. Nowadays, the software applications become more complex, and the number of components that are involved and interact with each other is extremely large. In such systems, if a fault occurs, it can easily propagate and can affect other components resulting in a complex problem which is difficult to detect and debugg, therefore a robust and resilient architecture is needed which ensures the continuous availability of system in the wake of component failures, making the overall system highly available. The goal of the thesis is to gain insight into the development of highly available applications and to explore the area of fault tolerance. This thesis outlines different design patterns and describes the capabilities of fault tolerance libraries for Java platform, and design the most appropriate solution for developing a highly available application and evaluate the behavior with stress and load testing using Chaos Monkey methodologies. Chaos Monkey Kommunizierende Autos Cloud Connected Car High Availability Cloud Computing Chaos Monkey ddc:004 Informatik Cloud Computing

Search results