• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Developing for Resilience: Introducing a Chaos Engineering tool

Monge Solano, Ignacio, Matók, Enikő January 2020 (has links)
Software complexity continues to accelerate, as new tools, frameworks, and technologiesbecome available. This, in turn, increases its fragility and liability. Despite the amount ofinvestment to test and harden their systems, companies still pay the price of failure. Towithstand this fast-paced development environment and ensure software availability, largescalesystems must be built with resilience in mind. Chaos Engineering is a new practicethat aims to assess some of these challenges. In this thesis, the methodology, requirements,and iterations of the system design and architecture for a chaos engineering tool arepresented. In a matter of only a couple of months and the working hours of two engineers, itwas possible to build a tool that is able to shed light on the attributes that make the targetedsystem resilient as well as the weaknesses in its failure handling mechanisms. This toolgreatly reduces the otherwise manual testing labor and allows software engineering teamsto find potentially costly failures. These results prove the benefits that many companiescould experience in their return of investment by adopting the practice of ChaosEngineering.
2

Minimizing Blast Radius of Chaos Engineering Experiments via Steady-State Metrics Forecasting / Minimera sprängradien för Chaos Engineering-experiment via prognoser för steady-state mätvärden

Navin Shetty, Dhruv January 2023 (has links)
Chaos Engineering (CE) intentionally disrupts distributed systems by introducing faults into the system to better understand and improve their resilience. By studying these intentional disruptions, CE provides insights that help enhance system performance and the overall user experience. However, two main challenges exist: reducing the negative impact or ”blast radius” of these CE experiments without diluting the value of the CE experiment and identifying a standardized set of metrics to monitor during such CE experiments. This research addresses these challenges by monitoring application and system-level metrics known as the Golden Signals, and a steady-state metric called the Apdex score during a CE experiment. Using Pearson and Spearman correlation analyses alongside Granger Causality tests, a strong connection between the Golden Signals and Apdex score is identified. The study also introduces a new health-check system design that uses the Apdex score to automatically stop a CE experiment if a preset threshold is violated. Furthermore, the design also introduces a method for early termination of the CE experiment based on forecasted Apdex scores. This method not only limits potential system damage but also reveals key system weaknesses, striking a balance between risk and discovery. / Chaos Engineering (CE) stör medvetet distribuerade system genom att införa fel i systemet för att bättre förstå och förbättra deras motståndskraft. Genom att studera dessa medvetna störningar ger CE insikter som hjälper till att förbättra systemprestanda och den övergripande användarupplevelsen. Två huvudutmaningar finns dock: att minska den negativa effekten eller ”blast radius” av dessa CE-experiment utan att försämra värdet av CE-experimentet och att identifiera en standardiserad uppsättning av mätvärden att övervaka under sådana CE-experiment. Denna forskning tar itu med dessa utmaningar genom att övervaka applikations- och systemnivåmätvärden kända som Golden Signals, och en jämviktsmetrik kallad Apdex-poängen under ett CE-experiment. Genom att använda Pearson och Spearmans korrelationsanalyser tillsammans med Granger orsakssambandstester identifieras en stark koppling mellan Golden Signals och Apdex-poängen. Studien introducerar också en ny hälsocheck-systemdesign som använder Apdex-poängen för att automatiskt stoppa ett CE-experiment om ett förinställt tröskelvärde överskrids. Vidare introducerar designen också en metod för tidig avslutning av CE-experiment baserat på förutsagda Apdex-poäng.. Denna metod begränsar inte bara potentiell systemskada utan avslöjar också nyckelsystemsvagheter och skapar en balans mellan risk och upptäckt.

Page generated in 0.0861 seconds