Global ETD Search

1	Autoscaling through Self-Adaptation Approach in Cloud Infrastructure. A Hybrid Elasticity Management Framework Based Upon MAPE (Monitoring-Analysis-Planning-Execution) Loop, to Ensure Desired Service Level Objectives (SLOs) Butt, Sarfraz S. January 2019 (has links) The project aims to propose MAPE based hybrid elasticity management framework on the basis of valuable insights accrued during systematic analysis of relevant literature. Each stage of MAPE process acts independently as a black box in proposed framework, while dealing with neighbouring stages. Thus, being modular in nature; underlying algorithms in any of the stage can be replaced with more suitable ones, without affecting any other stage. The hybrid framework enables proactive and reactive autoscaling approaches to be implemented simultaneously within same system. Proactive approach is incorporated as a core decision making logic on the basis of forecast data, while reactive approach being based upon actual data would act as a damage control measure; activated only in case of any problem with proactive approach. Thus, benefits of both the worlds; pre-emption as well as reliability can be achieved through proposed framework. It uses time series analysis (moving average method / exponential smoothing) and threshold based static rules (with multiple monitoring intervals and dual threshold settings) during analysis and planning phases of MAPE loop, respectively. Mathematical illustration of the framework incorporates multiple parameters namely VM initiation delay / release criterion, network latency, system oscillations, threshold values, smart kill etc. The research concludes that recommended parameter settings primarily depend upon certain autoscaling objective and are often conflicting in nature. Thus, no single autoscaling system with similar values can possibly meet all objectives simultaneously, irrespective of reliability of an underlying framework. The project successfully implements complete cloud infrastructure and autoscaling environment over experimental platforms i-e OpenStack and CloudSim Plus. In nutshell, the research provides solid understanding of autoscaling phenomenon, devises MAPE based hybrid elasticity management framework and explores its implementation potential over OpenStack and CloudSim Plus. Cloud computing Autoscaling MAPE process Self-adaptation Taxonomy Autoscaling approaches Elasticity management framework OpenStack CloudSim Plus
2	Workload characterization, controller design and performance evaluation for cloud capacity autoscaling Ali-Eldin Hassan, Ahmed January 2015 (has links) This thesis studies cloud capacity auto-scaling, or how to provision and release re-sources to a service running in the cloud based on its actual demand using an auto-matic controller. As the performance of server systems depends on the system design,the system implementation, and the workloads the system is subjected to, we focuson these aspects with respect to designing auto-scaling algorithms. Towards this goal,we design and implement two auto-scaling algorithms for cloud infrastructures. Thealgorithms predict the future load for an application running in the cloud. We discussthe different approaches to designing an auto-scaler combining reactive and proactivecontrol methods, and to be able to handle long running requests, e.g., tasks runningfor longer than the actuation interval, in a cloud. We compare the performance ofour algorithms with state-of-the-art auto-scalers and evaluate the controllers’ perfor-mance with a set of workloads. As any controller is designed with an assumptionon the operating conditions and system dynamics, the performance of an auto-scalervaries with different workloads.In order to better understand the workload dynamics and evolution, we analyze a6-years long workload trace of the sixth most popular Internet website. In addition,we analyze a workload from one of the largest Video-on-Demand streaming servicesin Sweden. We discuss the popularity of objects served by the two services, the spikesin the two workloads, and the invariants in the workloads. We also introduce, a mea-sure for the disorder in a workload, i.e., the amount of burstiness. The measure isbased on Sample Entropy, an empirical statistic used in biomedical signal processingto characterize biomedical signals. The introduced measure can be used to charac-terize the workloads based on their burstiness profiles. We compare our introducedmeasure with the literature on quantifying burstiness in a server workload, and showthe advantages of our introduced measure.To better understand the tradeoffs between using different auto-scalers with differ-ent workloads, we design a framework to compare auto-scalers and give probabilisticguarantees on the performance in worst-case scenarios. Using different evaluation cri-teria and more than 700 workload traces, we compare six state-of-the-art auto-scalersthat we believe represent the development of the field in the past 8 years. Knowingthat the auto-scalers’ performance depends on the workloads, we design a workloadanalysis and classification tool that assigns a workload to its most suitable elasticitycontroller out of a set of implemented controllers. The tool has two main components;an analyzer, and a classifier. The analyzer analyzes a workload and feeds the analysisresults to the classifier. The classifier assigns a workload to the most suitable elasticitycontroller based on the workload characteristics and a set of predefined business levelobjectives. The tool is evaluated with a set of collected real workloads, and a set ofgenerated synthetic workloads. Our evaluation results shows that the tool can help acloud provider to improve the QoS provided to the customers. cloud computing autoscaling workloads performance modeling controller design
3	Comparison of Auto-Scaling Policies Using Docker Swarm / Jämförelse av autoskalningspolicies med hjälp av Docker Swarm Adolfsson, Henrik January 2019 (has links) When deploying software engineering applications in the cloud there are two similar software components used. These are Virtual Machines and Containers. In recent years containers have seen an increase in popularity and usage, in part because of tools such as Docker and Kubernetes. Virtual Machines (VM) have also seen an increase in usage as more companies move to solutions in the cloud with services like Amazon Web Services, Google Compute Engine, Microsoft Azure and DigitalOcean. There are also some solutions using auto-scaling, a technique where VMs are commisioned and deployed to as load increases in order to increase application performace. As the application load decreases VMs are decommisioned to reduce costs. In this thesis we implement and evaluate auto-scaling policies that use both Virtual Machines and Containers. We compare four different policies, including two baseline policies. For the non-baseline policies we define a policy where we use a single Container for every Virtual Machine and a policy where we use several Containers per Virtual Machine. To compare the policies we deploy an image serving application and run workloads to test them. We find that the choice of deployment strategy and policy matters for response time and error rate. We also find that deploying applications as described in the methodis estimated to take roughly 2 to 3 minutes. Computer Systems Datorsystem
4	Optimized Autoscaling of Cloud Native Applications Åsberg, Niklas January 2021 (has links) Software containers are changing the way distributed applications are executedand managed on cloud computing resources. Autoscaling allows containerizedapplications and services to run resiliently with high availability without the demandof user intervention. However, specifying an autoscaling policy that can guaranteethat no performance violations will take place is an extremely hard task, and doomedto fail unless considerable care is taken. Existing autoscaling solutions try to solvethis problem but fail to consider application specific parameters when doing so, thuscausing poor resource utilization and/or unsatisfactory quality of service in certaindynamic workload scenarios.This thesis proposes an autoscaling solution that enables cloud native application toautoscale based on application specific parameters. The proposed solution consistsof a profiling strategy that detects key parameters that affect the performance ofautoscaling, and an autoscaling algorithm that automatically enforces autoscalingdecisions based on derived parameters from the profiling strategy.The proposed solution is compared and evaluated against the default autoscalingfeature in Kubernetes during different realistic user scenarios. Results from thetesting scenarios indicate that the proposed solution, which uses application specificparameters, outperforms the default autoscaling feature of Kubernetes in resourceutilization while keeping SLO violations at a minimum Kubernetes Datacollection Datavisualization Autoscaling Cloudnativeapplication containerization Computer Sciences Datavetenskap (datalogi)
5	Predictive Autoscaling of Systems using Artificial Neural Networks Lundström, Christoffer, Heiding, Camilla January 2021 (has links) Autoscalers handle the scaling of instances in a system automatically based on specified thresholds such as CPU utilization. Reactive autoscalers do not take the delay of initiating a new instance into account, which may lead to overutilization. By applying machine learning methodology to predict future loads and the desired number of instances, it is possible to preemptively initiate scaling such that new instances are available before demand occurs. Leveraging efficient scaling policies keeps the costs and energy consumption low while ensuring the availability of the system. In this thesis, the predictive capability of different multilayer perceptron configurations is investigated to elicit a suitable model for a telecom support system. The results indicate that it is possible to accurately predict future load using a multilayer perceptron regressor model. However, the possibility of reproducing the results in a live environment is questioned as the dataset used is derived from a simulation. autoscaling predictive autoscaling machine learning artificial neural networks multilayer preceptrons MLP-regressor time series forecasting Computer Sciences Datavetenskap (datalogi)
6	A performance study for autoscaling big data analytics containerized applications : Scalability of Apache Spark on Kubernetes Vennu, Vinay Kumar, Yepuru, Sai Ram January 2022 (has links) Container technologies are rapidly changing how distributed applications are executed and managed on cloud computing resources. As containers can be deployed on a large scale, there is a tremendous need for Container Orchestration tools like Kubernetes that are highly automatic in deployment, scaling, and management. In recent times, the adoption of these container technologies like Docker has seen a rise in internal usage, commercial offering, and various application fields ranging from High-Performance Computing to Geo-distributed (Edge or IoT) applications. Big Data analytics is another field where there is a trend to run applications (e.g., Apache Spark) as containers for elastic workloads and multi-tenant service models by leveraging various container orchestration tools like Kubernetes. Despite the abundant research on the performance impact of containerizing big data applications, to the best of our knowledge, the studies that focus on specific aspects like scalability and resource management are largely unexplored, which leaves a research gap to study upon. This research studies the performance impact of autoscaling a big data analytics application on Kubernetes based on autoscaling mechanisms like Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA). These state-of-art autoscaling mechanisms available for scaling containerized applications on Kubernetes and the available big data benchmarking tools for generating workload on frameworks like Spark are identified through a literature review. Apache Spark is selected as a representative big data application due to its ecosystem and industry-wide adoption by enterprises. In particular, a series of experiments are conducted by adjusting resource parameters (such as CPU requests and limits) and autoscaling mechanisms to measure run-time metrics like execution time and CPU utilization. Our experiment results show that while Spark performs better execution time when configured to scale with VPA, it also exhibits overhead in CPU utilization. In contrast, the impact of autoscaling big data applications using HPA adds overhead in terms of both execution time and CPU utilization. The research from this thesis can be used by researchers and other cloud practitioners, using big data applications to evaluate autoscaling mechanisms and derive better performance and resource utilization. Containers Container Orchestration Big data analytics Autoscaling Resource Management Computer Sciences Datavetenskap (datalogi)
7	Performance Evaluation of Kubernetes Autoscaling strategies on GKE clusters / Prestandautverdering av autoskalningsstrategier på GKE-kluster Nilsen, Johanna January 2023 (has links) Cloud computing and containerisation have experienced significant growth in recent years. With cloud providers requiring users to specify resource limits and requests, the need for performance and resource optimisation has emerged in the cloud computing domain. This thesis focuses on examining three autoscaling approaches in the Kubernetes container orchestrator: Hybrid Pod Autoscaler, Vertical Pod Autoscaler (VPA), and Horizontal Pod Autoscaler (HPA). To conduct the analysis, a production-grade microservice was deployed on a GKE cluster, replicating the workload of the host company Nordnet Bank AB, a pan-Nordic platform for savings investments. The main objective was to investigate the impact of the different autoscalers on the 50th and 99th percentile response times. The study also aimed to investigate whether a hybrid pod autoscaler, combining VPA and HPA, could outperform HPA and VPA in terms of response time and CPU usage. Additionally, the study aimed to identify the service metrics that an orchestrator can use to achieve response times similar to those obtained when resources are over-provisioned. The research findings indicate that response times varied significantly depending on the autoscaling strategy. While the 50th percentile response times remained consistent, the 99th percentile exhibited greater variation. Among the strategies, HPA demonstrated consistent performance, albeit with greater variability in the 99th percentile response times. The VPA strategy, in contrast, resulted in higher response times for both the 50th and 99th percentile compared to the baseline. The hybrid approach generally outperformed VPA in terms of response times while showing comparable performance to HPA, although with slightly greater variability. CPU usage patterns of the hybrid approach were more closely aligned with HPA than VPA. CPU usage and request rate were effectively used as service metrics for orchestrators in achieving acceptable 99th percentile response times, as demonstrated by both HPA and the hybrid approach. Nevertheless, these findings are contingent on the specific autoscaler configuration, microservice, and workload model used in this study and may not be universally applicable. / Cloud computing och containerisering har under de senaste åren haft en betydande tillväxt. I och med att molnleverantörer ger användare möjlighet att själva specificera resursgränser, har behovet för prestanda- och resursoptimering inom molntjänster blivit alltmer framträdande. Denna forskning fokuserar på att undersöka och utvärdera tre olika autoskalningsmetoder i Kubernetes containerorkestrator: Hybrid Pod Autoscaler, Vertical Pod Autoscaler (VPA) och Horizontal Pod Autoscaler (HPA).För att genomföra utvärderingen implementerades tre mikrotjänster i en GKE-klustermiljö. Arbetsbelastningen hos den svenska banken och handelsplattformen Nordnet Bank AB replikerades. Det primära syftet med studien var att undersöka hur de olika autoskalningsmetoderna påverkade svarstiden i den 50:e och 99:e percentilen. Utöver detta, syftade också till att undersöka om en hybrid pod autoscaler, som kombinerar både VPA och HPA, kunde överträffa de enskilda metoderna i svarstid och CPU-användning. Dessutom identifiera vilka mätvärden en orchestrator kan använda för att uppnå svarstider som liknar dem som uppnås när resurserna överdimensionerade. Resultaten från forskningen visar att svarstiderna varierade avsevärt beroende på vilken autoskalningsstrategi som användes. Medan svarstiderna för 50:e percentilen var relativt konsekventa, uppvisade 99:e percentilen större variation. HPA visade generellt sett jämn prestanda, men med en något större variation i 99:e percentilen av svarstider. Å andra sidan resulterade VPA i högre svarstider både för 50:e och 99:e percentilen. Hybridmetoden presterade generellt sett bättre än VPA när det gäller svarstider och visade liknande resultat som HPA, även om det fanns en något större variabilitet. Mönstret för CPU-användning för hybridmetoden låg närmare HPA än VPA. CPU-användning och förfrågningshastighet visade sig vara effektiva mätvärden för att uppnå acceptabla svarstider i 99:e percentilen, vilket bekräftades av både HPA och hybridmetoden. Det är dock viktigt att notera att dessa resultat är specifika för den autoskalningskonfiguration, mikrotjänst och arbetsbelastningsmodell som användes i studien och kanske inte är universellt tillämpliga. Cloud Computing Containerisation Kubernetes Google Kubernetes Engine Autoscaling Horizontal Pod Autoscaling Vertical Pod Autoscaling Hybrid Pod Autoscaling Molntjänster Containerisering Kubernetes Google Kubernetes Engine Autoskalning Horisontell Pod-autoskalning Vertikal Pod-autoskalning Hybrid Pod-autoskalning Computer Sciences Datavetenskap (datalogi) Computer and Information Sciences Data- och informationsvetenskap
8	Towards SLO-aware Resource Scheduling for Serverless Inference Workloads Tripathy, Abhijit 08 August 2023 (has links) The rapid advancement of Machine Learning (ML) and Deep Learning (DL) has revolutionized various domains, necessitating efficient and cost-effective ML inference capabilities. Function-as-a-Service (FaaS) has emerged as a promising approach for hosting ML inference services, providing a serverless computing environment that streamlines development cycles and offers scalability and simplified infrastructure management. However, existing autoscaling strategies employed by popular FaaS platforms often overlook critical factors such as response time and tail latency. Additionally, Python's Global Interpreter Lock (GIL) poses challenges for parallel computing in high-request traffic scenarios. This thesis addresses the need for efficient and cost-effective Machine Learning (ML) inference capabilities by exploring batching and autoscaling strategies for Serverless Inference instances. The study proposes a prototype FaaS framework that provides adaptive request batching, reactive autoscaling policies, and SLO monitoring, thus allowing Serverless Inference workloads to meet their SLO targets even during peak traffic. The proposed approach aims to optimize resource utilization, mitigate tail latency, and improve overall system performance. / Master of Science / Machine Learning (ML) and Deep Learning (DL) are advanced techniques that allow computers to learn from data and make predictions or decisions without being explicitly programmed. This has led to significant advancements in various fields. Inference refers to the process of applying a trained ML model to new data to make predictions or extract insights. In the context of ML, there is a growing need for efficient and cost-effective inference capabilities. A new approach called Function-as-a-Service (FaaS) has emerged that can address this need. FaaS is a way of abstracting the server infrastructure away from the developers. This means developers can focus on writing the ML code without worrying about managing the underlying infrastructure. FaaS offers benefits such as scalability, simplified infrastructure management, and faster development cycles. However, existing FaaS platforms face challenges in ensuring fast response times and handling high levels of incoming requests. This thesis aims to address these challenges by proposing a prototype FaaS framework. The framework incorporates adaptive request batching, reactive autoscaling policies, and Service-Level Objectives (SLOs) monitoring. Request batching allows the framework to process multiple requests together, improving efficiency. Autoscaling policies ensure the system dynamically adjusts its resources based on the incoming workload. Monitoring SLOs helps track and meet performance targets, even during peak traffic. By optimizing resource utilization, reducing delays in processing requests, and improving overall system performance, the proposed approach seeks to provide efficient and cost-effective ML inference capabilities in a serverless environment. Machine Learning Deep Learning Serverless Inference Autoscaling Load Balancing Response Time Tail Latency
9	Intelligent autoscaling in Kubernetes : the impact of container performance indicators in model-free DRL methods / Intelligent autoscaling in Kubernetes : påverkan av containerprestanda-indikatorer i modellfria DRL-metoder Praturlon, Tommaso January 2023 (has links) A key challenge in the field of cloud computing is to automatically scale software containers in a way that accurately matches the demand for the services they run. To manage such components, container orchestrator tools such as Kubernetes are employed, and in the past few years, researchers have attempted to optimise its autoscaling mechanism with different approaches. Recent studies have showcased the potential of Actor-Critic Deep Reinforcement Learning (DRL) methods in container orchestration, demonstrating their effectiveness in various use cases. However, despite the availability of solutions that integrate multiple container performance metrics to evaluate autoscaling decisions, a critical gap exists in understanding how model-free DRL algorithms interact with a state space based on those metrics. Thus, the primary objective of this thesis is to investigate the impact of the state space definition on the performance of model-free DRL methods in the context of horizontal autoscaling within Kubernetes clusters. In particular, our findings reveal distinct behaviours associated with various sets of metrics. Notably, those sets that exclusively incorporate parameters present in the reward function demonstrate superior effectiveness. Furthermore, our results provide valuable insights when compared to related works, as our experiments demonstrate that a careful metric selection can lead to remarkable Service Level Agreement (SLA) compliance, with as low as 0.55% violations and even surpassing baseline performance in certain scenarios. / En viktig utmaning inom området molnberäkning är att automatiskt skala programvarubehållare på ett sätt som exakt matchar efterfrågan för de tjänster de driver. För att hantera sådana komponenter, container orkestratorverktyg som Kubernetes används, och i det förflutna några år har forskare försökt optimera dess autoskalning mekanism med olika tillvägagångssätt. Nyligen genomförda studier har visat potentialen hos Actor-Critic Deep Reinforcement Learning (DRL) metoder i containerorkestrering, som visar deras effektivitet i olika användningsfall. Men trots tillgången på lösningar som integrerar flera behållarprestandamått att utvärdera autoskalningsbeslut finns det ett kritiskt gap när det gäller att förstå hur modellfria DRLalgoritmer interagerar med ett tillståndsutrymme baserat på dessa mätvärden. Det primära syftet med denna avhandling är alltså att undersöka vilken inverkan statens rymddefinition har på prestandan av modellfria DRL-metoder i samband med horisontell autoskalning inom Kubernetes-kluster. I synnerhet visar våra resultat distinkta beteenden associerade med olika uppsättningar mätvärden. Särskilt de set som uteslutande innehåller parametrar som finns i belöningen funktion visar överlägsen effektivitet. Dessutom våra resultat ge värdefulla insikter jämfört med relaterade verk, som vår experiment visar att ett noggrant urval av mätvärden kan leda till anmärkningsvärt Service Level Agreement (SLA) efterlevnad, med så låg som 0, 55% överträdelser och till och med överträffande baslinjeprestanda i vissa scenarier. Cloud computing container autoscaling resource optimisation Deep Reinforcement Learning Actor-Critic Kubernetes service mesh Cloud computing container autoscaling Optimering av resurser Deep Reinforcement Learning Actor-Critic Kubernetes service mesh Elektroteknik och elektronik
10	Gestion autonomique de l'élasticité multi-couche des applications dans le Cloud : vers une utilisation efficiente des ressources et des services du Cloud / Crosslayer elasticity management for Cloud : towards an efficient usage of Cloud resources and services Dupont, Simon 26 April 2016 (has links) L’informatique en nuage, au travers de son modèle en couche et de l’accès à ses services à la demande, a bouleversé la façon de gérer les infrastructures (IaaS) et la manière de produire les logiciels (SaaS). Grâce à l’élasticité de l’infrastructure, la quantité de ressource peut être ajustée automatiquement en fonction de la demande afin de satisfaire un certain niveau de qualité de service (QoS) aux clients tout en minimisant les coûts d’exploitation sous-jacents. Le modèle d’élasticité actuel qui consiste à ajuster les ressources IaaS au travers de services de dimensionnement automatique basiques montre ses limites en termes de réactivité et de granularité d’adaptation. De plus, bien qu’étant une caractéristique cruciale de l’informatique en nuage, l’élasticité est à ce jour pauvrement outillée empêchant ainsi les différents acteurs du Cloud de jouir pleinement de ses bienfaits. Dans ce travail de thèse, nous proposons d’étendre leconcept d’élasticité aux couches hautes du nuage, et plus précisément au niveau du SaaS. Nous présentons ainsi le nouveau concept d’élasticité logicielle que nous définissons comme la capacité d’un logiciel à s’adapter, idéalement de manière autonome, pour répondre aux changements de la demande et/ou aux limitations de l’élasticité des ressources de l’infrastructure. Il s’agit alors d’envisager l’élasticité de manière transverse et multi-couche en considérant l’adaptation des ressources Cloud au sens large. Pour ce faire, nous présentons un modèle pour la gestion autonome de l’élasticité multi-couche et le Framework ElaStuff associé. Dans le but d’outiller et d’industrialiser le processus de gestion de l’élasticité, nous proposons l’outil de surveillance perCEPtion basé sur le traitement des événements complexes et permettant à l’administrateur de mettre en place une observation avancée du système Cloud. De plus, un langage dédié à l’élasticité multi-couche nommé ElaScript est proposé pour exprimer simplement et efficacement des plans de reconfiguration orchestrant les actions d’élasticité de différents niveaux. Enfin, notre proposition d’étendre l’élasticité aux couches hautes du Cloud, et plus particulièrement au niveau SaaS, est validée expérimentalement selon plusieurs points devue (QoS, énergie, réactivité et précision du passage à l’échelle,etc.). / Cloud computing, through its layered model and access to its on-demand services, has changed the way of managing the infrastructures (IaaS) and how to produce software (SaaS). With the advent of IaaS elasticity, the amount of resources can be automatically adjusted according to the demand to satisfy a certain level of quality of service (QoS) to customers while minimizing underlying operating costs. The current elasticity model is based on adjusting the IaaS resources through basic autoscaling services, which reaches to its limit in terms of responsiveness and adaptation granularity. Although it is an essential feature for Cloud computing, elasticity remains poorly equipped which prevents the various actors of the Cloud to really enjoy its benefits. In this thesis, we propose to extend the concept of elasticity to higher layers of the cloud, and more precisely to the SaaS level. Then, we present the new concept of software elasticity by defining the ability of the software to adapt, ideally in an autonomous way, to cope with workload changes and/or limitations of IaaS elasticity. This predicament brings the consideration of Cloud elasticity in a multi-layer way through the adaptation of all kind of Cloud resources. To this end, we present a model for the autonomic management of multi-layer elasticity and the associated framework ElaStuff. In order to equip and industrialize the elasticity management process, we propose the perCEPtion monitoring tool, based on complex event processing, which enables the administrators to set up an advanced observation of the Cloud system. In addition, we propose a domain specific language (DSL) for the multi-layer elasticity, called ElaScript, which allows to simply and effectively express reconfiguration plans orchestrating the different levels of elasticity actions. Finally, our proposal to extend the Cloud elasticity to higher layers, particularly to SaaS,is validated experimentally from several perspectives (QoS,energy, responsiveness and accuracy of the scaling, etc.). Informatique en Nuage Élasticité Informatique autonome Dimensionnement automatique Reconfiguration dynamique Langage dédié Cloud computing Elasticity Autonomic computing Autoscaling Dynamic reconfiguration Domain Specific Language

Search results