Spelling suggestions: "subject:"autoscaling"" "subject:"autoscaling""
11 |
Job Schedule and Cloud Auto-Scaling for Repetitive ComputationDannetun, Victor January 2016 (has links)
Cloud computing’s growing popularity is based on the cloud’s flexibility and the availability of a huge amount of resources. Today, cloud providers offer a wide range of predefined solutions, VM (virtual machine) sizes and customization differing in performance, support and price. In this thesis it is investigated how to achieve cost minimization within specified performance goals for a commercial service with computation occurring in a repetitive pattern. A promising multilevel queue scheduling and a set of auto-scaling rules to fulfil computation deadlines and job prioritization and lower server cost is presented. In addition, an investigation to find an optimal VM size in the sense of cost and performance points out further areas of cloud service optimization.
|
12 |
Approche dirigée par les contrats de niveaux de service pour la gestion de l'élasticité du "nuage" / SLA-driven cloud elasticity anagement approachKouki, Yousri 09 December 2013 (has links)
L’informatique en nuage révolutionne complètement la façon de gérer les ressources. Grâce à l’élasticité, les ressources peuvent être provisionnées en quelques minutes pour satisfaire un niveau de qualité de service (QdS) formalisé par un accord de niveau de service (SLA) entre les différents acteurs du nuage. Le principal défi des fournisseurs de services est de maintenir la satisfaction de leurs consommateurs tout en minimisant le coût de ces services. Du point de vue SaaS, ce défi peut être résolu d’une manière ad-hoc par l’allocation/-libération des ressources selon un ensemble de règles prédéfinies avec Amazon Auto Scaling par exemple. Cependant, implémenter finement ces règles d’élasticité n’est pas une tâche triviale. D’une part, la difficulté de profiler la performance d’un service compromet la précision de la planification des ressources. D’autre part, plusieurs paramètres doivent être pris en compte, tels que la multiplication des types de ressources, le temps non-négligeable d’initialisation de ressource et le modèle de facturation IaaS. Cette thèse propose une solution complète pour la gestion des contrats de service du nuage. Nous introduisons CSLA (Cloud ServiceLevel Agreement), un langage dédié à la définition de contrat de service en nuage. Il adresse finement les violations SLA via la dégradation fonctionnelle/QdS et des modèles de pénalité avancés. Nous proposons, ensuite, HybridScale un framework de dimensionnement automatique dirigé par les SLA. Il implémente l’élasticité de façon hybride : gestion réactive-proactive, dimensionnement vertical horizontal et multi-couches (application-infrastructure). Notre solution est validée expérimentalement sur Amazon EC2. / Cloud computing promises to completely revolutionize the way to manage resources. Thanks to elasticity, resources can be provisioning within minutes to satisfy a required level of Quality of Service(QoS) formalized by Service Level Agreements (SLAs) between different Cloud actors. The main challenge of service providers is to maintain its consumer’s satisfaction while minimizing the service costs due to resources fees. For example, from the SaaS point of view, this challenge can be achieved in ad-hoc manner by allocating/releasing resources based on a set of predefined rules as Amazon Auto Scaling implements it. However, doing it right –in a way that maintains end-users satisfaction while optimizing service cost– is not a trivial task. First, because of the difficulty to profile service performance,the accuracy of capacity planning may be compromised. Second, several parameters should be taken into account such as multiple resource types, non-ignorable resource initiation time and IaaS billing model. For that purpose, we propose a complete solution for Cloud Service Level Management. We first introduce CSLA (Cloud Service LevelAgreement), a specific language to describe SLA for Cloud services. It finely expresses SLA violations via functionality/QoS degradationand an advanced penalty model. Then, we propose HybridScale, an auto-scaling framework driven by SLA. It implements the Cloud elasticity in a triple hybrid way : reactive-proactive management, vertical horizontal scaling at cross-layer (application-infrastructure). Our solution is experimentally validated on Amazon EC2.
|
13 |
Performance evaluation of auto scaling mechanisms in private clouds for supporting a web service applicationCAMPOS, Eliomar Gomes 03 August 2015 (has links)
Submitted by Haroudo Xavier Filho (haroudo.xavierfo@ufpe.br) on 2016-03-11T13:40:44Z
No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
Dissertacao - Eliomar Gomes Campos - Mestrado Ciência da computação.pdf: 5177532 bytes, checksum: 4fc152f297f19c035a0affba1640dcc3 (MD5) / Made available in DSpace on 2016-03-11T13:40:44Z (GMT). No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
Dissertacao - Eliomar Gomes Campos - Mestrado Ciência da computação.pdf: 5177532 bytes, checksum: 4fc152f297f19c035a0affba1640dcc3 (MD5)
Previous issue date: 2015-08-03 / FACEPE / Composite web services, also known as mashups, are useful to build added-value products in the web. Cloud computing environments have been widely used for hosting web services due to the possibility of increasing or decreasing available resources through automatic mechanisms (i.e.: auto scaling). Such elastic behavior ease the task of reaching satisfactory performance on peaks of demand without wasting resources. It is hard to determine the right components to tune such systems performance when eventually needed. This study evaluates the performance of auto scaling mechanisms for private clouds hosting an event recommendation web service. A hierarchical modeling approach is used to cope with the complexity of such a system, and represent specific details of these mechanisms. Our study applies parametric sensitivity analysis from several performance metrics of the models, such as mean execution time of the auto scaling monitoring, mean time of VMs instantiation, and the mean response time perceived by the web service user. We also have carried a General Full Factorial Experiment, in order to calculate the relevance and effects of each factor involved in the processes of auto scaling and virtual machines (VMs) instantiation. For the auto scaling monitoring, we analyze the factors: collection period of a metric, number of monitored virtual machines, and the time of monitoring of a metric. Regarding the instantiation process, the following factors have been chosen: VM type, VM image size, and VM caching. This analysis allows checking the impact of parameters on the system response time and pointing out effective ways for improvement of performance. / Serviços web compostos, também conhecidos como mashups, são úteis para construir produtos de valor agregado na web. Ambientes de computação em nuvem têm sido amplamente utilizados para hospedar serviços web, devido à possibilidade de aumentar ou diminuir os recursos disponíveis através de mecanismos automáticos (i.e.: escala automática). Tal comportamento elástico facilita a tarefa de alcançar um desempenho satisfatório nos picos de demanda sem desperdiçar recursos. É difícil determinar os componentes certos para ajustar o desempenho desses sistemas eventualmente, quando necessário. Este estudo avalia o desempenho dos mecanismos de escala automática e elasticidade para nuvens privadas hospedando um serviço web de recomendação de eventos. Uma abordagem de modelagem hierárquica é utilizada afim de lidar com a complexidade de tal sistema, e representar detalhes específicos desses mecanismos. Nosso estudo aplicou análise de sensibilidade paramétrica a partir de várias métricas de desempenho dos modelos, tais como o tempo médio de execução do monitoramento de escala automática, tempo médio da instanciação de VMs e o tempo médio da resposta percebida pelo usuário do serviço web. Realizamos também um Experimento Geral Fatorial Completo, com o objetivo de calcular os efeitos e relevâncias de cada fator envolvido nos processos escala automática e instanciação de máquinas virtuais (virtual machines - VMs). Para o monitoramento de escala automática, analisamos os fatores: período de coleta de uma métrica, número de máquinas virtuais monitoradas, e o tempo de monitoração de uma métrica. Quanto ao processo de instanciação, os seguintes fatores foram escolhidos: tipo de VM, tamanho da imagem da VM, e cache da VM. Estas análises permitem verificar o impacto dos parâmetros sobre o tempo de resposta do sistema e apontar formas eficazes de melhoria do desempenho.
|
14 |
Cloud Auto-Scaling Control Engine Based on Machine LearningYou, Yantian January 2018 (has links)
With the development of modern data centers and networks, many service providers have moved most of their computing functions to the cloud. Considering the limitation of network bandwidth and hardware or virtual resources, how to manage different virtual resources in a cloud environment so as to achieve better resource allocation is a big problem. Although some cloud infrastructures provide simple default auto-scaling and orchestration mechanisms, such as OpenStack Heat service, they usually only depend on a single parameter, such as CPU utilization and cannot respond to the network changes in a timely manner.<p> This thesis investigates different auto-scaling mechanisms and designs an on-line control engine that cooperates with different OpenStack service APIs based on various network resource data. Two auto-scaling engines, Heat orchestration based engine and machine learning based online control engine, have been developed and compared for different client requests patterns. Two machine learning methods, neural network, and linear regression have been considered to generate a control signal based on real-time network data. This thesis also shows the network’s non-linear behaviors for heavy traffic and proposes a scaling policy based on deep network analysis.<p> The results show that for offline training, the neural network and linear regression provide 81.5% and 84.8% accuracy respectively. However, for online testing with different client request patterns, the neural network results are different than we expected, while linear regression provided us with much better results. The model comparison showed that these two auto-scaling mechanisms have similar behavior for a SMOOTH-load Pattern. However, for the SPIKEY-load Pattern, the linear regression based online control engine responded faster to network changes while heat orchestration service shows some delay. Compared with the proposed scaling policy with fewer web servers in use and acceptable response latency, both of the two auto-scaling models waste network resources. / Med utvecklingen av moderna datacentraler och nätverk har många tjänsteleverant örer flyttat de flesta av sina datafunktioner till molnet. Med tanke på begränsningen av nätverksbandbredd och hårdvara eller virtuella resurser, är det ett stort problem att hantera olika virtuella resurser i en molnmiljö för att uppnå bättre resursallokering. även om vissa molninfrastrukturer tillhandahåller enkla standardskalnings- och orkestrationsmekanismer, till exempel OpenStack Heat service, beror de vanligtvis bara på en enda parameter, som CPU-utnyttjande och kan inte svara på nätverksändringarna i tid. Denna avhandling undersöker olika auto-skaleringsmekanismer och designar en online-kontrollmotor som samarbetar med olika OpenStack-service APIskivor baserat på olika nätverksresursdata. Två auto-skalermotorer, värmeorkestreringsbaserad motor- och maskininlärningsbaserad online-kontrollmotor, har utvecklats och jämförts för olika klientförfråg-ningsmönster. Två maskininl ärningsmetoder, neuralt nätverk och linjär regression har ansetts generera en styrsignal baserad på realtids nätverksdata. Denna avhandling visar också nätverkets olinjära beteenden för tung traffik och föreslår en skaleringspolitik baserad på djup nätverksanalys. Resultaten visar att för nätutbildning, ger neuralt nätverk och linjär regression 81,5% respektive 84,8% noggrannhet. För online-test med olika klientförfrågningsm önster är de neurala nätverksresultaten dock annorlunda än vad vi förväntade oss, medan linjär regression gav oss mycket bättre resultat. Modellen jämförelsen visade att dessa två auto-skala mekanismer har liknande beteende för ett SMOOTH-load mönster. För SPIKEY-load mönster svarade den linjära regressionsbaserade online-kontrollmotorn snabbare än nätverksförändringar medan värme-orkestrationstjänsten uppvisar viss fördröjning. Jämfört med den föreslagna skaleringspolitiken med färre webbservrar i bruk och acceptabel svarsfördröjning, slöser båda de två auto-skalande modellerna nätverksresurser.
|
15 |
Optimizing Resource Allocation in Kubernetes : A Hybrid Auto-Scaling Approach / Optimering av resurstilldelning i Kubernetes : En hybrid auto-skalningsansatsChiminelli, Brando January 2023 (has links)
This thesis focuses on addressing the challenges of resource management in cloud environments, specifically in the context of running resource-optimized applications on Kubernetes. The scale and growth of cloud services, coupled with the dynamic nature of workloads, make it difficult to efficiently manage resources and control costs. The objective of this thesis is to explore the proactive autoscaling of virtual resources based on traffic demand, aiming to improve the current reactive approach, the Horizontal Pod Autoscaler (HPA), that relies on predefined rules and threshold values. By enabling proactive autoscaling, resource allocation can be optimized proactively, leading to improved resource utilization and cost savings. The aim is to strike a balance between resource utilization and the risk of Service Level Agreement (SLA) violations while optimizing resource usage for microservices. The study involves generating predictions and assessing resource utilization for both the current HPA implementation and the proposed solution. By comparing resource utilization and cost implications, the economic feasibility and benefits of adopting the new approach can be determined. The analysis aims to provide valuable insights into resource utilization patterns and optimization opportunities. The analysis shows significant improvements in CPU utilization and resource consumption using the proposed approach compared to the current HPA implementation. The proactive strategy allows for handling the same number of requests with fewer replicas, resulting in improved efficiency. The proposed solution has the potential to be applied to any type of service running on Kubernetes, with low computational costs. In conclusion, the analysis demonstrates the potential for resource optimization and cost savings through the proposed approach. By adopting proactive strategies and accurately predicting resource needs, organizations can achieve efficient resource utilization, system robustness, and compliance with SLA. Further research and enhancements can be explored based on the findings of this analysis. / Denna avhandling fokuserar på att adressera utmaningarna med resurshantering i molnmiljöer, specifikt i kontexten att köra resursoptimerade applikationer på Kubernetes. Skalan och tillväxten av molntjänster, tillsammans med arbetsbelastningarnas dynamiska natur, gör det svårt att effektivt hantera resurser och kontrollera kostnader. Syftet med denna avhandling är att utforska proaktiv autoskalning av virtuella resurser baserat på trafikbehov, med målet att förbättra den nuvarande reaktiva metoden, Horizontal Pod Autoscaler (HPA), som förlitar sig på fördefinierade regler och tröskelvärden. Genom att möjliggöra proaktiv autoskalning kan resurstilldelningen optimeras i förväg, vilket leder till förbättrad resursanvändning och kostnadsbesparingar. Målet är att hitta en balans mellan resursanvändning och risken för överträdelser av Service Level Agreements (SLA) samtidigt som resursanvändningen för mikrotjänster optimeras. Studien innefattar att generera förutsägelser och bedöma resursanvändning för både den nuvarande HPA-implementeringen och den föreslagna lösningen. Genom att jämföra resursanvändning och kostnadsimplikationer kan den ekonomiska genomförbarheten och fördelarna med att anta det nya tillvägagångssättet bestämmas. Analysen syftar till att ge värdefulla insikter i mönster för resursanvändning och möjligheter till optimering. Analysen visar betydande förbättringar i CPU-användning och resursförbrukning med den föreslagna metoden jämfört med den nuvarande HPA-implementeringen. Den proaktiva strategin möjliggör hantering av samma antal förfrågningar med färre replikor, vilket resulterar i förbättrad effektivitet. Den föreslagna lösningen har potential att tillämpas på alla typer av tjänster som körs på Kubernetes, med låga beräkningskostnader. Sammanfattningsvis visar analysen potentialen för resursoptimering och kostnadsbesparingar genom det föreslagna tillvägagångssättet. Genom att anta proaktiva strategier och noggrant förutsäga resursbehov kan organisationer uppnå effektiv resursanvändning, systemets robusthet och uppfyllnad av SLA:er. Vidare forskning och förbättringar kan utforskas baserat på resultaten av denna analys.
|
Page generated in 0.0703 seconds